Hey everyone, I’m stuck on a problem with the Notion API. I want to pick a random entry from a big database, but there’s a catch. The API only lets me grab 100 entries at a time, and I can’t see how many total entries there are. Right now, I have to go through all the pages to find a random entry, which is slow for big databases.
I’ve set up a cron job to do this regularly, but it’s not working well with thousands of entries. Plus, I’m worried about hitting the rate limit if I make too many API calls at once.
Does anyone know a smarter way to get a random entry from a paginated database like this? I’m looking for a method that’s faster and won’t get me in trouble with rate limits. Any ideas would be super helpful! Thanks in advance for your suggestions.
As someone who’s wrestled with Notion API limitations, I feel your pain, Sophia. Here’s a trick that’s worked wonders for me: implement a weighted sampling approach. Instead of fetching all entries, periodically sample a subset of the database and assign weights based on how frequently entries are added or modified.
Store these weights locally and use them to bias your random selection. This way, you’re more likely to pick recent or frequently updated items, which often aligns with what users want anyway. You’ll need far fewer API calls, and it scales beautifully with large databases.
Just remember to occasionally re-sample to keep your weights fresh. This method has saved me tons of headaches and API quota. It’s not perfect randomness, but it’s a solid compromise between true randomness and API efficiency.
hey sophia, have u considered caching the total count and updating it periodically? that way, u could use a random number generator to pick a page, then grab a random item from that page. it’d be way faster and limit api calls. just make sure to handle edge cases like deleted entries!
I’ve encountered a similar issue with large Notion databases. One approach that worked well for me was implementing a reservoir sampling algorithm. It allows you to select a random item from a stream of unknown length with only one pass through the data.
Here’s how it works: As you iterate through the pages, maintain a ‘reservoir’ of one item. For each new item, replace the reservoir item with probability 1/n, where n is the current item count. This ensures each item has an equal chance of being selected, regardless of database size.
This method is efficient, requires minimal API calls, and doesn’t need to know the total count upfront. It’s particularly effective for large datasets and respects API rate limits. Just ensure you’re handling pagination correctly to traverse all pages.