I’m working on a Zapier integration with a polling trigger and need help with optimization. My API endpoint works fine and returns data correctly, but I’m worried about performance as my data grows.
Right now I’m using Zapier’s built-in deduplication by adding unique IDs to each item. This prevents duplicate processing, but my app still has to send all items every time Zapier polls. With hundreds of records now and thousands expected soon, this seems wasteful.
I want to optimize so my API only sends items that Zapier hasn’t seen before. This would reduce memory usage and improve performance. I thought about storing timestamps for each polling request, but that won’t work reliably since the same API might be used in multiple Zaps or for sample data requests.
What’s the best approach to track what items have already been sent to Zapier while keeping the solution robust?
tbh, tracking sent items is rough with zapier’s polling. Just send everything and let zapier handle the deduping - way less headache. Watch your performance tho. If things get slow, then you can revisit. For now, let zapier manage it.
I had the same issue with large datasets. Cursor-based pagination was a game changer for me. Instead of timestamps, I return a next_cursor parameter with each response that Zapier sends back on the next poll. The API remembers where it left off without tracking individual Zap instances. I generate the cursor from the last record ID or combine timestamp + ID for uniqueness. Just make sure your API can rebuild the exact query state from the cursor alone. This scaled way better than sending everything and cut our response times big time once we hit thousands of records.
I faced a similar dilemma when our dataset expanded beyond 2000 records. My solution was to implement a “last_synced” timestamp corresponding to each unique webhook URL. Since Zapier provides a consistent webhook URL parameter, it serves as a reliable identifier across various Zaps. I store the hash of the URL along with the last successful poll’s timestamp in my database. During polling, I simply check this timestamp and return only the records that have been modified after it. This approach allows each Zap instance to be tracked independently without any conflict. For sample requests, I disregard the timestamp and return only the most recent items. This strategy significantly reduced our API response size, leading to a 90% decrease while resolving our memory concerns.