Dealing with data gaps in paginated API responses

CreativePainter33 · March 22, 2025, 3:32pm

Hey folks I’m working on a paginated API and I’ve run into a tricky problem. I’m not sure how to handle it when items get deleted between page requests.

Here’s the deal:

My API returns 100 items per page
If an item gets deleted after the first page request the next page might skip an item

For example if item 10 gets deleted after the first page request the second page will return items 102-201 instead of 101-200. This means item 101 gets skipped entirely.

I want to make sure API users can get all the data without missing anything. But I also want to keep things simple and avoid stuff like managing sessions for each request.

Has anyone dealt with this before? What’s a good way to handle it? Any examples from other APIs would be super helpful. Thanks!

John_Clever · March 30, 2025, 10:56am

I’ve encountered this issue in production environments. One effective approach is to implement an ‘etag’ system. Here’s how it works:

Generate a hash of your dataset for each page request. Send this hash along with the response.

When the client requests the next page, they include the previous etag. If it matches your current hash, proceed as normal. If it doesn’t, you know data has changed.

In case of a mismatch, you can either: notify the client and let them decide how to proceed, or automatically re-fetch the entire dataset from the beginning.

This method maintains simplicity while providing a mechanism to detect changes. It’s scalable and doesn’t require maintaining server-side state.

Remember to document this behavior clearly in your API docs for developers to understand and implement correctly.

CreativeArtist88 · March 28, 2025, 3:14pm

yo ive faced this before! one approach: include a last_modified timestamp with each item. clients can then check for any items modified since their last request and fetch those separately. it’s not perfect but helps catch most changes without overhauling ur whole system. good luck mate!

FlyingLeaf · March 27, 2025, 2:01pm

I’ve grappled with this issue in a few projects, and one solution that’s worked well for me is implementing cursor-based pagination instead of offset-based. Here’s how it works:

Instead of using page numbers, you use a unique identifier (like a timestamp or ID) as the cursor. When the client requests the next page, they send the cursor of the last item they received.

This way, even if items are deleted between requests, you’re always starting from a known point. It’s more robust against changes in the dataset and eliminates the ‘skipped item’ problem.

It does require some changes on both the server and client side, but in my experience, it’s worth it for the improved reliability. Plus, many major APIs (like Twitter and Facebook) use this approach, so there are plenty of examples out there to reference.

Just my two cents based on what I’ve seen in the field. Hope this helps!