Extracting JSON data from AJAX response using Puppeteer

I’m working on a web scraping project and need to capture JSON data from AJAX calls made by the page. When I try to intercept the network response and extract the JSON body, I keep getting a pending Promise instead of the actual data.

Here’s my current approach:

const puppeteer = require('puppeteer');
const extractedData = [];

(async () => {
    const browserInstance = await puppeteer.launch({
        headless: true
    });
    const currentPage = await browserInstance.newPage();
    
    await currentPage.goto("https://example-site.com/search-page", {
        waitUntil: 'networkidle0'
    });

    await currentPage.type('#location-input', 'SW1A 1AA');
    await currentPage.click('#submit-btn');

    await currentPage.on('response', async responseObj => {    
        if (responseObj.url().includes("/api/search-results")){
            console.log('AJAX call detected'); 
            console.log(await responseObj.json()); 
        } 
    }); 
})();

The problem is that responseObj.json() returns a Promise that stays pending. How can I properly await this and get the actual JSON content from the XHR response?

You’re attaching the response listener after the AJAX call starts. That’s why it’s not working. Set up the listener before anything triggers the request. Here’s how to fix it: const puppeteer = require(‘puppeteer’); const extractedData = ; (async () => { const browserInstance = await puppeteer.launch({ headless: true }); const currentPage = await browserInstance.newPage(); currentPage.on(‘response’, async responseObj => { if (responseObj.url().includes(“/api/search-results”)) { console.log(‘AJAX call detected’); const jsonData = await responseObj.json(); console.log(jsonData); extractedData.push(jsonData); } }); await currentPage.goto(“https://example-site.com/search-page”, { waitUntil: ‘networkidle0’ }); await currentPage.type(‘#location-input’, ‘SW1A 1AA’); await currentPage.click(‘#submit-btn’); await currentPage.waitForTimeout(2000);})(); Once the listener’s in place before the AJAX call, it’ll catch the response.

Another issue - you’re using await in the event handler, which breaks things. Remove the await from the currentPage.on line and handle the promise correctly. Also, don’t close the browser until you get the actual response. Skip waitForTimeout since it’s unreliable - use waitForResponse instead.

The timing’s your main problem - you need to register your response handler before triggering anything that makes the AJAX call. I’ve run into this where responses can only be consumed once, so if another part of your code already read the response body, you’ll get that pending promise. Wrap your JSON extraction in a try-catch to see if that’s what’s happening. Also check the response status and content-type headers before trying to parse as JSON. Sometimes endpoints return empty responses or error codes that won’t parse. When I’m doing similar scraping stuff, I add a small delay after the click but before processing responses - helps make sure the network request actually started.