How can I refresh a webpage using Puppeteer when it fails to load?

Emma_Galaxy · August 21, 2025, 10:15am

I’m working on a web scraping project and sometimes pages don’t load correctly. When this happens, I get errors trying to find elements on the page. I noticed that if I manually refresh the page in the browser, it works fine.

I tried using page.reload() but it’s not working as expected. Here’s my code:

for(const category of categories) {
    
    // Extract all product links from the page
    const productLinks = await page.$$eval('section.content > div.item-grid > article > header a.title-link', anchors => anchors.map(anchor => anchor.href));
    
    // Process each product link individually
    for (let productUrl of productLinks) {
        var index = counter++;
        try {
            await page.goto(productUrl);
            const productTitle = await page.$eval('header.product-title h1', element => element.innerText.trim());
            console.log('\n' + index);
            console.log(productTitle);
        } catch(error) {
            console.log('\n' + index);
            console.log('FAILED', error);
            await page.reload();
        }
    }
}

The error I’m getting looks like this:

FAILED Error: Error: failed to find element matching selector "header.product-title h1"
    at ElementHandle.$eval
    at process._tickCallback

Is there a better way to handle page reloading in Puppeteer when elements fail to load? The pages work when I refresh them manually, so I think there should be an automatic way to do this.

Luna23 · September 1, 2025, 3:55am

add a wait after the reload - try await page.waitForTimeout(2000) or better yet await page.waitForSelector('header.product-title h1'). the page often needs extra time to fully load even after reload finishes.

sofia_scribbles · August 30, 2025, 1:23pm

I’ve hit this same issue with production scrapers. The real problem isn’t reload timing - you’re being reactive instead of building resilience upfront.

I wrap my scraping logic in an automation pipeline. Skip the manual retry code and page reloads. Set up workflows that handle navigation fails, missing elements, and data extraction with built-in recovery.

The workflow watches each page load, catches missing elements automatically, and tries different strategies. Reload, wait for selectors, switch user agents - whatever works. No complex error handling needed.

This scales way better with hundreds of URLs. You get proper logging, failure tracking, plus alerts when sites consistently break.

Teams waste weeks on custom retry logic when they could automate the whole pipeline in hours. Let automation handle edge cases while you focus on extracting data.

Check out Latenode for resilient scraping workflows: https://latenode.com

ethant · August 28, 2025, 3:05pm

Try page.goto(productUrl, { waitUntil: 'domcontentloaded' }) instead of the default. Sometimes the DOM isn’t ready even after goto resolves. Also, wrap your selector in try-catch with a backup selector - the HTML structure might change between loads.

Isaac_Cosmos · August 28, 2025, 1:32am

The problem is page.reload() only reloads the current page, but you’re reloading after navigating to a product URL that didn’t load correctly. Instead of reloading, retry the navigation. Replace your catch block with await page.goto(productUrl, { waitUntil: 'networkidle2' }); then attempt to find the element again. The waitUntil option ensures the page fully loads before proceeding. Additionally, incorporate a retry counter to avoid infinite loops if a URL continually fails.

CreativePainter33 · August 27, 2025, 11:44pm

Your problem isn’t page failures - it’s timing. When page.goto() finishes, dynamic content might still be loading. I’ve had way better luck using exponential backoff with element waiting instead of reloading the page. Don’t catch the error and reload. Instead, wait for the specific element before trying to extract data. Use page.waitForSelector('header.product-title h1', { timeout: 10000 }) right after navigation. If that times out, retry with page.goto() again rather than page.reload(). I’ve scraped similar sites - the element probably loads via JavaScript after the DOM is ready. Add a small delay or check for loading indicators before selecting elements. This prevents most failures without any reload logic.

Tom_89Paint · August 27, 2025, 11:35pm

I’ve hit this same issue. A retry mechanism works way better than just reloading the page. Your catch block reloads but doesn’t retry the element selection - that’s the problem. Wrap your element extraction in a separate function with retry logic. Create a retry wrapper that runs page.$eval() multiple times with delays between attempts. Network timeouts and incomplete DOM rendering cause this constantly. Also check if the page actually loaded after page.goto() by verifying the response status before you try extracting elements. This approach has been way more reliable than reloading in my scraping projects.