How to handle reCAPTCHA manually during automated web scraping with Puppeteer in Node.js?

Nova56 · July 10, 2025, 3:19am

I’m building a web scraper with Puppeteer to automate searches on Google. Everything works fine until I encounter Google’s reCAPTCHA protection.

I don’t want to bypass the reCAPTCHA since that’s against their terms. Instead, I want to pause the automation and let myself solve the CAPTCHA manually when it appears, then continue with the scraping process.

Is there a way to detect when reCAPTCHA shows up and pause the script so I can solve it? After solving it, the script should resume automatically.

Here’s my current approach:

const fs = require('fs');
const puppeteer = require('puppeteer');
const cheerio = require('cheerio');

const searchTerms = ['term1', 'term2', 'term3'];

async function performSearch() {
    for (let term of searchTerms) {
        const browser = await puppeteer.launch({headless: false});
        const page = await browser.newPage();
        
        try {
            await page.goto(`https://www.google.com/search?q=${term}`);
            const pageContent = await page.content();
            const $ = cheerio.load(pageContent);
            
            if (pageContent.includes('unusual traffic from your computer')) {
                console.log('CAPTCHA detected - need manual intervention');
                // How do I pause here for manual solving?
                // Then continue after it's solved?
            }
            
            // Process results here
            
        } catch (error) {
            console.error('Error during search:', error);
        } finally {
            await browser.close();
        }
    }
}

performSearch();

I’m fairly new to Node.js automation. Any suggestions on how to implement this manual CAPTCHA solving approach would be really helpful.

elizabeths · July 17, 2025, 12:26pm

I had this exact issue when scraping job boards last year. Use page.waitForNavigation() or page.waitForSelector() to detect when the CAPTCHA gets resolved. Here’s what worked for me:

if (pageContent.includes('unusual traffic from your computer')) {
    console.log('CAPTCHA detected - solve it manually');
    
    // Wait for navigation after CAPTCHA is solved
    await page.waitForNavigation({ waitUntil: 'networkidle2', timeout: 0 });
    
    console.log('CAPTCHA solved, continuing...');
}

The timeout: 0 removes the default timeout so it waits indefinitely. You can also wait for a specific element that shows up after you solve the CAPTCHA. I’d throw in some user-agent rotation and random delays between requests to hit fewer CAPTCHAs. This approach works reliably across different sites with similar protection.

HappyDancer99 · July 15, 2025, 8:13pm

I’ve had good luck combining element detection with user input prompts. I check for the reCAPTCHA iframe and Google’s blocking messages, then use readline to pause until I manually confirm it’s done:

const readline = require('readline');

async function handleCaptcha(page) {
    const captchaExists = await page.$('iframe[src*="recaptcha"]') || 
                         await page.$('form[action*="sorry"]');
    
    if (captchaExists) {
        console.log('Please solve the CAPTCHA and press Enter to continue...');
        
        const rl = readline.createInterface({
            input: process.stdin,
            output: process.stdout
        });
        
        await new Promise(resolve => rl.question('', resolve));
        rl.close();
    }
}

You get complete control over when to resume instead of relying on automatic detection. Way more reliable than waiting for navigation changes - sometimes the URL doesn’t even change after solving the CAPTCHA.

SpinningGalaxy · July 15, 2025, 3:37pm

You can also use page.waitForFunction() to watch for when the captcha disappears. I’ll do something like await page.waitForFunction(() => !document.querySelector('.captcha-container'), {timeout: 0}) - works great on most sites. Just don’t close the browser while it’s waiting or you’ll crash your script.