Closing Chromium processes after browser.disconnect in Puppeteer

Problem:

I’m using Puppeteer to visit many websites. I’m reusing one browser instance with puppeteer.connect. After each visit, I disconnect and close the tab. But I’m noticing issues:

  • First 100 visits work fine
  • Then it takes more tries to load pages
  • After 500 visits, I get timeout errors

I checked Task Manager and saw lots of Chromium processes still running. They use up memory and CPU.

Question:

How can I make sure Chromium processes are really closed after I use browser.disconnect?

Code example:

const puppeteer = require('puppeteer')
const sites = ['site1.com', 'site2.com', 'site3.com']

async function visitSites() {
  const mainBrowser = await puppeteer.launch({ headless: true })
  const wsEndpoint = await mainBrowser.wsEndpoint()

  for (const site of sites) {
    try {
      const tempBrowser = await puppeteer.connect({ browserWSEndpoint: wsEndpoint })
      const page = await tempBrowser.newPage()
      await page.goto(site)
      
      // Do stuff with the page

      await page.goto('about:blank')
      await page.close()
      await tempBrowser.disconnect()
    } catch (err) {
      console.error(err)
    }
  }
  await mainBrowser.close()
}

visitSites()

Any ideas on how to fix this? Thanks!

hey Luke, have u tried using browser.close() instead of disconnect()? that might help kill those pesky chromium processes. also, you could try setting a lower connectionTimeout when connecting to the browser. that might force it to clean up quicker. just a thought!

I’ve dealt with similar memory leaks in Puppeteer before. One thing that helped me was using the --single-process flag when launching the browser. This forces Chromium to run everything in one process, which can be easier to manage and close properly.

Here’s what I’d suggest trying:

const browser = await puppeteer.launch({
  headless: true,
  args: ['--single-process']
});

Also, consider implementing a garbage collection cycle every X visits. You can force Node.js to run garbage collection using:

if (global.gc) {
  global.gc();
}

Run your script with the --expose-gc flag to enable this. It’s not a silver bullet, but it helped me squeeze more performance out of long-running Puppeteer scripts.

Lastly, monitor your memory usage closely. If you see it creeping up despite these measures, you might need to implement a full browser restart every few hundred visits as a last resort.

I’ve encountered similar issues with Puppeteer and found that explicitly closing pages and managing browser instances more aggressively helps. Instead of disconnecting after each visit, try closing the page and creating a new one for each site. Also, consider implementing a browser restart mechanism every X visits to prevent resource buildup.

Here’s a modified approach that might work better:

async function visitSites() {
  let browser = await puppeteer.launch({ headless: true });
  let pageCount = 0;

  for (const site of sites) {
    try {
      if (pageCount >= 100) {
        await browser.close();
        browser = await puppeteer.launch({ headless: true });
        pageCount = 0;
      }

      const page = await browser.newPage();
      await page.goto(site);
      
      // Do stuff with the page

      await page.close();
      pageCount++;
    } catch (err) {
      console.error(err);
    }
  }
  await browser.close();
}

This approach should help manage resources more effectively and prevent the issues you’re experiencing with long-running instances.