How can I enhance the response speed of my Puppeteer web scraper?

I am developing a web scraping tool for a personal project using Puppeteer, and while it effectively retrieves data, I’m encountering performance issues. My API endpoint currently has a response time of approximately 12 to 15 seconds, which is unacceptable.

app.get("/data", async (req, res) => {
  try {
    const browser = await puppeteer.launch();
    const page = await browser.newPage();
    const requestedPage = req.query.page || 1;
    await page.goto(`https://example-site.com/page=${requestedPage}`);

    const titles = await page.$$eval(".item-title", (elements) => {
      return elements.map((element) => element.textContent);
    });

    const episodeNumbers = await page.$$eval(".item-episode", (elements) => {
      return elements.map((element) => element.textContent);
    });

    const images = await page.$$eval(".item-image", (elements) => {
      return elements.map((element) => element.src);
    });

    const links = await page.$$eval(".item-link", (elements) => {
      const uniqueLinks = new Set();
      elements.forEach((element) => {
        const link = element.getAttribute("href");
        if (link) {
          uniqueLinks.add(link);
        }
      });
      return Array.from(uniqueLinks);
    });

    const resultData = [];

    for (let index = 0; index < titles.length; index++) {
      resultData.push({
        title: titles[index],
        episodes: episodeNumbers[index],
        image: images[index],
        link: links[index],
      });
    }

    await browser.close();
    res.json(resultData);
  } catch (error) {
    res.status(500).json({ error: "An error occurred while retrieving data." });
  }
});

I initially attempted to use Cheerio but it was ineffective for dynamic content, so I transitioned to Puppeteer. While it processes data well, the delay in response is detrimental to user satisfaction. What strategies can I implement to reduce this response time?

Been there with Puppeteer performance issues. You’re launching a new browser for every request - that’s what’s killing your speed.

Honestly, I’d skip optimizing this setup and just use a proper automation platform. You’re rebuilding something that already exists for web scraping at scale.

I’ve hit similar scraping problems in production. Biggest game-changer was ditching custom Puppeteer scripts for Latenode workflows. Same scraping logic but with built-in connection pooling, smart retries, and parallel processing.

Best part? No managing browser instances, memory leaks, or scaling headaches. Build your scraping flow once and it handles everything else. Monitoring and error handling included.

For your case, you’d create a workflow that hits the target page, grabs all elements at once, and formats the response. Those 12-15 second waits disappear.

Check it out: https://latenode.com

Keep one browser instance running instead of launching fresh ones every time. Turn off images and CSS - you don’t need them anyway. Use await page.setRequestInterception(true) to block those resources. Run headless mode. Should drop your time to 3-4 seconds no problem.

Your bottleneck is those four separate DOM queries. You’re making Puppeteer serialize data back and forth between the browser and Node.js four times when you could do it once. Combine them into a single page.evaluate() that grabs everything and returns the complete object structure. That’ll cut your processing time big time. If you’re handling multiple requests, throw in puppeteer-cluster for browser pooling. Constantly launching new browsers kills performance under load. I’ve seen setups drop from 12+ seconds to under 4 just by consolidating DOM operations and reusing browser instances.