I’m trying to set up a REST API using Express and need to scrape data with Puppeteer during the build process. However, when deploying to Vercel, I encounter the error: ‘Failed to launch the browser process!’
I’ve found several online suggestions that recommend installing libraries using sudo apt-get install
, but Vercel restricts installations to npm packages during deployment.
I also attempted using chrome-aws-lambda
along with puppeteer-core
, yet the issue persists.
My build command is node scrap.js && node index.js
.
Below is a revised version of my scraping code:
const puppeteer = require('puppeteer-core');
const chrome = require('chrome-aws-lambda');
async function scrapeSite() {
const browser = await puppeteer.launch({
args: chrome.args,
executablePath: await chrome.executablePath,
headless: true,
});
const page = await browser.newPage();
await page.goto('https://example.com');
const pageTitle = await page.title();
await browser.close();
return pageTitle;
}
scrapeSite().then(result => console.log(result));
Is there a recommended workaround to enable Puppeteer to run successfully on Vercel without storing scraped data in my GitHub repository?
yo man, have u tried playwright instead of puppeteer? it’s pretty sweet for scraping and works better on vercel in my experience. just swap out puppeteer for playwright in ur code and it might solve ur problem. worth a shot if nothing else is working!
I’ve encountered similar issues when deploying Puppeteer-based projects to serverless environments like Vercel. One approach that worked for me was using the @sparticuz/chromium
package instead of chrome-aws-lambda
. It’s specifically designed for serverless environments and has better compatibility.
Here’s how I modified my code:
const puppeteer = require('puppeteer-core');
const chromium = require('@sparticuz/chromium');
async function scrapeSite() {
const browser = await puppeteer.launch({
args: chromium.args,
defaultViewport: chromium.defaultViewport,
executablePath: await chromium.executablePath(),
headless: chromium.headless,
ignoreHTTPSErrors: true,
});
// Rest of your code...
}
Also, make sure to add @sparticuz/chromium
to your dependencies. This solution has worked reliably for me in production on Vercel. If you’re still facing issues, you might want to check Vercel’s function size limits, as Puppeteer can be quite large.
Have you considered using a headless browser API service instead of running Puppeteer directly on Vercel? Services like Browserless or ScrapingBee handle the browser infrastructure for you, which can bypass deployment issues.
You’d just need to make API calls to their service instead of managing Puppeteer locally. This approach has worked well for me on serverless platforms.
If you prefer keeping it in-house, you could also explore running your scraping process on a separate server or cloud function, then have your Vercel app fetch the scraped data as needed. This separates concerns and avoids the Puppeteer deployment complexities on Vercel itself.
These solutions add some complexity but can be more reliable for production use cases in my experience.