I need help finding a quick headless browser solution for Node.js. My goal is simple - visit multiple websites and grab specific global JavaScript variables from the window object like window.myVariable
.
Right now I’m using Selenium with headless Chrome on my Ubuntu server. It works fine but takes way too long - around 1.5 to 3 seconds per site. Since I need to check tons of websites regularly, this speed is killing my workflow.
// Current slow approach with Selenium
const { Builder } = require('selenium-webdriver');
const chrome = require('selenium-webdriver/chrome');
async function getGlobalVar(url, varName) {
const options = new chrome.Options().headless();
const driver = await new Builder()
.forBrowser('chrome')
.setChromeOptions(options)
.build();
await driver.get(url);
const result = await driver.executeScript(`return window.${varName}`);
await driver.quit();
return result;
}
I don’t need page rendering, images, or fancy DOM manipulation. Just need to load the page enough so JavaScript runs and I can access those window variables. I looked at PhantomJS and CasperJS but they seem dead now.
What’s the fastest modern alternative for this basic task?
Have you considered switching to Puppeteer with Chrome or even better, Playwright? I had a similar bottleneck issue when scraping global variables from hundreds of sites daily. The game changer for me was switching to Playwright and implementing connection pooling instead of launching fresh browser instances every time. With Playwright, you can launch one browser instance and reuse contexts or pages, which dramatically cuts down the startup overhead. I’m now getting consistent sub-500ms times for simple variable extraction. The key is keeping the browser alive between requests and just creating new pages when needed. Another approach worth testing is using a lightweight JavaScript engine like QuickJS through the quickjs npm package if the sites don’t rely heavily on browser-specific APIs. For pure JavaScript execution without DOM complexity, it can be significantly faster than full headless browsers.
jsdom might be exactly what ur looking for! its way lighter than full browsers since it doesnt render anything visual. just creates a dom enviroment where js can run. ive used it for similar variable extraction and its usually under 200ms per site. only downside is some complex js might not work perfectly but for basic window vars it should be fine.
I ran into this exact performance problem about six months ago when extracting config variables from client sites. The solution that worked best for me was combining Puppeteer with some smart optimizations. First, disable unnecessary features like images, CSS, and fonts using the --disable-web-security --disable-features=VizDisplayCompositor
flags. Second, set a very short timeout for page loads since you only need the initial JavaScript to execute. Most importantly, I found that launching multiple browser instances in parallel rather than sequential processing gave me the biggest speed boost. With 5-10 concurrent Puppeteer instances running simultaneously, I dropped from 2+ seconds per site to around 400-600ms average. The trick is finding the right balance of concurrent instances without overwhelming your server resources. Also consider implementing a simple cache for sites you check repeatedly since window variables often remain static for extended periods.