Setting up random user agents in Puppeteer headless browser

I’m working with a library that generates random user agent data and I need help integrating it with Puppeteer for headless Chrome automation.

I have the user-agents npm package that creates random browser fingerprints, but I’m struggling to properly apply all the generated properties to my headless browser instance.

Currently I can generate the user agent data and it shows something like this:

{
  "browserName": "Netscape",
  "networkInfo": {
    "bandwidth": 8,
    "connectionType": "4g",
    "latency": 50
  },
  "operatingSystem": "Win32",
  "extensionCount": 2,
  "browserVendor": "Google Inc.",
  "userAgentString": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.77 Safari/537.36",
  "windowHeight": 720,
  "windowWidth": 1366,
  "deviceType": "desktop",
  "displayHeight": 900,
  "displayWidth": 1440
}

Here’s my current implementation:

const chrome = require('puppeteer');
const RandomUserAgent = require('user-agents');

const setupBrowserProfile = async (browserPage) => {
  const randomAgent = new RandomUserAgent();
  console.log(randomAgent.toString());
  console.log(JSON.stringify(randomAgent.data, null, 2));
  await browserPage.setUserAgent(randomAgent.toString());
}

(async () => {
  const browserInstance = await chrome.launch({
    args: ['--disable-web-security'],
    headless: true,
  });
  const newPage = await browserInstance.newPage();

  await setupBrowserProfile(newPage);

  const targetSite = 'https://example.com/browser-detection-test';
  await newPage.goto(targetSite);

  await newPage.screenshot({path: '/home/user/test-output.png'});

  await browserInstance.close();
})();

The issue is that I’m only setting the user agent string, but I need to configure other browser properties like viewport dimensions, platform info, and connection details to make it more realistic. How can I apply all these generated properties to the headless browser session?

Also check out setExtraHTTPHeaders() - sites look at more than just user agent. Try page.emulate() instead of setting everything manually since it handles viewport and useragent together. Randomizing the --window-size launch arg to match your generated dimensions really helped me - makes it way less obvious it’s headless.

I’ve hit similar fingerprinting issues with scraping. The trick is mapping those generated properties to the right Puppeteer methods - don’t just stick with setUserAgent. Use page.setViewport() for windowWidth and windowHeight. For platform and browser vendor stuff, override them with page.evaluateOnNewDocument() - inject JavaScript that changes navigator.platform, navigator.vendor, and other navigator properties before the page loads. For network conditions, Puppeteer’s got page.emulateNetworkConditions() that takes downloadThroughput, uploadThroughput, and latency parameters. Just convert your connectionType to rough bandwidth values. Handle display dimensions by setting deviceScaleFactor in your viewport config. Here’s what tripped me up: sites often check if window dimensions match the reported screen resolution. Make sure your displayWidth/Height values line up with whatever viewport you’re setting. Also, some detection scripts run checks after page load, so throw in page.evaluate() to verify your properties actually applied before moving on with your automation.

You’re missing a bunch of properties that need manual config beyond just the user agent string. Use page.setViewport() for window dimensions from your generated data. The tricky part is spoofing navigator properties - you’ll need page.evaluateOnNewDocument() to override navigator.platform, navigator.vendor, and navigator.hardwareConcurrency before page scripts run. For network throttling, use page.emulateNetworkConditions() with your bandwidth and latency values. Here’s a gotcha I hit - many fingerprinting libraries check if properties match up, so make sure your OS aligns with what the user agent claims and screen dimensions make sense together. Also throw in some random timing between actions since perfectly timed automation screams bot. The extensionCount property is trickier - you might need to inject fake extension objects into the page context, but that’s more advanced stuff.