To initiate a browser session in a controlled Node.js environment, you can use the following code snippet:
const puppeteer = require('puppeteer');
(async () => {
const browserInstance = await puppeteer.launch({
headless: false
});
const newPage = await browserInstance.newPage();
await newPage.goto('https://www.example.com/');
await newPage.setViewport({ width: 1280, height: 800 });
await newPage.waitForSelector('.selector-class');
await newPage.click('.selector-class');
await browserInstance.close();
})();
Typically, operating a browser may generate various JSON outputs. How can I capture and save these outputs? Is it preferable to use Puppeteer directly or should I utilize Chrome’s Developer Tools? It seems there are two distinct approaches for this.
To store JSON data using Puppeteer, the best approach is to directly capture and save it through Puppeteer’s APIs, ensuring efficiency by avoiding additional layers like the Developer Tools.
Here’s a practical way to do it:
const fs = require('fs');
const puppeteer = require('puppeteer');
(async () => {
const browserInstance = await puppeteer.launch({ headless: true });
const newPage = await browserInstance.newPage();
await newPage.goto('https://www.example.com/');
// Extract JSON data from the page's content.
const jsonData = await newPage.evaluate(() => {
// Sample: Assume the data is stored in a variable on the page.
return window.dataVariable;
});
// Write the JSON data to a file.
fs.writeFileSync('output.json', JSON.stringify(jsonData, null, 2));
await browserInstance.close();
})();
Steps Explained:
- Navigate and Extract: Use Puppeteer’s
evaluate
function to extract JSON data. This method runs within the page context.
- Save Data: Utilize Node.js’s
fs
module to write the JSON data to a local file efficiently.
This approach enables you to manage JSON data directly within Node.js, leveraging Puppeteer for streamlined operation. It avoids additional complexities and offers a direct solution tailored for backend automation.