I need to capture screenshots of multiple websites automatically. I have all the URLs stored in a text file and want to process them one by one using Puppeteer.
const puppeteer = require('puppeteer');
async function captureWebpage(websiteUrl, fileName) {
let browser = await puppeteer.launch({ headless: false });
let tab = await browser.newPage();
await tab.goto(websiteUrl);
await tab.setViewport({width: 1280, height: 800})
await tab.waitFor(2000);
console.log('capturing webpage screenshot');
await tab.screenshot({ path:`${fileName}.png`, fullPage: true });
await tab.close();
await browser.close();
}
async function processUrls() {
console.log('starting process');
var fs = require("fs");
var urlList = fs.readFileSync("urls.txt").toString().split("\n");
for (var index = 0; index < urlList.length; ++index) {
captureWebpage(urlList[index], "screenshot"+index)
console.log("screenshot"+index+" done");
await tab.waitFor(3000);
}
}
processUrls();
When I run this code I get an error message:
ReferenceError: tab is not defined at processUrls
The error happens in the main loop and I cannot figure out why the tab variable is not accessible. Can someone help me understand what is going wrong here?
The issue arises because the tab variable is scoped only within the captureWebpage function, making it inaccessible in processUrls. You should replace the await tab.waitFor(3000) call with await new Promise(resolve => setTimeout(resolve, 3000)), or consider removing it altogether since the captureWebpage function already includes a wait. Additionally, ensure you prefix the captureWebpage(urlList[index], "screenshot" + index) call with await so that the program waits for each screenshot to complete before moving to the next one. From my experience, running the browser instance once outside the loop is significantly more efficient than creating a new one for every screenshot.
your tab var is only inside captureWebpage, so u can’t access it in processUrls. just remove that await tab.waitFor(3000) line since u’re already waiting in captureWebpage. and make sure to await before calling captureWebpage or ull trigger all screenshots at once, which might crash ur system.
You’re encountering this error because tab only exists within the captureWebpage function, making it inaccessible in processUrls. Instead of using await tab.waitFor(3000), consider implementing await new Promise(resolve => setTimeout(resolve, 3000)). However, you might not need that wait at all since the captureWebpage function already manages timing. More importantly, ensure you use await before the captureWebpage call. If it’s omitted, all the screenshot processes will initiate simultaneously, leading to potential resource issues. Hence, modify it to await captureWebpage(urlList[index], "screenshot" + index) for sequential processing.