I’m working on a web scraping project with Puppeteer and having trouble extracting data from the page.evaluate() method. The function runs fine inside the browser context and I can see the results in the browser console, but I can’t access those values in my Node.js code.
The promise inside the evaluate block shows the correct array in browser console, but the items variable stays empty when I try to use it outside. What am I doing wrong here?
Your page.evaluate() isn’t returning the data. You’re logging the promise result but not actually returning it.
Also, your resolve is passing Array.from(cards) - but that’s DOM elements, which can’t be serialized back to Node.js. You need to extract the actual data first.
Here’s the fix:
const fetchData = async() => {
return await page.evaluate(async () => {
const result = await new Promise(resolve => {
var currentHeight = 0
var step = 150
var scrollTimer = setInterval(() => {
cards = document.querySelectorAll("div.container-class > .video-item > div.video-wrapper")
var totalHeight = document.documentElement.scrollHeight
window.scrollBy(0, step)
currentHeight += step
if(currentHeight >= totalHeight || cards.length >= 25){
clearInterval(scrollTimer)
// Extract serializable data from DOM elements
const cardData = Array.from(cards).map(card => ({
text: card.textContent,
href: card.querySelector('a')?.href || '',
// add other properties you need
}))
resolve(cardData)
}
}, 600)
})
return result // This was missing!
})
}
Honestly though, complex scraping logic like this gets messy fast. I’ve been using Latenode to automate these workflows instead. You can build the entire scraping pipeline visually, handle data extraction, and set up scheduling without wrestling with Puppeteer code.
I’ve hit this exact issue before. The problem’s simple - you’re missing the return statement for your Promise result. The console.log shows the data in your browser but doesn’t pass it back to Node.js.
There’s another issue I ran into early on. Using setInterval for scrolling is unreliable. Sometimes the page hasn’t finished loading new content before the next scroll, so you miss elements.
You’re using console.log() instead of return in your evaluate function. When you do console.log(await new Promise(...)), you’re just printing to the browser console but not returning anything to your Node.js context.
I hit this exact issue when I started with Puppeteer. Remember that page.evaluate() can only return serializable data - no DOM elements, functions, or complex objects.
Try this:
const fetchData = async() => {
return await page.evaluate(async () => {
return await new Promise(resolve => {
var currentHeight = 0
var step = 150
var scrollTimer = setInterval(() => {
const cards = document.querySelectorAll("div.container-class > .video-item > div.video-wrapper")
var totalHeight = document.documentElement.scrollHeight
window.scrollBy(0, step)
currentHeight += step
if(currentHeight >= totalHeight || cards.length >= 25){
clearInterval(scrollTimer)
// Extract only the data you need
const cardData = Array.from(cards).map(card => ({
innerHTML: card.innerHTML,
className: card.className
// add whatever properties you actually need
}))
resolve(cardData)
}
}, 600)
})
})
}
Main changes: use return instead of console.log and extract primitive data from DOM elements before resolving the promise.