Getting return values from page.evaluate() function in Puppeteer

I’m working on a web scraping project with Puppeteer and having trouble extracting data from the page.evaluate() method. The function runs fine inside the browser context and I can see the results in the browser console, but I can’t access those values in my Node.js code.

Here’s what I’m trying to do:

let items = []
const fetchData = async() => {
    return await page.evaluate(async () => {
        console.log(await new Promise(resolve => {
            var currentHeight = 0
            var step = 150
            var scrollTimer = setInterval(() => {
                cards = document.querySelectorAll("div.container-class > .video-item > div.video-wrapper")
                console.log(`Found ${cards.length} items`)
                var totalHeight = document.documentElement.scrollHeight
                window.scrollBy(0, step)
                currentHeight += step
                if(currentHeight >= totalHeight || cards.length >= 25){
                    clearInterval(scrollTimer)
                    resolve(Array.from(cards))
                }
            }, 600)
        }))
    })
}
items = await fetchData()
console.log(items)

The promise inside the evaluate block shows the correct array in browser console, but the items variable stays empty when I try to use it outside. What am I doing wrong here?

Your page.evaluate() isn’t returning the data. You’re logging the promise result but not actually returning it.

Also, your resolve is passing Array.from(cards) - but that’s DOM elements, which can’t be serialized back to Node.js. You need to extract the actual data first.

Here’s the fix:

const fetchData = async() => {
    return await page.evaluate(async () => {
        const result = await new Promise(resolve => {
            var currentHeight = 0
            var step = 150
            var scrollTimer = setInterval(() => {
                cards = document.querySelectorAll("div.container-class > .video-item > div.video-wrapper")
                var totalHeight = document.documentElement.scrollHeight
                window.scrollBy(0, step)
                currentHeight += step
                if(currentHeight >= totalHeight || cards.length >= 25){
                    clearInterval(scrollTimer)
                    // Extract serializable data from DOM elements
                    const cardData = Array.from(cards).map(card => ({
                        text: card.textContent,
                        href: card.querySelector('a')?.href || '',
                        // add other properties you need
                    }))
                    resolve(cardData)
                }
            }, 600)
        })
        return result // This was missing!
    })
}

Honestly though, complex scraping logic like this gets messy fast. I’ve been using Latenode to automate these workflows instead. You can build the entire scraping pipeline visually, handle data extraction, and set up scheduling without wrestling with Puppeteer code.

Much cleaner: https://latenode.com

I’ve hit this exact issue before. The problem’s simple - you’re missing the return statement for your Promise result. The console.log shows the data in your browser but doesn’t pass it back to Node.js.

There’s another issue I ran into early on. Using setInterval for scrolling is unreliable. Sometimes the page hasn’t finished loading new content before the next scroll, so you miss elements.

This approach worked much better for me:

const fetchData = async() => {
    return await page.evaluate(async () => {
        return await new Promise(resolve => {
            let previousCount = 0;
            const checkAndScroll = () => {
                const cards = document.querySelectorAll("div.container-class > .video-item > div.video-wrapper");
                
                if (cards.length === previousCount && cards.length >= 10) {
                    // No new items loaded, finish
                    const data = Array.from(cards).slice(0, 25).map(card => ({
                        text: card.textContent.trim(),
                        // extract other needed properties
                    }));
                    resolve(data);
                    return;
                }
                
                previousCount = cards.length;
                window.scrollBy(0, 300);
                
                if (cards.length >= 25) {
                    const data = Array.from(cards).slice(0, 25).map(card => ({
                        text: card.textContent.trim()
                    }));
                    resolve(data);
                } else {
                    setTimeout(checkAndScroll, 800);
                }
            };
            checkAndScroll();
        });
    });
}

This waits for content to actually load between scrolls, which gives way more reliable results.

You’re using console.log() instead of return in your evaluate function. When you do console.log(await new Promise(...)), you’re just printing to the browser console but not returning anything to your Node.js context.

I hit this exact issue when I started with Puppeteer. Remember that page.evaluate() can only return serializable data - no DOM elements, functions, or complex objects.

Try this:

const fetchData = async() => {
    return await page.evaluate(async () => {
        return await new Promise(resolve => {
            var currentHeight = 0
            var step = 150
            var scrollTimer = setInterval(() => {
                const cards = document.querySelectorAll("div.container-class > .video-item > div.video-wrapper")
                var totalHeight = document.documentElement.scrollHeight
                window.scrollBy(0, step)
                currentHeight += step
                if(currentHeight >= totalHeight || cards.length >= 25){
                    clearInterval(scrollTimer)
                    // Extract only the data you need
                    const cardData = Array.from(cards).map(card => ({
                        innerHTML: card.innerHTML,
                        className: card.className
                        // add whatever properties you actually need
                    }))
                    resolve(cardData)
                }
            }, 600)
        })
    })
}

Main changes: use return instead of console.log and extract primitive data from DOM elements before resolving the promise.