I’m learning Puppeteer and I’m stuck on something. I’m trying to get the view count from a YouTube video. I found a way to do it in the Chrome console:
document.querySelector('#count > yt-view-count-renderer > span.view-count.style-scope.yt-view-count-renderer').innerText
This works fine in the console, but when I use it in my Puppeteer script, it can’t find the element. Here’s my code:
const puppeteer = require('puppeteer')
async function getViews() {
const browser = await puppeteer.launch()
const page = await browser.newPage()
await page.goto('https://www.youtube.com/watch?v=dQw4w9WgXcQ')
await page.waitForTimeout(1000)
const result = await page.evaluate(() => {
const viewCount = document.querySelector('#count > yt-view-count-renderer > span.view-count.style-scope.yt-view-count-renderer').innerText
return { viewCount }
})
await browser.close()
return result
}
getViews().then(console.log)
I ended up using the ytInitialData object to get the view count, but I’m confused why my original method didn’t work. Can someone explain what’s going on? Thanks!
I’ve dealt with similar Puppeteer challenges before. The issue likely stems from YouTube’s dynamic content loading. While the console interacts with a fully loaded page, Puppeteer may attempt to access elements before they’re ready.
To address this, consider using page.waitForNetworkIdle() instead of a fixed timeout. This ensures all network activity has settled before proceeding. Additionally, YouTube’s DOM structure can be volatile, so it’s worth exploring more robust selectors or data attributes for increased reliability.
Remember, YouTube may have anti-scraping measures in place. It’s crucial to respect their terms of service and implement appropriate rate limiting in your script to avoid potential issues.
hey, i’ve run into this too. the problem is youtube loads stuff dynamically, so puppeteer might not see everything right away. try using page.waitForSelector() instead of that timeout. it’ll wait til the element actually shows up. also, youtube’s layout changes sometimes, so maybe use data attributes if u can find em. they’re more stable.
I’ve encountered similar issues with Puppeteer before, and I think I know what’s going on here. The problem likely stems from the dynamic nature of YouTube’s content loading. When you’re using the Chrome console, you’re interacting with a fully loaded page. However, with Puppeteer, the page might not have finished loading all its elements by the time your script tries to access them.
To fix this, you could try increasing your wait time or, better yet, use page.waitForSelector() instead of a fixed timeout. This ensures the element is actually present before you try to access it. Here’s how you might modify your code:
await page.waitForSelector('#count > yt-view-count-renderer > span.view-count')
Also, YouTube’s DOM structure can change, so using more robust selectors or data attributes (if available) could make your script more resilient to updates.
Lastly, remember that YouTube might have measures in place to detect and block automated scraping, so be mindful of rate limits and terms of service when using such scripts.