Extracting text from div elements using Puppeteer

I’m having trouble getting the text from div elements with the class ‘appname’ using Puppeteer. My code is supposed to scrape app names from a website, but I keep getting a ‘TypeError: Cannot read property ‘innerHTML’ of null’ error.

Here’s a simplified version of what I’m trying to do:

const browser = await puppeteer.launch();
const page = await browser.newPage();
await page.goto('https://example.com');

const appNames = await page.evaluate(() => {
  const nameElements = document.querySelectorAll('.appname');
  return Array.from(nameElements).map(el => el.textContent);
});

console.log(appNames);

I’ve tried using innerText, textContent, and innerHTML, but nothing seems to work. The page definitely has elements with the ‘appname’ class, so I’m not sure why I can’t access them. Any ideas on what I might be doing wrong or how to fix this?

hey man, have u tried using page.$$eval() instead? it’s a bit simpler and might solve ur issue. like this:

const appNames = await page.$$eval(‘.appname’, elements => elements.map(el => el.textContent));

this does the querySelectorAll and mapping in 1 go. worth a shot if the other suggestions didn’t work for ya

I’ve faced similar issues with Puppeteer before, and I think I know what might be causing your problem. It sounds like the page content isn’t fully loaded when your script tries to access the elements.

Try adding a wait before querying the page. You can use page.waitForSelector() to ensure the elements are present before attempting to extract their content. Here’s how you might modify your code:

await page.goto('https://example.com');
await page.waitForSelector('.appname');

const appNames = await page.evaluate(() => {
  const nameElements = document.querySelectorAll('.appname');
  return Array.from(nameElements).map(el => el.textContent);
});

This should give the page time to load and render the elements you’re looking for. If you’re still having issues, the elements might be dynamically loaded with JavaScript. In that case, you might need to interact with the page first or wait for a specific network request to complete before querying the elements.

Hope this helps! Let me know if you need any more assistance.

I’ve encountered this issue before, and it’s often related to timing. The page might not be fully loaded when your script runs. Try incorporating a delay or using page.waitForTimeout() before executing your evaluation. Something like this:

await page.goto('https://example.com');
await page.waitForTimeout(5000); // Wait for 5 seconds

const appNames = await page.evaluate(() => {
  const nameElements = document.querySelectorAll('.appname');
  return Array.from(nameElements).map(el => el.textContent);
});

If that doesn’t work, the elements might be inside an iframe. In that case, you’ll need to switch to the correct frame before querying. Check the page structure and adjust your code accordingly. Also, ensure you’re using the correct selector for the elements you’re targeting.