Puppeteer successfully extracts properties like innerText, but retrieving full child elements returns empty objects. How can I obtain complete child elements? See example below:
const output = await page.evaluate(() => {
const tableNodes = [...document.getElementsByTagName('table')];
return { texts: tableNodes.map(node => node.innerText), fullElements: tableNodes };
});
console.log(output);
I encountered a similar problem recently and discovered that the issue is due to Puppeteer only being able to serialize data that can be converted to JSON. DOM element references cannot be directly serialized, which is why you end up with empty objects. To work around this limitation, I opted to return the outerHTML of each element instead. By doing so, all child elements and properties are included in the string. This method provides a complete representation of the element structure, making it easier to work with on the client side.
I encountered a similar issue and resolved it by converting the DOM elements into serializable objects. Instead of trying to return the element references directly, I wrote a recursive function within page.evaluate that extracts each element’s node name, attributes, and text content, then iterates over its children to build a nested object structure. This approach not only provides the complete hierarchy of child elements but also allows you to customize the data you wish to carry forward. It has been particularly useful when dealing with complex nested structures in my projects.