I’m having trouble with puppeteer when trying to get the updated HTML after filling out form fields. Here’s what I’m doing:
await page.fill('#email', '[email protected]');
After entering text into the input field, I want to grab the current HTML content like this:
const updatedHTML = await page.innerHTML('body');
The problem is that when I check the HTML output, the email field appears empty even though I can see ‘[email protected]’ is there when I take a screenshot with page.screenshot(). The visual content shows the text but the HTML doesn’t reflect these changes. Am I using the wrong method to capture the modified DOM state? Any ideas what could be causing this disconnect?
Yeah, this is totally normal with forms. The HTML source stays the same while the actual form data lives in the DOM properties. When you use page.fill(), you’re changing the element’s internal state, not the HTML markup. The value attribute just sets the default - user input goes into the element’s value property instead. I’ve hit this tons of times scraping dynamic sites. Don’t rely on innerHTML for form data. Use page.evaluate() to run JavaScript in the browser and grab those live properties directly. This HTML attributes vs DOM properties thing trips up a lot of developers at first.
totally get it! that issue can be a pain, right? really, innerHTML only reflects what’s in the static HTML. you should try await page.$eval('#email', el => el.value) to access the updated value directly. it works every time!
Yeah, this happens all the time. Form inputs don’t update their HTML attributes when you fill them programmatically - the value attribute stays the same while the actual input’s value property changes. I’ve hit this exact issue with Puppeteer before. Skip page.innerHTML() and use page.evaluate() instead to grab the DOM element properties directly. Try await page.evaluate(() => document.querySelector('#email').value) to get the current input value. Or use the shorthand: page.$eval('#email', el => el.value). Your screenshot shows the right content because it captures what’s rendered, not the HTML source. This varies between input types and browsers, but accessing the value property directly always works.