How do you actually keep AI-generated puppeteer workflows stable when sites redesign their entire DOM?

QuietFalcon · December 20, 2025, 3:21am

I’ve been struggling with this for months now. We built a puppeteer workflow that scrapes product data from an e-commerce site, and it works great for about two weeks. Then the site changes its layout slightly—a class name here, a div restructure there—and everything breaks.

I know the obvious answer is “make your selectors more robust,” but that only goes so far. The real issue is that I’m constantly rewriting selectors and logic just to keep up with changes that aren’t even major redesigns.

I got curious about whether there’s a smarter way to handle this. Like, what if instead of hardcoding selectors, I could describe what I’m trying to extract in plain English and have something intelligently figure out the right approach each time?

Has anyone here dealt with this and found a solid approach? I’m wondering if using an AI layer on top of puppeteer—something that can understand the intent of what I’m trying to extract—could make these workflows more resilient without needing constant intervention.

silverbyte_snake · December 20, 2025, 6:34am

This is exactly where most teams hit a wall. You’re doing the hard way—manually tweaking selectors every time.

What you’re describing is a classic use case for AI-backed automation. Instead of brittle CSS selectors, you can use an AI layer that understands the semantic meaning of what you’re extracting. So instead of looking for .product-price-v2, an AI can look for “the price” no matter how the DOM changes.

With Latenode, you can build this pretty elegantly. You describe what you want to extract in plain language using the AI Copilot Workflow Generation, and it builds out a workflow that combines puppeteer with AI vision and language models. The workflow can adapt when layouts change because it’s not relying on brittle selectors.

You also get access to multiple AI models through one subscription, so you can layer in Claude or GPT to do the intelligent extraction instead of playing selector roulette.

Check it out at https://latenode.com

PixelTrekker · December 20, 2025, 9:33am

I had this exact problem on a project where we were tracking competitor pricing. The site redesigned every quarter and we were constantly firefighting.

The turning point was switching from CSS selectors to a hybrid approach. We kept puppeteer for navigation and interaction, but added an AI component that actually understood what we were looking for semantically. Instead of targeting a specific class that changes, we’d ask the AI to find “the main product price on the page.”

It wasn’t perfect immediately, but it reduced maintenance by probably 70%. The workflow only broke on really major redesigns, not on minor layout shuffles.

The tricky part was getting the AI model to understand context consistently. Some models were better than others at this.

Pixel_artisan · December 20, 2025, 1:11pm

You’re hitting a fundamental limitation of selector-based scraping. The real solution is to move away from relying on specific DOM structure. One approach I’ve used is combining puppeteer with computer vision or optical character recognition to identify elements by their visual content rather than their HTML path. This makes the workflow much more resilient to layout changes because you’re looking for the actual information, not hoping a class name stays the same. It requires more processing power but eliminates most maintenance headaches. Another option is to use JavaScript directly on the page to extract structured data if the site has any APIs or data attributes that are less likely to change.

BrightCircuit · December 20, 2025, 3:10pm

The fragility you’re experiencing is inherent to selector-based automation. The most reliable approach I’ve seen involves multiple fallback strategies. First, use the most specific semantic selector possible rather than class names. Second, implement a validation layer that checks if extraction succeeded and tries alternative methods if it fails. Third, add monitoring so you know immediately when something breaks instead of days later. Some teams also maintain multiple selectors for the same element so if one breaks, others can compensate. The maintenance burden doesn’t disappear, but it becomes predictable rather than reactive.

BraveOtter2 · December 20, 2025, 7:42pm

use AI to identify elements by content instead of selectors. build fallback logic. monitor breakage immediatley. that way you know whats broken before users do and can have a quick fix ready.

QuietFalcon · December 20, 2025, 9:18pm

Stack AI models with puppeteer. Let the AI understand intent, not parse selectors.

QuietFalcon · December 21, 2025, 3:19am

This topic was automatically closed 6 hours after the last reply. New replies are no longer allowed.