What actually happens when you let an AI system coordinate multiple agents on a complex browser automation task?

I’ve been reading about autonomous AI teams and multi-agent systems, and I’m curious about how this actually works in practice for browser automation.

Like, imagine you need to log into a site, navigate to a dashboard, extract some data, and then summarize that data in a specific format. That’s multiple distinct tasks. I’m wondering if you can actually have different AI agents handle different parts of that workflow, or if it’s just a nice concept that breaks down when you try it.

My concern is handoffs between agents. In my experience, anything involving handoff between systems tends to get messy fast. Data format mismatches, context loss, agents not understanding what the previous one did.

But the appeal is obvious—if you could have a specialist agent handle each part of the task, that seems like it could be more reliable than one monolithic automation.

Has anyone actually built or worked with multi-agent automation for browser tasks? Does the coordination actually work, or does it fall apart when things get real?