I created a headless browser wrapper class that works perfectly when I run it directly in a script, but it breaks when I try to use it inside FastAPI route handlers.
Here’s my browser wrapper:
from playwright.async_api import async_playwright
import logging
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)
class WebScraper:
browser_instance = None
playwright_process = None
@classmethod
async def start_browser(cls):
try:
if cls.browser_instance is None:
if cls.playwright_process is None:
cls.playwright_process = await async_playwright().__aenter__()
cls.browser_instance = await cls.playwright_process.chromium.launch(headless=True)
logger.info('Browser started successfully.')
except Exception as e:
logger.error(f'Browser startup failed: {e}')
async def fetch_html(self, target_url):
if self.browser_instance is None:
raise RuntimeError("Browser not started. Call start_browser first.")
ctx = await self.browser_instance.new_context()
tab = await ctx.new_page()
await tab.goto(target_url)
html_content = await tab.content()
await ctx.close()
return html_content
@classmethod
async def scrape_url(cls, target_url):
await cls.start_browser()
scraper = cls()
return await scraper.fetch_html(target_url)
This works fine in standalone mode:
async def test_function():
result = await WebScraper.scrape_url("https://example.com")
print(result)
asyncio.run(test_function())
But when I use it in my FastAPI app, it crashes:
from fastapi import FastAPI
api = FastAPI()
@api.get("/scrape/")
async def scrape_endpoint():
result = await WebScraper.scrape_url('https://example.com')
return result
The error message says:
RuntimeError: Browser not started. Call start_browser first.
Why does this happen in FastAPI but not when running directly?
This happens because FastAPI handles processes and worker threads differently than when you run asyncio.run() directly. With asyncio.run(), everything runs in one process so your state sticks around. FastAPI can spawn multiple workers or restart them, which wipes out your class variables without warning.
I hit this same issue building a competitor monitoring tool. Browser worked perfectly in tests but randomly died in production. What’s happening is your browser_instance check passes in start_browser() but somehow turns None by the time fetch_html() runs.
Ditch the shared state approach entirely. Either create a new browser instance per request or use FastAPI’s dependency injection to handle the lifecycle properly. I’d go with fresh browser contexts for each scrape instead of trying to reuse a global instance.
If you really need persistent browser sessions, look into a proper process manager or move the scraping to background tasks with Celery.
This happens because FastAPI handles request isolation differently than your standalone script. Each request runs in its own context, so class variables become unreliable across different request cycles. Your browser_instance check is failing even after start_browser() runs. There’s likely a timing issue where the class variable isn’t set before fetch_html() executes. Try initializing the browser during FastAPI startup instead of per-request: @api.on_event("startup") async def startup_event(): await WebScraper.start_browser() @api.on_event("shutdown") async def shutdown_event(): if WebScraper.browser_instance: await WebScraper.browser_instance.close(). Or use dependency injection to manage the browser per request, or switch to a singleton pattern with proper async context management. Class variables are fragile in web frameworks that manage their own async contexts.
fastapi’s probably killing your browser instance between requests. class variables don’t work the same in web servers as they do in regular scripts. had the same weird issues with selenium before i switched to playwright. just create a fresh browser for each request instead of trying to reuse it - slower, sure, but it won’t randomly break on you.
FastAPI runs its own event loop that’s different from your standalone script. When you use asyncio.run(), it creates a fresh loop, but FastAPI manages its own where your class variables don’t persist between requests.
Your browser instance gets lost because FastAPI can spawn multiple workers or restart contexts. Class-level variables won’t stick around.
I’ve hit this same wall before. Instead of fighting browser lifecycle management in FastAPI, I moved my scraping to Latenode workflows. You just trigger them from FastAPI endpoints with an HTTP call.
Set up a Latenode workflow that handles browser automation - it manages all the Playwright complexity. Then your FastAPI route calls the workflow: