I’m having trouble with my headless browser in a FastAPI app. It works fine on its own, but when I use it as an endpoint, it breaks. Here’s a quick rundown:
I made a WebScraper class that uses Playwright. It has methods to start the browser and fetch page content. This works great when I run it by itself:
async def test():
html = await WebScraper.fetch_and_parse('https://example.com')
print(html)
asyncio.run(test())
But when I try to use it in a FastAPI endpoint, it crashes:
@app.get('/scrape/')
async def scrape_page():
html = await WebScraper.fetch_and_parse('https://example.com')
return html
I get this error: RuntimeError: Browser not started. Call start_browser first.
Any ideas why it’s not working in FastAPI? I thought I set it up right, but maybe I’m missing something obvious. Thanks for any help!
I’ve dealt with this exact problem in one of my projects. The issue is likely related to the lifecycle of your WebScraper instance within the FastAPI application. What worked for me was creating a singleton pattern for the WebScraper class. Here’s how I implemented it:
class WebScraperSingleton:
_instance = None
@classmethod
async def get_instance(cls):
if cls._instance is None:
cls._instance = WebScraper()
await cls._instance.start_browser()
return cls._instance
@app.get('/scrape/')
async def scrape_page():
scraper = await WebScraperSingleton.get_instance()
html = await scraper.fetch_and_parse('https://example.com')
return html
This ensures that only one instance of WebScraper is created and the browser is started only once. It solved the issue for me and might work for you too. Remember to handle browser closure properly when your application shuts down.
I’ve encountered similar issues with headless browsers in FastAPI. The problem likely stems from the browser not being initialized properly within the FastAPI context. A potential solution is to use a dependency injection approach. Define a dependency function that ensures the browser is started and yields the WebScraper instance:
async def get_web_scraper():
await WebScraper.start_browser()
try:
yield WebScraper
finally:
await WebScraper.close_browser()
@app.get('/scrape/')
async def scrape_page(scraper: WebScraper = Depends(get_web_scraper)):
html = await scraper.fetch_and_parse('https://example.com')
return html
This method ensures the browser is started for each request and properly closed afterward, which should resolve the issue you’re experiencing.
hey sophia, sounds like a tricky one! have you tried starting the browser in your FastAPI startup event? something like:
@app.on_event(“startup”)
async def startup_event():
await WebScraper.start_browser()
might help kickstart things before endpoints run. lmk if that works!