I have been using Python tools like LangChain and LangGraph for about two years now. I worked with them in FastAPI and Django projects mostly.
Right now I am starting a fresh project and need to create automated workflows. These workflows should scrape web pages, sort content into categories, and verify if the content matches what I need. In Python I would just use LangGraph with PydanticAI, but this time I have to work with TypeScript and Next.js.
My plan is to set up scheduled tasks using Next.js API routes. I want to trigger these with external cron services and handle all the workflow logic inside those API endpoints.
Which library would work best for this setup? I would love to hear your reasons too.
I am also thinking about a different approach - maybe running FastAPI in a Docker container with LangGraph and PydanticAI, then calling it from scheduled jobs instead.
I went the opposite route and went all-in on TypeScript for something similar about 18 months ago. Used LangChain.js with Vercel’s AI SDK for workflow orchestration. The TypeScript ecosystem has gotten way better since then. Biggest win was keeping everything in one language and deployment pipeline. No Docker headaches, no API boundary issues between services. Just Next.js API routes handling everything from scraping to content validation. For what you’re doing, LangChain.js plays nice with web scraping libraries like Puppeteer or Playwright. You can handle content categorization and validation through OpenAI’s structured outputs or Anthropic’s tool use. External cron services for scheduled tasks work great. I used GitHub Actions for triggering and never had reliability problems. Response times were consistently faster than my old Python setup - no cold start penalty with containerized services. That said, if you’re already comfortable with Python and you’re on a tight timeline, Docker makes sense. But don’t sleep on how smooth an all-TypeScript workflow can be once you get it dialed in.
Been there, done that. Had to migrate a content processing pipeline from Python to TypeScript three years back for the same reasons.
Stick with FastAPI + LangGraph in Docker. I wasted weeks trying to force TypeScript workflow orchestration - the ecosystem’s not ready.
What actually worked: Keep Python services containerized with clean REST endpoints. Your Next.js app hits these via API calls from cron jobs. You get proven Python AI tools plus your required Next.js frontend.
On the TypeScript side, just use fetch calls with decent error handling. Skip the fancy workflow libraries. Keep complex logic in Python where the tools actually work.
Watch out for scaling though - make sure Docker containers can handle volume if you’re processing tons of content. Learned that one the hard way when scraping jobs started piling up.
Hybrid approach saved me months fighting half-baked TypeScript AI libraries. Pragmatic beats pure every time.
Think of Temporal as your workflow orchestration layer - it handles all the messy state management and reliability stuff that web scraping pipelines need. I’ve got a similar setup running for content processing where Temporal workers run in Node.js and call out to different services for specific tasks. Best part? You can mix languages however you want. Run your scraping in TypeScript workers with Cheerio or Playwright, then hand off AI categorization to Python microservices when you need it. Temporal automatically handles retries, timeouts, and workflow state. For Next.js integration, just expose Temporal workflow triggers through API routes. External cron services hit these endpoints to kick off workflows. Way more maintainable than shoving everything into API route handlers. Your Python experience with LangGraph actually translates pretty well to how Temporal workflows work. Both handle complex multi-step processes with proper error handling and state persistence. Learning curve isn’t too bad and you skip all the Docker headaches of running separate FastAPI services.