Need Help with Cloudflare Bypass Using Puppeteer and FlareSolverr
Hey everyone! I’ve been struggling with web scraping for the past month and could really use some advice.
I’m trying to scrape a watch marketplace website but keep running into Cloudflare protection. The site shows a “Please wait while we verify your connection” message that blocks my scraper.
What I’ve Already Tried
- Basic Puppeteer setup
- Adding proxy rotation
- Using puppeteer-extra-plugin-stealth
- Combining Puppeteer-real-browser with proxies
- FlareSolverr integration with all the above
Current Setup and Problems
Right now I’m using FlareSolverr to make initial requests, then grabbing the cookies and user agent to feed into Puppeteer. But I’m hitting two major issues:
- FlareSolverr timing out - When it detects the challenge, it just fails with timeout errors
- Puppeteer still blocked - Even when FlareSolverr succeeds, Puppeteer still hits the challenge page
I think FlareSolverr might not be properly detecting challenges or getting the right cookies.
My Code
let solverResponse: any;
let retryCount = 0;
const maxRetries = 5;
// Keep trying FlareSolverr until success
while (retryCount < maxRetries) {
try {
let apiCall = await fetch("http://localhost:8191/v1", {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({
cmd: "request.get",
url: targetUrl,
maxTimeout: 6000,
session: sessionToken,
}),
});
solverResponse = await apiCall.json();
if (solverResponse?.status === "error") {
throw new Error(solverResponse?.message ?? "API call failed");
}
break;
} catch (err: any) {
retryCount++;
if (retryCount === maxRetries) {
throw new Error(`All ${retryCount} attempts failed: ${err.message}`);
}
await new Promise((resolve) => setTimeout(resolve, 2000));
}
}
if (!solverResponse) throw new Error("No response from solver");
const extractedCookies = solverResponse.solution.cookies;
const extractedUserAgent = solverResponse.solution.userAgent;
if (!extractedUserAgent) throw new Error("Missing user agent");
if (extractedCookies.length !== 0) {
await puppeteerBrowser.setCookie(
...extractedCookies.map((item: any) => ({ ...item, expires: item?.expiry ?? 0 }))
);
}
await puppeteerPage.setUserAgent(extractedUserAgent);
await delay(Math.random() * 8000 + 2000);
await puppeteerPage.goto(targetUrl, { waitUntil: "networkidle0" });
const pageHtml = await puppeteerPage.content();
return pageHtml;
Am I missing something obvious here? What’s the best way to reliably get past Cloudflare these days?