Help with Cloudflare Bypass Issues
I have been trying to scrape a website for the last month but keep running into Cloudflare problems. When I use just Puppeteer by itself, I get stuck on a page that says “Waiting for the website to respond” pretty often.
I tried different methods to fix this:
- Using Puppeteer with proxy rotation
- Adding the stealth plugin for puppeteer-extra
- Switching to puppeteer-real-browser with proxies
- Combining puppeteer-real-browser with FlareSolverr and proxy rotation
My Current Setup
Right now I am using FlareSolverr to make a request first. Then I take the cookies and user agent from that request and give them to Puppeteer. But I have two main problems:
-
FlareSolverr cannot solve the challenge - When it finds a Cloudflare challenge, it just times out and gives me an error message
-
Puppeteer still gets blocked - Even when FlareSolverr works and I copy the cookies and user agent, Puppeteer still hits the Cloudflare page
It seems like FlareSolverr might not be finding the challenge correctly, so it does not get the right cookies.
What am I missing here? Is there a better way to make Puppeteer work with Cloudflare protection?
let solverResponse;
let retryCount = 0;
const maxRetries = 3;
// Make request through solver service
while (retryCount < maxRetries) {
try {
const requestOptions = {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({
command: "get.request",
targetUrl: pageUrl,
timeout: 8000,
sessionToken: solverSession,
}),
};
let apiResponse = await fetch("http://localhost:8191/v1", requestOptions);
solverResponse = await apiResponse.json();
if (solverResponse?.status === "failed") {
throw new Error(solverResponse?.error ?? "Solver service failed");
}
break;
} catch (err) {
retryCount++;
if (retryCount === maxRetries) {
throw new Error(`Request failed after ${retryCount} tries: ${err.message}`);
}
await new Promise((wait) => setTimeout(wait, 2000));
}
}
if (!solverResponse) throw new Error("No response from solver service");
const sessionCookies = solverResponse.result.cookies;
const browserAgent = solverResponse.result.userAgent;
if (!browserAgent) throw new Error("Missing user agent");
if (sessionCookies.length > 0) {
await puppeteerBrowser.setCookie(
...sessionCookies.map((item) => ({ ...item, expires: item?.expiry || 0 }))
);
}
await currentPage.setUserAgent(browserAgent);
await sleep(Math.random() * 5000 + 2000);
await currentPage.goto(pageUrl, { waitUntil: "domcontentloaded" });
const pageHtml = await currentPage.content();
return pageHtml;