I’m working on a project where I need to grab some JavaScript-generated cookies using a headless browser in Go. Then I want to use these cookies with the Colly scraper. Right now, I can export the cookies from the headless browser to a JSON file. But I’m stuck on how to add them to Colly’s cookie jar.
Here’s what I’ve got so far:
func getCookies() {
browserCookies := headlessBrowser.FetchCookies()
cookieData, err := json.Marshal(browserCookies)
// What to do next?
}
The cookieData looks something like this:
{
"name": "sessionID",
"value": "abc123",
"domain": "example.com",
"path": "/",
"expires": 1234567890,
"httpOnly": true,
"secure": true
}
How can I take this data and add it to Colly’s cookie jar? Any help would be awesome!
hey there! i had a similar issue before. what worked for me was using colly’s SetCookies() method. you can convert your json to http.Cookie objects and pass them to SetCookies(). something like:
c := colly.NewCollector()
httpCookies := convertJsonToHttpCookies(cookieData)
c.SetCookies(“https://example.com”, httpCookies)
hope that helps!
I’ve encountered this issue before in a project. Here’s a solution that worked for me:
First, parse your JSON cookie data into a slice of http.Cookie structs. You can use json.Unmarshal for this. Then, use Colly’s SetCookies method to add these cookies to the collector’s jar.
Here’s a code snippet to illustrate:
var cookies []*http.Cookie
err := json.Unmarshal(cookieData, &cookies)
if err != nil {
log.Fatal(err)
}
c := colly.NewCollector()
c.SetCookies("https://example.com", cookies)
This approach should seamlessly transfer your Go Rod cookies to Colly’s cookie jar. Remember to handle any potential errors and adjust the URL to match your target domain.
I’ve dealt with this exact problem in a recent project. Here’s what worked for me:
Instead of exporting to JSON, you can directly convert the Rod cookies to http.Cookie objects. Then use Colly’s SetCookies() method to add them to the cookie jar.
Here’s a quick example:
func transferCookies(page *rod.Page, c *colly.Collector) {
rodCookies, _ := page.Cookies()
httpCookies := make([]*http.Cookie, len(rodCookies))
for i, cookie := range rodCookies {
httpCookies[i] = &http.Cookie{
Name: cookie.Name,
Value: cookie.Value,
Domain: cookie.Domain,
Path: cookie.Path,
Expires: time.Unix(int64(cookie.Expires), 0),
HttpOnly: cookie.HTTPOnly,
Secure: cookie.Secure,
}
}
c.SetCookies(page.MustInfo().URL, httpCookies)
}
This approach avoids the JSON step entirely, making it more efficient. Just call this function after your Rod operations and before beginning your Colly scraping.