I’m having trouble with a Ruby script that’s supposed to run headless in a cron job. Here’s the deal:
I wrote a script using the Headless gem and Watir-WebDriver to scrape data from an admin dashboard. It works fine when I run it manually, but it’s not playing nice with cron.
The weird part is that Firefox still pops up when the script runs. I thought Headless was supposed to prevent that. Am I missing something?
Here’s a simplified version of my code:
require 'watir-webdriver'
require 'headless'
headless = Headless.new
browser = Watir::Browser.start 'http://example.com/admin'
# Login and data scraping steps here
browser.close
headless.destroy
puts "Data grabbed at #{Time.now}"
hey, ive faced this too. try calling headless.start before newing the browser and check if xvfb’s running. sometimes cron jobs need its proper setup. if that fails, switching to selenium with chrome in headless mode might do the trick.
Also, check your $DISPLAY environment variable in the cron context. Sometimes it’s not set correctly, causing the headless mode to fail. You can try explicitly setting it in your cron job:
Have you considered using Capybara with Poltergeist instead? It’s been more reliable for me in cron jobs. Here’s a quick example:
require 'capybara/poltergeist'
Capybara.register_driver :poltergeist do |app|
Capybara::Poltergeist::Driver.new(app, js_errors: false)
end
session = Capybara::Session.new(:poltergeist)
session.visit 'http://example.com/admin'
# Login and scraping logic here
session.driver.quit
puts "Data grabbed at #{Time.now}"
This approach uses PhantomJS under the hood, which is truly headless. It might resolve your Firefox issues and work better with cron. Just ensure PhantomJS is installed on your system. Also, double-check your cron environment variables, particularly PATH, to make sure all necessary binaries are accessible.