Recommendations for a Headless Browser or Browser Control with Multithreading and Proxy Support for Ajax/JavaScript?

I have explored various options for browser controls or headless browsers that can support multithreading and allow proxy configuration for each thread while handling Ajax and JavaScript functionalities. Here’s what I have tested so far:

  • I found awesomium to be decent, but I need to create a separate application for each thread, which is not ideal.
  • The simplebrowser performs adequately, yet it has limitations regarding JavaScript and Ajax manipulation.
  • nhtmlunit proved too complex to deploy effectively in C# and didn’t function optimally.
  • watin doesn’t allow for an independent session instantiation or proxy settings per instance.

I remember testing several others, but these were my primary candidates. Although awesomium would be excellent with multithreading capability, I understand that it might be added in future updates.

My main objective is to automate testing on a local intranet site.

UPDATE: Unlike watin, it seems that selenium webdriver requires a remote server to operate outside of the application itself, and I’m still searching for alternatives.

UPDATE2: I managed to achieve multithreading using chromiumdriver with Selenium in C#, but it consumes an excessive amount of memory due to spawning multiple drivers and browsers.

Here’s a snippet that demonstrates firing multiple threads:

_tasks = new List<Task>();
foreach (ProxyEntry entry in _proxyList)
{
    string proxyAddress = entry.Proxy;
    string domain = entry.Domain;
    _tasks.Add(Task.Run(() =>
    {
        FetchIpWithChrome(proxyAddress, domain);
    }));
}

The method to fetch the IP looks like this:

private void FetchIpWithChrome(string proxyAddress, string domain)
{
    string directoryPath = Application.StartupPath + "\\" + domain;
    ChromeOptions options = new ChromeOptions();
    options.AddArguments(new string[] { "-proxy-server=" + proxyAddress, "-incognito", "--new-window", "-user-data-dir=" + directoryPath });
    IWebDriver webDriver = new ChromeDriver(options);
    webDriver.Navigate().GoToUrl("http://checkip.dyndns.com/");
    if (webDriver.PageSource.Contains("Current IP Address"))
        MessageBox.Show(webDriver.FindElement(By.TagName("body")).Text, domain);
    else
        MessageBox.Show("Failed to retrieve IP", domain);
    webDriver.Quit();
}

Are there any techniques or alternatives to optimize memory usage for multithreading in this context?

To optimize memory usage while maintaining support for Ajax/JavaScript, try these options:

  • Use PuppeteerSharp: Consider PuppeteerSharp for a more memory-efficient headless browser option enhanced with Node.js support. It handles JavaScript and Ajax much like Chrome, and proxy handling is also available. You can control it via Task parallelism.
  • Remote WebDriver Instances: To reduce memory consumption, consider running WebDriver instances remotely on a lower-memory server or a Docker container. This can free up system resources on the primary machine.
  • Shared WebDrivers with Session Management: Opt for a session manager that allows sharing a smaller pool of WebDriver instances across threads. This way, browser instances don't reload with each request, reducing overhead.

These tweaks should lower resource usage and align with your multithreading requirements more efficiently.