Fingerprinting Threats of Tor Browser and Selenium

I’ve written about customizing PhantomJS to defend against fingerprinting but I never really looked at what affect Selenium (the driver that control lots of different Browsers) has on the fingerprintability of say something like the Tor Browser Bundle.

Some people at the Tor Project are looking into using tor-browser-selenium or the like to automate control of a Tor Browser instance. For tasks like trying to detect whether an exit is manipulating content to the user it would be useful to emulate exactly what a user’s experience would be with TBB.

If your goal is to make sure that automated requests look just like a normal user request, then Selenium’s has to prevent any fingerprintable differences.

Fingerprintable By Design

In the past, Selenium has been fingerintable… by design. Thanks to the W3C Webdriver specification, which says:

The webdriver IDL attribute of the Navigator interface must return the value of the webdriver-active flag, which is initially false.

This property allows websites to determine that the user agent is under control by WebDriver, and can be used to help mitigate denial-of-service attacks.

So Selenium should set the document.webdriver object to true.

You can still test this yourself: Use the tbselenium example below and drop to the JavaScript console of the browser. Check out window.navigation.webdriver and you might go “Oh no!.”

from tbselenium.tbdriver import TorBrowserDriver
with TorBrowserDriver("/home/USERNAME/.local/share/tor-project_en-US") as driver:
    driver.get('https://check.torproject.org')

But the mitigation is that FireFox currently blocks this by deleting (blocking?) that object when a page executes making it inaccessible to the page itself. So even if you had a malicious page execute arbitrary JavaScript, it can’t access navigator.webdriver value.

Selenium-Specific Browser Elements

What about other properties that are set? I wanted to compare the DOM between browsers. One is the normal Tor Browser Bundle and one is the same bundle using Selenium to control it. Below are the results diffing the DOMs between both browser instances:

< history	History { length: 2, scrollRestoration: "auto", state: null }
---
> history	History { length: 1, scrollRestoration: "auto", state: null }
< mozPaintCount	6
---
> mozPaintCount	5

This is showing is there are really no differences at all. mozPaintCount apparently is how many times the screen has been “painted”. I don’t know. The history differences is just because my normal TBB opens 2 tabs by default and the Selenium version doesn’t. That’s not something you’d be able to use.

Capabilities

The other area I wanted to look at was webdriver capabilities. Every webdriver must have a set of capabilities per the webdriver spec. These are properties like the user agent or the platform. Or it could control whether or not a feature has been enabled. Then there are “extension capabilities” which are browser specific. If tbselenium included custom extension capabilities it could affect the privacy of the browser.

Right now those capabilities for tbselenium the same ones as Firefox, with some defaults enabled * handlesAlerts: The browser can handle popup alerts * databaseEnabled: Apparently enabled to be able to load content on a page. * javascriptEnabled: Interesting choice by default but mirrors TBB defaults. * browserConnectionEnabled: This one is more concerning but I don’t know exactly what the implciations are.

The Selenium documentation says that browserConnectionEnabled determines

Whether the session can query for the browser’s connectivity and disable it if desired.

Does that mean turn off things like proxy settings or important network controls? I really don’t know.

Threats

From what I can see now, these are the types of threats you’d want to look out for:

WebDriver protocol listner: Selenium operates by implementing an HTTP service to communication between your script and the browser. If that ends up being accessible in some way to an attacker this would be an easy point of attack and probably really catastrophic overall since they could control your session.
DoS conditions: Adding another layer on top of the browser increases the attack footprint for a DoS condition which, in this case, would be an easy way to prevent visits from automated scanners. The most likely case being an unhandled exception in your code.
Misconfigured Capabilities: If the default capabilities change in the driver or Selenium changes them upstream or tbselenium changes them, you risk being able to be fingerprinted or worse. You can manually set these using the Selenium.DesiredCapabilities module.

There’s really only so much you can do. Browsers are fingerprintable especially when you think about the capabilties of machine learning. It’s just about limiting the attack surface as much as possible.