I’ve written about customizing PhantomJS to defend against fingerprinting but I never really looked at what affect Selenium (the driver that control lots of different Browsers) has on the fingerprintability of say something like the Tor Browser Bundle.
Some people at the Tor Project are looking into using tor-browser-selenium or the like to automate control of a Tor Browser instance. For tasks like trying to detect whether an exit is manipulating content to the user it would be useful to emulate exactly what a user’s experience would be with TBB.
If your goal is to make sure that automated requests look just like a normal user request, then Selenium’s has to prevent any fingerprintable differences.
In the past, Selenium has been fingerintable… by design. Thanks to the W3C Webdriver specification, which says:
The webdriver IDL attribute of the Navigator interface must return the value of the webdriver-active flag, which is initially false.
This property allows websites to determine that the user agent is under control by WebDriver, and can be used to help mitigate denial-of-service attacks.
So Selenium should set the document.webdriver
object to true
.
You can still test
this yourself: Use the tbselenium example below and drop to the JavaScript
console of the browser. Check out window.navigation.webdriver
and you might
go “Oh no!.”
from tbselenium.tbdriver import TorBrowserDriver
with TorBrowserDriver("/home/USERNAME/.local/share/tor-project_en-US") as driver:
driver.get('https://check.torproject.org')
But the mitigation is that FireFox currently blocks this by deleting
(blocking?) that object when a page executes
making it inaccessible to the page itself.
So even if you had a malicious page execute arbitrary JavaScript,
it can’t access navigator.webdriver
value.
What about other properties that are set? I wanted to compare the DOM between browsers. One is the normal Tor Browser Bundle and one is the same bundle using Selenium to control it. Below are the results diffing the DOMs between both browser instances:
< history History { length: 2, scrollRestoration: "auto", state: null }
---
> history History { length: 1, scrollRestoration: "auto", state: null }
< mozPaintCount 6
---
> mozPaintCount 5
This is showing is there are really no differences at all. mozPaintCount
apparently is how many times the screen has been “painted”. I don’t know. The
history differences is just because my normal TBB opens 2 tabs by default and the
Selenium version doesn’t. That’s not something you’d be able to use.
The other area I wanted to look at was webdriver capabilities.
Every webdriver must have a set of capabilities
per the webdriver spec.
These are properties like the user agent or the platform. Or it could control
whether or not a feature has been enabled.
Then there are “extension capabilities” which are browser
specific. If tbselenium included custom
extension capabilities it could affect the privacy of the browser.
Right now those capabilities for tbselenium the same ones as Firefox,
with some defaults enabled
* handlesAlerts
: The browser can handle popup alerts
* databaseEnabled
: Apparently enabled to be able to load content on a page.
* javascriptEnabled
: Interesting choice by default but mirrors TBB defaults.
* browserConnectionEnabled
: This one is more concerning but I don’t know exactly what the implciations are.
The Selenium documentation
says that browserConnectionEnabled
determines
Whether the session can query for the browser’s connectivity and disable it if desired.
Does that mean turn off things like proxy settings or important network controls? I really don’t know.
From what I can see now, these are the types of threats you’d want to look out for:
Selenium.DesiredCapabilities
module.There’s really only so much you can do. Browsers are fingerprintable especially when you think about the capabilties of machine learning. It’s just about limiting the attack surface as much as possible.