GUEST ESSAY: Why there’s no such thing as anonymity it this digital age

By Goddy Ray

Unless you decide to go Henry David Thoreau and shun civilization altogether, you can’t — and won’t — stop generating data, which sooner or later can be traced back to you.

Related: The Facebook factor

A few weeks back I interviewed a white hat hacker. After the interview, I told him that his examples gave me paranoia. He laughed and responded, “There’s no such thing as anonymous data; it all depends on how determined the other party is.”

App developers, credit card, telecommunication companies, and others use the term “anonymous data” because it sells. But anonymous data really doesn’t exist anymore

Every step online is recorded and stored – our interactions with devices, geolocation, voter registration, time stamps, etc. Machine learning (ML) is currently the leading technique to re-identify any data. Specifically-designed algorithms make pattern-recognition much faster and more efficient. Sometimes the accuracy of identifying is 90% and more.

De-anonymization

Actually, 63% of the population can be identified just by the combination of their gender, date of birth, and zip code.

“Anonymous” or “aggregated” large datasets are often released publicly. As a result, the development of de-anonymization tools is becoming increasingly more advanced. Here are a  few unexpected examples of supposedly anonymous data reversal:

•In 2016, the Australian government released what they called the “anonymous” (i. e. names and other identifying features were removed) medical data of 2.9 million people. Later, a team of researchers from the University of Melbourne discovered that the data could be used to pinpoint individuals and learn all about their medical history.

•Developers can be identified by their coding style with an attribution accuracy of up to 96% with 100 candidate programmers and 83% with 600.

•Mobile eye-tracking can be used to learn how much we understand while reading.

Vocal patterns help to find signals of post-traumatic stress disorder or even heart disease.

Metadata spying

Metadata is worthy of a separate examination. Despite warnings from cybersecurity academics, whistleblowers or former NSA and CIA agents, the general public shows no animosity toward metadata surveillance.

Perhaps this is due to government officials trying to smokescreen by saying “It’s just plain old metadata, don’t worry about it”; or vague laws which don’t clarify when we become subject to metadata surveillance.

Metadata can be used to reverse-engineer and pinpoint identity quite easily. Recently, researchers from Greece and the US examined location metadata. They presented LPAuditor, a tool to conduct “a comprehensive evaluation of the privacy loss caused by publicly available location metadata.” This research demonstrated that the exposure of such data causes serious privacy risks which can lead to de-anonymization or even physical threats.

Another example comes from the MIT and Boston University. Researchers showed that black box network metadata from Android devices, such as transmission time and size of packets, can be used to identify users’ web browsing history.

Unfortunately, it’s almost impossible to stop spreading metadata in all directions. When you send an email, you generate From, To, Cc and Timestamp metadata. When you take a picture, it has location, time and exposure metadata. When you go to secure (HTTPS) websites, the requested IP and domain are visible.

The consensus used to be that if the data is scrubbed, it can’t be used to identify individuals, and is hence suitable for analysis and marketing. However, this isn’t relevant anymore. De-anonymization can be shockingly accurate, yet users, policymakers, and businesses seem to struggle to adjust to the new reality.


LW contributor Goddy Ray is a content manager and researcher at Surfshark VPN. She’s a devoted security and privacy enthusiast with a focus on public education and communication. 

Share on FacebookShare on Google+Tweet about this on TwitterShare on LinkedInEmail this to someone