DEEP TECH NEWS: Respecting individual rights by using ‘privacy preserving aggregate statistics’

By Byron V. Acohido

To sell us more goods and services, the algorithms of Google, Facebook and Amazon exhaustively parse our digital footprints.

Related: The role of ‘attribute based encryption’

There’s nothing intrinsically wrong with companies seeking to better understand their customers. However, over the past 20 years the practice of analyzing user data hasn’t advanced much beyond serving the business models of these tech giants.

That could be about to change. Scientists at NTT Research are working on an advanced type of cryptography that enables businesses to perform aggregate data analysis on user data — without infringing upon individual privacy rights.

I had the chance to visit with , senior scientist at NTT Research’s Cryptography & Information Security (CIS) Lab, to learn more about the progress being made on a promising concept called “privacy preserving aggregate statistics.”

Rising data privacy regulations underscores the need for such a capability, Boyle told me. And in the long run, the capacity to analyze our online behaviors in a much more inspired, respectful way could serve a much greater good than just fostering impulsive consumer purchases. For a full drill down, please view the accompanying videocast. Here are a few key takeaways:

Rising regulations

It’s not just the tech giants that have a strategic imperative to better understand user behaviors. Companies across all industries have long sought to better understand how consumers use their product and services; this guides their product improvements and can dictate future investments, often shaping the next big innovations.

Our smartphones, wearables, vehicles and buildings have come to be saturated with sensors that collect granular information about our daily activities and provide a wellspring of information about what we prefer and how we behave. However, this intensive ingestion of personal data points — in the absence of reasonable oversight — has triggered consumer anxiety, and rightly so.

This, in turn, has led to rising data privacy regulations. Europe’s General Data Protection Regulation (GDPR) and California’s Consumer Privacy Act (CCPA,) for instance, are two significant pieces of legislation aimed at protecting consumer privacy in the digital age. Both regulations have profound implications for companies seeking to collect and apply aggregate statistical analysis to consumer data.

GDPR requires companies to establish a legal basis for data processing as well as ensure that the aggregation and anonymization methods protect individual identities. Meanwhile, CCPA focuses on ensuring that personal information isn’t sold without the consumer’s knowledge or against their will.

Partitioning user data

So now the rub is this: companies yearn to extract useful insights from user data, yet many have lost sight of the fact that it’s going to become much more expensive for them to possess granular tracking details, going forward. This has led NTT Research to seek a way to enable businesses to perform aggregate data analysis on consumer data — with privacy built in, Boyle says.

Privacy preserving aggregate statistics revolves around partitioning sensitive user data into pieces, which each on their own tells nothing about the original, but we can perform meaningful computations on the pieces, which can eventually be recombined. Boyle explained how a private telemetry system can be set up to split sensitive user data into two segments in such manner.

One segment retains broad, general information, useful for tracking usage patterns; the other segment converts the individual’s private details into a  random sequence of zeros and ones. As more data pours in from other users the former gets aggregated to give shape to emerging patterns, while the latter remains incomprehensible, ensuring that individual privacy remains sacrosanct.

Beyond meeting compliance, this approach can improve the bottom line, she says, by significantly reducing the cost associated with collecting and storing sensitive personal data. In addition to developing and getting in position to supply the technology, Boyle says.

“The goal is to develop solutions that allow us to only learn aggregate information, while never touching the data of individuals, in some sense, by taking private information and splitting it into pieces,” she says. “The tricky part is designing this splitting procedure so that you can actually compute on these pieces separately.”

A greater good

In a world that’s becoming increasingly cautious about data privacy, this new twist to data analysis could help businesses comply with privacy regulations and temper consumer anxiety. It could also provide a means for businesses to gain data-driven insights in a more efficient, respectful, way.


Boyle pointed out how companies across all industries — healthcare, financial services, energy and consumer goods – could immediately leverage this new approach in way that would allow them to begin to extract much more useful insights from the data lakes of consumer data swelling somewhat randomly.

They’d be able to examine the steadily rising influx of consumer data at a summarized level and discover overall patterns and trends. NTT Research, for instance, has successfully tested advanced privacy-preserving computations on common benchmarking tools like histograms, mean vs. standard deviations, maximums vs. minimums, topmost common values and more.

That’s just a starting point. As the type of advanced cryptography moves into mainstream use, it has the potential to inspire innovators to leverage our digital footprints for more than just tweaking advertisements.

In one project, for instance, social scientists in Boston applied privacy-preserving computations to wages and benefits data for employees across several companies to determine whether there was a wage gap between males and females.

It’s not hard to imagine how privacy-preserving statistical analysis could help climatologists better understand energy usage patterns, or medical researchers track the spread of a disease.

“Being able to somehow combine this information and learn something globally across it can have tremendous power,” Boyle says. “It’s very exciting to be in a position where mathematical concepts like abstract algebra actually play a role in designing logical systems that help solve big problems.”

The transformation progresses. I’ll keep watch and keep reporting.


Pulitzer Prize-winning business journalist Byron V. Acohido is dedicated to fostering public awareness about how to make the Internet as private and secure as it ought to be.

(LW provides consulting services to the vendors we cover.)

Share on FacebookShare on Google+Tweet about this on TwitterShare on LinkedInEmail this to someone