IT Pro is supported by its audience. When you purchase through links on our site, we may earn an affiliate commission. Learn more

Google makes privacy-focused data analysis tool open source

Its differential privacy library has helped shape many of the company's core products

Google privacy

Google is launching an open source version of its internally used differential privacy library, allowing businesses and data scientists to generate insights from data while protecting the privacy of those to which it belongs.

Google's differential privacy library is used to make improvements to many of its core products, such as when Search knows how busy a business, such as a gym, is at certain times or how popular a dish is at a given restaurant.

Differential privacy is an approach to data science which involves taking large amounts of user data and obfuscating it with artificial data - enough to hide a user's true identity but not so much that insights can't be made using software-aided analysis.

Businesses can now use Google's library to start forming their own conclusions from big datasets without their customers losing trust in their brand, the company argues.

In addition to Search, Google has embedded differential privacy in products since 2014. RAPPOR (Randomised Aggregatable Privacy-Preserving Ordinal Response) was a Chrome privacy project designed to better safeguard users' security, find bugs, and improve the overall user experience while analysing user data.

Adding to the growing list of privacy-minded applications, TensorFlow privacy was introduced this year to help protect users from being identified when their data was being used to train AI algorithms.

Apple is another company that's been hot on embedding differential privacy into its work. Since 2016, the privacy mechanism has been used in its machine learning algorithms to analyse the plethora of data it takes from its customers' iPhones.

Data is becoming increasingly valuable, some experts even say its the most valuable commodity in the world and it's something that hackers can steal and sell on for profit.

In a world where data breaches are rife, protecting data and the user to whom it belongs can be a hugely significant factor when it comes to maintaining customer trust.

Unfortunately, not every company gets it right - even the big names. In the late 2000s, well-meaning Netflix aimed to improve its film recommendation algorithm by using supposedly de-anonymised data which eventually was found to not be sufficiently protected.

Researchers were able to reveal user identities form the large dataset and even pinpoint their political affiliation.

"This sort of thing should be worrying to us," said Matthew Green, cryptography professor at Johns Hopkins University in a blog post.

"Not just because companies routinely share data (though they do) but because breaches happen, and because even statistics about a dataset can sometimes leak information about the individual records used to compute it," he added. "Differential Privacy is a set of tools that was designed to address this problem."

One real-world benefit of a differential privacy approach relates to health research, as explained by Miguel Guevara, product manager, privacy and data protection office at Google.

"If you are a health researcher, you may want to compare the average amount of time patients remain admitted across various hospitals in order to determine if there are differences in care," he said.

"Differential privacy is a high-assurance, analytic means of ensuring that use cases like this are addressed in a privacy-preserving manner."

Featured Resources

Accelerating AI modernisation with data infrastructure

Generate business value from your AI initiatives

Free Download

Recommendations for managing AI risks

Integrate your external AI tool findings into your broader security programs

Free Download

Modernise your legacy databases in the cloud

An introduction to cloud databases

Free Download

Powering through to innovation

IT agility drive digital transformation

Free Download

Recommended

What is big data analytics?
Business strategy

What is big data analytics?

8 Jun 2022

Most Popular

Actively exploited server backdoor remains undetected in most organisations' networks
cyber attacks

Actively exploited server backdoor remains undetected in most organisations' networks

1 Jul 2022
Macmillan Publishers hit by apparent cyber attack as systems are forced offline
Security

Macmillan Publishers hit by apparent cyber attack as systems are forced offline

30 Jun 2022
Former Uber security chief to face fraud charges over hack coverup
data breaches

Former Uber security chief to face fraud charges over hack coverup

29 Jun 2022