Google adds Python support to privacy-preserving data analysis tool
The addition of Python opens up the open-source differential privacy library to nearly half of all developers worldwide


Google has expanded its open-source differential privacy (DP) platform to support the Python programming language, widening availability to millions more developers and data analysts.
The announcement makes Python the fourth language supported by the project after initially launching in 2019 with support for C++, Java, and Google-created language Go, sometimes referred to as Golang.
It comes after Google reported a significant number of developers contacting the company expressing their interest in using the open-source library for their Python projects. Google worked for more than a year with OpenMined on Python support and said numerous projects have already used its DP library, including Australian developers who have accelerated scientific discoveries through analysing medical data in a private way.
DP is a system used by data analysts to preserve the privacy of the individuals whose data is used in an analysed data set. Work to develop strong DP dates back decades, but only in recent years have tech giants such as Google and Apple embraced the system.
One of the key areas of DP development for Google in the past year has been on providing a tool for developers within the library to fine-tune the 'epsilon' - a mathematical measure of privacy. Finding the optimum epsilon requires a great deal of trial and error to perfect and having a tool within the library that allows developers to make adjustments to yield a lower epsilon, which indicates a more private release, means individual projects are able to be tuned as privately as possible.
Google said now Python is supported, the DP library is now available to nearly half of all developers worldwide which means more developers and researchers will be able to analyse data and make new discoveries while preserving the privacy of users to whom the data belongs.
Python is among the most popular programming languages currently in use and won 'Language of the Year 2021' from the TIOBE index, which ranks programming languages based on their popularity. Python is useful for a wide range of programming activities but is especially well-known for its capabilities in data analysis, making it a natural progression for Google's DP library.
Get the ITPro daily newsletter
Sign up today and you will receive a free copy of our Future Focus 2025 report - the leading guidance on AI, cybersecurity and other IT challenges as per 700+ senior executives
As part of the launch, Google has released a new web-based product, pipleinedp.io, which allows any Python developer to analyse their dataset with differential privacy. Google also said it has seen organisations experimenting with new use cases such as showing a website's most visited web pages by country, in an anonymised fashion.
The library is compatible with leading large data processing engines, the Spark and Beam frameworks, and Google will be launching an additional tool to help users "visualise and better tune the parameters used to produce differentially private information".
"We encourage developers around the world to take this opportunity to experiment with differential privacy use cases like statistical analysis and machine learning, but most importantly, provide us with feedback," said Google announcing the news. "We are excited to learn more about the applications you all can develop and the features we can provide to help along the way.
"We will continue investing in democratising access to critical privacy-enhancing technologies and hope developers join us in this journey to improve usability and coverage. As we’ve said before, we believe that every Internet user in the world deserves world-class privacy, and we’ll continue partnering with organisations to further that goal."
What is differential privacy?
Differential privacy is a tool that has gained acclaim in recent years as data and identity protection have become focal points for researchers, businesses, and regulators alike.
Some argue it is fundamentally necessary in data analytics to preserve the privacy and hide the identity of people whose data is being analysed. For technology companies especially, it has been at the forefront of how their users expect them to handle the data they hold on others.
RELATED RESOURCE
Content syndication isn't dead, but your data processes might be
It's a new (lead) generation
DP works by adding 'controlled noise' to datasets so that people cannot be individually identified by the data they provide to the dataset. For example, if residents of a neighbourhood supplied data for analysis involving their salaries which were then represented as an average, and one resident left the neighbourhood, their salary information could be tied to their identity by looking at the difference in the data pre- and post-move.
Similarly, if two databases were analysed, one with a single data point on 50 people and one with a single data point on 51 people, the analysis results for both would have to be indistinguishable from each other to avoid identifying that 51st person, in order to qualify as differentially private.
Adding controlled noise to a dataset would remove the possibility of identifying an individual by skewing the statistics just enough to remove the element of identification, without significantly compromising the accuracy of the results.
All major Big Tech firms have embraced DP in different ways. Microsoft's AI Lab works with Harvard University on projects to facilitate DP-enabled research. Apple has used DP on its products since macOS Sierra and iOS 10, and Facebook and Amazon also have experience working with the system too.

Connor Jones has been at the forefront of global cyber security news coverage for the past few years, breaking developments on major stories such as LockBit’s ransomware attack on Royal Mail International, and many others. He has also made sporadic appearances on the ITPro Podcast discussing topics from home desk setups all the way to hacking systems using prosthetic limbs. He has a master’s degree in Magazine Journalism from the University of Sheffield, and has previously written for the likes of Red Bull Esports and UNILAD tech during his career that started in 2015.
-
Bigger salaries, more burnout: Is the CISO role in crisis?
In-depth CISOs are more stressed than ever before – but why is this and what can be done?
By Kate O'Flaherty Published
-
Cheap cyber crime kits can be bought on the dark web for less than $25
News Research from NordVPN shows phishing kits are now widely available on the dark web and via messaging apps like Telegram, and are often selling for less than $25.
By Emma Woollacott Published
-
Learning and operating Presto
whitepaper Meet your team’s warehouse and lakehouse infrastructure needs
By ITPro Published
-
Four ways AI is helping knowledge workers excel
Case Study From medical diagnostics to mining and exploration, many industries are using AI to make their workers more effective
By Sandra Vogel Published
-
AI inferencing with AMD EPYC™ processors
whitepaper Providing an excellent platform for CPU-based AI inferencing
By ITPro Published
-
How to help IT manage itself with autonomous operations
Whitepaper Using AI and automation to proactively adapt to business disruptions
By ITPro Published
-
Green Quadrant: Enterprise carbon management software 2022
Whitepaper Detailing the 15 most prominent carbon management software vendors to see if they fit your requirements
By ITPro Published
-
insideBIGData: Guide to energy
Whitepaper How big data can help energy companies manage intense disruption
By ITPro Published
-
Robotic process automation
Whitepaper A no-hype buyer's guide
By ITPro Published
-
The increasing need for environmental intelligence solutions
Whitepaper How sustainability has become a major business priority and is continuing to grow in importance
By ITPro Published