Privacy Preserving Machine Learning ๐Ÿ”

Lately, Iโ€™ve been working on #PrivacyPreservingML ๐Ÿ” I got looped in some projects after Apple launched AppTrackingTransparency (ATT) framework, requiring iOS apps to ask permission to share usersโ€™ data w/ 3rd parties. This has triggered an industry-wide discussion on best practices to respect user privacy.

ML models are as good as the data we feed them ๐Ÿฃ In the online world, it's tempting to use data from a user's each & every move by sharing info across companies to make the models smarter -- in the guise of โ€˜serving the users betterโ€™. #DataCentricAI

But the use of every data point w/o protecting it has led to cyber attacks, reverse engineering, and violations of sensitive data like personal conversations, financial transactions, medical history, etc. This is where #PrivacyPreservingML comes into play!


To put it simply:

Privacy-Preserving ML = Data Privacy + Model Privacy


Data Privacy = Using authorized data + not exposing training data & its source + protecting input & output data. Check out this โ€œsecret sharerโ€ paper -- certain sequences (for eg. credit card numbers, SSN) can be unintentionally memorized!

Model Privacy = Protect the model and its nuances (aka parameters) from the public eye. Apart from Intellectual Property loss, using it to reverse engineer inputs from the outputs can prove to be a dangerous problem in many sensitive situations.

There are many methods to ensure that the data cannot be stolen by a third party or how to ensure model privacy. Might come back to these #PrivacyPreservingML techniques some other time, until then CIAO ๐Ÿ‘‹๐Ÿป

On that note, if you are someone who is still getting started on #DataForML and need a boost to your #MachineLearning dataset building skills using #Python and all things #OpenSource! Check out @DataForML on @amazon ๐Ÿ›’



Written on July 10, 2022