How do you keep up with the countless innovations and state-of-the-art R&D. happening?
How do you keep up with the countless innovations and state-of-the-art R&D. happening?
Did you know that the ML benchmark datasets we heavily rely on like MNIST, ImageNet, CIFAR, etc., can have thousands of errors in their labels? ๐ฑ Check outย labelerrors.comย for a few examples.
I just read this very interesting article -- it talks about the life of human (especially women) data annotators based in small towns and villages in India. They are the lifeline behind the most unglamorous part of the AI pipeline: data annotation, or labeling. I also learnt that India is one of the worldโs largest markets for data annotation labor!
Tonight's explorations led me to the โจgoldโจ standard of mitigating the leakage of data in #ML -- #DifferentialPrivacy. The idea is to add very subtle statistical noise (in the dataset) to make it impossible to infer information about an individual data point.
Lately, Iโve been working on #PrivacyPreservingML ๐ I got looped in some projects after Apple launched AppTrackingTransparency (ATT) framework, requiring iOS apps to ask permission to share usersโ data w/ 3rd parties. This has triggered an industry-wide discussion on best practices to respect user privacy.
DATA is the new oil๐ข๏ธ As #DataCentric approaches to #ML gather traction, access to diverse, comprehensive, and more importantly quality data has been the talk of the town. Along these lines, it's important to understand what does QUALITY really means in the context of DATA ๐ข๐งต๐๐ป
Hey folks ๐๐ป For those who missed the talk by @AndrewYNg on #DataCentric approach to #MachineLearning, which aligns with our mission @DataForML, here is a quick recap ๐งต๐๐ป
Machine Learning models are as good as the data they consume๐ดData impacts performance, fairness, robustness & scalability of #ML Systems. If not taken care of, it leads to a TON of tech debt over time in a corporate setting, downstream effects of which are termed as DATA CASCADES ๐ ๐งต๐๐ป
I have been professionally working as a Machine Learning Engineer since more than 2 years now and also, recently co-authored a book titled โSculpting Data for ML: The first act of Machine Learningโ. My past few experience have taught me that data does not get its due limelight in #MachineLearning as compared to complex model architecture. Keeping up with 'more data beats clever algorithms, but better data beats more data', here are top 5 tips for polishing the dataset to effectively solve #ML problems ๐ค๐๐ป
2020 was unprecedented. My heart goes out to all who had a difficult year ๐ค Losing the privilege to socialize as much and be outdoors, it gave Rishabh Misra an opportunity to do something that was always on our bucket list.