Data Correction with Cleanlab 🧼

Did you know that the ML benchmark datasets we heavily rely on like MNIST, ImageNet, CIFAR, etc., can have thousands of errors in their labels? 😱 Check out labelerrors.com for a few examples.

Last week, I attended Snorkel AI's conference on #DataCentricAI - where one of my favorite talks was Cleanlab: AI for Correcting Errors in Any Dataset!

It was fascinating to know that what started as a graduate school project from an idea of finding an erroneous label in a benchmark dataset - of using #ConfidentLearning to find and fix label errors in ML datasets, is now being used to correct data at a large scale by companies like Microsoft, Tesla, and Google is fab 👏🏻

Also, cleanlab.ai is #OpenSource, no-code, and an automated solution!


Written on August 13, 2022