Sculpting Data For ML - The first act of Machine Learning

2020 was unprecedented. My heart goes out to all who had a difficult year 🤍 Losing the privilege to socialize as much and be outdoors, it gave Rishabh Misra an opportunity to do something that was always on our bucket list.


Excited to share that our book "Sculpting Data for ML: The first act of Machine Learning" launched worldwide on January 18, 2021 🚀 The book is a culmination of our experiences in Machine Learning and Data Science. It introduces the readers to the first act of Machine Learning, Dataset Curation which often does not gets its due limelight in Academia or Industry.

The book puts forward practical tips to identify valuable information from the extensive amount of crude data available at our fingertips. The step-by-step guide accompanies code examples in Python from the extraction of real-world datasets created by the authors themselves. It also illustrates ways to hone the skills of extracting meaningful datasets. In addition, the book dives deep into how data fits into the Machine Learning ecosystem and tries to highlight the impact good quality data can have on the Machine Learning system's performance.

The book is endorsed by leading ML experts from both academia and industry. It has forewords by:

What's Inside?

  • Significance of data in Machine Learning
  • Identification of relevant data signals
  • End-to-end process of data collection and dataset construction
  • Overview of extraction tools like BeautifulSoup and Selenium
  • Step-by-step guide with Python code examples of real-world use cases
  • Synopsis of Data Preprocessing and Feature Engineering techniques
  • Introduction to Machine Learning paradigms from a data perspective

This book is for Machine Learning researchers, practitioners, or enthusiasts who want to tackle the data availability challenges to address real-world problems. Folks interested can purchase either the Kindle or the Paperback version from Amazon → 📖

A lot of thoughtful efforts were put in the composition of the book. We would appreciate your take on it. Please share your thoughts with us and the potential readers, especially on the platform you used to access this book. In case of any other questions regarding the content of the book, feel free to reach us at @dataforml on Twitter or Instagram.

Written on January 24, 2021