Introduction
Get an overview of the course and its prerequisites.
We'll cover the following
This course is meant to raise awareness of the idea that no machine learning (ML) pipeline is impervious to faults. There are tons of ways things can go wrong, and when models fail, they fail violently.
In this course, we present a series of informative lectures about the dangers of data and models. We focus on the aspects of the ML pipeline that are (more or less) the direct responsibility of engineers and data scientists: the data and the model.
This course is a mix of theory and practice. While we’ll discuss the theoretical basis of how bias enters data/models and how solutions work to remove bias, we’ll also cover several real-world case studies and address industry standards.
Intended audience
This course is targeted toward interested learners with at least an intermediate knowledge of AI/ML. There will be some mathematical formulations, but these won’t be the main focus. If you understand the general process of ML, you’ll be able to learn a lot from this course.
Prerequisites
This course requires at least an intermediate knowledge of ML. If you need a refresher, Educative has several great courses on the topic.
It will also help to have some mathematical background, specifically linear algebra. Some Python knowledge will also be useful because there are a few practical examples scattered throughout the lessons.
Importance
This course covers a critical topic that’s often neglected in other tutorials or learning mediums. ML is not like a standard software product—there’s no such thing as a “deploy and ignore” strategy. Because ML models constantly learn and evolve, there needs to be constant governance and accountability to catch models if and when things go wrong. In the past, there have been data breaches, biased algorithms, and much more that could have easily been mitigated with proper techniques and care.