Introduction: scikit-learn and Model Evaluation

Get introduced to our topics for this chapter: scikit-learn, a popular Python library for machine learning, and model evaluation.

Overview

This chapter introduces the core functionality of scikit-learn for training models and making predictions, through simple use cases of logistic and linear regression. Evaluation metrics for binary classification models, including true and false positive rates, the confusion matrix, the receiver operating characteristic (ROC) curve, and the precision-recall curve, are demonstrated both from scratch and using convenient scikit-learn functionality. By the end of this chapter, you’ll be able to build and evaluate binary classification models using scikit-learn.

Getting started with scikit-learn

In this chapter, we will finish our exploration of the data by examining the response variable. After we’ve concluded that the data is of high quality and makes sense, we will be ready to move forward with developing machine learning models. We will take our first steps with scikit-learn, one of the most popular machine learning packages available in the Python language, and start to get comfortable with the syntax for using mathematical models.

Model evaluation techniques

We will also learn some common techniques for answering the question, “Is this model good or not?” There are many possible ways to approach model evaluation.

Get hands-on with 1300+ tech skills courses.