Labeled Data
Explore how to work with labeled data in machine learning by separating datasets according to class labels using scikit-learn. Learn to implement functions that categorize principal component data by class for effective data visualization and classification tasks.
We'll cover the following...
We'll cover the following...
Chapter Goals:
- Learn about labeled datasets
- Separate principle component data by class label
A. Class labels
A big part of data science is classifying observations in a dataset into separate categories, or classes. A popular use case of data classification is in separating a dataset into "good" and "bad" categories. For example, we can classify a dataset of breast tumors as either malignant or benign.
The code below separates a breast cancer dataset into malignant and benign categories. The load_breast_cancer function is part of the scikit-learn library, and its ...