mk1.tar.gz

data_preprocessing

dataset

decision_tree_classification

feature_extraction

feature_selection

gradient_boost

kmeans

knn_classification

sk_lr_classification

sk_lr_regression

sk_metrics

sk_missing_value

sk_naive_bayes_classification

sk_nn

sk_parameter_search

sk_pca

sk_pipeline

sk_rf

sk_tsne

sk_svm_classification

jupyter_job

python_updated

sk_naive_bayes_classification-copy

Scikit-Learn is a powerful library that provides a handful of supervised and unsupervised learning algorithms. If you’re serious about having a career in machine learning, then scikit-learn is a must know.

In this course, you will start by learning the various built-in datasets that scikit-learn offers, such as iris and mnist. You will then learn about feature engineering and more specifically, feature selection, feature extraction, and dimension reduction.

In the latter half of the course, you will dive into linear and logistic regression where you’ll work through a few challenges to test your understanding. Lastly, you will focus on unsupervised learning and deep learning where you’ll get into k-means clustering and neural networks.

By the end of this course, you will have a great new skill to add to your resume, and you’ll be ready to start working on your own projects that will utilize scikit-learn.

Hands-on Machine Learning with Scikit-Learn

import sklearn.datasets as datasets
from sklearn.model_selection import train_test_split, GridSearchCV
from sklearn.ensemble import GradientBoostingClassifier

X, y = datasets.load_breast_cancer(return_X_y=True)

train_x, test_x, train_y, test_y = train_test_split(X,
                                                    y,
                                                    test_size=0.2,
                                                    random_state=42)

gb = GradientBoostingClassifier(random_state=10)

param_grid = [{
    "n_estimators": [1, 2, 4, 16, 32],
    "learning_rate": [0.05, 0.1, 0.2, 0.4],
    "min_samples_leaf": [1, 2, 4, 8],
}]

cv = GridSearchCV(gb, param_grid=param_grid, scoring="f1", n_jobs=4)
cv.fit(train_x, train_y)

print("The best F1-score is {}.".format(cv.best_score_))
print("The parameter of best estimator is {}.".format(cv.best_params_))

In this lesson, we explain the solution to the last challenge lesson.

Challenge Solution Review

Preliminaries

Working with Datasets

Feature Engineering

General Concepts

Linear Regression

Logistic Regression

Support Vector Machine

Tree Model and Ensemble Method

Unsupervised Learning

Deep Learning

Others

What's Next

Challenge Solution Review