Learn machine learning with scikit-learn, covering supervised learning, clustering, regression, SVMs, autoencoders, and ensemble methods through practical Python projects.

ml.tar.gz

Playground

Playground-SPA

Complexity CH-1

regulizer CH-1

regulizer CH-1-S

regulizer ch1_v2

complexity ch1_v2

code_widget

ST_kernel_trick-LIVE

python

pytorch

ST_Autoencoders

code_widget_torchsummary

code_widget_tqdm

SPA_tqdm

AE_multiple

SPA_server

Implementation of NN

c1l3

Implementation of NN-copy

This course focuses on core concepts, algorithms, and machine learning techniques. It explores the fundamentals, implements algorithms from scratch, and compares the results with scikit-learn, the Python machine learning library. This course contains examples, theoretical knowledge, and codes for various ML algorithms.

You’ll start by learning the essentials of machine learning and its applications. Then, you’ll learn about supervised learning, clustering, and constructing a bag of visual words project, followed by generalized linear regression, support vector machines, logistic regression, ensemble learning, and principal component analysis. You’ll also learn about autoencoders and variational autoencoders and end with three exciting projects.

By the end, you’ll have a solid understanding of machine learning and its algorithms, hands-on experience implementing such algorithms and applying them to different problems, and an understanding of how each algorithm works with the provided examples.

Fundamentals of Machine Learning: A Pythonic Introduction

## Single target example
It’s possible to reformulate generalized linear regression to incorporate the kernel trick. For example, the loss function $L(\bold w)$ for generalised linear regression with a single target is as follows:

$$
L(\bold w)= \|\phi(X) \bold w-\bold y\|_2^2 + \lambda \|\bold w\|_2^2
$$

>**Note:** 
>$$\bold w^T\bold w = \|\bold w\|_2^2$$

Setting the derivative of the loss with respect to $\bold w$ to $\bold 0$ results in the following:

$$
\begin{align*}
& \phi(X)^T(\phi(X)\bold w-\bold y)+\lambda \bold w  = \bold 0 \\

& \bold w  = -\frac{1}{\lambda}\phi(X)^T(\phi(X)\bold w-\bold y) \\
& \bold w  = \phi(X)^T\bold a \\


\end{align*}
$$
Here, $\bold a=-\frac{1}{\lambda}(\phi(X)\bold w-\bold y)$.

### Reparameterization
We can now parametrize the loss function with parameter vector $\bold a$ by replacing $\bold w$ with $\phi(X)^T\bold a$, as follows:

$$
\begin{align*}
L(\bold a)&= \|\phi(X) \phi(X)^T\bold a-\bold y\|_2^2 + \lambda \|\phi(X)^T\bold a\|_2^2 \\

&= \|\phi(X) \phi(X)^T\bold a-\bold y\|_2^2 + \lambda \bold a^T \phi(X)\phi(X)^T\bold a \\

&= \|K\bold a-\bold y\|_2^2 + \lambda \bold a^T K\bold a \\
\end{align*}
$$

### Closed-form solution
Setting the derivative of the loss $L(\bold a)$ with respect to $\bold a$ to $\bold 0$ results in the following:

$$
K^T(K\bold a - \bold y)+\lambda K \bold a = \bold 0 
$$
As the Gram matrix $K$ is symmetric, that is, $K^T=K$, so the above equation can be written as follows:
$$
\begin{align*}
& K(K\bold a - \bold y)+\lambda K \bold a = \bold 0 \\

& K(K\bold a - \bold y + \lambda \bold a) = \bold 0 \\

& (K + \lambda I)\bold a = \bold y \\

& \bold a = (K + \lambda I)^{-1} \bold y
\end{align*}
$$

### Prediction
Once $\bold a$ is computed, the prediction $\hat y_t$ on an input vector $\bold x_t$ can be made as follows:

$$
\begin{align*}
\hat y_t &= \bold w^T \phi(\bold x_t)\\

&= \bold a^T \phi(X) \phi(\bold x_t) \\
&= \begin{bmatrix}a_1 & a_2 & \dots & a_n\end{bmatrix} \begin{bmatrix}\phi(\bold x_1)^T\phi(\bold x_t) \\ \phi(\bold x_2)^T\phi(\bold x_t) \\ \vdots \\ \phi(\bold x_n)^T\phi(\bold x_t)\end{bmatrix} \\ \\

&= \begin{bmatrix}a_1 & a_2 & \dots & a_n\end{bmatrix} \begin{bmatrix}k(\bold x_1,\bold x_t) \\ k(\bold x_2,\bold x_t) \\ \vdots \\ k(\bold x_n,\bold x_t)\end{bmatrix}

\end{align*}
$$

## Implementation
We now implement the generalized linear regression for a single target using the kernel trick.




# Single target example
It’s possible to reformulate generalized linear regression to incorporate the kernel trick. For example, the loss function $L(\bold w)$ for generalised linear regression with a single target is as follows:

$$
L(\bold w)= \|\phi(X) \bold w-\bold y\|_2^2 + \lambda \|\bold w\|_2^2
$$

>**Note:** 
>$$\bold w^T\bold w = \|\bold w\|_2^2$$

Setting the derivative of the loss with respect to $\bold w$ to $\bold 0$ results in the following:

$$
\begin{align*}
& \phi(X)^T(\phi(X)\bold w-\bold y)+\lambda \bold w  = \bold 0 \\

& \bold w  = -\frac{1}{\lambda}\phi(X)^T(\phi(X)\bold w-\bold y) \\
& \bold w  = \phi(X)^T\bold a \\


\end{align*}
$$
Here, $\bold a=-\frac{1}{\lambda}(\phi(X)\bold w-\bold y)$.

## Reparameterization
We can now parametrize the loss function with parameter vector $\bold a$ by replacing $\bold w$ with $\phi(X)^T\bold a$, as follows:

$$
\begin{align*}
L(\bold a)&= \|\phi(X) \phi(X)^T\bold a-\bold y\|_2^2 + \lambda \|\phi(X)^T\bold a\|_2^2 \\

&= \|\phi(X) \phi(X)^T\bold a-\bold y\|_2^2 + \lambda \bold a^T \phi(X)\phi(X)^T\bold a \\

&= \|K\bold a-\bold y\|_2^2 + \lambda \bold a^T K\bold a \\
\end{align*}
$$

## Closed-form solution
Setting the derivative of the loss $L(\bold a)$ with respect to $\bold a$ to $\bold 0$ results in the following:

$$
K^T(K\bold a - \bold y)+\lambda K \bold a = \bold 0 
$$
As the Gram matrix $K$ is symmetric, that is, $K^T=K$, so the above equation can be written as follows:
$$
\begin{align*}
& K(K\bold a - \bold y)+\lambda K \bold a = \bold 0 \\

& K(K\bold a - \bold y + \lambda \bold a) = \bold 0 \\

& (K + \lambda I)\bold a = \bold y \\

& \bold a = (K + \lambda I)^{-1} \bold y
\end{align*}
$$

## Prediction
Once $\bold a$ is computed, the prediction $\hat y_t$ on an input vector $\bold x_t$ can be made as follows:

$$
\begin{align*}
\hat y_t &= \bold w^T \phi(\bold x_t)\\

&= \bold a^T \phi(X) \phi(\bold x_t) \\
&= \begin{bmatrix}a_1 & a_2 & \dots & a_n\end{bmatrix} \begin{bmatrix}\phi(\bold x_1)^T\phi(\bold x_t) \\ \phi(\bold x_2)^T\phi(\bold x_t) \\ \vdots \\ \phi(\bold x_n)^T\phi(\bold x_t)\end{bmatrix} \\ \\

&= \begin{bmatrix}a_1 & a_2 & \dots & a_n\end{bmatrix} \begin{bmatrix}k(\bold x_1,\bold x_t) \\ k(\bold x_2,\bold x_t) \\ \vdots \\ k(\bold x_n,\bold x_t)\end{bmatrix}

\end{align*}
$$

# Implementation
We now implement the generalized linear regression for a single target using the kernel trick.




Learn to implement kernel linear regression for a single target.

Kernel Linear Regression

Course Overview

Supervised Learning

Detect Cyber Intrusion Using Machine Learning

Clustering

Project: Bag of Visual Words

Generalized Linear Regression

Face Recognition Using Kernel Linear Discriminant

Support Vector Machine

Logistic Regression

Ensemble Learning

Early Stage Diabetes Prediction Using Ensemble Learning

Decoding Dimensions: PCA and Autoencoders

Image Reconstruction Using PCA

Image Colorization using Autoencoders

Colorful Face Generation with VAEs

Appendix

Wrapping Up

How to Predict the Traffic Volume Using Machine Learning

Kernel Linear Regression

Single target example

Reparameterization