Reinforcement learning solves problems where an agent needs to learn how to make the best decisions to maximize rewards through trial and error in an uncertain environment.
Key takeaways:
Supervised, unsupervised, and reinforcement learning represent the three major categories of machine learning (ML) techniques.
Supervised learning solves static problems with labeled datasets, unsupervised learning reveals insights from unstructured data, and reinforcement learning tackles dynamic environments where actions impact future rewards.
Supervised learning maps input to output, unsupervised learning groups inputs based on similarity, and reinforcement learning focuses on finding the best actions to maximize cumulative rewards over time.
Supervised learning algorithms, such as decision trees and linear regression, excel in prediction tasks where historical data is available.
Unsupervised learning algorithms, such as k-means clustering and hierarchical clustering, excel in identifying hidden patterns and natural groupings within data.
Reinforcement learning algorithms, like Q-learning and Deep Q-Networks, excel in environments where decisions affect long-term rewards. They learn through trial and error while addressing the exploration vs. exploitation dilemma.
The choice between supervised, unsupervised, and reinforcement learning depends on the availability of labeled data, the type of problem, and the learning environment.
Machine learning (ML) encompasses various techniques, each with unique approaches to solving different types of problems. Supervised, unsupervised, and reinforcement learning represent the three major categories. While supervised learning relies on labeled data to make predictions, unsupervised learning uncovers hidden patterns without labels, and reinforcement learning teaches agents to make decisions through trial and error. In this Answer, understand the differences between these methods that are crucial to selecting the right approach for specific tasks.
In supervised learning, the AI model is trained based on the given input and its expected output, i.e., the label of the input. The model creates a mapping equation based on the inputs and outputs and predicts the label of the inputs in the future based on that mapping equation.
Suppose we have to develop a model that differentiates between a cat and a dog. To train the model, we feed multiple images of cats and dogs into it with a label indicating whether the image is of a cat or a dog. The model tries to develop an equation between the input images and their labels. After training, the model can predict whether an image is of a cat or a dog, even if the image was previously unseen by the model.
Supervised machine learning primarily addresses regression and classification problems. Some examples of supervised machine learning applications are detecting whether a piece of news is real or fake and predicting whether the cancer tumors are malignant or benign.
Try out the “Fake News Detection Using scikit-learn” project to get hands-on practice with supervised learning.
Let’s see the code example for classifying the images using Keras. The following code trains the model with limited images (taken from the “Intel Image Classification” dataset) of seas and buildings. We’ll provide it with an unseen image from one of the two categories to see how well it predicts that image.
import tensorflow as tf from tensorflow.keras.preprocessing import image from tensorflow.keras.preprocessing.image import ImageDataGenerator import numpy as np import matplotlib.pyplot as plt import base64 imageSize = (250, 250) batchSize = 20 trainDirectory = 'archive/seg_train/seg_train' testDirectory = 'archive/seg_test/seg_test' generateTrainingData = ImageDataGenerator( rescale=1./255, rotation_range=25, width_shift_range=0.1, height_shift_range=0.1, shear_range=0.1, zoom_range=0.1, horizontal_flip=True, fill_mode='nearest' ) trainDataset = generateTrainingData.flow_from_directory( trainDirectory, seed=594, target_size=imageSize, batch_size=batchSize, class_mode='sparse' ) validationDataset = tf.keras.utils.image_dataset_from_directory( testDirectory, seed=594, image_size=imageSize, batch_size=batchSize ) classNames = list(trainDataset.class_indices.keys()) classCount = len(classNames) model = tf.keras.Sequential([ tf.keras.layers.Conv2D(20, 3, activation='relu', input_shape=(imageSize[0], imageSize[1], 3)), tf.keras.layers.MaxPooling2D(), tf.keras.layers.Conv2D(40, 3, activation='relu'), tf.keras.layers.MaxPooling2D(), tf.keras.layers.Conv2D(80, 3, activation='relu'), tf.keras.layers.MaxPooling2D(), tf.keras.layers.Flatten(), tf.keras.layers.Dense(80, activation='relu'), tf.keras.layers.Dense(classCount) ]) model.compile( optimizer='adam', loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True), metrics=['accuracy'] ) history = model.fit( trainDataset, validation_data=validationDataset, epochs=15 ) img = image.load_img('19763.jpg', target_size=imageSize) imgArray = image.img_to_array(img) imgArray = np.expand_dims(imgArray, axis=0) imgArray = imgArray / 255.0 predictions = model.predict(imgArray) predictedClassIndex = np.argmax(predictions) predictedClass = classNames[predictedClassIndex] plt.imshow(imgArray[0]) plt.title(predictedClass) plt.savefig('output.png') html = f''' <html> <body> <h1>Predicted Class: {predictedClass}</h1> <img src="data:image/png;base64,{base64.b64encode(open('output.png', 'rb').read()).decode('utf-8')}" alt="Output"> </body> </html> ''' with open('output.html', 'w') as file: file.write(html)
In unsupervised learning, the AI model is trained only on the inputs without their labels. The model classifies the input data into classes with similar features. Based on the similarity of its features with one of the classes, the input’s label is then predicted in the future.
Suppose we have a collection of red and blue balls, and we have to classify them into two classes. Let’s say all other features of the balls are the same except for their color. The model tries to find the dissimilar features between the balls on the basis of how the model can classify the balls into two classes. After the balls are classified into two classes depending on their color, we get two clusters of balls, one of blue color and one of red color.
Unsupervised machine learning is ideal for clustering and associative rule mining tasks, allowing for the identification of hidden patterns in data without relying on predefined labels.
The notable applications of unsupervised machine learning can be customer segmentation, to help businesses tailor their marketing efforts and personalized recommendation systems, like the ones in Netflix or Amazon that suggest personalized content to users based on their preferences.
Want to explore the concepts of unsupervised learning with a real-world application? Try out the “Customer Segmentation with K-Means Clustering” project.
Let’s see the code example of clustering in Python using the DBSCAN algorithm.
import numpy as npimport pandas as pdimport matplotlib.pyplot as pltfrom sklearn.datasets import make_classificationfrom sklearn.cluster import DBSCAN# Create a random dataset with 1000 samples and 2 featuresX, _= make_classification(n_samples=1000,n_features=2,n_informative=2,n_redundant=0,n_clusters_per_class=1,random_state=4)df = pd.DataFrame(X)print(df.shape)# # Define the modeldbscan_model = DBSCAN(eps=0.35,min_samples=16)# # Train the modeldbscan_model.fit(df)# #Visualize the clustersplt.figure(figsize=(10,10))plt.scatter(df[0],df[1],c = dbscan_model.labels_,s=15)plt.title('DBSCAN Clustering',fontsize=20)plt.xlabel('Feature 1',fontsize=14)plt.ylabel('Feature 2',fontsize=14)plt.show()
In reinforcement learning, the AI model tries to take the best possible action in a given situation to maximize the total profit. The model learns by getting feedback on its past outcomes.
Consider the example of a robot that is asked to choose a path between A
and B
. In the beginning, the robot chooses either of the paths as it has no past experience. The robot is given feedback on the path it chooses and learns from this feedback. The next time the robot gets into a similar situation, it can use feedback to solve the problem. For example, if the robot chooses path B
and gets a reward, i.e., positive feedback, this time the robot knows that it has to choose path B
to maximize its reward.
Reinforcement learning is used to solve exploration and exploitation problems. In this method, an agent learns to make decisions through trial and error to maximize rewards.
Its applications are widespread, from robotics to autonomous driving to healthcare, and the list goes on. Some examples of its applications can be creating an autonomous driving experience or training a two-legged robot to walk without falling.
Get hands-on with the concepts of reinforcement learning with the “Train an Agent to Self-Drive a Taxi Using Reinforcement Learning” project.
Here is a sample simulation to see how reinforcement learning gets feedback and learns policy.
Criteria | Supervised Learning | Unsupervised Learning | Reinforcement Learning |
Input Data | Input data is labelled. | Input data is not labelled. | Input data is not predefined. |
Problem | Learn pattern of inputs and their labels. | Divide data into classes. | Find the best reward between a start and an end state. |
Solution | Finds a mapping equation on input data and its labels. | Finds similar features in input data to classify it into classes. | Maximizes reward by assessing the results of state-action pairs |
Model Building | Model is built and trained prior to testing. | Model is built and trained prior to testing. | The model is trained and tested simultaneously. |
Applications | Deals with regression and classification problems. | Deals with clustering and associative rule mining problems. | Deals with exploration and exploitation problems. |
Algorithms Used | Decision trees, linear regression, K-nearest neighbors | K-means clustering, k-medoids clustering, agglomerative clustering | Q-learning, SARSA, Deep Q Network |
Examples | Image detection, Population growth prediction | Customer segmentation, feature elicitation, targeted marketing, etc | Drive-less cars, self-navigating vacuum cleaners, etc |
Supervised, unsupervised, and reinforcement learning all have important roles in machine learning. Supervised learning works best when you have labeled data, helping to make accurate predictions. Unsupervised learning helps find hidden patterns in data that aren’t labeled. Reinforcement learning is great for situations where an agent learns the best actions through trial and error in changing environments. In this Answer, we understood the strengths of each type and learned how to tackle real-life problems, from sorting and grouping data to making smart decisions.
Haven’t found what you were looking for? Contact Us
Free Resources