Problem-Solving Through Machine Learning

Some problems cannot be solved by simple rule-based programming. These are pattern identification problems, as we have also seen in the previous lessons. The galaxy type identification is one such problem. We have also seen that the human brain inherently follows a certain learning process to solve these pattern identification problems. Machine learning assumes that computer-based programs need to be able to demonstrate some brain-like intelligence to solve pattern identification problems successfully.

Let’s revise the pipeline of how humans learn from data and apply this knowledge to identify unknown patterns around them.

Press + to interact
Human learning pipeline
Human learning pipeline

For machines to solve the same problem, they have to learn from data and identify patterns in it.

Data

In our galaxy identification problem, we looked at a series of images labeled with their types. So, for a machine to understand the concept of spiral galaxies, it first needs to “see” some data and gather some experience. For our example, as we have already seen, this dataset contains labeled images of spiral and non-spiral galaxies.

Press + to interact

Let’s closely observe the set of images shown above. It has some images that are clearly spiral, while some of them are clearly not spiral. Some images, however, are not as spiral-looking as one would expect and yet are labeled as spiral. It is important to have some variation in the data for the machine to learn these underlying features of objects from it that can help differentiate between the various instances of the same object. In our example of galaxies, we would want the computer program (machine) to see different images of spiral galaxies taken from different angles and having slightly different characteristics.

Model

If we refer to the pipeline shown above, we see that once humans have seen enough data and examples of images of galaxies, they learn and store a representation consisting of significant features from it in their brains. The brain is where the learning takes place. It is a massive network of tiny cells called neurons. Each neuron receives input from sensory organs and “fires” an output if that input is above a certain threshold. This output becomes the input for the next neuron in the network, and so on.

Neurons communicate with each other through connections whenever they receive input signals. This is how learning takes place in the brain, by setting the weight of specific neural pathways within the brain upon seeing a specific input.

If we want to introduce intelligence in machines, it only seems intuitive to have a component in computer programs that is able to learn important and distinguish features from data. This component can act as the “brain” of the program. We also call it the model. The model stores the learned representation from data that is later used to identify patterns in similar-looking images.

Training the model

In our previous example, you were asked to look at some images of galaxies and label them. All these images were labeled with their respective types. After having looked at these images multiple times and also their corresponding labels, you had built a representation of a spiral galaxy in your mind.

This process of machines/programs repeatedly looking at the data to determine what features are important in differentiating one type of object from another is called training a model.

Training involves showing the model different instances of the same object, in our case, spiral galaxies, so that the model learns a representation from this data. During this process, the model learns from its mistakes.

Prediction

All this sounds good up till now. However, how do we know that our trained model is any good at identifying new images of galaxies that it has never seen before?

The whole point of learning is to be able to predict.

Just like you were asked to label new galaxy images in the previous lesson, the trained model in our machines also needs to be tested on new galaxy images. How well the model has trained on data depends on how accurately it identifies new, unseen images of galaxies. The larger and more diverse the set of images the model is able to look at, the better it will get at identifying them.

This set of new images is also called the test set. If the model correctly predicts galaxy types for all the new images in the test set, we say that it is 100% accurate. However, just like humans make mistakes when identifying all instances of objects they have seen or known, machines can also make mistakes when predicting the types of new images.

Evaluation

Finally, we now move on to test our program for accuracy. Let’s see how we can represent it in numbers. One way to measure the performance and accuracy of our program correctly is to represent it as the number of correctly predicted images out of the total number of images in the test set. So, for example, if our program correctly identifies the galaxy types from 7 out of 10 new images, we can say that it is 70% accurate.

An acceptably good prediction accuracy means that generalized learning has taken place and the model has not just rote-learned the data that it saw in the training phase.

Machine learning-based problem-solving

What we have just seen above are the steps of a new paradigm for solving complex pattern identification problems by machines. We call it the ML way of problem-solving! Click the “Next" button to walk through each step of the machine learning process.

Get used to these five steps of the machine learning process. As an ML engineer, you will need to think in these terms almost always for all sorts of ML-related problems.

Classifying the galaxy images via machine learning

We can look at our machine learning-based program that successfully identifies galaxy types from images. We can see that the images that our simple rule-based program was unable to identify previously are correctly identified by our ML program. Run the following program, which is based on the five steps of the ML process:

Press + to interact
# Data
img_data, tags = prepare_data()
# Model
model = get_model()
# Training
trained_model = train_model(model, img_data, tags)
# Prediction
get_predictions(trained_model)

We can see that the ML-based program can now correctly classify both the galaxy image without the star and the dimmed galaxy image as spiral galaxies. This was not the case with our simple comparison-based code in the previous lesson.

Do not worry about what is happening inside each step for now. We just wanted to see if ML works or not. The next chapter will take us inside each of these five steps in detail. Keep moving through to the next lesson to internalize this machine learning way of thinking about problem-solving.