Inputs, Features, and Targets
Explore the inputs, features, and targets for various state-of-the-art ML applications.
Feature magic in machine learning
Under the umbrella of machine learning, inputs, and features are highly correlated because the input (commonly called the dataset) can be transformed to get more insightful features. For example, the input will be the image of the person we want to identify for the facial recognition application. Features like color, distance between the eyes, the nose, the height, etc., can be extracted by applying different transformations over the input image. Finally, the target will be the name of the person. In order to create this application, we need to make our model learn the identities of people, and for that, we need to provide the model with a mapping function between the inputs and the identities.
Extracting the input’s insightful features helps the model learn faster and more accurately. Our model might use the picture’s brightness or standard deviation as a feature, or it might divide the image into small patches like the four-quadrant system. In this manner, we get local information that comes in handy when the model struggles to understand the whole picture.
How can we select the best features that can influence the results of a model in a more effective manner?
Feature selection process
In classical machine learning, feature selection was heavily dependent upon hand-engineered features, which took a lot of time. For instance, considering the color as an input feature, we might get useful results while identifying the model of a particular car, assuming the company used different colors for different models of cars. However, this feature won’t be handy while identifying shirts or trousers, as they’re available in almost all colors. Therefore, we can’t restrict our model to call an object a shirt just because it’s red.
In modern machine learning, the model is designed to identify the best features without human intervention. The latest machine learning algorithms focus on creating models without knowing much about the features. Further, these features are extracted from inputs automatically.
Custom features selection application
To create a perfect person identifier, we need to have a unique feature set such that those features remain consistent among all pictures of the same person, however, a clear difference in the feature set must be observed for a different person.
Machine learning algorithms analyze and process data to identify custom features, which are used to train models for classification, prediction, and other tasks. This improves the accuracy and reliability of models, helps distinguish between individuals, objects, and events, and improves understanding of complex systems and processes.
Therefore, the following application is designed in a way that allows the user to select features by clicking anywhere on two different persons’ faces. As a result, the application will output the difference between the same features but on two different faces.
1. Press the **Run** button and wait till the connection gets established. 2. Select the same set of features for both faces in order to observe the difference between them. 3. Select any feature point by clicking on the image of a person. 4. Hit **Enter** after selecting the features to move on to the next person. 5. Open the **Terminal** tab to observe the difference. 6. Run the `python3 /usercode/main.py` command to execute it again.
After playing with the application given above, we can easily realize that feature selection in itself is a time-consuming process.
Note: Selecting a larger number of features will result in a lower error.
Deepfake
Deepfake is a machine learning technique that’s commonly used to replace a person’s face with someone else’s face in a given image or video. The term targets in the context of deepfake refers to the individual pixels that make up the original image, and it’s worth noting that an original image usually consists of hundreds or even thousands of these individual pixels. The input is an image of a person’s face that will replace the original face in the image or video.
In the case of creating a deepfake GIF of a statue, a single image of the statue is sufficient to make it appear to move. However, multiple images of a person’s face can enhance the output results since more data is always welcome in machine learning.
Activity recognition
Activity recognition is a machine learning technique used to recognize a certain activity from a given sequence of input image frames. For an activity recognition algorithm, the inputs are more than a single image, conventionally. The output, on the other hand, is a textual string describing the activity in layman’s language.
Text-to-image generation
As the title indicates, certain open source models are available nowadays that take a caption in string form as input in advance and create a synthetic image to draft that captioned scenario. In other words, it means creating an image that corresponds to or illustrates the content of the given caption.
Click the text below to select any phrase and observe the effects: