Home/Blog/Machine Learning/A Beginner’s Guide to TensorFlow: Building Machine Learning Model
Home/Blog/Machine Learning/A Beginner’s Guide to TensorFlow: Building Machine Learning Model

A Beginner’s Guide to TensorFlow: Building Machine Learning Model

Najeeb Ul Hassan
Jan 15, 2024
7 min read

The “divide and conquer” rule can make solving big and complicated tasks a breeze. Imagine we have a big puzzle and access to good friends who are willing to help. Each friend can solve a small part of the puzzle, and these small parts are combined once finished. This approach solves the problem in smaller steps to obtain the final result. In terms of machine learning, solving a large task is quite a resource-consuming and time-taking task. TensorFlow breaks a complex machine learning task into smaller tasks with each task processed in parallel. This makes the overall process efficient.

widget

TensorFlow is an open-source framework for implementing machine learning models. Additionally, the advent of deep learning-based neural networks has made TensorFlow more in demand in various domains like natural language processing, image, and audio recognition, etc. If you’re new to TensorFlow and want to train and evaluate ML and DNN models, you’re in the right place. This blog provides a step-by-step guide to TensorFlow on building machine learning (ML) and deep neural network (DNN) models using this powerful framework.

Getting started with TensorFlow#

A tensor is a multi-dimensional data structure representing the input, output, and intermediate data in TensorFlow computations. In TensorFlow, a graph represents a computation with nodes and edges. A node represents operations, and edges represent the data flow between these operations.

Building an ML model#

A machine learning model consists of a neural network with interconnected nodes organized into layers. Each node takes input from multiple nodes from the previous layer. Let’s take an example of a simple graph with an “add” operation as shown below:

A simple graph example
A simple graph example

In our simple example that depicts the “add” operation, layer one consists of two constants, a and b, and layer two consists of a single operation node c. The edges represent a flow of data between nodes. The first layer’s output becomes the second layer’s input, and this process continues until the final layer, which produces the network’s output.

Let’s see how we can create this simple graph in TensorFlow:

import tensorflow as tf
# Create a graph
graph = tf.Graph()
# Add operations to the graph
with graph.as_default():
# Create nodes/operations
a = tf.constant(5, name='a')
b = tf.constant(10, name='b')
c = tf.add(a, b, name='c')
# Run the graph in a session
with tf.Session(graph=graph) as sess:
# Run the session to execute the operations
result = sess.run(c)
print("Result:", result)
  • Line 1: Imports the tensorflow library.
  • Line 4: Creates a graph using the tf.Graph() function.
  • Lines 9–11: Defines two nodes a and b using the tf.constant() function to hold constant values of 5 and 10, respectively.
  • Line 11: Creates the c node using the tf.add() function to add values of a and b.
  • Line 14: Creates a session using tf.Session(graph=graph).
  • Line 16: Runs the session to execute the operations within the graph.

Creating a simple neural network#

Let’s build a simple neural network to implement an XOR logic gate with the following structure:

  • An input layer with two nodes representing the two inputs of the XOR gate.
  • A hidden layer with two nodes. This layer is called the hidden layer since it is not directly observable or accessible in terms of the input or output of the network.
  • An output layer with one node representing the network output.

The input-output relation of an XOR logic gate is depicted in the table below:

XOR Gate: Input-Output Relation

Input 1

Input 2

Output

0

0

0

0

1

1

1

0

1

1

1

0

Let’s build our neural network in TensorFlow:

import tensorflow as tf
import numpy as np
# Step 1: Prepare the training data
x_train = np.array([[0, 0], [0, 1], [1, 0], [1, 1]], dtype=np.float32) # Input features
y_train = np.array([[0], [1], [1], [0]], dtype=np.float32) # Target outputs
# Step 2: Define the model architecture
input_dim = 2
hidden_dim = 2
output_dim = 1
  • Lines 5–6: Defines the input-output relation for the XOR logic gate.
  • Lines 9–11: Defines the number of nodes for input, hidden, and output layers.

The resulting graph looks as follows:

Network for XOR logic gate with an input, a hidden and an output layer
Network for XOR logic gate with an input, a hidden and an output layer

Assigning weights and bias#

Now that we have a basic structure of the XOR neural network, we start by assigning weights and biases to the network. Weights are numerical values associated with the graph edges and determine the weightage of that connection in influencing the output. Biases are parameters associated with each node, excluding the nodes in the input layer. These introduce an offset to the weighted sum of inputs to a node.

Let’s assign weights and biases to our neural network:

# Define placeholders for input and output
x = tf.placeholder(tf.float32, shape=[None, input_dim], name='x')
y = tf.placeholder(tf.float32, shape=[None, output_dim], name='y')
# Define variables for weights and biases of the hidden layer
W_hidden = tf.Variable(tf.random_normal([input_dim, hidden_dim]), name='W_hidden')
b_hidden = tf.Variable(tf.zeros([hidden_dim]), name='b_hidden')
# Define variables for weights and biases of the output layer
W_output = tf.Variable(tf.random_normal([hidden_dim, output_dim]), name='W_output')
b_output = tf.Variable(tf.zeros([output_dim]), name='b_output')
  • Lines 6–7: Defines weights and biases of the hidden layer.

  • Lines 10–11: Defines weights and biases of the output layer.

Note: The initial weights assigned here are randomly chosen from a normal distribution, and all biases are zeros. We will see later how to update the values of weights and biases as we train our model.

The resultant network with weights and biases looks as follows:

Network for XOR logic gate with weights and biases
Network for XOR logic gate with weights and biases

As an example, the output of the first node in the hidden layer will be calculated as follows:

hidden  1=input 1×w1+input 2×w2+b1\text{hidden \ 1} = \text{input\ 1} \times w1 + \text{input\ 2} \times w2 + b_1

Applying the activation function#

An activation function is applied to the weighted sum of inputs and biases of the node to produce an output of the node. An XOR logic gate is a non-linear operation, and we must introduce non-linearity into the model to make the network learn a non-linear XOR logic. Hence, we use an activation function to introduce non-linearity.

For the hidden layer, we will use a sigmoid function as an activation function that provides a smooth and continuous non-linear transformation, and it is mathematically written as follows:

S(hidden  1)=11+e(hidden  1)S(\text{hidden \ 1}) = \frac{1}{1 + e^{-(\text{hidden \ 1})}}

For the output layer, we perform a linear transformation of the hidden layer outputs, weights, and bias as follows:

output=hidden 1×w5+hidden 2×w6+b3\text{output} = \text{hidden\ 1} \times w5 + \text{hidden\ 2} \times w6 + b_3

Let’s apply these two activation functions:

# Step 3: Apply activation functions
# Apply sigmoid activation to hidden layer
hidden_layer = tf.nn.sigmoid(tf.matmul(x, W_hidden) + b_hidden)
# Output layer
output_layer = tf.matmul(hidden_layer, W_output) + b_output
  • Line 4: Applies the sigmoid activation function to the nodes in the hidden layer.

  • Line 6: Applies the linear transformation to calculate the network output.

Defining the loss function and optimizer#

Now that we have completed one pass of the network, we need to check if the network’s output matches the desired output. To do so, we define a loss function. A loss function measures the dissimilarity between the predicted and expected output. This helps us to quantify the model’s performance. We will use a mean squared error (MSE) as a loss function that calculates the average squared difference between the predicted and expected output values.

Based on the output of the loss function, we now need to update the model’s parameters, like weights and biases. We will use gradient descent optimization algorithms to find the optimum values of weights and biases. A gradient descent tunes the parameters in the direction of the steepest descent of the loss function.

Let’s implement the loss function and the optimization algorithm into our machine learning model:

# Step 4: Define the loss function and optimizer
loss = tf.reduce_mean(tf.square(output_layer - y))
optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.1)
train_op = optimizer.minimize(loss)
  • Line 2: Defines the loss function to reduce the mean squared error between the output_layer and y. Here y is the expected output and output_layer is the predicted output.

  • Lines 5–6: Defines the optimization algorithm with parameter learning_rate. This defines the step size taken during each parameter update. A larger value of learning_rate can result in faster convergence but also risks overshooting the optimal value. Similarly, a smaller value of learning_rate results in slow convergence but is expected to provide a more precise result.

Training the model#

Now that everything is in place, it’s time to train our model using the training date defined in x_train and y_train:

# Step 1: Prepare the training data
x_train = np.array([[0, 0], [0, 1], [1, 0], [1, 1]], dtype=np.float32) # Input features
y_train = np.array([[0], [1], [1], [0]], dtype=np.float32) # Target outputs
# Step 5: Train the model
num_epochs = 1000
batch_size = 4
with tf.Session() as sess:
sess.run(tf.global_variables_initializer()) # Initialize variables
for epoch in range(num_epochs):
# Generate random mini-batches
indices = np.random.choice(len(x_train), batch_size, replace=False)
x_batch = x_train[indices]
y_batch = y_train[indices]
# Run optimization operation
_, current_loss = sess.run([train_op, loss], feed_dict={x: x_batch, y: y_batch})
  • Lines 2–3: Defines the training data that represents the input-output relation for the XOR logic gate.

  • Lines 7–8: Defines the training parameters num_epochs and batch_size.

  • Lines 13–17: Performs the training using training data by passing the data in batches of size batch_size to our model. This procedure is performed num_epochs times.

Let’s visualize the training procedure using the following illustration:

Training procedure
Training procedure

Validating the model#

Now is the time to validate our model to check if the predicted output is the same as the expected output. We can input once again our training data and compare the model’s output to the expected output:

# Make predictions
predicted_output = sess.run(output_layer, feed_dict={x: x_train})
print("Predicted outputs:", predicted_output)

The predicted output shows that our neural network has converged to the expected output of an XOR logic gate. The first and last predicted outputs are close to 0 and the second and third predicted outputs are close to 1.

Next steps#

This blog has briefly introduced TensorFlow and how we can build a machine learning model using tensors.

We encourage you to explore more activation and loss functions and practice building more complex machine learning models. You can also check out the following courses on Educative to learn machine learning:

Applied Machine Learning: Industry Case Study with TensorFlow

Cover
Applied Machine Learning: Industry Case Study with TensorFlow

In this course, you'll work on an industry-level machine learning project based on predicting weekly retail sales given different factors. You will learn the most efficient techniques used to train and evaluate scalable machine learning models. After completing this course, you will be able to take on industry-level machine learning projects, from data analysis to creating efficient models and providing results and insights. The code for this course is built around the TensorFlow framework, which is one of the premier frameworks for industry machine learning, and the Python pandas library for data analysis. Basic knowledge of Python and TensorFlow are prerequisites. To get some experience with TensorFlow, try our course: Machine Learning for Software Engineers. This course was created by AdaptiLab, a company specializing in evaluating, sourcing, and upskilling enterprise machine learning talent. It is built in collaboration with industry machine learning experts from Google, Microsoft, Amazon, and Apple.

3hrs
Intermediate
16 Challenges
2 Quizzes

Become a Machine Learning Engineer

Cover
Become a Machine Learning Engineer

Start your journey to becoming a machine learning engineer by mastering the fundamentals of coding with Python. Learn machine learning techniques, data manipulation, and visualization. As you progress, you'll explore object-oriented programming and the machine learning process, gaining hands-on experience with machine learning algorithms and tools like scikit-learn. Tackle practical projects, including predicting auto insurance payments and customer segmentation using K-means clustering. Finally, explore the deep learning models with convolutional neural networks and apply your skills to an AI-powered image colorization project.

105hrs
Beginner
17 Challenges
11 Quizzes

Machine Learning with NumPy, pandas, scikit-learn, and More

Cover
Machine Learning with NumPy, pandas, scikit-learn, and More

If you're a software engineer looking to add machine learning to your skillset, this is the place to start. This course will teach you to write useful code and create impactful machine learning applications immediately. From the start, you'll be given all the tools that you need to create industry-level machine learning projects. Rather than reading through dense theory, you’ll learn practical skills and gain actionable insights. Topics covered include data analysis/visualization, feature engineering, supervised learning, unsupervised learning, and deep learning. All of these topics are taught using industry-standard frameworks: NumPy, pandas, scikit-learn, XGBoost, TensorFlow, and Keras. Basic knowledge of Python is a prerequisite to this course. This course was created by AdaptiLab, a company specializing in evaluating, sourcing, and upskilling enterprise machine learning talent. It is built in collaboration with industry machine learning experts from Google, Microsoft, Amazon, and Apple.

15hrs
Intermediate
115 Challenges
8 Quizzes

Frequently Asked Questions

How do you make a ML model using TensorFlow?

  1. Install TensorFlow: pip install tensorflow.

  2. Import TensorFlow: import tensorflow as tf.

  3. Load and preprocess data: Normalizing data, extracting features, and splitting it into training and testing sets.

  4. Build the ML model: Define the architecture of your model. model = tf.keras.Sequential([ tf.keras.layers.Dense(128, activation=‘relu’, input_shape=(input_dim,)), tf.keras.layers.Dropout(0.2), tf.keras.layers.Dense(10, activation=‘softmax’) )]

  5. Compile the model: Configure the optimizer, loss function, and metrics. model.compile(optimizer=‘adam’, loss=‘categorical_crossentropy’, metrics=[‘accuracy’])

  6. Train the model: Train the model on training data. model.fit(X_train, y_train, epochs=10, batch_size=32, validation_data=(X_val, y_val))

  7. Hyperparameter tuning: Adjust the number of epochs, batch size, learning rate, and other hyperparameters to improve the model.

  8. Evaluate the Model: Once training is complete, evaluate the model. test_loss, test_accuracy = model.evaluate(X_test, y_test)

  9. Make predictions: Use the trained model to make predictions on new data. predictions = model.predict(new_data)


  

Free Resources