Home/Blog/Data Science/PyTorch vs. TensorFlow: The key differences that you should know
Home/Blog/Data Science/PyTorch vs. TensorFlow: The key differences that you should know

PyTorch vs. TensorFlow: The key differences that you should know

Nimra Zaheer
Feb 28, 2024
7 min read

Have you ever found yourself drowning in a sea of Python code written in PyTorch or TensorFlow? If you have, it might make you wonder, “Why do people always use these two frameworks for machine learning-related tasks?” Well, it’s like choosing between two heavyweight champions in machine learning. Let’s go on a journey to demystify the unique traits and distinctions that set these frameworks apart.

TensorFlow#

Google Brain developed TensorFlow as an open-source machine learning library for building and training various machine learning models, including deep learning models. TensorFlow provides a flexible and efficient platform for numerical computations and is particularly well-suited for developing and training deep neural networks.

Let’s explore computational graphs, tensors, and machine learning models in TensorFlow with code examples.

Computational graphs#

In TensorFlow, computations are represented as directed acyclic graphs or computational graphs. Nodes represent operations or computations. These include mathematical operations, variable assignments, or high-level operations like neural network layers. Each node performs a specific operation on its input data and produces an output. On the other hand, edges represent the flow of data between nodes. They carry tensors, which are multi-dimensional arrays, from one node to another. Edges also define the dependencies between nodes, indicating the order in which operations should be executed. Tensors are the data structures that flow through the edges of the graph. They represent the inputs, outputs, and intermediate results of operations. Tensors can have various ranks, such as scalars, vectors, matrices, or higher-dimensional arrays, and are the fundamental building blocks in TensorFlow.

The following example demonstrates the creation of a simple computational graph:

import tensorflow as tf
# Define nodes in the computational graph
a = tf.constant(2.0, name='a')
b = tf.constant(3.0, name='b')
c = tf.add(a, b, name='c')
# Print the computational graph
print("Computational Graph Nodes:")
print(a)
print(b)
print(c)

In this TensorFlow code snippet, nodes are defined in a computational graph where a and b are constant nodes with values 2.0 and 3.0, respectively. The node c represents their sum, and the script prints these nodes, providing a glimpse into the structure of the computational graph.

The following illustration demonstrates the creation of a simple computational graph of the code above:

Adding two tensors
Adding two tensors

Sessions and placeholders from TensorFlow 1.x are replaced by eager execution and the tf.data API in TensorFlow 2.x, which also supports static graphs. However, eager execution is the default mode. Eager execution allows for immediate evaluation of operations, eliminating the need for explicit sessions and making the code more Pythonic. The tf.data API simplifies data input pipelines, which makes the placeholders obsolete and provides a more efficient and flexible approach to handling input data.

Machine learning models#

TensorFlow enables the construction and training of machine learning models through its high-level APIs like Keras. Developers can easily define neural network architectures using these APIs, abstracting away many complexities associated with manual model implementation. Let’s take a look at a simple neural network that classifies images.

import tensorflow as tf
from tensorflow.keras import layers, models
from tensorflow.keras.datasets import mnist
# Load and preprocess the MNIST dataset
(train_images, train_labels), (test_images, test_labels) = mnist.load_data()
train_images, test_images = train_images / 255.0, test_images / 255.0 # Normalize pixel values to be between 0 and 1
# Build a simple neural network using the Keras Sequential API
model = models.Sequential([
layers.Flatten(input_shape=(28, 28)), # Flatten 28x28 images to a 1D array
layers.Dense(128, activation='relu'),
layers.Dropout(0.2),
layers.Dense(10, activation='softmax') # Output layer with 10 classes for digits 0-9
])
# Compile the model
model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
# Train the model on the training data
model.fit(train_images, train_labels, epochs=5, validation_data=(test_images, test_labels))
# Evaluate the model on the test data
test_loss, test_accuracy = model.evaluate(test_images, test_labels)
print(f'Test accuracy: {test_accuracy * 100:.2f}%')

In this TensorFlow code, a neural network is constructed using the Keras Sequential API to classify handwritten digits from the MNIST dataset. In lines 6–15, the dataset is loaded, normalized, and used to train a simple neural network with a flattened input layer, a hidden layer with ReLU activation, dropout regularization, and an output layer with softmax activation for 10 classes (digits 0–9). The model is compiled in lines 18–20 with the Adam optimizer and sparse categorical cross-entropy loss. In lines 23–27, after training for five epochs, the model is evaluated on the test data, and the test accuracy is printed.

PyTorch#

Facebook’s Fundamental AI Research (FAIR) lab developed PyTorch as an open-source machine learning library for building and training various machine learning models, especially deep learning ones. PyTorch is known for its dynamic computational graph, which makes it particularly suitable for research and experimentation. Let’s discuss some key features of PyTorch.

Tensors and NumPy integration#

Tensors in PyTorch are the fundamental data structures that behave similarly to NumPy arrays but have additional features such as GPU acceleration. PyTorch tensors seamlessly integrate with NumPy, making switching between the two libraries easy. This feature is crucial when working with existing codebases or datasets. Moving tensors between CPU and GPU with minimal code changes facilitates efficient computation on GPUs.

import torch
# Creating a tensor
x = torch.tensor([1, 2, 3])
# Converting to NumPy array
numpy_array = x.numpy()
print('pytorch array:', x)
print('numpy array:', numpy_array)

Here is the output for this snippet:

pytorch array: tensor([1, 2, 3])
numpy array: [1 2 3]
The output of creating a tensor and converting it to a NumPy array

In this PyTorch code snippet, a tensor x is created with values [1, 2, 3]. It is then converted to a NumPy array numpy_array using the .numpy() method (lines 3–6), facilitating seamless interoperability between PyTorch and NumPy data structures.

Dynamic computational graph#

PyTorch’s dynamic computational graph allows for defining and modifying the computation graph on the fly. This contrasts with static computation graphs used by some other deep learning frameworks. The dynamic nature of PyTorch’s computational graph is advantageous during model development and debugging because we can print, modify, or analyze the computation graph at runtime. The requires_grad attribute on tensors enables automatic differentiation, allowing the computation of gradients for backpropagation.

import torch
x = torch.tensor([1.0, 2.0, 3.0], requires_grad=True)
y = x**2
z = y.mean()
z.backward()
print(x.grad)

Here is the output for this snippet:

tensor([0.6667, 1.3333, 2.0000])
Output for creating a tensor and converting it to a NumPy array

In line 3, a tensor with values [1.0, 2.0, 3.0] marked for gradient tracking is defined. In lines 4–5, subsequent operations involving squaring (y = x**2) and taking the mean (z = y.mean()) are performed. In line 7, the backward() method calculates the gradients of z with respect to x, and x.grad prints the resulting gradients (line 8). This demonstrates automatic differentiation for computing gradients in PyTorch.

Here’s a similar Tensorflow code with static graph implementation:

import tensorflow as tf
# Define inputs
x = tf.Variable([1.0, 2.0, 3.0], dtype=tf.float32)
# Define computation
y = x ** 2
z = tf.reduce_mean(y)
# Build static computation graph
with tf.GradientTape() as tape:
tape.watch(x)
z = tf.reduce_mean(x ** 2)
# Perform backward pass
grads = tape.gradient(z, x)
# Print gradients
print("Gradients of x:", grads.numpy())

In this TensorFlow code, we first define the input tensor x as a TensorFlow variable (line 4). Then, we define the computation y = x ** 2 and z = tf.reduce_mean(y) similarly to the PyTorch code (lines 7 and 8).

However, in TensorFlow, we need to explicitly define the computation within the tf.GradientTape() context to record the gradients. Therefore, we use tape.watch(x) to tell TensorFlow to track the gradients with respect to x (line 12). Finally, we use tape.gradient(z, x) to compute the gradients of z with respect to x (line 16). The result should be similar to the PyTorch code, showing the gradients of x (line 19).

PyTorch’s autograd module provides automatic differentiation. It keeps track of operations performed on tensors and computes gradients with respect to input variables. By setting requires_grad=True on tensors, we signal PyTorch to track operations for gradient computation. During the backward pass (backward()), gradients are calculated and stored in the .grad attribute of the input tensors.

Machine learning algorithms#

As a dynamic and versatile deep learning library, PyTorch supports various machine learning algorithms catering to various tasks. Let’s take a look at a simple linear regression model implemented in PyTorch:

import torch
import torch.nn as nn
import torch.optim as optim
import matplotlib.pyplot as plt
# Step 1: Create a Synthetic Dataset
torch.manual_seed(42) # Setting a seed for reproducibility
# Generating synthetic data
X = 2 * torch.rand(100, 1)
y = 4 + 3 * X + 0.1 * torch.randn(100, 1)
# Step 2: Define the Model
class LinearRegressionModel(nn.Module):
def __init__(self):
super(LinearRegressionModel, self).__init__()
self.linear = nn.Linear(1, 1)
def forward(self, x):
return self.linear(x)
# Create an instance of the model
model = LinearRegressionModel()
# Step 3: Specify Loss Function and Optimizer
criterion = nn.MSELoss() # Mean Squared Error Loss
optimizer = optim.SGD(model.parameters(), lr=0.01) # Stochastic Gradient Descent
# Step 4: Training the Model
num_epochs = 1000
for epoch in range(num_epochs):
# Forward pass
outputs = model(X)
loss = criterion(outputs, y)
# Backward and optimize
optimizer.zero_grad()
loss.backward()
optimizer.step()
if (epoch+1) % 100 == 0:
print(f'Epoch [{epoch+1}/{num_epochs}], Loss: {loss.item():.4f}')
# Step 5: Make Predictions and Visualize
with torch.no_grad():
predicted = model(X)
plt.scatter(X.numpy(), y.numpy(), label='Original data')
plt.plot(X.numpy(), predicted.numpy(), 'r-', label='Fitted line')
plt.legend()
plt.show()

Here is the output for this snippet:

Epoch [100/1000], Loss: 0.2024
Epoch [200/1000], Loss: 0.1098
Epoch [300/1000], Loss: 0.0627
Epoch [400/1000], Loss: 0.0371
Epoch [500/1000], Loss: 0.0232
Epoch [600/1000], Loss: 0.0156
Epoch [700/1000], Loss: 0.0115
Epoch [800/1000], Loss: 0.0093
Epoch [900/1000], Loss: 0.0081
Epoch [1000/1000], Loss: 0.0074
Output for linear regression in PyTorch

The fitted line can be visualized as follows:

Visualizing the fitted line for linear regression in PyTorch
Visualizing the fitted line for linear regression in PyTorch

In this PyTorch code, a simple linear regression model is trained on a synthetic dataset. In lines 10–23, the dataset is generated with random values, and a linear regression model with one input and one output is defined. The mean squared error loss function and the stochastic gradient descent optimizer are specified in lines 26–27. The model is trained for 1000 epochs, with each epoch involving a forward pass, loss calculation, backward pass for gradient computation, and optimizer update (lines 30–43). The training progress, specifically the loss, is printed every 100 epochs (lines 42–43). Finally, in lines 46–52, the trained model is used to make predictions and visualize the original data along with the fitted regression line using Matplotlib.

Tensorflow vs. PyTorch#

Let’s dive into some key differences of both libraries:

  • Computational graphs: TensorFlow uses a static computational graph, while PyTorch employs a dynamic one. This impacts the flexibility and ease of debugging during model development.

  • Usability: PyTorch is often considered more intuitive and user-friendly, especially for those new to deep learning. TensorFlow’s learning curve can be steeper, but TensorFlow 2.0 and the adoption of Keras have made it more accessible.

  • Community: Both frameworks have large and active communities, but PyTorch gained popularity in the research community due to its dynamic nature. TensorFlow is widely adopted in the industry and has extensive tooling for deployment.

  • Model deployment: TensorFlow is often chosen for production deployments due to its static graph nature and better integration with TensorFlow Serving. PyTorch has made strides in deployment tools like TorchServe, but TensorFlow is still popular in production environments.

In summary, the choice between TensorFlow and PyTorch depends on personal preference, the nature of the project, and whether the focus is on production deployment or research and experimentation. Both frameworks are powerful and capable, and developers often find themselves comfortable with either based on their specific needs and experiences.

If you want to learn more about TensorFlow and PyTorch frameworks, look no further! Check out the following exciting courses available on the Educative platform:

 Applied Machine Learning: Industry Case Study with TensorFlow   

Cover
Applied Machine Learning: Industry Case Study with TensorFlow

In this course, you'll work on an industry-level machine learning project based on predicting weekly retail sales given different factors. You will learn the most efficient techniques used to train and evaluate scalable machine learning models. After completing this course, you will be able to take on industry-level machine learning projects, from data analysis to creating efficient models and providing results and insights. The code for this course is built around the TensorFlow framework, which is one of the premier frameworks for industry machine learning, and the Python pandas library for data analysis. Basic knowledge of Python and TensorFlow are prerequisites. To get some experience with TensorFlow, try our course: Machine Learning for Software Engineers. This course was created by AdaptiLab, a company specializing in evaluating, sourcing, and upskilling enterprise machine learning talent. It is built in collaboration with industry machine learning experts from Google, Microsoft, Amazon, and Apple.

3hrs
Intermediate
16 Challenges
2 Quizzes

Deep Learning with PyTorch Step-by-Step: Part I - Fundamentals 

Cover
Deep Learning with PyTorch Step-by-Step: Part I - Fundamentals

This course is designed to provide you with an easy-to-follow, structured, incremental, and from-first-principles approach to learning PyTorch. In this course, you’ll be introduced to the fundamentals of PyTorch: autograd, model classes, datasets, data loaders, and more. You will develop, step-by-step, not only the models themselves but also your understanding of them. You'll be shown both the reasoning behind the code and how to avoid some common pitfalls and errors along the way. By the time you finish this course, you’ll have a thorough understanding of the concepts and tools necessary to start developing and training your own models using PyTorch.

8hrs
Beginner
184 Playgrounds
20 Quizzes

  

Free Resources