This course demystifies convolutional neural network architectures using PyTorch for image classification and object detection.

Image Classification and Object Detection using CNNs-01.png

pytorch.tar.gz

Basics_Of_CNN-ex1

Basics_Of_CNN-ex2

Basics_Of_CNN-vgg16

Basics_Of_CNN-inceptionv1

Basics_Of_CNN-residuals

Basics_Of_CNN-depthwise

Train-data

Train-training

Train-trainingFT

Train-transfer

deploy-onnx

deploy-openvino

deploy-comparison

prepare_dataset_openimages

train_yolov7

inference_yolov7

onnx_yolov7

Image classification and object detection have gained widespread use in recent years. Content categorization and monitoring, disease diagnosis from medical images, identifying terrain in satellite images, and detecting road elements for self-driving cars are classification problems at their core. PyTorch is a popular framework for these tasks—offering a useful mix of user-friendliness, deep learning functionalities, customization, and optimization.

In this course, you will cover the fundamentals of classification and object detection models and apply them to actual datasets using PyTorch. You’ll learn popular architectures and how to implement and fine-tune them for better results. Finally, you’ll learn to convert models to ONNX and OpenVINO to deploy in edge devices.

By the end of this course, you will have acquired the necessary skills to be able to use PyTorch for image identification and object detection in real-world applications

Using PyTorch for Image Classification and Object Detection

## Summary

We learned all the popular image classification architectures and the novelties they brought us. Let's do a quick review to sum up:

* **AlexNet and VGG:** These are architectures that use only standard convolutions with different kernel sizes.

* **InceptionV1:** It uses a network-in-network approach to make deeper networks in smaller sizes and with fewer model parameters. Additionally, InceptionV2 brings batch normalization usage to the convolutional layers.

* **ResNet:** It discovers residual blocks to handle vanishing gradient problems in deep neural networks. 

* **MobileNetV1:** It uses depth-wise separable convolutions to tackle model size and parameter amount

* **MobileNetV2:** It uses inverted residual blocks with model size and parameter amount to tackle again.

* **EfficientNet:** It uses compound scaling to obtain better-scaled models while reducing the model size and parameter amount.

## How to choose the model?

Choosing which model to use depends on our task. If we have limited memory or want our model size to be small, we could go for MobileNet or EfficientNet since we can play with its scaling until we find a good fit. If accuracy is our priority rather than model size, a deep ResNet architecture would be reasonable to use. Or, knowing all the fundamental components of a convolutional neural network, we could create our custom model according to our needs. 

# Summary

We learned all the popular image classification architectures and the novelties they brought us. Let's do a quick review to sum up:

* **AlexNet and VGG:** These are architectures that use only standard convolutions with different kernel sizes.

* **InceptionV1:** It uses a network-in-network approach to make deeper networks in smaller sizes and with fewer model parameters. Additionally, InceptionV2 brings batch normalization usage to the convolutional layers.

* **ResNet:** It discovers residual blocks to handle vanishing gradient problems in deep neural networks. 

* **MobileNetV1:** It uses depth-wise separable convolutions to tackle model size and parameter amount

* **MobileNetV2:** It uses inverted residual blocks with model size and parameter amount to tackle again.

* **EfficientNet:** It uses compound scaling to obtain better-scaled models while reducing the model size and parameter amount.

# How to choose the model?

Choosing which model to use depends on our task. If we have limited memory or want our model size to be small, we could go for MobileNet or EfficientNet since we can play with its scaling until we find a good fit. If accuracy is our priority rather than model size, a deep ResNet architecture would be reasonable to use. Or, knowing all the fundamental components of a convolutional neural network, we could create our custom model according to our needs. 

Review the image classification architectures of the module with their top features.

Image Classification Architectures Summary

Before We Start

Basics of Convolutional Neural Networks

Popular Neural Network Architectures for Image Classification

Using PyTorch for Image Classification

Model Deployment

Basics of Object Detection

Two-Stage Object Detection Architectures

One-Stage Object Detection Architectures

YOLOv7 Model Train and Inference on Edge

Conclusion

Appendix

Image Classification Architectures Summary

Summary

How to choose the model?