Learn about transformer networks, self-attention, multi-head attention, and spatiotemporal transformers in this course, focusing on their applications in computer vision and deep learning.

cv_updated.tar.gz

DETR

Transformers

DETR2

This is a comprehensive course on vision transformers and their use cases in computer vision. You’ll begin by exploring the rise of transformers and attention mechanisms and their role in deep neural networks. 
You’ll gain insights into self-attention mechanisms, multi-head attention, and the pros and cons of transformers building a strong foundation. Next, you’ll discover how transformers reshape image analysis. Comparing self-attention with convolutional encoders and understanding spatial vs. channel vs. temporal attention, you’ll grasp nuances in applying transformer architectures to visual data. 

The course also explores spatiotemporal transformers, bridging the gap between static images and dynamic data. After completing this course, you’ll have the knowledge and skills to leverage transformer networks across diverse applications in deep learning and artificial intelligence.

Transformers for Computer Vision Applications

Learn to simulate and recognize actions in video frames using spatio-temporal transformers with a concise Python code example.

Introduction

Overview of Transformer Networks

Neural Machine Translation with a Transformer and Keras

Transformers in Computer Vision

Vision Transformer for Image Classification

Transformers in Image Classification

Fine-Tuning Vision Transformers for Image Classification

Transformers in Object Detection

Transformers in Semantic Segmentation

Spatio-Temporal Transformers

Object Detection with Vision Transformers

Wrap Up

Spatio-Temporal Transformers

Spatial and temporal relations in video analysis