Transformers Applied to Computer Vision
Learn about how transformers are applied to computer vision.
Overview
This course is about NLP, not computer vision. However, in the previous lessons of this chapter, we implemented general-purpose sequences that can be applied to many domains. Computer vision is one of them.
The title of the article by Dosovitskiy et al. (2021) says it all: “
Google has made vision transformers available in a Jupyter notebook.
The Jupyter notebook Compact_Convolutional_Transformers.ipynb
(under the “Code playground” section) is self-explanatory. You can explore it to see how it works. However, bear in mind that when Industry 4.0 reaches maturity and Industry 5.0 kicks in, the best implementations will be obtained by integrating our data on Cloud AI platforms. Local development will diminish, and companies will turn to Cloud AI without bearing local development, maintenance, and support.
Some code contents
The notebook’s table of contents contains a transformer process we have gone through several times in this course. However, this time, it’s simply applied to sequences of digital image information.
The notebook follows standard deep learning methods. It shows some images with labels with this code:
Get hands-on with 1400+ tech skills courses.