Build AI Chatbots with Open-Source LLMs, LangChain, and Streamlit/

...

Overview of Transformers

Learn how transformers revolutionized the field of deep learning.

We'll cover the following...

Transformer models in conversational AI
Understanding transformer architecture
- The encoder
- The decoder
Leveraging transformers in chatbot development
Deploying the chatbot in a real-world scenario

Over the last decades, multiple developments in the field of natural language processing (NLP) have resulted in achieving large language models (LLMs) and, in particular, the introduction of transformers. Transformers were introduced in the “Attention is All You Need” paper in 2017 by Ashish Vaswani et al.

Transformers revolutionized the field of deep learning, offering a modern architecture that outperforms the recurrent neural networks (RNNs) and long short-term memory (LSTM) networks which were widely used in deep learning. This architecture not only simplifies the structure of neural networks but also significantly reduces training time.

Press + to interact

Deep Neural Networks had already been in development for decades. In the 1990s, RNNs (recurrent neural networks) were conceived. A couple of years later, LSTMs (long short-term memory networks) were introduced in 1997. The concept of the basic attention mechanism became popular and utilized in neural network architectures around 2014, and it helped in improving the performance of various sequential models, including RNNs, LSTMs, and GRUs (Gated Recurrent Units). The transformer model was introduced in the paper “Attention is All You Need” in 2017. BERT (Bidirectional Encoder Representations from Transformers) was released by researchers at Google in 2018, and it became one of the first models to utilize the transformer architecture for NLP tasks. Transformer models are widely utilized, with many adaptations and improvements in 2018. Models such as GPT, T5, and others leverage and demonstrate the flexibility and effectiveness of the architecture. Transformers are utilized extensively in generative AI, as of 2020, with models such as GPT-3 showing amazing capabilities for generating human-like text.

Basically, Transformers process text by tokenizing words. Tokenization is the process of converting text into smaller units, or tokens, such as words or sub-words. This step is crucial for transforming natural language into a format that the model can process. These tokens are then transformed into vector representations using word embedding tables, allowing the model to understand and generate text. Transformers are implemented in many applications that we utilize on a daily basis, such as text completion features in smartphone messaging apps (next-word prediction and auto-correction).

Press + to interact

Once the text is embedded, the attention mechanism within the transformer model processes and interprets the input data, offering a more nuanced understanding and text generation capability. Essentially, the attention mechanism allows the model to focus on different parts of the input data when generating each word in the output by paying attention to the most relevant word at each step of the sequence. This is achieved by calculating how much importance each word in the input sequence should receive relative to other words when predicting a specific word in the output. The self-attention mechanism utilizes sets of queries, keys, and values derived from the input data to perform this calculation. As a result, transformers can understand context and the relationships between words. This ability to allocate attention across the input sequence allows transformers to generate responses that enhance the quality of interaction in applications such as chatbots.

The output of the self-attention mechanism is then passed through a feed-forward neural network to process that data before contributing the final output. In practical applications, such as when composing messages in a messaging app, a couple of words are suggested to the user. Under the hood, the sentence is sent to a neural network that predicts the next possible words with a probability vector, as shown below.

Press + to interact

Introduction to Building Chatbots

Understanding Transformers

Automating Contract Review with Transformer Models

Understanding Large Language Models (LLMs)

Data Collection and Preparation

Optimizing RAG Workflows with LangChain

Prompt Engineering and Retrieval Chains

Chatbot User Interface Development with Streamlit

Chatbot Integration and Evaluation

Capstone Project

Conclusion and Future Developments

Overview of Transformers

Transformer models in conversational AI

Understanding transformer architecture

The encoder