Chunking: Entity and Relationship Extraction with LLMs at Scale

Understanding the problem: Token limit in LLMs

Tokens are units of text that LLMs use to break down and analyze input. These tokens can represent individual words, sequences of characters, or combinations of words and punctuation. For example, the word "chatbot" would be treated as a single token. A sentence like "OpenAI is amazing!" might be split into multiple tokens depending on how the model processes it. Each large language model has a limit on the maximum number of tokens it can process in a single request.

Let's take the example of GPT-4 (8k version) model that we are using in our codes. Token limit for this model is 8,192 tokens. This number includes input tokens as well as output tokens.

  • Input tokens include prompts, which consist of system and user-level messages. In our case, the user-level message also includes raw text.

  • Output tokens include the response generated by the model.

If we send an input prompt that is 5,000 tokens long, the model only has 3,192 tokens left for generating a response (because the sum of input + output cannot exceed the limit). If we don't account for the output token size, the model might truncate the response or not generate a response at all because the total token limit is exceeded. We can either shorten the input to leave enough room for a meaningful response or estimate the expected size of the response based on the nature of our prompt.

If we want to extract entities and relationships from a large text that exceeds the total token limit allowed by a model in a single request, we need to split our input into manageable chunks. Let’s see how it works.

Solution to the token limit: Chunking

Chunking refers to splitting the input into smaller pieces. It works by:

  • Breaking the input text into smaller chunks, each fitting within the token limit.

  • Processing each chunk separately for entity and relationship extraction.

  • Combining the results from multiple chunks to form a complete knowledge graph.

Get hands-on with 1400+ tech skills courses.