Exploring Open-Source LLMs
Explore the difference between open-source and closed-source LLM models.
Introduction to AI democratization
The concept of AI democratization represents a shift in the landscape of technological innovation by making advanced AI technologies available to a wider base of users beyond the large tech corporations and specialized research institutions. This movement is essential in the realm of large language models because of the complexity and resource requirements of developing and training such large models. AI democratization seeks to empower individuals, startups, and academic researchers with the tools they need to explore, innovate, and contribute to the field of AI.
Open-source LLMs are at the heart of this movement serving as a catalyst for innovation and collaboration. By providing access to pretrained models and the source code, open-source initiatives encourage an environment where knowledge and resources are shared freely. This approach not only reduces the financial obstacles associated with AI development and training but also promotes transparency in AI research and applications. As a result, open-source LLMs facilitate AI advancements in many sectors, such as healthcare, education, and environmental science.
Closed-source vs. open-source LLMs
Open source refers to the practice of sharing the original source code of software with the public, allowing anyone to inspect the code, modify it, or enhance it. This is essential for collaboration purposes, whereby collaborative development can lead to more reliable, secure, and efficient products. What makes an open-source open is the license under which it is released, which dictates the usage, permissions, and distribution rights. Building open-source software or models means that resources such as the source code, model architecture, and documentation are available freely to the public. The open-source initiative emerged as a countermeasure to the constraints of proprietary software, encouraging freedom in the development and use of software and technology. This has led to the creation of communities and organizations that support open-source initiatives, such as the Apache Software Foundation.
On the other hand, closed-source LLMs are propriety models that can be accessed through online platforms or APIs. Neither the source code is shared with the wider audience, nor are the algorithms used revealed, nor is the training data that was utilized detailed. Closed-source LLMs are paid models either per token or as subscription-based models.
LLM Models
LLM | Released | Maintainer | License | Accessible via | Architecture | Params (Billions) | Token length |
AutoGPT | 01/03/2023 | OpenAI | MIT | GitHub | Encoder - Decoder | 175 -> 1000 | 8,192 |
BERT | 01/10/2018 | Apache 2.0 | Google Cloud | Encoder | 340 | 512 | |
BLOOMChat | 01/05/2023 | SambaNova & Together Computer | BLOOMChat-176B LICENSE v1.0 | Hugging Face | Decoder | 176 | NA |
Cerebras-GPT | 01/03/2023 | Cerebras | Apache 2.0 | Hugging Face | Decoder | 0.111 - 13 | 2,048 |
Claude | 01/03/2023 | Anthropic | N/A | Anthropic | NA | NA | 100,000 tokens |
DLite (v2) | 01/05/2023 | AI Squared | Apache 2.0 | GitHub, Hugging Face | NA | 0.124 - 1.5 | 1,024 |
Dolly 2.0 | 01/04/2023 | Databricks | Apache 2.0 | Hugging Face | NA | NA | 2,048 |
Falcon-40B | 01/05/2023 | Technology Innovation Institute (TII) | TII Falcon LLM License | Hugging Face | Decoder | 40 | 2,048 |
Falcon-180B | 01/09/2023 | Technology Innovation Institute (TII) | FALCON 180B TII License | Hugging Face | Decoder | 180 | 3,500 |
FastChat-T5 | 01/04/2023 | LMSYS | Apache 2.0 | GitHub, Hugging Face | NA | 3 | 512 |
FinLLM | 01/06/2023 | AI4Finance Foundation | MIT | GitHub(FinGPT) & GitHub(FinNLP) | NA | NA | NA |
GPT-3.5-Turbo | 01/08/2023 | OpenAI | No | OpenAI API | NA | 154 | 4,096 |
GPT-J-6B | 01/06/2023 | EleutherAI | MIT | Hugging Face | NA | 6 | 2,048 |
GPT2 | 01/02/2019 | OpenAI | MIT | GitHub, Hugging Face | Decoder | 0.117 - 1.542 | 1,024 |
GPT3 | 01/05/2020 | OpenAI | No | OpenAI API | Decoder | 175 | 4,096 |
GPT4 | 01/03/2023 | OpenAI | No | OpenAI API | Decoder | > 1000 | 8,192 |
GPT4All-J | 01/06/2023 | Nomic AI | Apache 2.0 | Hugging Face | NA | 6 | NA |
h2OGPT | 01/05/2023 | h2o.ai | Apache 2.0 | GitHub, ChatBot (Hugging Face) | NA | NA | 256 & 2,048 |
Llama | 01/02/2023 | Meta | GPL 3 | Meta AI | Decoder | NA | 2,048 |
Llama-2 | 01/07/2023 | Meta | LLAMA Community License | Meta AI | NA | 7B, 13B, 70B | 4,096 |
Megatron-LM | 01/10/2019 | NVIDIA | Megatron-LM | GitHub, Hugging Face | NA | 8.3 | 1,024 |
MPT-7B | 01/05/2023 | MosaicML | Apache 2.0 | Hugging Face | NA | 6.7 | 65,000 |
OpenLLaMA | 01/05/2023 | UC Berkeley | Apache 2.0 | GitHub, Hugging Face | Decoder | 3B, 7B, 13B | 2,048 |
Palmyra Base | 01/01/2023 | Writer | Apache 2.0 | Hugging Face | Decoder | 5 | 2,048 |
Pythia | 01/04/2023 | EleutherAI | Apache 2.0 | GitHub, Hugging Face | Decoder | 0.07 - 12 | 2,048 |
RedPajama-INCITE | 01/05/2023 | together.ai | Apache 2.0 | Hugging Face | NA | 3B, 7B | NA |
RoBERTa | 01/10/2019 | Meta | GNU General Public License v2.0 | Hugging Face | Encoder | 0.125 | 512 |
StableLM | 01/04/2023 | Stability AI | CC BY-SA-4.0 / Code released under Apache 2.0 | GitHub, Hugging Face | NA | NA | 4,096 |
T5 | 01/10/2019 | Apache 2.0 | GitHub, Hugging Face | Encoder - Decoder | 11 | 512 | |
UL2 | 01/10/2022 | Apache 2.0 | GitHub, Hugging Face | Encoder | 20 | 512 & 2,048 | |
Vicuna-13B | 30/03/2023 | LMSYS | GNU General Public License v3.0 / Code released under Apache 2.0 | LMSYS Org, ChatBot (Hugging Face), GitHub | NA | 13 | 2,048 |
It is not a secret today that commercial LLM models, or in other words, closed-source models, are dominating the market. Only a handful of open-source models reach somewhere close to the quality of the commercial models, such as GPT models by OpenAI. Just look at the number of closed-source models and their performance in the market. However, the open-source community is trying hard to bridge this gap between open-source and closed-source models through projects that aggregate resources, such as datasets and online computing power, to train models that can compete with their commercial counterparts in terms of quality.
Cost implications of LLM adoption
At first glance, these closed-source models seem cheap, costing not more than 0.001 dollars per token. However, as soon as we scale our applications in production, and as soon as we start having thousands of users, we realize that the cost of these models is huge.
Let’s find out the cost of utilizing closed-source models by calculating the number of users, the cost per token, and the pattern of usage per day. For example, we can take the cost per token for one of the premium models. On average, the input costs around 0.005 dollars per 1,000 tokens, and the output costs ...