Foundation models refer to large AI models that are trained using vast datasets to perform various tasks. These models mark a departure from the smaller task-specific models that previously dominated the AI landscape.
First popularized by the Stanford Institute for Human-Centered Artificial Intelligence, the term “foundation models” refers to how these large models serve as the foundation for developing more complex and refined models across diverse applications. For example, GPT serves as a foundation model and successors like GPT-3 and GPT-4 are harnessed by hundreds of startups and established companies to develop task-specific models for applications ranging from content creation to marketing plan generation.
This hub includes organizations that have built large AI models, whether for in-house product development, to drive AI research, or for commercialization via third-party API access and other means. Due to the prohibitive costs of training and running large AI models, the space remains strongly incumbent-driven and is led by Big Tech firms, followed by AI research labs and communities.
Among startups, OpenAI is a frontrunner in terms of commercializing foundation models. Big Tech firms like Google and Meta had previously refrained from releasing foundation models for public use and, instead, leveraged them to enhance the functionality of in-house products. However, this posture has since shifted, with Google recently making its foundation models available via the PaLM API and Vertex AI platform, and Meta open-sourcing models like LLaMA. AI research labs and communities, such as EleutherAI and BigScience, tend to focus on open-sourcing AI development.
Note: Additional sections (market sizing, incumbents, etc.) can be provided on request.
Foundation models are currently deployed across a variety of industries to develop new applications and introduce new AI-powered features on existing products. As the first player to commercialize its foundation models, OpenAI's models remain the most widely used. For instance, OpenAI’s GPT-3 reportedly has over 300 applications under its belt, while its fine-tuned, coding-focused descendant, Codex, has over 70 applications. Other foundation model builders like Google, AI21 Labs, Anthropic, and Midjourney are now quickly following suit with the commercialization of their models.
Large language models, fine-tuned language models, and multimodal models remain the most popular among users. Many providers have yet to disclose customer use cases for newer model types like audio, video, and speech models. Some foundation models are also still at the research stage (for example, Meta's ImageBind model).
We have identified key use cases below:
All foundation models are primarily segmented based on the distinctions in data type, specifically the type of input data (prompt) leveraged by a user and the type of output data generated by the model. For instance, LLMs use textual inputs to generate textual outputs, while video models use textual inputs to generate video outputs.