OctoAI, founded in 2019 as a spinoff from the University of Washington, is a Seattle-based startup that specializes in optimizing and deploying machine learning and generative AI models. The company's flagship product, OctoStack, enables efficient deployment of large language models (LLMs) in private cloud environments, allowing businesses to maintain data sovereignty and security. OctoAI's platform supports various open-source models, including Llama 2, Mistral, and Stable Diffusion, offering flexibility for developers to experiment with and customize AI models for their specific needs. The company's technology focuses on improving the performance and cost-efficiency of AI model inference, with capabilities like automated hardware selection and model optimization. In November 2023, OctoAI launched its Text Gen Solution, featuring accelerated open-source LLMs and the option for customers to bring their own fine-tuned Llama 2 models. The company's approach allows for a flexible "model-cocktail" alternative to monolithic multi-modal models, enabling developers to build highly composable multi-modal applications.
In September 2024, NVIDIA acquired OctoAI for USD 250 million to expand its efforts in machine learning compilers and cloud infrastructure for AI. OctoAI's CEO and co-founder, Luis Ceze, will join NVIDIA to contribute to these initiatives.
By using this site, you agree to allow SPEEDA Edge and our partners to use cookies for analytics and personalization. Visit our privacy policy for more information about our data collection practices.