UK-based Stability AI is actively involved in building AI applications and offers a range of foundation models, including:
Text-to-image model
Stable Diffusion (SDXL 1.0): The open-source text-to-image model incorporates a dual network architecture with a 3.5-billion parameter base model and a 6.6-billion parameter refiner. The model was trained on image-caption pairs from the LAION-5B dataset derived from Common Crawl data. It excels in high-quality image generation, spatial object positioning, and complex concept recognition. The model can be fine-tuned for specific tasks using techniques such as embeddings, hypernetworks, and tools like DreamBooth, a deep-learning generation model.
Stable Diffusion 3 Medium: a text-to-image open-source AI models with 2 billion parameters utilizing a Diffusion Transformer Architecture and can be fine-tuned using small datasets.
Stable Fast 3D: Text-to-image model generates 3D images from a single 2D image.
In November 2023, the company launched two new products Sky Replacer and Stable 3D. Sky Replacer lets users manipulate the aesthetic of the sky in their photos from a choice of nine alternatives, enhancing the overall image appeal and Stable 3D aims to simplify the process of creating textured 3D objects, from images or text prompts, into a quick, automatic activity.
Text-to-text models
Stable Beluga 1: The instruction-following language model is based on LLaMA 65 billion and has reasoning capabilities.
Stable Beluga 2 (FreeWilly): An expansion to Stable Beluga 1, this model is based on LLaMA 70 billion and reportedly ranks near the top of the Open LLM Leaderboard.
StableLM is an open-source language model suite; the company has released alpha versions of three billion- and seven-billion parameter models, with 15 to 65 billion-parameter models to follow. It's designed to generate text and code, and can power various downstream applications. Stable LM 2 1.6B, an AI language model designed for multiple languages, was released in January 2024.
The Japanese StableLM Alpha, is a seven-billion parameter Japanese language model, which claims to have superior performance compared to other publicly available Japanese language models. It offers text generation capabilities and is released under the Apache License 2.0 for commercial use, with an additional research version available for research purposes.
StableCode has three billion parameters and is specifically designed to help programmers with coding tasks by providing autocomplete suggestions and responses to programming instructions.
Text-to-audio model
Stable Audio, which has 907 million parameters, uses a latent diffusion model architecture for audio and is conditioned on various factors including text metadata, audio file duration, and start time. This conditioning allows control over the content and length of the generated audio.
Text-to-image model
Stable Diffusion 3, which has 800 million range and 8 billion parameters and combines a diffusion transformer architecture and flow matching. Additionally, the model can depict multi-subject prompts, create images, and correct spelling. Public preview of this model was announced in June 2024.
Stable Cascade, a text-to-image model developed with a three-phase approach for image creation and modification.
Text-to-video model
Stable Video Diffusion, an AI model designed to produce videos from images, offered as a research preview. The models, SVD and SVD-XT, can generate short, high-quality clips but come with limitations.
Stable Video 3D (SV3D), a GenAI tool to create 3D videos to create multi-view 3D models from a single input image and can produce short video content from image or text prompts.
Stable Video 4D, a model that enables users to upload video and obtain dynamic videos from different views.
There has been considerable debate about the ethical and copyright implications of using stable diffusion to create images. The model is particularly permissive when it comes to creating different types of content, which has raised concerns about its potential for abuse. In response to its permissiveness, Stability AI has become involved in copyright infringement lawsuits, with artists and Getty Images raising legal objections to the use of their content in the model's training data.
In March 2024, Stability AI's founder and CEO, Emad Mostaque, resigned from his role and the company's board to focus on the development of decentralized AI to bring transparency and competitiveness to AI.
In April 2024, the company made Stable Diffusion 3 and Stable Diffusion 3 Turbo available on the Stability AI developer platform API.
In May 2024, Stability AI introduced Stable Artisan, a tool to enable media generation for Discord users.
Key customers and partnerships
In December 2022, Stability AI entered a partnership with Amazon, with AWS designated as the preferred cloud provider to facilitate the development and expansion of Stability AI's AI models.
In March 2024, Stability AI partnered with Tripo AI, a provider of AI 3D models, to release TripoSR for 3D object reconstruction to cater to the demands of entertainment, gaming, industrial design, and architecture professionals, with responsive outputs for visualizing detailed 3D objects.
In April 2024, the company partnered with Fireworks AI, a provider of API platforms, to deliver Stable Diffusion 3 and Stable Diffusion 3 Turbo for enterprises.
In September 2024, the company launched Stable Image Ultra, Stable Diffusion 3 Large, and Stable Image Core, its text-to-image models on Amazon Bedrock.
By using this site, you agree to allow SPEEDA Edge and our partners to use cookies for analytics and personalization. Visit our privacy policy for more information about our data collection practices.