Stability AI, a developer of Stable Diffusion text-to-image AI, has introduced Stable Cascade, a text-to-image model developed with a three-phase approach for the use of non-commercial requirements.
The model includes features like image creation, modification of generated images, improving an existing picture's resolution, and inpainting and outpainting where editing is applied on specific parts of an image. It also offers canny edge functionality, enabling users to generate a new image just from the edges of an existing one.
Additionally, the company will release scripts for checkpoints and inference, fine tuning, ControlNet, and LoRA training to enable users to experiment with the new architecture (Würstchen) found on the Stability GitHub page.
The company claims that the model will offer compression ability higher than VAE in Stable Diffusion and is 16x less than a similar-sized Stable Diffusion model during additional tuning in Stage C. Moreover, the model is designed to require approximately 20GB VRAM for inference, which can be further reduced using smaller variants.
By using this site, you agree to allow SPEEDA Edge and our partners to use cookies for analytics and personalization. Visit our privacy policy for more information about our data collection practices.