All Updates

All Updates

icon
Filter
Product updates
Cerebras launches Cerebras Inference for improved AI model deployment
Machine Learning Infrastructure
Aug 27, 2024
This week:
Product updates
Meta introduces Meta Credits digital currency for Horizon Worlds
Metaverse Platforms
Today
Product updates
Liquid AI launches STAR framework for AI model architecture optimization
Foundation Models
Today
Funding
9fin raises USD 50 million in Series B funding to expand US operations
Capital Markets Tech
Yesterday
Partnerships
Shastic partners with MeridianLink to provide AI workflow automation for financial institutions
Workflow Automation Platforms
Yesterday
Funding
KisoJi Biotechnology raises CAD 41 million to develop oncology candidate
AI Drug Discovery
Yesterday
M&A
Deel acquires UK money transfer startup Atlantic Money for undisclosed sum
Remote Work Infrastructure
Yesterday
Funding
Acorai raises EUR 4.2 million in funding to advance clinical studies
Next-gen Medical Devices
Yesterday
Product updates
Bosch announces Light Drive, an AR solution for all-day smart glasses
Extended Reality
Yesterday
Partnerships
Product updates
Vuzix expands OSHA collaboration; launches upgraded M400 AR smart glasses
Extended Reality
Yesterday
Product updates
StrikerVR launches pre-orders for consumer-facing Mavrik haptic VR gun
Extended Reality
Yesterday
Machine Learning Infrastructure

Machine Learning Infrastructure

Aug 27, 2024

Cerebras launches Cerebras Inference for improved AI model deployment

Product updates

  • Cerebras Systems has introduced Cerebras Inference, a new service for running AI models. The service offers speeds of 1,800 tokens per second for Llama 3.1 8 billion and 450 tokens per second for Llama 3.1 70 billion, with pricing starting at 10 cents per million tokens.

  • Cerebras Inference maintains 16-bit accuracy throughout the inference process, offering high performance without compromising accuracy. The service is available in three tiers: Free, Developer, and Enterprise. The Developer tier provides an API endpoint for flexible, serverless deployment, while the Enterprise tier offers fine-tuned models, custom SLAs, and dedicated support.

  • Cerebras claims its inference service is 20x faster than NVIDIA GPU-based solutions in hyperscale clouds and offers 100x higher price performance for AI workloads. The company claims this speed enables developers to build next-generation AI applications requiring complex, multi-step, real-time task performance, such as AI agents.

  • Analyst QuickTake: Cerebras Inference’s performance and lower costs threaten NVIDIA's solutions, potentially changing how AI inference is done. With nearly 20x the speed and much lower costs, it may appeal to developers and businesses looking for more efficient and affordable AI deployment options.

Contact us

Gain access to all industry hubs, market maps, research tools, and more
Get a demo
arrow
menuarrow

By using this site, you agree to allow SPEEDA Edge and our partners to use cookies for analytics and personalization. Visit our privacy policy for more information about our data collection practices.