All Updates

All Updates

icon
Filter
Product updates
Cerebras launches Cerebras Inference for improved AI model deployment
Machine Learning Infrastructure
Aug 27, 2024
This week:
Management news
Consensys conducts layoffs citing regulatory battles and market pressure
Enterprise Blockchain Solutions
Yesterday
Management news
Consensys conducts layoffs citing regulatory battles and market pressure
Web3 Ecosystem
Yesterday
Funding
HOMEE raises USD 12 million in Series C funding to develop smart claims management platform
InsurTech: Infrastructure
Yesterday
Product updates
Snap launches simplified version of Snapchat to drive user engagement; adds new AI features
Creator Economy
Yesterday
Partnerships
Bloomreach partners with Planet to offer unified commerce personalization solution
Marketing Automation
Yesterday
Partnerships
Criteo partners with OMS providers and Salesforce to enhance retail media network solutions
Marketing Automation
Yesterday
Product updates
Basware launches Insights platform with GenAI capabilities to enhance touchless invoicing
Business Expense Management
Yesterday
Product updates
HiBob launches learning and development module Bob Learning to enhance workforce capability
Remote Work Infrastructure
Yesterday
Funding
Fingercheck raises USD 115 million in growth funding to support expansion
Remote Work Infrastructure
Yesterday
Funding
Blacklane raises USD 65 million in Series G funding to expand operations
Travel Tech
Yesterday
Machine Learning Infrastructure

Machine Learning Infrastructure

Aug 27, 2024

Cerebras launches Cerebras Inference for improved AI model deployment

Product updates

  • Cerebras Systems has introduced Cerebras Inference, a new service for running AI models. The service offers speeds of 1,800 tokens per second for Llama 3.1 8 billion and 450 tokens per second for Llama 3.1 70 billion, with pricing starting at 10 cents per million tokens.

  • Cerebras Inference maintains 16-bit accuracy throughout the inference process, offering high performance without compromising accuracy. The service is available in three tiers: Free, Developer, and Enterprise. The Developer tier provides an API endpoint for flexible, serverless deployment, while the Enterprise tier offers fine-tuned models, custom SLAs, and dedicated support.

  • Cerebras claims its inference service is 20x faster than NVIDIA GPU-based solutions in hyperscale clouds and offers 100x higher price performance for AI workloads. The company claims this speed enables developers to build next-generation AI applications requiring complex, multi-step, real-time task performance, such as AI agents.

  • Analyst QuickTake: Cerebras Inference’s performance and lower costs threaten NVIDIA's solutions, potentially changing how AI inference is done. With nearly 20x the speed and much lower costs, it may appeal to developers and businesses looking for more efficient and affordable AI deployment options.

Contact us

Gain access to all industry hubs, market maps, research tools, and more
Get a demo
arrow
menuarrow

By using this site, you agree to allow SPEEDA Edge and our partners to use cookies for analytics and personalization. Visit our privacy policy for more information about our data collection practices.