Groq, an AI solutions company, achieved over 300 tokens per second per user on Meta AI's Llama-2 70B LLM using its Language Processing Unit™ (LPU) system.
Groq's LPU system addresses LLMs' sequential and compute-intensive nature, offering ultra-low latency crucial for natural conversation rhythm in AI interfaces; traditional solutions like GPUs fall short in handling incumbent latency and scale-related issues.
Groq is an AI solutions company and the inventor of the Language Processing Unit accelerator, which is purpose-built and software-driven to power large language models (LLMs) for the exploding AI market.
By using this site, you agree to allow SPEEDA Edge and our partners to use cookies for analytics and personalization. Visit our privacy policy for more information about our data collection practices.