All Updates

All Updates

icon
Filter
Product updates
NVIDIA releases Llama-3.1-Minitron 4 billion, a compressed version of Llama 3 model
Foundation Models
Aug 20, 2024
This week:
Partnerships
T-Mobile partners with OpenAI to develop AI-powered customer service platform
Generative AI Applications
Yesterday
Partnerships
Runway partners with Lionsgate to develop AI video tools using studio's movie catalog
Generative AI Applications
Sep 18, 2024
Funding
QMill raises EUR 4 million in seed funding to provide quantum computing industrial applications
Quantum Computing
Sep 18, 2024
Product updates
QuiX Quantum launches 'Bia' quantum cloud computing service for quantum solutions
Quantum Computing
Sep 18, 2024
Partnerships
Oxford Ionics and Infineon Technologies partner to build portable quantum computer for Cyberagentur
Quantum Computing
Sep 18, 2024
Partnerships
Product updates
Tencent Ai Lab launches EzAudio AI for text-to-audio generation with Johns Hopkins University
Foundation Models
Sep 18, 2024
Funding
TON secures USD 30 million in investment from Bitget and Foresight Ventures
Web3 Ecosystem
Sep 18, 2024
Funding
Hemi Labs raises USD 15 million in funding to launch blockchain network
Web3 Ecosystem
Sep 18, 2024
Product updates
Fivetran launches Hybrid Deployment for data pipeline management
Machine Learning Infrastructure
Sep 18, 2024
Product updates
Fivetran launches Hybrid Deployment for data pipeline management
Data Infrastructure & Analytics
Sep 18, 2024
Foundation Models

Foundation Models

Aug 20, 2024

NVIDIA releases Llama-3.1-Minitron 4 billion, a compressed version of Llama 3 model

Product updates

  • NVIDIA has released Llama-3.1-Minitron 4 billion, a compressed version of the Llama 3 model, to run on resource-constrained devices.

  • The model is developed by pruning and distilling the Llama 3.1 8 billion model, resulting in a 4-billion parameter version. Additionally, two versions of the model are available using depth-only pruning and width-only pruning.

  • The model is claimed to be capable of instruction following, roleplay, retrieval-augmented generation, and function-calling, offering a balance between training costs and inference capabilities.

  • The company claims that Llama-3.1-Minitron 4 billion performs comparably to other SLMs like Phi-2 2.7 billion, Gemma2 2.6 billion, and Qwen2-1.5 billion, despite being trained on a much smaller dataset.

  • Analyst QuickTake: There has been a growing trend of introducing smaller language models in the AI market. Last month, NVIDIA and Mistral AI released the Mistral NeMo 12B, a 12-billion-parameter multilingual language model. NVIDIA joins the likes of Microsoft, which introduced the Phi-3 Vision model (in May) and Phi-3 Mini (in April), OpenAI, which launched GPT-4o mini (in July), and Apple, which introduced OpenELM (in April).

Contact us

Gain access to all industry hubs, market maps, research tools, and more
Get a demo
arrow
menuarrow

By using this site, you agree to allow SPEEDA Edge and our partners to use cookies for analytics and personalization. Visit our privacy policy for more information about our data collection practices.