Allen Institute for AI (AI2) has launched OLMo 2, a new open-source language model available in 7 billion- and 13 billion-parameter versions.
The company claims that OLMo 2 features architectural improvements, including RMSNorm, QK-Norm, rotary positional embedding, and Z-loss regularization. The model is pre-trained in two stages: The first using OLMo-Mix-1124 with 3.9 trillion tokens and the second using Dolmino-Mix-1124, which contains 843 billion tokens of high-quality web and domain-specific data.
According to AI2, OLMo 2 7 billion and 13 billion are currently the best-performing fully open models, outperforming LLama-3.1 8 billion and Qwen 2.5 7 billion despite using lower total training FLOPs. The models show significant improvement across all tasks compared to the earlier OLMo 0424 model.
Analyst QuickTake: This follows the company's previous launch of the OLMoE model this September. The OLMo 2 appears to be an evolution of this work, further advancing AI2's efforts in open language models and representing a significant step forward in fully open language model development.
By using this site, you agree to allow SPEEDA Edge and our partners to use cookies for analytics and personalization. Visit our privacy policy for more information about our data collection practices.