All Updates

All Updates

icon
Filter
Product updates
EleutherAI launches evaluation library for LLMs
Generative AI Infrastructure
May 26, 2024
This week:
Funding
GrayMatter Robotics raises USD 45 million in Series B funding to accelerate AI-powered robotics solutions
Smart Factory
Yesterday
Funding
Vecna Robotics raises USD 100 million in Series C funding; appoints new COO
Logistics Tech
Yesterday
Funding
Vecna Robotics raises USD 100 million in Series C funding; appoints new COO
Smart Factory
Yesterday
Funding
FairNow raises USD 3.5 million to advance AI governance solutions
Generative AI Infrastructure
Yesterday
Partnerships
Gravitics develops testing gauntlet for larger spacecraft in collaboration with NASA
Space Travel and Exploration Tech
Yesterday
M&A
knownwell acquires Alfie Health to integrate AI in primary and obesity care services
Telehealth
Yesterday
Funding
Pomelo Care raises USD 46 million in Series B funding to expand virtual maternal care
Telehealth
Yesterday
Funding
Isar Aerospace raises EUR 65 million, backed by NATO Innovation Fund
Space Travel and Exploration Tech
Yesterday
Product updates
Beyond Meat releases new Beyond Sausage, expanding its Beyond IV product line
Plant-based Meat
Yesterday
Product updates
Funding
SurrealDB raises USD 20 million in Series A; launches beta version of Surreal Cloud
Data Infrastructure & Analytics
Yesterday
Generative AI Infrastructure

Generative AI Infrastructure

May 26, 2024

EleutherAI launches evaluation library for LLMs

Product updates

  • EleutherAI, in collaboration with Stability AI and other partners, has launched "Language Model Evaluation Harness" (lm-eval), an open-source library designed to enhance the evaluation of LLMs.

  • The lm-eval tool offers modular implementation of evaluation tasks, supporting various requests such as conditional log-likelihoods, perplexities, and text generation. It facilitates qualitative and quantitative analyses, allowing researchers to conduct in-depth evaluations of model outputs.

  • EleutherAI claims that the lm-eval tool overcomes the limitations of reproducibility and transparency in existing evaluation methods by providing a consistent framework for fair and precise comparisons across different models and techniques, ultimately leading to more reliable research outcomes.

Contact us

Gain access to all industry hubs, market maps, research tools, and more
Get a demo
arrow
menuarrow

By using this site, you agree to allow SPEEDA Edge and our partners to use cookies for analytics and personalization. Visit our privacy policy for more information about our data collection practices.