Qualcomm has partnered with Ampere Computing, a chip designing company backed by Oracle, to create an AI inference solution for LLMs.
Ampere and Qualcomm's joint solution consists of a Supermicro server fitted with Ampere CPUs and Qualcomm's Cloud AI 100 accelerator chips. This combined framework aims to deliver an easily deployable system for efficient inference computing suitable for handling varying sizes. This is expected to meet the computational needs of businesses dealing with immense parameter models. Additionally, a 256-core server CPU is planned to be released in 2025.
By using this site, you agree to allow SPEEDA Edge and our partners to use cookies for analytics and personalization. Visit our privacy policy for more information about our data collection practices.