All Updates

All Updates

icon
Filter
Product updates
Google Cloud Run adds NVIDIA GPU support for serverless AI inference
Generative AI Infrastructure
Aug 21, 2024
This week:
Funding
Slingshot Aerospace secures USD 30 million in growth capital from Trinity Capital
Digital Twin
Sep 12, 2024
Product updates
Akselos introduces software for FPSO monitoring
Digital Twin
Sep 12, 2024
Product updates
Vertical Aerospace completes first phase of VX4 flight testing
Passenger eVTOL Aircraft
Sep 12, 2024
Partnerships
Turbine and Ono Pharmaceutical reach developmental milestone in oncology research collaboration
AI Drug Discovery
Sep 12, 2024
Partnerships
Sino Biological and BioGeometry partner to optimize protein design using GenAI technology
AI Drug Discovery
Sep 12, 2024
Funding
Sinopia Biosciences raises USD 2.2 million in Phase II SBIR grant funding to accelerate oral mucositis program
AI Drug Discovery
Sep 12, 2024
Regulation/policy
eToro settles for USD 1.5 million with SEC over unregistered trading of certain crypto assets
Retail Trading Infrastructure
Sep 12, 2024
Funding
Amprion raises USD 100,000 in research grant funding from ALS Network to support ALS research and therapy development
Longevity Tech
Sep 12, 2024
Funding
Amprion raises USD 100,000 in research grant funding from ALS Network to support ALS research and therapy development
Precision Medicine
Sep 12, 2024
Funding
Epitopea raises GBP 500,000 in grant funding to develop vaccines and TCR-based therapies
Precision Medicine
Sep 12, 2024
Generative AI Infrastructure

Generative AI Infrastructure

Aug 21, 2024

Google Cloud Run adds NVIDIA GPU support for serverless AI inference

Product updates

  • Google Cloud announced a preview of NVIDIA L4 GPU support for its Cloud Run serverless platform. The feature allows organizations to run serverless AI inference, with users only paying for the GPU resources they use.

  • The GPU-enabled Cloud Run instances can support various AI frameworks and models, including NVIDIA NIM, VLLM, Pytorch, and Ollama. Each instance can have one NVIDIA L4 GPU, providing up to 24 GB of vRAM. The service supports running models with up to 13 billion parameters for optimal performance.

  • Google Cloud claims this integration enables real-time inference with lightweight open models, serving custom fine-tuned GenAI models, and accelerates compute-intensive services. The company reports cold start times ranging from 11 to 35 seconds for various models, demonstrating the platform's responsiveness.

Contact us

Gain access to all industry hubs, market maps, research tools, and more
Get a demo
arrow
menuarrow

By using this site, you agree to allow SPEEDA Edge and our partners to use cookies for analytics and personalization. Visit our privacy policy for more information about our data collection practices.