All Updates

All Updates

icon
Filter
Product updates
Google Cloud Run adds NVIDIA GPU support for serverless AI inference
Generative AI Infrastructure
Aug 21, 2024
This week:
M&A
N-able acquires Adlumin for USD 266 million to strengthen cybersecurity offerings
Next-gen Cybersecurity
Today
M&A
Bitsight acquires Cybersixgill for USD 115 million to enhance threat intelligence capabilities
Cyber Insurance
Today
M&A
Snowflake acquires Datavolo to enhance data integration capabilities for undisclosed sum
Generative AI Infrastructure
Today
M&A
Snowflake acquires Datavolo to enhance data integration capabilities for undisclosed sum
Data Infrastructure & Analytics
Today
Product updates
Microsoft launches Copilot Actions for workplace automation
Foundation Models
Yesterday
M&A
Almanac acquires Gro Intelligence's IP assets for undisclosed sum
Smart Farming
Yesterday
Partnerships
Aduro Clean Technologies partners with Zeton to build hydrochemolytic pilot plant
Waste Recovery & Management Tech
Yesterday
Funding
Oishii raises USD 16 million in Series B funding from Resilience Reserve
Vertical Farming
Yesterday
Management news
GrowUp Farms appoints Mike Hedges as CEO
Vertical Farming
Yesterday
M&A
Rise Up acquires Yunoo and expands LMS monetization capabilities
EdTech: Corporate Learning
Yesterday
Generative AI Infrastructure

Generative AI Infrastructure

Aug 21, 2024

Google Cloud Run adds NVIDIA GPU support for serverless AI inference

Product updates

  • Google Cloud announced a preview of NVIDIA L4 GPU support for its Cloud Run serverless platform. The feature allows organizations to run serverless AI inference, with users only paying for the GPU resources they use.

  • The GPU-enabled Cloud Run instances can support various AI frameworks and models, including NVIDIA NIM, VLLM, Pytorch, and Ollama. Each instance can have one NVIDIA L4 GPU, providing up to 24 GB of vRAM. The service supports running models with up to 13 billion parameters for optimal performance.

  • Google Cloud claims this integration enables real-time inference with lightweight open models, serving custom fine-tuned GenAI models, and accelerates compute-intensive services. The company reports cold start times ranging from 11 to 35 seconds for various models, demonstrating the platform's responsiveness.

Contact us

Gain access to all industry hubs, market maps, research tools, and more
Get a demo
arrow
menuarrow

By using this site, you agree to allow SPEEDA Edge and our partners to use cookies for analytics and personalization. Visit our privacy policy for more information about our data collection practices.