OpenAI has launched GPT-4o (GPT omni), a model with enhanced capabilities across text, audio, and image inputs and outputs, to facilitate natural human-computer interaction.
The company claims the model matches the performance of GPT-4 Turbo in text processing in English and code, with improved performance in non-English languages. It is also claimed to be 2x faster and 50% cheaper than GPT-4 Turbo.
Additionally, the model is claimed to respond to audio inputs in an average of 320 milliseconds, similar to human response time in conversations.
The model is currently launched with text and image capabilities. Audio outputs will initially be limited to preset voices to adhere to existing safety policies, with improved features planned to be released in the upcoming weeks.
GPT-4o is currently available in the API as a text and vision model for developers.
By using this site, you agree to allow SPEEDA Edge and our partners to use cookies for analytics and personalization. Visit our privacy policy for more information about our data collection practices.