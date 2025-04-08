Amazon Says Its New Voice AI Model Will Understand a User’s Tone
The company says the voice AI technology, called Amazon Nova Sonic, is cheaper than OpenAI’s version.
BY BEN SHERRY, STAFF REPORTER @BENLUCASSHERRY
Photos: Getty Images
Amazon has announced a new AI model designed to power voice-to-voice interactions in natural, conversational language. The company says that the new model, named Amazon Nova Sonic, uses the same technology present in its next-gen reboot of Alexa. The model is expected to be used by software developers to create advanced AI agents that can interact with humans and take actions on their behalf.
Amazon said in a blog post that Nova Sonic combines speech understanding and speech generation into a single model, making it particularly useful for building AI-powered vocal assistants, especially in customer support.
Many other vocal AI assistants work by first transcribing a users’ words into text, processing the text, generating a text response, and then converting that text back into audio. This process can take some time, and can slow down conversations. But by using speech-to-speech models like Amazon’s Nova Sonic, developers can cut out most of the steps, greatly shortening the time between a user’s request and the model’s response.
Not only is this method faster, but it also enables the AI models to pick up on details that would otherwise be lost, like a users’ tone of voice or speaking cadence. These kinds of details wouldn’t be transcribed, but the voice-to-voice system can understand the nuances in speech and adapt accordingly. To show how this works, Amazon created a fake AI-powered travel agent, and asked it to buy plane tickets. In a recording, a user expressed apprehension about the price of a ticket, and in response the AI agent’s tone became reassuring. It told the user not to worry while it searched for low-cost options.
In another example, a user asked an AI agent built with Nova Sonic to summarize trends from a financial dashboard. Because the agent had been connected to the users’ data sources, it was able to complete the summary, update the status of a report, and draft an email containing the newly-surfaced data
The Nova Sonic model is very similar to OpenAI’s realtime models, which empower ChatGPT’s advanced voice mode. That feature lets users chat with AI voices, but Amazon says its model is “nearly 80% less expensive” than OpenAI’s main realtime model. This can’t be confirmed, though, because specific pricing for the model wasn’t yet available.
Amazon is currently rolling out Alexa+, a major update to the popular voice-based assistant. The update, which gives Alexa improved speaking capabilities through the Nova Sonic model, is free for Amazon Prime subscribers, and will cost $19.99 per month to non-subscribers. It was first announced on February 26.
