
Amazon is still struggling to bring its Alexa assistant to the generative AI era. The Financial Times published a lengthy report today detailing the numerous challenges the company is currently facing to transform Alexa from a simple AI assistant into a much more capable “agent” that can perform complex tasks.
As you may remember, Amazon first teased new AI features for Alexa during its devices and services event in September 2023. At the time, the company demoed a more conversational version of Alexa that can engage in deeper conversations, create stories or recipes on the fly, and more. The company said this new version of Alexa with generative AI features would launch first in preview in the US, but that ultimately didn’t happen.
In an interview with the Financial Times, Rohit Prasad, senior vice president and head scientist for Artificial General Intelligence at Amazon explained why Alexa’s generative AI overhaul is more complex than expected. His team is currently focused on improving Alexa’s response speed and reducing the occurrence of “hallucinations,” which is what happens when generative AI assistants start to make things up. This is something we’ve already seen happen with Apple Intelligence and Google’s AI Overviews, and these hallucinations are pretty much guaranteed to reduce trust in AI assistants.
“Hallucinations have to be close to zero,” Prasad told the Financial Times. “It’s still an open problem in the industry, but we are working extremely hard on it.”
To bring new generative AI capabilities to Alexa, Amazon isn’t completely replacing the assistant’s simpler limited algorithms with more complex large language models (LLMs). The new version of Alexa actually uses a combination of LLMs, including Amazon’s in-house Nova models and Anthropic’s Claude models to enable new generative AI capabilities. However, these LLMs still need to interface with Alexa’s existing infrastructure in some way, and this is one of the things that’s causing some friction.
“The new Alexa uses a bouquet of different AI models to recognise and translate voice queries and generate responses, as well as to identify policy violations, such as picking up inappropriate responses and hallucinations,” the Financial Times reported. “Building software to translate between the legacy systems and the new AI models has been a major obstacle in the Alexa-LLM integration.”
Amazon will also need to ensure that developers will be able to easily create “skills” for its new version of Alexa. However, according to a couple of developers who spoke with the Financial Times, Amazon has yet to share important technical details with them after initially pressuring them to get ready for the next generation of Alexa.
Lastly, Amazon still has to figure out how to make Alexa a sustainable business in the generative AI era. Running complex LLMs at scale is very costly, and Amazon is reportedly exploring the opportunity to make this new Alexa available as a subscription service. A former Alexa employee also told the Financial Times that the company is also considering taking a cut of sales of goods and services to support its development.