Building a Voice Assistant for Pandora
The Future is Spoken presents Ananya Sharan as this week’s guest. Ananya is a search and voice expert working as the product manager for Pandora’s Voice Mode, a mobile-only voice assistant that allows users to easily discover and listen to new music. In this episode, Ananya explores the rewards and challenges of working with voice assistant technology.
As the largest audio-streaming platform in the US, Pandora creates personalized recommendations for users, whether they are looking for podcasts or music. Ananya describes Pandora’s Voice Mode as an employee-driven initiative. “We really wanted to build something in-house and showcase our data science, our deep knowledge about the listener, and all of the intelligence we have in our recommendation ensembles to bring that natural conversational way of consuming music and podcasts and put it right on your phone. So we really were thinking with our customer in mind.”
Voice Mode, which is a mobile-only feature available on Android and iOS, is unique from other voice assistants in its ability to work in ambiguous and hands-free scenarios. Unlike connected devices, which may require a specific song request, Pandora’s creation caters to personalized recommendations, depending on a user’s mood or activity. Ananya uses cooking as an example. “When you say, ‘Hey Pandora, play me something for cooking,’ we know what you like to listen to … so we play something like that.”
Ananya emphasizes the importance of advocating for users’ needs. In addition to internally advocating for innovative solutions, a challenge for Voice Mode’s team was getting leaders on board with a milestone-driven schedule, rather than being pinned down to a standard timeline-driven schedule. Due to the need to train AI modes in language and accent recognition, among other hurdles, “it's hard to predict with absolute certainty when the product is going to be ready or when it's going to achieve, let's say, 90% accuracy.” Ananya explains that inaccuracies could translate into quite a lot of frustration for a real-world user—they may not even go back to a product after one bad experience.
Another challenge relates to accents and pronunciations. Many early AI models only learned American accents. However, Ananya believes that voice technologies have begun to catch up as global usage of these products grow. Ananya even sees an improvement in Voice Mode’s accent recognition since it was first launched.
“The only way it can work is by getting more people to use it and training the models using a variety of accents.”
Ananya’s advice for others in voice is to “focus on real users and real use cases, and assume that there's just multiple ways to ask for the same thing.” She also suggests looking into new contexts where users might benefit from voice technology.
Ananya shares more about her work with Pandora and voice technology in this insightful podcast episode!
Find Ananya on LinkedIn
Conversation Design from Google: https://designguidelines.withgoogle.com/conversation/conversation-design/learn-about-conversation.html#learn-about-conversation-the-cooperative-principle
Voice Summit Playlist from 2018: https://www.youtube.com/playlist?list=PLn51IO3rbkV1E1a6WjgvFtW3VaOCRxzov
Voice Industry news, reports: https://voicebot.ai