Ast Of Crafting Natural Conversations
The Future is Spoken presents Jeff Adams as this week's guest. Jeff has been leading prominent speech & language technology research for more than 20 years. Until 2009, he worked at Nuance / Dragon, where he was responsible for developing and improving speech recognition for Nuance's "Dragon" dictation software. He presided over many of the critical improvements in the 1990s and 2000s that brought this technology into the mainstream and enabled widespread consumer adoption.
After leaving Nuance, Jeff joined Yap, a company specializing in voicemail-to-text transcription. He assembled a strong team of 12 speech scientists who, within two brief years, were able to beat all competitors on an unbiased test set. They also matched the performance of a competitor who used (off-shore) human transcription.
Yap's success caught the interest of Amazon, who wanted to jump-start their new speech & language research lab. Upon acquisition, Jeff led efforts to build one of the industry-leading speech & language groups. His Amazon team developed products such as the Echo, Dash, and Fire TV. Jeff left Amazon in 2014 to found Cobalt Speech and Language.
Starting with their own experiences, they end up discussing crafting natural conversations for the bot.
Tune in now!
[00:28] The journey to Voice…..
- Jeff has been working on speech technology for almost 26 years now. He started with a small speech company in Boston.
- He ended up working at Amazon on Alexa before it was launched.
- Cobalt work with companies that are looking for speech-related technologies. It is a company that licenses technology and also customizes it.
[03:58] Can anyone design natural conversations?
- Jeff explains that it is an art to designing natural conversations. One way to approach this is to assume the system as a human.
- He also talks about designing a system that can cater to everyone's needs.
- Giving the user what they are looking for is what matters! Jeff touches on building a system that responds appropriately to all the different ways people ask for something.
[15:10] Creating a bot despite the lack of resources?
- Jeff divulges different ways of creating a natural voice application despite the lack of resources.
- A slow launch with a lot of beta testing is the key.
[18:41] NLP v/s NLU
- Jeff explains the difference between Natural Language Processing and Natural Language Understanding.
- NLP is a broad umbrella term referring to any computerized processing of human language, while NLU is a subset.
- What are the uses of NLU?
- Spoken Language Understanding is Automatic Speech Recognition(ASR) + NLU.
- Jeff also touches on ensuring a system that understands what the user says.
[34:04] The secret sauce for the ASR systems to work better!
- Jeff elaborates on the different approaches for the ASR systems to work together.
- How can we design a speech recognition system that understands the users in the natural environment?
[41:41] What are the Best Practices while designing a speech system?
[44:15] Must Listen
- Jeff's piece of advice for someone trying to get into Speech Recognition.
Learn more about Jeff at
If you enjoyed this episode of The Future is Spoken Podcast, then make sure to subscribe to our podcast.
Follow Shyamala Prayaga at @sprayaga