“Turn down the temperature.”
A human can use context clues and follow-up questions to grasp the exact meaning of such a request—like whether you're too warm or just trying to save on the electric bill (hi, mom). But for voice assistants, asking for clarification is easier said than done.
- Researchers have been working on solving this for over a decade, says Salim Roukos, global leader for language research at IBM Research—but over the past five years, there’s been a “dramatic acceleration” in progress.
Field testing: Two weeks ago, Amazon announced that in the coming months, Alexa will begin to ask users for clarification on certain requests. It’ll start small, limiting follow-up questions to smart home-related requests. Unsurprisingly, Amazon’s got bigger plans: It wants to apply this to all requests eventually.
Context is king
Machine learning systems are typically trained by assigning labels to freeform text: By analyzing your words, they attempt to categorize what you want. But a system has no idea how to classify a request it doesn’t understand—e.g., “Play my favorite music.”
- “You’re entering a whole new world of unstructured conversation, nuance, and all that kind of stuff—and machines are just terrible at that,” Dr. Vasant Dhar, a professor and AI researcher at NYU, told us.
The (in-progress) solution: Systems need to gather more info in order to give you what you want.
- Using advanced language algorithms, voice assistants essentially must gauge: Is this a request I could fulfill with additional information? If yes, which part of the request should I seek clarification on? And what’s the best way to ask?
- “That’s the intelligence that they’re now building into Alexa…and that’s not an easy problem,” says Dhar.
C student
The degree of success here hinges on ML systems learning from humans. That’s tough for a laundry list of reasons: diverse needs, varying context, and human impatience. We don’t like to be asked too many questions—whether by an algorithm, a toddler, or some guy at a networking event.
But at its core, the trouble boils down to two famously star-crossed lovers: AI and common sense. In the past, projects built on humans teaching common sense to machines have been a “colossal flop,” says Dhar.
- For this application, the key will be starting with simple use cases and building from there, says Roukos.
Roukos believes Alexa’s primary focus will be word-sense disambiguation—a sector of computational linguistics that attempts to parse a word’s meaning based on context clues.
- The system is a feedback loop. As it learns more about context, it should theoretically get better at asking relevant follow ups. And better follow ups yield a better understanding of a user's personal context.
Keep up with the innovative tech transforming business
Tech Brew keeps business leaders up-to-date on the latest innovations, automation advances, policy shifts, and more, so they can make informed decisions about tech.
Big picture: Amazon’s aiming to do two things at once: 1) crowdsource to solve the AI-and-common-sense problem, says Dhar, and 2) personalize smart assistants so they’re more in tune with user preferences. Its competitors will likely roll out similar features if Amazon finds success.