Voice Search | The Alter Egozi

Amazon is promoting voice shopping as part of its deals for Prime Day next week. Shoppers will get $10 credit just for making their first voice purchase from a list of “Alexa Deals“, items that are already greatly discounted. That’s a major incentive just to push consumers into something that should actually be a great benefit – effortless, simple, zero-click shopping. Why does Amazon have to go through so much trouble to get shoppers to use something that’s supposedly so helpful?

To understand the answer, it’s worthwhile to first understand how valuable voice shopping is for Amazon. In all demos and videos for the various Alexa devices, voice shopping is positioned as the perfect tool for spontaneous, instant ordering purchases, such as “Alexa, I need toilet paper / diapers / milk / dog food / …” That easily explains why you would need to be an Amazon Prime subscriber in order to use voice shopping, and getting Prime to every household is a cornerstone to Amazon’s business strategy.

In addition, Alexa orders are fulfilled by 1-click payment, yet another highly valuable Amazon tool. Amazon also guarantees free returns for Alexa purchases, just in case you’re concerned about getting your order wrong. Now, combine all of these together and you can see how voice shopping is built to create a habit, of shopping as a frictionless, casual activity. That is probably also why the current offer does not apply for voice shopping from within Amazon’s app, as the long process of launching it and reaching the voice search in it ruins the spontaneity.

And yet – shoppers are not convinced. In last year’s Prime Day, a similar promotion offered by Amazon drove on average one voice order per second. This may sound like a lot, but ~85K orders are still a tiny fraction of the total ~50M orders consumers placed on Amazon that day. This year Amazon raised the incentive even further, which indicates there is still much convincing to do. Why is that?

For starters, Amazon’s Alexa devices were never built to be shopping-only. Usage survey reports consistently show that most users prefer to use the Alexa assistant to ask questions, play music, and even to set timers, much more than to shop. This does not mean that Amazon has done a bad job, quite the contrary. Voice shopping may not be that much of a habit initially, and getting used to voice-controlling other useful skills helps build habit and trust. Problem is, when you focus on non-shopping, you also get judged by it. That’s how Amazon gets headlines such as “Google Assistant is light-years ahead of Amazon’s Alexa“, with popular benchmarks measuring it by search, question answering and conversational AI, fields where Google has historically invested more than Amazon by orders of magnitude. The upcoming HomePod by Apple is expected to even further complicate Amazon’s stand, with Apple growing to control the slot of a sophisticated, music-focused, high-end smart home device.

The “How it works” page for the Prime Day Alexa deals hints at other issues customers have with shopping in particular. Explanations aim to reassure that no unintended purchases take place (triggered by your kids, or even your TV), and that if your imperfect voice interaction got you the wrong product, returns are free for all Alexa purchases. These may sound like solved issues, but keep in mind the negative (and often unjustified) coverage around unintended purchases has sent countless Echo owners to set a passcode on ordering, which is actually a major setback for the frictionless zero-click purchasing Amazon is after.

But most importantly, voice-only search interfaces have not yet advanced to support interactions that are more complex than a simple context-less pattern recognition. It’s no accident that the most common purchase flows Alexa supports are around re-ordering, where the item is a known item and no search actually takes place. This means that using Alexa for shopping may work well only for those simple pantry shopping, assuming you already made such purchases in the past. Google, on the other hand, is better positioned than Amazon in this respect, having more sophisticated conversational infrastructure. It even enables external developers to build powerful and context-aware Google Assistant apps using tools such as api.ai (for a quick comparison on these developer platforms, see here).

So what might Amazon be doing to make voice shopping more successful?

Re-ordering items is the perfect beginner use-case, being the equivalent of “known item” searches. Amazon may work on expanding the scope of such cases, identifying additional recurring purchase types that can be optimized. These play well with other recent moves by Amazon, such as around grocery shopping and fulfillment.

Shopping lists are a relatively popular Alexa feature (as well as on Google Home), but based on owner testimonials it seems that most users use these for offline shopping. Amazon is likely working to identify more opportunities for driving online purchases from these lists.

Voice interface has focused mainly on a single result, yielding a “I’m Feeling Lucky” interaction. Using data from non-voice interactions, Amazon could build a more interactive script, one that could guide users through more complex decisions. An interesting case study for this has been eBay with its “ShopBot” chatbot, though transitioning to voice-only control still remains a UX challenge.

And finally – it’s worth noting that in the absence of an item in the purchase history (or if the user declines it), Alexa recommends products from what Amazon calls “Amazon’s Choice“, which are “highly rated, well-priced products” as quoted from this help page. This feature is in fact a powerful business tool, pushing vendors to compete for this lucrative slot. In the more distant future, users may trust Alexa to the point of just taking its word for it and assuming this is the best product for them. That will place a huge lever in Amazon’s hands in its relationship with brands and vendors, and it’s very likely that other retailers as well as brands will fight for a similar control, raising the stakes even more on voice search interfaces.

If you visit the Google homepage on your desktop, you’ll see a rare, prehistoric specimen – one that most Google users don’t see the point of: the “I’m Feeling Lucky” button.

Google has already removed it from most of its interfaces, and even here it only serves as a teaser for various Google nitwit projects. And yet the way things are going, the “Feeling Lucky” ghost may just come back to life – and with a vengeance.

In the early years, the “I’m Feeling Lucky” button was Google’s way of boldly stating “Our results are so great, you can just skip the result lists and head straight to destination #1”. It was a nice, humorous touch, but one that never really caught on as users’ needs grew more complex and less obvious. In fact, it lost Google quite a lot of money, since skipping the result list also meant users saw fewer and fewer sponsored results – Google’s main income source. But usability testing showed that users really liked seeing the button, so Google kept it there for a while.

But there’s another interface rising up that prides itself on returning the first search result without showing you the list. Did you already guess what it is?

Almost every demo of a new personal assistant product will include questions being answered by the bot tapping into a search engine. The demos will make sure to use simple single-answer cases, like “Who is the governor of California?” That’s extremely neat, and was regarded as science fiction not so many decades ago. Amazing work on query parsing and entity extraction from search results has led to great results on this type of query, and the quality of the query understanding, and resulting answers, is usually outstanding.

However, these are just some of the possible searches we want our bots to run. As we get more and more comfortable with this new interface, we will not want to limit ourselves to one type of query. If you want to be able to get an answer for “Give me a good recipe for sweet potato pie” or “Which Chinese restaurants are open in the area now?”, you need a lot more than a single answer. You need verbosity, you need to be able to refine – which stretches the limits of how we perceive conversational interfaces today.

Part of the problem is that it’s difficult for users to understand the limits of conversational interfaces, especially when bot creators pretend that there are no such limits. Another problem lies in the fact that a natural language interface may simply be a lousy choice for some interaction types, and imposing it on them will only frustrate users.

There is a whole new paradigm of user interaction waiting to be invented, to support non-trivial search and refine through conversation – for all of those many cases where a short exchange and single result will probably not do. We will need to find a way to flip between vocal and visual, manage a seamless thread between devices and screen-based apps, and make digital assistants keep context on a much higher level.

Until then, I guess we’ll continue hoping that we’re feeling lucky.

Tag Archives: Voice Search

“Alexa, add voice shopping to my to-do list”

Feeling Lucky Is the Future of Search

Recent Posts

Blogroll