Scott Monty - Strategic Communications & Leadership Advisor

Scott Monty - Strategic Communications & Leadership Advisor

We're moving into an era when more of our desires and demands will be answered — literally and figuratively — by disembodied voices. But there may be more to the story before we're through...

You know the major players in the voice-enabled speaker and assistant market: Amazon Echo (powered by Alexa) and Google Home; and associated AI-powered voice activated systems in Apple devices powered by Siri, Google Now, and Microsoft Cortana. And don't forget J.A.R.V.I.S.

The race is on, as 35.6 million Americans will use a voice-activated assistant device at least once a month. That's a growth of 130% in the voice-enabled speaker market — a market that is currently dominated by Amazon, with 70% market share (followed by 23% market share for Google Home).

But here's the current challenge with voice-activated systems: there's no menu. There's no dropdown of options. There's no visual cue to help you give you a sense of what you can ask the system. Oh sure, you can ask what your query options are, but the voice will simply read back to you what your options are.

And for anyone who has had the experience of using laptops, smartphones, telephones, typewriters, pen and paper, wet clay tablets, or any other communication system in the last 2500 years, you'll understand that we as a species are trained to think visually. We haven't functioned as a full aural society since Homer's time, when epic poems were shared around fires rather than in printed tomes.

So it should be no surprise that there might not be universal adoption of voice-enabled speakers (35 million people who use them once a month?). Could the forefront of artificial intelligence-enabled systems actually be visual rather than audio? Two recent developments indicate a possibility.

First, Amazon announced a touchscreen version of Echo: the Echo Show. It has the same functionality as its tubular brother, but with visual cues through a built-in display. The upside is that the Echo Show includes video and voice call capability.

And last week in its I/O announcements, Google debuted Google Lens, which some have said is the future of Google. Essentially, you point your phone's camera at anything, and Google Lens acts as a visual search, giving you any relevant information or taking action. It is augmented reality meets search meets artificial intelligence. And Amazon ought to be concerned.


Because for years, the differentiation between Amazon and Google has been this:

Google brings you information.
Amazon brings you products.

With visual search, that has the potential to be flipped on its head. How? The phone is mobile, while Echo is not. Echo ties you to interactions with Amazon in your home. With Google Lens (and the less visual Google Now, which is now available in iOS), you're essentially carrying a AI assistant in your pocket with the potential to do much more.

With the visual search element, it makes showrooming a cinch — should ecommerce brands integrate with visual search results. It also takes the friction out of the audio-only interactions. And it puts Google back on top in this app-dominated market where a Facebook or Amazon search might occur before a Google search.

So Google might actually be able to bring you more than just information. By giving you more than just a stationary disembodied voice to talk to.


Post a Comment