With all the hype about voice assistants and AI it’s no wonder that businesses and brands are afraid of being left behind.

Reactions to Google’s public demonstration of its Duplex AI bot have been polarised. Some tech commentators were reportedly amazed by Google’s creation; others were horrified.

Concerns centred on how natural it sounds and the fact that the participants weren’t notified exactly whom or what they were communicating with. Doubts over the authenticity of the ‘experiment’ have surfaced recently, adding to the hype.

Emerging tech voice assistants

With such divided opinion, and considering the recently surfaced doubts over the authenticity of these demonstrations, are voice assistants really going to take off?

Most likely. We can find voice user interfaces everywhere, even if they are not being used to their full potential quite yet. Amazon’s Alexa, Google Assistant and Apple’s Siri can be found in phones, televisions, watches, cars, and now built as standard into the smart home ecosystem.

Smart speakers such as Amazon Echo are quickly becoming the central control hubs of the smart home, encouraging more of us to try it out.

“Over 20% of mobile search is already voice based and forecast to rise to 50% of all search by 2020”

They even teach our children. Amazon’s Alexa can now be tuned to encourage children to say “Please” and “Thank you”, instilling good manners in the absence of parents.

Hardware ubiquity aside, the contexts in which voice trumps traditional input methods are becoming even more influential.

The growing reliance on tech to control our lives, to interact with people and provide entertainment while our hands and eyes are busy doing something else like cooking, driving and washing, is encouraging more of us to try it out.

Most of us will either own a smart speaker or know someone that does, here’s some stats to note:

– A Voicebot.ai report recently found that 10% of the UK adult population, now have access to a smart speaker (having doubled in the last 6 months).

– 75% of US households are projected to own a smart speaker by 2020.

– Hubspot estimate that 19% of people use Siri at least daily.

– Over 20% of mobile search is already voice based and forecast to rise to 50% of all search by 2020.

These stats are a clear indication that users are increasingly turning to voice instead of typing their search queries. It’s hardly surprising; given voice interaction is three times faster, more intuitive and significantly easier than typing.

But that’s just the beginning of the interaction. It’s the returning of search results that begins to challenge the capabilities and the acceptance of your typical user.

A picture paints a thousand words

Voice assistants search

In a visual display, a list of results enables us to scan an array of possible options, briefly noting words, phrases and URLs – all combining to inform our decision.

With verbal content, this is’t possible. Receiving search results through voice is a completely different experience for the user.

Given the limited working memory capacity of our brains, remembering verbal options and making decisions based on these options would be challenging.

Just think about how many times you’ve forgotten which option to press on an automated phone call menu (the simplest form of voice assistance).

Returning results using voice will need to be stripped back, direct and basic, with voice assistants most likely recommending a single choice rather than presenting options, which effectively removes the opportunity for search marketing.

Screen saver

We won’t be ditching our screens quite yet though. Our voice assistants already use a combination of voice input and visual output and, in most situations, this works well enough.

The internet is built on choice, and it will take time before voice assistants (and the companies behind them) build enough trust with their users before they have the confidence to hand over the reigns for all their search queries.

Learning the language

There will no doubt be some learning on the job. User expectations for the interaction are that it should be more like communicating than using technology. But people are used to talking to other people, not technology.

Google said their Duplex AI was developed to have ‘natural’ conversations and would be able to accomplish real-world tasks for people via their phones. Controlled by Google’s DeepMind WaveNet software (terrifying terminology), the voice has been trained using thousands of conversations. The software knows how humans sound and how they behave/react, so it can mimic them effectively, even adding its own little fillers (like “ah” or “um”) between words, just we do.

“Like it or not, it’s happening and there are opportunities for businesses and brands to develop innovative content and experiences – and potentially even brand-new services”

Is it listening to me?

Aside from the social discomfort or ‘embarrassment factor’, concerns about privacy and security are bound to slow adoption.

The internet is filled with accounts of people finding advertising that is suspiciously similar to topics of conversations they’d been having that day in earshot of their voice assistants.

So are they always listening? Google categorically claims it does not use anything before a person says, “OK Google” to activate the voice recognition – be it for advertising or any other purpose.

Facebook allegedly doesn’t allow brands to target advertising based around microphone data and it never shares data with third parties without consent. Other big tech companies have also denied using the technique.

The Channel 4 show ‘Celebrity Hunted’ caused a bit of a backlash against Alexa as users saw first-hand the power of the stored recordings made available for future interrogation. 

Voice assistants Amazon'S Alexa

Over the next few years, the challenges and barriers to adoption will inevitably be overcome.

So, like it or not, it’s happening and there are opportunities for businesses and brands to develop innovative content and experiences – and potentially even brand-new services.

That said, it’s not going to be easy. Voice interactions will garner some new and unusual challenges for brands. Here’s three to start considering:

1. Don’t speak to me in that tone of voice

The traditional ‘tone of voice’ brand document takes on a whole new meaning. Having one voice responsible for representing the brand, expected to span interactions with customers ranging from technical support and sales, through to customer services, could cause some hot debate amongst brand teams.

Voice-based services will allow brands to define the characteristics of the persona that represents them.

It opens up numerous questions about personality style, gender, accents, vocabulary… the list could go on.

2. Accessibility rules

It’s likely that voice interaction will pose more of a challenge than visual graphic based systems.

Without a visual interface, design teams will need to find a way to provide users with the missing information (affordances) that usually informs them of what they can do.

Systems will need to communicate instructions to the user, telling them how to express their intentions in a way that the system understands.

Feedback is also required, ensuring the user knows the system is listening and fulfilling their requests.

This approach shouldn’t be new to brands that provide truly accessible products and services to those users with cognitive and perceptual disabilities.

3. Context and subtext

You must also handle the expectations users have from their experience with everyday conversations. When people have a conversation, a lot of information is not naturally contained in the spoken message itself.

We use experiences of similar ‘events’ and the context of the interaction to provide the information and inform the appropriate interpretation as we listen and talk.

We are only scratching the surface here so if you want to chat about the future of voice and how you can prepare, get in touch with Jonny West on jonny.west@saintnicks.uk.com or on 0117 9270100.