A look at Duplex, Google's creepy AI chatbot

Google's AI chatbot

Duplex is a new Google tool that uses complex artificial intelligence (AI) to not only dial a phone number and conduct a voice conversation but to sound convincingly human in the process.

It's designed to carry out specific tasks such as booking appointments and checking opening hours. But although it undoubtedly marks the next step towards effective human-computer interaction, it's already proving controversial, in no small part thanks to rumours that Google may be faking the whole thing.

What's the idea behind it?

Google has noticed that many businesses have yet to introduce an online booking service. As such, we still primarily rely on making phone calls to book appointments with dentists, doctors, opticians, hairdressers and restaurants.

Google wants to bring these interactions in line with services that do offer internet facilities by handling the chat side of things for you.

How does it work?

Duplex is to be integrated with Google Assistant, and it's simply a matter of asking it to perform a certain task. You could enquire when a local computer shop is open, for example, or make an appointment with your nearest hairdresser.

In either case, details of the request are passed to Duplex, which then makes a call, does the talking for you and reports back with an answer or confirmation. Magically, the person on the other end shouldn't be able to tell they've just spoken to a computer.

How does it achieve that?

For several years, Google has been training a recurrent neural network using anonymised phone conversation data. Such networks work like a human brain to some degree, using memory to process input sequences while taking into account their own output, context and history.

As you can imagine, this is hugely complex in nature. Duplex needs to work with the output from Google's automatic speech-recognition software, while taking into account features of the audio and the conversation's history. It also needs to operate within the parameters of the conversation it is tasked to carry out, knowing what to ask and the kind of responses it should be giving.

Can't I just chat merrily away to it?

No, not yet. Duplex will not be not phoning businesses for a general chit-chat and neither could it handle one if it did: flawless, open-ended conversation is difficult to achieve because it relies on context and multiple-layers.

For that reason, Google says Duplex can't carry out general chats, so discussing the weather and what you're having for your tea isn't on the agenda right now.

But is it convincing as a human?

Yes, it is. Based on the examples that Google demonstrated at its annual I/O developer conference (above), Duplex could start a conversation with a person who picked up the phone, make the right request, listen for a response, answer questions and deal with nuances of speech.

As a follow-up blog showed, Duplex is also able to handle interruptions and elaborations, with latency taken into account ("hellos" are responded to immediately, while responses that would ordinarily require a little bit of thought are slightly delayed).

It even "ums" and "aahs" like a human, lending that extra level of reality to its speech to trick whoever answers the phone.

Can it recognise any speech?

Apparently so, although Google has not yet provided any examples of a conversation near-drowned out by clattering plates or snipping scissors, and nor has it given evidence of Duplex coping with fast talkers or strong accents either.

However, voice recognition software, neural networks and natural language processing are advancing at a rate of knots so we wouldn't be surprised.

Will it replace infuriating automated phone systems?

Sadly, that's not the current intention. Instead, Google Duplex is flipping computer-human conversations around, with the business on the receiving end rather than the customer. But Duplex makes the process of talking to a computer much more natural. Google says humans do not have to adjust to the system; the system adjusts to us.

Will people be told they're talking to a robot?

In the audio examples given by Google, Duplex does not identify itself as a machine. This has caused a lot of controversy because many people worry that it could become creepy and deceptive. But Google insists it is experimenting to find the right approach and since the demonstration, it has said Duplex will indeed identify itself (as Google Assistant) at the start of a conversation.

Isn't this still a slippery path?

Maybe. Some people believe a robot should always sound like a robot so that there are no misunderstandings and there's certainly a fear that human-sounding robotic voices could be used to manipulate people.

We've already seen AI that can mimic any voice: a startup called Lyrebird creates digital voices that sound like you by sampling a minute of your speech. But even something as small as the "ums" and "aahs" of Duplex could be seen as a step too far.

Won't people just put the phone down?

If the system announces itself as a robo-caller, then the person may well simply hang up. But if they do, they risk losing business, so it's in their best interest to answer the call.

Whether or not they adapt the way they talk when they're alerted to the non-human nature of the caller is another matter entirely. It's even possible that by becoming accustomed to talking to robots, we'll change the way we converse with each other.

Will Google record the conversations?

Yes, but only "in certain jurisdictions". We take that to mean it will record conversations where it is legal to do so and that it will seek permission in countries and states that require people to know a recording is being made.

Google says recordings could be shared with users so that they have a record of how they responded. The information is also likely to aid Duplex's learning process.

Why can't we just pick up the phone ourselves?

Well, we can, of course, and we envisage that many people will still prefer to call direct.

Going via Duplex does have benefits, though. It can save time and allows Google Assistant to make a note of the appointment in your calendar and set up a reminder. The system also means businesses don't have to spend a fortune on online booking systems.

It could even get us across language barriers: we may request a booking for a Parisian restaurant in English on Google Assistant and have the request made in perfect French.

Did Google fake the AI calls?

It may be testament to the impressive nature of the calls, but when Google showed off examples of Duplex in action (above), some journalists and experts believed the calls may have been edited or even staged.

They pointed to the lack of an introduction by the humans, who failed to identify their business names. They also believed the lack of ambient noise was a giveaway sign that something was amiss. The fact that none of the humans asked for a phone number or email was also deemed suspicious, and this was compounded by the fact that there was no evidence that the person called had agreed to be recorded.

Google has not commented on the speculation, although some web sleuths picked up on a photo of Duplex engineers Yaniv Leviathan and Mathan Kalman in a restaurant. The pair were said to have booked their meal via Duplex, so reporters identified the eatery from the style of dcor and booth and called it up. The restaurant appeared to confirm the AI had made the reservation.

Image: Google Duplex