Build 2016: Why Microsoft predicts a world of talking bots
Satya Nadella’s big vision for Microsoft is around intelligent conversations with machines
Right after its millennial Tay bot turned genocidal might not seem like the best time for Microsoft to pin its future on bots, conversations and artificial intelligence, but that's exactly what Satya Nadella announced at Build 2016 last night.
The Redmond CEO claimed that "we are on the cusp of a new frontier that pairs the power of natural human language with advanced machine intelligence".
Microsoft wants to bring conversation into so many places that it becomes the next stage of the GUI; "we want to take the power of human language and apply it more pervasively to all computing interfaces," Nadella said.
Conversations as a Platform, as Microsoft calls it, envisions a world where you ask Cortana to block out the week you'll be at a conference in your calendar and she tags in the bot from your favourite hotel to book a room for the right dates.
This bot, by the way, already knows what kind of room you prefer, and suggests a message to send to a friend who lives nearby letting them know when you'll be in town.
That extends the way Cortana can already index your email, looking for meetings you ought to go to and things you've promised to do to services like Skype and Slack, so she can extract information from video voicemail (which comes with a handy transcript using voice recognition since Microsoft is extracting the information for the bots, you can see it too).
If you're working late, Cortana might ask if you want to order a takeaway and pops up the Just Eat bot or the Dominos bot to take your order depending on what she knows about your eating habits.
Behind the scenes, the Dominos bot will be using Microsoft's Cognitive Services APIs to understand what you're saying or typing or talking about, although it's up to the developer whether they want to accept suggestions from the text recognition API that their bot should know that send a pizza to my crib' means deliver a pizza to my house'.
The intelligent underpinnings of conversational interfaces draw on AI and machine learning research Microsoft has been doing for a couple of decades. They also rely on on what they know about how businesses work, the information in your mail, and documents stored in the Microsoft Graph.
This is the big data processing that Azure is so good at and is more evidence of the cross-platform mobility of experience Nadella has been preaching from day one, because although Microsoft wants Windows to be the best place to have a voice conversation with Cortana, this will work in Cortana on your iPhone or Nexus too.
Intelligence is not just voice recognition for talking to bots and language understanding for typing messages. It's also image recognition (which down the line is going to include recognising people, places and things in video, along with emotions), and the other 20 APIs Microsoft is already offering.
It's also holograms and the way HoloLens maps the real world to mix virtual into reality. It's also the smart handwriting recognition any developer can put in their Windows apps - when you write tomorrow' and Cortana recognises it as a date, it's another way you could start a chat about what you want to do or automatically get a reminder when you need it.
Really, it's the Natural User Interfaces we've been promised for about a decade, and they might finally become real.
Plausible, polite conversations
If conversations with bots sound automated and dehumanising, like a future of brands shouting their wares at you, look at projects like Seeing AI, a service for smartphones and smartglasses to help the visually impaired understand who and what is around them.
The app can guide a blind person to take a photo of the menu (move the phone up and to the right') and then read it out to them and rather than listing that menu, it lets them pick and choose the sections they want to hear about.
Alternatively you can swipe a finger over the earpiece of your smartglasses to snap a photo and have it automatically captioned, so you know what's happening ( I think it's a girl throwing an orange frisbee in the park' says the system, using the kind of image captioning that was a research project only last year).
Cleverly, Microsoft demoed this capability by recognising the age and gender of the people you're talking to, instead of trying to do facial recognition on strangers walking down the street. That's a perfect of example of what Nadella calls "intelligence that augments human capabilities and experience", saying that "it's not going to be about man versus machines, it's going to be about man with machines".
Similarly, bots that talk to you or that Cortana deals with automatically for you - don't get to know your location unless you allow that. For example, Cortana might ask do you want the courier to know where you are this afternoon so the delivery comes to the right place?'.
Cortana can also securely pass along your credit card details, an innovation that might even work with something like the FIDO standards that underpin Windows Passport, so Cortana can do your banking for you, once you've used the Windows Hello biometrics to sign in.
AI is one of the biggest buzzwords around at the moment. It's also often hype. What Microsoft is promising is more achievable than some AI futures. Think of a bot as very simple app, or better yet a more useful search result that lets you do what you want rather than clicking on a link and hunting through a web page looking for the right tool.
That doesn't mean the idea of pervasive conversational interfaces isn't ambitious and it won't necessarily be easy to get right. It's not that the effects will be so much worse if it goes wrong than we have problems with other areas of technology; this is image recognition and language understanding, not robots with lasers and getting a hotel booking wrong is as easy on a website as it will be talking to a bot.
But because of the personal, intuitive, conversational interface, it might feel a lot more upsetting if a hotel bot gets our room wrong or a restaurant bot books the table for the anniversary of your first wedding when you're divorced and remarried.
That's one reason Nadella was careful to position AI, bots, assistants, conversational interfaces and intelligence as part of the future landscape of technology, but also something society needs to have a discussion about. He specifically mentioned the problems with Tay (while subtly noting that Microsoft's chat bots in China and Japan haven't run into the same malicious pranking by users), promising Microsoft would take "a principled approach" to adding intelligence to interfaces, and make technology that's "more inclusive and respectful" with those principles including privacy as well as security and compliance. "We want to build technology so we get the best of humanity, not the worst," he notes.
That makes conversational interfaces part of the ongoing discussion about accessibility and diversity in technology. Nadella hinted at other common questions like where the balance between security and privacy lie with things like encryption and whether technology is actually contributing to GDP.
Conversational interfaces that actually work could be great for productivity as well as extending access to computing to a lot more people than are comfortable with laptops and smartphones. That makes the old Microsoft promise of a computer on every desktop' seem almost unambitious.
Key considerations for implementing secure telework at scale
Identifying the security risks and advanced requirements of a remote workforceDownload now
The State of Salesforce 2020
Your guide to getting the most from SalesforceDownload now
Fast, flexible and compliant e-signatures for global businesses
Be at the forefront of digital transformation with electronic signaturesDownload now
Rethink your cybersecurity strategy for the new world
5 steps to secure the enterprise and be fit for a flexible futureDownload now