What is natural language processing?
Creating systems capable of meaningful conversation is one of the hardest tasks facing AI researchers
If you find yourself dealing with customer service departments on a regular basis, it's likely you've noticed a slow shift away from traditional voice recordings and 'press 1 for x' instructions, towards something a little more intuitive.
This modern service approach has automated systems capable of receiving and understanding voice input - in other words, allowing customers to vent their frustrations and have a computer resolve the problem.
Thanks to recent developments in machine learning, the technology is far more sophisticated than most people believe, particularly when our experience of this sort of thing is usually confined to interactions with an Amazon Echo. Natural language processing, the framework that such devices are built on, is responsible for a range of new innovations, including highly capable 'chatbots'. These are able to ascertain the subject of a query or response, analyse its content, take in contextual information and, in many cases, assess your mood.
Whether you're asking a smart speaker what the weather will be like this week, making a call to check the latest train times, or calling to make a complaint about shoddy user experience, chances are that natural language processing (NLP) is working away in the background to provide a quality service that's just as good, if not better than, a human.
It's not just voice NLP can help with. The technology also allows our word processing software to check for grammar errors and it also sits behind Google Translate, doing its best to ensure that the document in a language you can't read is repurposed into one that you can. It helps the chatbot used by your bank to understand and comply with your requests in the blink of an eye.
A very tricky problem to solve
However, having an effective conversation or reading text is rife with nuances, inferences and judgements. It's one thing to break a language up into nouns, verbs, adjectives and the rest, but language is about far more than the various mechanisms that structure a sentence - and it's this added complexity that makes it difficult for computers to interpret and mimic human interaction.
The same word can have different meanings depending on the context in which it is used. Consider this example: the 'aeroplane banked to the left', or 'I banked the cheque you gave me'. How about 'I stood at the bow of the ship' compared to 'Mark tied his shoelaces into a bow' - not to mention the act of bowing to an audience or the device used to shoot an arrow.
Knowing how 'bow' is used in each of those sentences depends on the context provided by the surrounding words. Yet sometimes the context of the surrounding words doesn't help either. For example, consider "I saw the hiker on the path with binoculars". Who has the binoculars - me or the hiker?
The English language is heavily reliant on context, something that non-English speakers have difficulties with, never mind a machine. And that's before we start factoring in various highly-personal traits that characterise the person speaking, such as irony, sarcasm and humour, which even us humans can find difficult to infer from sentences alone.
What does NLP software do?
Natural language processing helps overcome any problems associated with understanding spoken language by combining data and algorithms to create context. There are two main aspects to how this works - syntax and semantics.
Syntax is by far the easiest of the two to apply, given that it deals with clearly defined rules. This includes dictionary definitions, the structure of sentences and, most notably, parsing, which relates to understanding what the sentence actually means. With the 'binocular' example above, parsing helps to identify who they belong to by using earlier or later sentences.
With natural language processing, algorithms are able to apply a number of different syntactic processing techniques to help break down these rules. These include: 'word segmentation', which dissects large chunks of text into distinct sections; 'stemming', which involves refining an inflected word to its root form; and morphological segmentation, which involves breaking a word down into its smallest meaningful units, such as 'in', 'come', '-ing' for 'incoming'.
Semantics, however, is far less structured and usually the more difficult to interpret. This aspect involves understanding the meaning of individual words within the broader context of the sentence. As with the examples above, this is understanding the various and most appropriate uses of the word 'bow'.
Semantic analysis can include 'named entry recognition', which involves figuring out which pieces of text map to proper names, and then determining what type of name it is, whether that's a person's name or the name of a location. Systems can rely on the convention of capitalising the first letter to easily identify a given name, however, this is also true of the first letter of a sentence and many organisation names do not capitalise every letter. What's more, capitalisation rules are not uniform across languages - in German, every noun is capitalised, for example.
This is symptomatic of semantic analysis in general - although there are techniques available, there are a number of hurdles still left to overcome.
Are we there yet?
The very nature of human language, which has evolved over thousands of years, means computer systems need to be flexible, capable of accommodating approximations, and ultimately, able to make educated guesses.
M&S has already replaced the entirety of its call centre staff with software based on NLP
Systems based on natural language processing are still being developed, and only fairly rudimentary models are widely available for public use. Smart assistants such as Siri, Alexa and Google Home, are essentially just question and answer machines and are not capable of making sense of conversation.
Apple, Amazon, Google, IBM and other major technology players are all working on their own NLP systems to address a variety of challenges. These include systems capable of analysing large streams of text in documents, assessing emails to help sift junk mail, and maintaining conversations as part of a customer service tool.
The sophistication of online chatbots is improving all the time, but they are still only really capable of handling simple question and answer interactions and require the user to provide specific targetted keywords. Complaints are often escalated to a human employee once a problem becomes more nuanced, which, unfortunately for the customer, doesn't take much doing.
We're a long way off the seamless interaction that the customer service industry envisages.
The IT Pro guide to Windows 10 migration
Everything you need to know for a successful transitionDownload now
Managing security risk and compliance in a challenging landscape
How key technology partners grow with your organisationDownload now
Software-defined storage for dummies
Control storage costs, eliminate storage bottlenecks and solve storage management challengesDownload now
6 best practices for escaping ransomware
A complete guide to tackling ransomware attacksDownload now