What is natural language processing?

Creating systems capable of meaningful conversation is one of the hardest tasks facing AI researchers

If you have recently dealt with a customer service department, you might have noticed that they have changed significantly over the last few years.

Traditional pre-recorded instructions asking you to choose from a range of easily-forgotten numbers in order to access the help you need have been gradually replaced with a more customer-friendly approach.

Modern customer service has managed to spare some of its employees from having to be on the receiving end of frustrated rants. Nowadays, customers are often greeted with the voice of an automated system able to receive and understand voice input - basically, allowing customers to do the yelling or complaining while a computer resolves the problem.

This small yet significant progress can be attributed to machine learning, which is also found in our interactions with ‘Alexa’, the ever-helpful assistant from Amazon Echo devices.

However, this technology is far more advanced than most people realise. Natural language processing (NLP), the framework which is the foundation for such devices, is actually behind a number of recent innovations. An example of such are 'chatbots', which have the ability to determine the subject of a query or response, analyse its content, receive contextual information and, in many cases, also assess your mood.

Moreover, NLP is not just limited to voice communication. The technology is also capable of checking grammatical errors, translating entire web pages, or finding relevant information in large volumes of documents.

Whether you're asking your ‘Alexa’ to read you the morning news, making a call to check if your plane has been delayed, or enquiring about a missing parcel, there is a chance that your request or issue is being dealt with by natural language processing, which is working away in the background to provide you with the customer service you need - or deserve.

A very tricky problem to solve

However, having an effective conversation or reading text is rife with nuances, inferences and judgements. It's one thing to break a language up into nouns, verbs, adjectives and the rest, but language is about far more than the various mechanisms that structure a sentence - and it's this added complexity that makes it difficult for computers to interpret and mimic human interaction.

The same word can have different meanings depending on the context in which it is used. Consider this example: the 'aeroplane banked to the left', or 'I banked the cheque you gave me'. How about 'I stood at the bow of the ship' compared to 'Mark tied his shoelaces into a bow' - not to mention the act of bowing to an audience or the device used to shoot an arrow.

Knowing how 'bow' is used in each of those sentences depends on the context provided by the surrounding words. Yet sometimes the context of the surrounding words doesn't help either. For example, consider "I saw the hiker on the path with binoculars". Who has the binoculars - me or the hiker?

The English language is heavily reliant on context, something that non-English speakers have difficulties with, never mind a machine. And that's before we start factoring in various highly-personal traits that characterise the person speaking, such as irony, sarcasm and humour, which even us humans can find difficult to infer from sentences alone.

What does NLP software do?

Natural language processing helps overcome any problems associated with understanding spoken language by combining data and algorithms to create context. There are two main aspects to how this works - syntax and semantics.

Syntax is by far the easiest of the two to apply, given that it deals with clearly defined rules. This includes dictionary definitions, the structure of sentences and, most notably, parsing, which relates to understanding what the sentence actually means. With the 'binocular' example above, parsing helps to identify who they belong to by using earlier or later sentences.

Related Resource

How to overcome the barriers to personalisation

Leap over every obstacle and jump ahead of your competitors in the process

Download now

With natural language processing, algorithms are able to apply a number of different syntactic processing techniques to help break down these rules. These include: 'word segmentation', which dissects large chunks of text into distinct sections; 'stemming', which involves refining an inflected word to its root form; and morphological segmentation, which involves breaking a word down into its smallest meaningful units, such as 'in', 'come', '-ing' for 'incoming'.

Semantics, however, is far less structured and usually the more difficult to interpret. This aspect involves understanding the meaning of individual words within the broader context of the sentence. As with the examples above, this is understanding the various and most appropriate uses of the word 'bow'.

Semantic analysis can include 'named entry recognition', which involves figuring out which pieces of text map to proper names, and then determining what type of name it is, whether that's a person's name or the name of a location. Systems can rely on the convention of capitalising the first letter to easily identify a given name, however, this is also true of the first letter of a sentence and many organisation names do not capitalise every letter. What's more, capitalisation rules are not uniform across languages - in German, every noun is capitalised, for example.

This is symptomatic of semantic analysis in general - although there are techniques available, there are a number of hurdles still left to overcome.

Are we there yet?

The very nature of human language, which has evolved over thousands of years, means computer systems need to be flexible, capable of accommodating approximations, and ultimately, able to make educated guesses.

M&S sign

M&S has already replaced the entirety of its call centre staff with software based on NLP

Systems based on natural language processing are still being developed, and only fairly rudimentary models are widely available for public use. Smart assistants such as Siri, Alexa and Google Home, are essentially just question and answer machines and are not capable of making sense of conversation.

Apple, Amazon, Google, IBM and other major technology players are all working on their own NLP systems to address a variety of challenges. These include systems capable of analysing large streams of text in documents, assessing emails to help sift junk mail, and maintaining conversations as part of a customer service tool.

The sophistication of online chatbots is improving all the time, but they are still only really capable of handling simple question and answer interactions and require the user to provide specific targetted keywords. Complaints are often escalated to a human employee once a problem becomes more nuanced, which, unfortunately for the customer, doesn't take much doing.

We're a long way off the seamless interaction that the customer service industry envisages.

Featured Resources

Digital document processes in 2020: A spotlight on Western Europe

The shift from best practice to business necessity

Download now

Four security considerations for cloud migration

The good, the bad, and the ugly of cloud computing

Download now

VR leads the way in manufacturing

How VR is digitally transforming our world

Download now

Deeper than digital

Top-performing modern enterprises show why more perfect software is fundamental to success

Download now


MarqVision detects counterfeit products with deep learning and AI
intellectual property

MarqVision detects counterfeit products with deep learning and AI

18 Sep 2020
The IT Pro Podcast: Attack of the AI hackers
artificial intelligence (AI)

The IT Pro Podcast: Attack of the AI hackers

14 Aug 2020
MIT develops AI tech to edit outdated Wikipedia articles
artificial intelligence (AI)

MIT develops AI tech to edit outdated Wikipedia articles

13 Feb 2020

Most Popular

The top 12 password-cracking techniques used by hackers

The top 12 password-cracking techniques used by hackers

5 Oct 2020
The enemy of security is complexity

The enemy of security is complexity

9 Oct 2020
IBM and SAP expand partnership to support software on hybrid cloud

IBM and SAP expand partnership to support software on hybrid cloud

21 Oct 2020