Google to change AI forever with open source 'Parsey McParseface'

The SyntaxNet system understands human language with an incredible degree of accuracy

AI artificial intelligence

Google has open sourced its language parsing model, SyntaxNet, calling the English version Parsey McParseface.

The system understands human language with an incredible degree of accuracy, but attention has centred around its choice of name, which comes after people voted to name a science research ship Boaty McBoatface - it was in fact named after Sir David Attenborough.

Advertisement - Article continues below

The open sourcing of Google's parsing model means that the broader community can employ the tool to up the game of artificial intelligence (AI). This means that machines could understand sentences from a standard database of English language journalism, the first step in their journey to take over the world.

"At Google, we spend a lot of time thinking about how computer systems can read and understand human language in order to process it in intelligent ways," explained Google senior staff research scientist, Slav Petrov.

"Today, we are excited to share the fruits of our research with the broader community by releasing SyntaxNet, an open source neural network framework implemented in TensorFlow that provides a foundation for Natural Language Understanding systems."

SyntaxNet is built on powerful machine learning algorithms that learn to analyse the linguistic structure of language, and that can explain the functional role of each word in a given sentence.

Advertisement - Article continues below
Advertisement - Article continues below

But how does it work? The system basically recognises the subject and object of sentences and make sense of what they mean, by determining the syntactic relationships between words in the sentence, represented in the dependency parse tree.

To explain this visually, here is a simple tree diagram for the word group: "Alice saw Bob".

The structure encodes that Alice and Bob are nouns and saw is a verb. The main verb 'saw' is the root of the sentence and Alice is the subject (nsubj) of saw, while Bob is its direct object (dobj). This diagram structure helps Parsey McParseface essentially suss out the meaning of the sentence, analysing it correctly.

"Our release includes all the code needed to train new SyntaxNet models on your own data, as well as Parsey McParseface," added Petrov. "SyntaxNet applies neural networks to the ambiguity problem. An input sentence is processed from left to right, with dependencies between words being incrementally added as each word in the sentence is considered."

According to Google, Parsey McParseface gets 94 per cent accuracy on the news text, compared to about 96 per cent or 97 per cent for human linguists, but it doesn't do quite as well on random sentences from the Web, where it gets about 90 per cent accuracy.

Featured Resources

Preparing for long-term remote working after COVID-19

Learn how to safely and securely enable your remote workforce

Download now

Cloud vs on-premise storage: What’s right for you?

Key considerations driving document storage decisions for businesses

Download now

Staying ahead of the game in the world of data

Create successful marketing campaigns by understanding your customers better

Download now

Transforming productivity

Solutions that facilitate work at full speed

Download now


operating systems

Best Linux distros 2020

18 May 2020
Careers & training

The UK should follow Finland's lead with it comes to AI training

19 Dec 2019
Business strategy

What is machine learning?

27 Sep 2019
Marketing & comms

AI is just clever marketing, and I’m not buying

20 Sep 2019

Most Popular


How to find RAM speed, size and type

24 Jun 2020

The road to recovery

30 Jun 2020

The growing case for IT flexibility

30 Jun 2020