Google to change AI forever with open source 'Parsey McParseface'

The SyntaxNet system understands human language with an incredible degree of accuracy

AI artificial intelligence

Google has open sourced its language parsing model, SyntaxNet, calling the English version Parsey McParseface.

The system understands human language with an incredible degree of accuracy, but attention has centred around its choice of name, which comes after people voted to name a science research ship Boaty McBoatface - it was in fact named after Sir David Attenborough.

The open sourcing of Google's parsing model means that the broader community can employ the tool to up the game of artificial intelligence (AI). This means that machines could understand sentences from a standard database of English language journalism, the first step in their journey to take over the world.

"At Google, we spend a lot of time thinking about how computer systems can read and understand human language in order to process it in intelligent ways," explained Google senior staff research scientist, Slav Petrov.

Advertisement - Article continues below

"Today, we are excited to share the fruits of our research with the broader community by releasing SyntaxNet, an open source neural network framework implemented in TensorFlow that provides a foundation for Natural Language Understanding systems."

SyntaxNet is built on powerful machine learning algorithms that learn to analyse the linguistic structure of language, and that can explain the functional role of each word in a given sentence.

But how does it work? The system basically recognises the subject and object of sentences and make sense of what they mean, by determining the syntactic relationships between words in the sentence, represented in the dependency parse tree.

To explain this visually, here is a simple tree diagram for the word group: "Alice saw Bob".

The structure encodes that Alice and Bob are nouns and saw is a verb. The main verb 'saw' is the root of the sentence and Alice is the subject (nsubj) of saw, while Bob is its direct object (dobj). This diagram structure helps Parsey McParseface essentially suss out the meaning of the sentence, analysing it correctly.

"Our release includes all the code needed to train new SyntaxNet models on your own data, as well as Parsey McParseface," added Petrov. "SyntaxNet applies neural networks to the ambiguity problem. An input sentence is processed from left to right, with dependencies between words being incrementally added as each word in the sentence is considered."

According to Google, Parsey McParseface gets 94 per cent accuracy on the news text, compared to about 96 per cent or 97 per cent for human linguists, but it doesn't do quite as well on random sentences from the Web, where it gets about 90 per cent accuracy.

Featured Resources

The IT Pro guide to Windows 10 migration

Everything you need to know for a successful transition

Download now

Managing security risk and compliance in a challenging landscape

How key technology partners grow with your organisation

Download now

Software-defined storage for dummies

Control storage costs, eliminate storage bottlenecks and solve storage management challenges

Download now

6 best practices for escaping ransomware

A complete guide to tackling ransomware attacks

Download now


Business strategy

What is machine learning?

27 Sep 2019
Marketing & comms

AI is just clever marketing, and I’m not buying

20 Sep 2019

AI can play poker, but I’m neither shaken nor stirred

16 Jul 2019

IBM doubles down on Red Hat independence

10 Jul 2019

Most Popular

Microsoft Azure

Microsoft, not Amazon, is going to win the cloud wars

30 Nov 2019
Mobile Phones

Pablo Escobar's brother launches budget foldable phone

4 Dec 2019
wifi & hotspots

Industrial Wi-Fi 6 trial reveals blistering speeds

5 Dec 2019

Five signs that it’s time to retire IT kit

29 Nov 2019