Google to change AI forever with open source 'Parsey McParseface'

The SyntaxNet system understands human language with an incredible degree of accuracy

AI artificial intelligence

Google has open sourced its language parsing model, SyntaxNet, calling the English version Parsey McParseface.

The system understands human language with an incredible degree of accuracy, but attention has centred around its choice of name, which comes after people voted to name a science research ship Boaty McBoatface - it was in fact named after Sir David Attenborough.

Advertisement - Article continues below

The open sourcing of Google's parsing model means that the broader community can employ the tool to up the game of artificial intelligence (AI). This means that machines could understand sentences from a standard database of English language journalism, the first step in their journey to take over the world.

"At Google, we spend a lot of time thinking about how computer systems can read and understand human language in order to process it in intelligent ways," explained Google senior staff research scientist, Slav Petrov.

"Today, we are excited to share the fruits of our research with the broader community by releasing SyntaxNet, an open source neural network framework implemented in TensorFlow that provides a foundation for Natural Language Understanding systems."

SyntaxNet is built on powerful machine learning algorithms that learn to analyse the linguistic structure of language, and that can explain the functional role of each word in a given sentence.

Advertisement - Article continues below
Advertisement - Article continues below

But how does it work? The system basically recognises the subject and object of sentences and make sense of what they mean, by determining the syntactic relationships between words in the sentence, represented in the dependency parse tree.

To explain this visually, here is a simple tree diagram for the word group: "Alice saw Bob".

The structure encodes that Alice and Bob are nouns and saw is a verb. The main verb 'saw' is the root of the sentence and Alice is the subject (nsubj) of saw, while Bob is its direct object (dobj). This diagram structure helps Parsey McParseface essentially suss out the meaning of the sentence, analysing it correctly.

"Our release includes all the code needed to train new SyntaxNet models on your own data, as well as Parsey McParseface," added Petrov. "SyntaxNet applies neural networks to the ambiguity problem. An input sentence is processed from left to right, with dependencies between words being incrementally added as each word in the sentence is considered."

According to Google, Parsey McParseface gets 94 per cent accuracy on the news text, compared to about 96 per cent or 97 per cent for human linguists, but it doesn't do quite as well on random sentences from the Web, where it gets about 90 per cent accuracy.



operating systems

Best Linux distros 2019

24 Dec 2019
Careers & training

The UK should follow Finland's lead with it comes to AI training

19 Dec 2019
Business strategy

What is machine learning?

27 Sep 2019
Marketing & comms

AI is just clever marketing, and I’m not buying

20 Sep 2019

Most Popular

cyber security

Elon Musk's SpaceX bans Zoom over security fears

2 Apr 2020
application programming interface (API)

Apple buys Dark Sky weather app and leaves Android users in the cold

1 Apr 2020
data management

Oracle cloud courses are free during coronavirus lockdown

31 Mar 2020