Companies House eyes potential of open data

Open data

Companies House hopes to harness the power of open data to help developers build better apps based on government information.

The business registry, part of the Department of Business, Innovation and Skills, is revamping its web services as it prepares to provide company information for free, instead of charging the current 1 fee per document.

It is replacing its relational Oracle database with NoSQL MongoDB infrastructure to support this, and releasing new REST-based APIs to developers.

Its head of IT architecture, Mark Fairhurst, told IT Pro one potential benefit of the overhaul was linking business data with that of other Whitehall departments to allow developers to build more sophisticated applications.

"None of this is set in stone but we're looking at how we can make that data linked, how we can join it up," he said.

"We have postcode data within our datasets we should be able to link directly with Ordnance Survey [data], for example."

Fairhurst believes linking datasets could help spur innovation among developers.

"We're hoping the new technology will generate some innovation in the marketplace and people will come along and start to consume the data. With the removal of the monetary term, they may start to play with the data.

"In theory all of this making data open should allow the market to start to do those kind of things," he said.

MongoDB

But the first step was redesigning from the ground up Companies House's "tired" web-based services, starting with replacing the Oracle relational database underpinning them with a MongoDB NoSQL system.

Fairhurst said: "What we currently have with our existing web-based services is relational database technology, we wanted to rewrite those services.

"They are becoming tired from the user experience viewpoint, and there's Government drivers to make things simpler and easier for customers."

The Oracle database meant the web services were designed in a relational way, sending queries to pull data from "three or four" different tables via Oracle's XML Gateway.

"That data then needs to be brought together and the developers need to do a little SQL to do that," said Fairhurst.

While it worked, REST-based APIs are considered a much easier way for developers to hook into datasets.

"As we were developing a new system we took the opportunity to look at this," explained Fairhurst.

The way he and his team designed the system with MongoDB meant it became "a collection of data".

And by providing REST-based APIs, the datasets became endpoints in themselves, he added.

"You come along and say I want the address for this company number', and there's a rest-based URL for that," Fairhurst said. "Someone who understands the technology can put that in and it should bring back the address. That's a collection in Mongo database that literally just returns that information in that format."

Scalability

Another benefit of the MongoDB database, and a reason Companies House picked it over other NoSQL options, was its scalability with the organisation expecting many more queries when it scraps the fee come June next year.

Fairhurst said: "We've developed a system now that is more easily scalable so hopefully as demand goes up for our data as it goes free, we have a system there that can easily scale to cope with the demand."

UI

A public benefit of the overhaul, rather than one for developers, is the user interface.

The organisation is improving the search functionality for users, as well as the general appearance of the site.

The new, revamped Companies House's appearance will look consistent with other Gov.uk sites as per the guidance of the Government Digital Service (GDS), which is moving all Whitehall websites to the suffix by March 2015.

Companies House is moving to Gov.uk next Wednesday.

"The new service follows those guidelines and fits in with that design strategy," confirmed Fairhurst.

The revamped Companies House site is currently in private beta with a limited number of users, and will go to public beta soon.