Did government failures lead to EU referendum voter registration website crash?
Sources blame GDS for EU referendum website failure
Failures to design the voter registration website for the cloud, or to make it scalable, may have forced the government to enact emergency legislation to allow people to register to vote in the EU referendum, according to sources speaking to IT Pro.
Tens of thousands of UK citizens were unable to register their right to vote in the upcoming referendum when the government's website crashed under what the Cabinet Office called "unprecedented traffic" on Tuesday, leading Whitehall to take the extraordinary step of extending the registration deadline until 11.59pm tonight.
However, sources close to the public sector with knowledge of the matter have blamed the government for the crash, revealing it used an in-house private cloud to host the service, meaning it could not scale to meet the high demand.
IT Pro has also discovered the application - designed by GDS (Whitehall's internal digital team) with help from its regular coding partners, including Kainos - was not built with the ability to run in multiple zones, meaning that when the private cloud fell over, users were not redirected to another instance of the website.
Research house Kable's chief analyst, Jessica Figueras, told IT Pro the failure raised serious questions over whether the government had thought through the consequences of mission-critical digital services failing at crucial times, saying it could have thrown any referendum result into doubt.
"It's not like an online shopping website going down," she said. "When you have IT service issues that affect democracy, that's a dangerous situation."
Despite the G-Cloud framework allowing the government to pick from thousands of cloud providers like IBM, Skyscape and HP Enterprise, Whitehall chose to host its register to vote application on servers belonging to FCO Services, a trading fund set up by the Foreign and Commonwealth Office.
While most Gov.uk website pages are hosted by Carrenza, clicking the Start Now button on the Register to vote page redirects users to the application hosted by FCO Services.
The service describes itself as running "integrated, secure services worldwide to the FCO and other UK government departments, supporting the delivery of government agendas", and sources said its hosting resembles a small private cloud with limited capacity to scale.
One source, a G-Cloud vendor, said: "They have a very small infrastructure so when I see a massive increase in traffic they just don't scale up and they run into availability issues."
Public data shows 515,000 people visited the Register to vote page on the day it crashed on 7 June, and the government separately confirmed 214,000 applications were made in the last hour of availability.
The largest previous spike was in the run up to the last General Election, when 469,000 people registered to vote on 20 April 2015 - though the site did not appear to crash that day.
Oliver Letwin MP today told Parliament the website crashed from a traffic spike "three times as intense as the spike before the general election", but this does not appear to be backed up by the government's own data. A Cabinet Office spokesman explained Letwin's figures are "based on a longer term average" spread across several days.
Kable analyst Figueras said GDS has experienced previous failures and traffic spikes in other digital services.
Referring to an incident last year when a government site crashed as people tried to sign up for digital tax discs, she said: "They have experienced this sort of thing going wrong before. Voter registration is a deadline-driven service and they lend themselves to service spikes, it's something you need to factor in from the start."
Letwin claimed the government has now doubled the capacity of the voter registration website to deal with future spikes, but the Cabinet Office declined the opportunity to explain how it has managed this.
FCO Services declined to comment on this article.
Why wasn't the voter registration application designed with better backup support?
Letwin also said in Parliament that the sheer number of applications mean "it's no surprise the website fell over", but IT Pro sources claimed the incident was avoidable.
In fact, they went as far as to say that GDS must shoulder blame for failing to design the Register to vote application to be "cloud native."
Cloud native applications are "built for failure", said one source, so that if one availability zone fails, the app continues to work by connecting users to a separate instance based in another availability zone that is running simultaneously.
"The last couple of days I would have thought this is a mission-critical application," the G-Cloud vendor source told IT Pro. "You need a cloud native application so if a provider fails, it works elsewhere. You need a multiple cloud strategy."
However, the application, written in 2013 by GDS, was not designed to be cloud native, according to our sources - with GDS lacking the skills to build such software.
"I don't think that level of skill exists, it certainly doesn't exist within government," the source said. "There's very poor guidance and education within government of writing a cloud native application."
GDS wrote the Register to vote application largely in-house, and invited development shops including Kainos to contribute without going through a formal tender process - apparently the norm when developing in-house.
A spokesman for Kainos told IT Pro only a handful of its engineers helped with the project, and said: "Our staff have a deep and recognised level of expertise, acquired over many years. This is acknowledged by our many government clients, who have selected Kainos to help on multiple complex digital projects."
Calling the app an "early" project from GDS, Kable's Figueras said she hoped the body has learned lessons from the experience.
"I would strongly suspect that since the service, they have learned a lot of lessons and I would hope and trust these services aren't being designed in this way again," she said.
But the anonymous vendor insisted the government still misunderstands the cloud, blaming it for problems unrelated to the service it provides.
"There are times they will blame us for application issues," the source said. "The fact they say 'it's your fault the cloud native application failed us' demonstrates the lack of education and the level of skill they have."
They said GDS must set out guidance on how to build cloud native applications for all future developments.
Kainos's spokesman added: "We're still working with the Cabinet Office to determine the exact sequence of events [of what went wrong]."
Website crash raises important questions over consequences of digital service failures
The website outage raises questions over the increasingly important role GDS developers play in supporting key elements of the UK's democracy, Kable's Figueras argued.
GDS set out in 2012 with a remit to digitise public services, making them more accessible and easier for citizens to use, but in practice developers' actions can influence the real world outside of the applications they create, the analyst said.
"We often hear government services are too difficult to navigate and use and the promise was digital services would simplify that," Figueras said. "But the danger is the complicated work isn't being done; thinking of the implications when these services fall over."
She explained: "It could lead to legal questions and even questions about the outcome of the referendum - nightmare scenarios that are outside the realm of the designer.
"Government services designers always need to be thinking of the wider implications and where their service fits within the whole of government systems and even our wider democracy in this service."
The Cabinet Office declined to address the specific points raised in this article, but a spokesman said: "There was a problem with the register to vote site, due to the unprecedented demand we experienced. We are not going to be discussing any further details until we have looked into exactly what happened, but clearly there will be lessons learned.
"After the website problems on Tuesday night, the deadline for registering to vote was extended for a further 48 hours."
IT Pro will continue to follow this story and will update it as more information emerges once the government has concluded its investigation into the issues.
Unlocking collaboration: Making software work better together
How to improve collaboration and agility with the right techDownload now
Four steps to field service excellence
How to thrive in the experience economyDownload now
Six things a developer should know about Postgres
Why enterprises are choosing PostgreSQLDownload now
The path to CX excellence for B2B services
The four stages to thrive in the experience economyDownload now