Decades-old bug wiped out UK air traffic control

Report on December Nats outage reveals bug downed aircraft for five hours

A 20-year-old bug was behind a software outage that grounded planes at Heathrow for five hours in December, according to an independent inquiry into the incident

A failure in the National Air Traffic Services (Nats) system shut down the air-traffic control centre at Swanwick at the end of last year, causing chaos at Heathrow.

Then business secretary Vince Cable told the BBC at the time that the Nats system was "ancient" and that the organsiation behind it was "skimping" on investment. 

Now, an official inquiry into the incident has revealed a bug in the System Flight Server was at the root of the fault - and that the flaw had been in the software since the 1990s.

The server was rolled out at Swanwick in 2002, and the bug was already present in it then. 

Despite the age of the flaw, the inquiry's findings, titled NATS System Failure 12 December 2014 Final Report, didn't criticise Nats.

Instead it said that "it is unrealistic to expect that software faults will not be introduced in development" of such complex systems. 

Nats' processes "are thorough and professional", according to the report, and there's a "strong and effective process" for software updates. 

"The resultant integrity appears better than would be expected for software of this importance," it added. 

The system is already set for upgrade as a new Europe-wide system called SESAR is rolled out in the next few years. The report said that deployment shouldn't be accelerated in light of the discovered bug, as "a search for earlier benefits would be likely to lead to shortcuts being taken".  

It also made a series of recommendations to bring into SESAR to avoid future problems, including better hardware redundancy, software audits and more.

Impressive achievement

The report had praise for Nats' engineers, saying that "identifying a software fault in such a large system (the total application exceeds two million lines of code), within only a few hours, is a surprising and impressive achievement".

The system was taken offline at 2.55pm that day, and mostly restored less than an hour later; by 7pm, engineers believed they had uncovered the reason behind the fault, with full service back by 8.30pm. 

Despite reports at the time saying UK airspace was completely closed, it wasn't - the delays were because controllers had to use manual methods to manage flight paths. 

"NATS estimates that... a maximum of 1,900 flights and 230,000 passengers were affected during the afternoon and evening of 12 December," the report said. "Additionally several airlines reported some level of cancellations and flight disruption running into 13 December with approximately 60 aircraft and 6,000 passengers affected."

Featured Resources

Unlocking collaboration: Making software work better together

How to improve collaboration and agility with the right tech

Download now

Four steps to field service excellence

How to thrive in the experience economy

Download now

Six things a developer should know about Postgres

Why enterprises are choosing PostgreSQL

Download now

The path to CX excellence for B2B services

The four stages to thrive in the experience economy

Download now

Most Popular

Microsoft is submerging servers in boiling liquid to prevent Teams outages
data centres

Microsoft is submerging servers in boiling liquid to prevent Teams outages

7 Apr 2021
Hackers are using fake messages to break into WhatsApp accounts
instant messaging (IM)

Hackers are using fake messages to break into WhatsApp accounts

8 Apr 2021
How to find RAM speed, size and type
Laptops

How to find RAM speed, size and type

8 Apr 2021