Cloudflare servers "panic" after leap second

2016's extra second created a bug knocking servers offline

A leap second added to the end of 2016 sent servers at DNS security service Cloudflare into a "panic", causing some of them to briefly drop offline.

The 61-second minute, caused by the extra second, hit a small number of Cloudflare servers at midnight on New Year's Day, as the code was unable to handle the invalid timestamp.

Any customers affected would have seen an error message saying servers cannot be reached, instead of being directed to the website they were trying to access.

The extra second was added to help co-ordinate worldwide timekeeping between zones, as the Earth's rotation experiences a gradual slowdown. However, the DNS service used by Cloudflare works under the assumption that 'time cannot go backwards', and the slight extension to 2016 caused the code to perceive a "negative resolution time".

Advertisement - Article continues below
Advertisement - Article continues below

"A number went negative when it should always have been, at worst, zero," said Cloudflare programmer John Graham-Cumming. "A little later this negative value caused RRDNS to panic... the net effect was that some DNS resolutions to some Cloudflare managed web properties failed."

The problem was believed to have only affected a small number of customers using CNAME DNS records with the company, and of these fewer than 1% of all user requests to servers resulted in an error.

"The most affected machines were patched in 90 minutes and the fix was rolled out worldwide by 0645 UTC," added Graham-Cumming. "We are sorry that our customers were affected, but we thought it was worth writing up the root cause for others to understand."

The new patch will allow the code behind the DNS service to 'normalise' in the unlikely event time is perceived to have skipped backwards.

Although widespread software meltdowns have yet to materialise after a leap second, the change in timestamps continues to hamper high profile tech companies. Both Twitter and Android were hit by 2015's mid-year leap second, as the services started to display notifications with incorrect dates and times.

Other major tech providers, including Instagram, Netflix and Amazon Web Services also experienced crippling web crashes in 2015, however this year the disruption appears to be on a much smaller scale.

Advertisement - Article continues below

Google recently announced it would be creating its own unit of time to accommodate for 2016's leap second. 'Smeared time' allowed the stretching of a regular second over the course of 31 December 2016, meaning the company was able to keep all servers that use Google's Network Time Protocol (NTP) in time with the changes.

Featured Resources

How inkjet can transform your business

Get more out of your business by investing in the right printing technology

Download now

Journey to a modern workplace with Office 365: which tools and when?

A guide to how Office 365 builds a modern workplace

Download now

Modernise and transform your sales organisation

Learn how a modernised sales process can drive your business

Download now

Your guide to managing cloud transformation risk

Realise the benefits. Mitigate the risks

Download now

Most Popular

cloud computing

Google Cloud snaps up multi-cloud analytics platform for $2.6bn

13 Feb 2020

How to use Chromecast without Wi-Fi

5 Feb 2020
operating systems

How to fix a stuck Windows 10 update

12 Feb 2020
Microsoft Azure

Microsoft Azure is a testament to Satya Nadella’s strategic nouse

14 Feb 2020