REMINDER: LEAP SECOND

Tore Anderson tore at fud.no
Wed Jun 24 06:33:14 UTC 2015


* Harlan Stenn <stenn at ntp.org>

> Matthew Huff writes:
> > A backward step is a known issue and something that people are more
> > comfortable dealing with as it can happen on any machine with a
> > noisy clock crystal.
> 
> A clock crystal has to be REALLY bad for ntpd to need to step the
> clock.
> 
> > Having 61 seconds in a minute or 86401 seconds in a day is a
> > different story.
> 
> Yeah, leap years suck too.
> 
> And those jumps around daylight savings time.

Hi Harlan,

Leap years and DST ladjustments have never caused us any major
issues. It seems these code paths are well tested and work fine.

The leap second in 2012 however ... total and utter carnage.
Application servers, databases, etc. falling over like dominoes. All
hands on deck in the middle of the night to clean up. It took days
before we stopped finding broken stuff.

Maybe all the bugs from 2012 have been fixed. Maybe they haven't. Maybe
new ones have been introduced. I'm not terribly optimistic. One example
I'm aware of: Cisco Nexus 5010/5020 switches need software that was
released as late as 29th of April this year in order to be immune to
the crash&burn leap second bug CSCub38654. The official «Cisco
Suggested release based on software quality, stability and longevity»
is older. Go figure.

In any case, we're certainly not going to risk it. So our plan is to
disconnect our local stratum-2s from their upstreams on June 29th so
they (and more crucially, their downstream clients) remain oblivious to
the leap second. Come July 1st, we'll reconnect them. The clients'
clocks will be 1s (plus any drift) off at that point, but as we're
running ntpd with the "-x" option, that shouldn't cause backwards
stepping. Running with slightly incorrect clocks for a few days is a
small price to pay to avoid a repeat of 2012's mayhem.

Tore



More information about the NANOG mailing list