Need trusted NTP Sources
andriy.bilous at gmail.com
Sun Feb 9 20:08:00 UTC 2014
Best practice is five. =) I don't remember if it's in FAQ on ntp.org or in
David Mills' book. Your local clock is kind of gullible "push-over" which
will "vote" for the "party" providing most reasonable data. The algorithm
would filter out insane sources which run too far from the rest and then
group sane sources into 2 "parties" - your clock will follow the one where
runners are closer to each other. That is why uneven number of trustworthy
sources at least at start is required. With 2 sources you will blindly
follow the one which is closer to your own clock. You're also having the
the risk to degrade into this situation when you lose 1 out of 3 sources.
Four is again 2:2 and only with five you have a good chance to start
disciplining your clock into the right direction at the right pace, so when
1 source is lost you (most probably) won't run into insanity.
On Sun, Feb 9, 2014 at 9:03 AM, Saku Ytti <saku at ytti.fi> wrote:
> On (2014-02-08 19:43 -0500), Jay Ashworth wrote:
> > In the architecture I described, though, is it really true that the odds
> > of the common types of failure are higher than with only one?
> I think so, lets assume arbitrarily that probability of NTP server not
> starting to give incorrect time is 99% over 1 year time.
> Then either of two servers not giving incorrect time is 0.99**2 i.e. 98%,
> two NTP servers would be 1% point more likely to give incorrect time than
> over 1 year time.
> Obviously the chance of working is more than 99% maybe it's something like
> 99.999%? And is that really typical failure-mode or is typical failure-mode
> complete loss of connectivity? Two NTP servers would protect from this,
> However loss-of-connectivity minor impact on clients, wrong time has major
> impact of client.
> Maybe if loss-of-connectivity is fixed in somewhat short period of time,
> single NTP always win, if loss-of-connectivity is fixed typically in very
> period of time, single NTP loses.
> I don't really have exact data, but best practice is >2. Matthew said 4,
> gives the advantage that in single failure you are still operating
> and do not have urgency to fix, with 3 in single failure another failure
> not occur before it is fixed.
> I think 3 is enough, networks are typically designed to handle 1 arbitrary
> failure at the same time and 2 arbitrary failures in most networks, when
> chosen correctly, will cause SLA breaking faults (Cheaper to pay SLA
> compensations than to recover from any 2 failures).
> But NTP servers are cheap, so if you want to be robust and recover from n
> false tickers, have 3+n.
More information about the NANOG