NTP Issues Today
Jared Mauch
jared at puck.nether.net
Tue Nov 20 19:39:03 UTC 2012
On Nov 20, 2012, at 2:28 PM, Jay Ashworth <jra at baylink.com> wrote:
> ----- Original Message -----
>> From: "Leo Bicknell" <bicknell at ufp.org>
>
>> To protect against two falseticking servers (tick and tock, as we saw on
>> the 19th) you need _FIVE_ servers minimum configured if they are both in
>> the list. More importantly, if you want to protect against a source
>> (GPS, CDMA, IRIG, WWIV, ACTS, etc) false ticking, you need a minimum of
>> _FOUR_ different source technologies in the list as well.
>>
>> It's not hard, my box that I posted the logs from peers with 18
>> servers using 8 source technologies, all freely available on the Internet...
>
> I'm curious, Leo, what your internal setup looks like. Do you have an
> internal pair of masters, all slaved to those externals and one another,
> with your machines homed to them? Full mesh? Or something else?
>
> In my last big gig, it was recommended to me that I have all the machines
> which had to speak to my DBMS NTP *to it*, and have only it connect to the
> rest of my NTP infrastructure. It coming unstuck was of less operational
> impact than *pieces of it* going out of sync with one another...
here's a sample ntp config from one of my systems.
-- snip --
# Use public servers from the pool.ntp.org project.
# Please consider joining the pool (http://www.pool.ntp.org/join.html).
server 0.fedora.pool.ntp.org
server 1.fedora.pool.ntp.org
server 2.fedora.pool.ntp.org
server 3.fedora.pool.ntp.org
#
server 0.us.pool.ntp.org iburst maxpoll 9
server 1.us.pool.ntp.org iburst maxpoll 9
server 2.us.pool.ntp.org iburst maxpoll 9
server 129.250.35.250 iburst maxpoll 9
server 129.250.35.251 iburst maxpoll 9
-- snip --
You can audit its operation like this:
nat:~$ ntpq -p -n -c ass
remote refid st t when poll reach delay offset jitter
==============================================================================
-129.250.35.250 164.244.221.197 2 u 68 512 377 19.248 -0.135 3.195
+129.250.35.251 192.5.41.40 2 u 439 512 377 41.817 1.109 15.660
-206.57.44.17 204.123.2.5 2 u 126 512 377 37.133 -6.443 9.631
+4.53.160.75 209.81.9.7 2 u 48 512 377 25.209 1.551 8.804
-64.73.32.135 192.5.41.41 2 u 349 512 377 23.418 -0.703 1.721
*50.116.38.157 64.250.177.145 2 u 380 512 377 43.021 1.267 2.136
+208.87.221.228 10.0.22.49 2 u 517 512 377 92.000 0.974 0.678
-206.212.242.132 128.252.19.1 2 u 323 512 377 21.781 -2.873 1.304
+38.229.71.1 204.123.2.72 2 u 211 512 377 21.977 -0.055 2.274
ind assid status conf reach auth condition last_event cnt
===========================================================
1 39973 931a yes yes none outlyer sys_peer 1
2 39974 941a yes yes none candidate sys_peer 1
3 39975 9324 yes yes none outlyer reachable 2
4 39976 942a yes yes none candidate sys_peer 2
5 39977 931a yes yes none outlyer sys_peer 1
6 39978 961a yes yes none sys.peer sys_peer 1
7 39979 9414 yes yes none candidate reachable 1
8 39980 931a yes yes none outlyer sys_peer 1
9 39981 941a yes yes none candidate sys_peer 1
What you would have seen is a falseticker from the impacted clocks.
This is a fairly reasonable setup.
I've also been looking at an item like this:
http://www.netburnerstore.com/ProductDetails.asp?ProductCode=PK70EX-NTP
which is about $300 + misc parts.
Should be well worth it to avoid a 'major outage' that some folks had with needing to reboot their servers, etc.
- Jared
More information about the NANOG
mailing list