NTP Sync Issue Across Tata (Europe)

Mon Aug 14 10:55:01 UTC 2023

We're going to have to somewhat disagree here...

I may not have been 100% clear about what I see as the most common risks
for GPS.  The reason I suggest that NTP risks and GPS risks are similar is
not primarily due to intentional time injection hacks (although that is a
risk).   Instead, it's related to GPS failure modes, and the increased
commonality of GPS jamming causing those failures.   I will 100% concede
that NTP carries far more spoofing or intentional DoS risks.   But GPS is
far more likely to suffer a failure in the absence of a bad actor than NTP.

The reason for this is that the GPS signal is incredibly weak, and it's
incredibly easy to break GPS reception.   Good antenna placement and
antennas that try to reject terrestrial signals help but don't always
prevent the failures from happening.

Because GPS is used more and more to track objects and people, people who
don't want to be tracked are starting to buy and use jammers.  In addition,
it's becoming increasingly common for gamers to spoof their GPS location
(and, as a result, time) via GPS injection.  So the kid down the street
trying to cheat at pokemon go or the truck driver not wanting to get in
trouble for speeding may unintentionally cause your time server to quit
working correctly.  Not to mention the random piece of electronic gear
which malfunctions and spews noise across the GPS band.

So, yes, I will 100% agree with you that NTP carries more intentional
hacking risk.   But I'm going to argue that GPS carries a significantly
higher risk of a jamming-related failure.   Without good statistics, it's
hard to tell which is more prevalent.   I see a lot more GPS failures from
my viewpoint, but I also have to talk to customers who are having precision
timing issues due to GPS failures.

My intuitive feeling is that in the absence of bad actors, NTP is
significantly more reliable than GPS.   In the presence of remote bad
actors, I'll grant that NTP is 100% the loser here.  When everything is
working, GPS will provide better time.  Adding a holdover oscillator to GPS
does help in marginal situations, but doesn't resolve all of the GPS issues.

In those situations where time is not critical, either NTP or GPS is a good
solution, and it largely comes down to which you prefer.   I deal with way
too many antennas so I'd rather just harden a NTP server.   You might deal
with way too many hackers getting in your systems so you might prefer
relying on a GPS antenna.   Either way, most of the time you're going to
get decent time service.   We could go into a lot of details about how each
system can fail, but for non-time critical applications I'm not sure either
would come out a clear winner.  I know you believe GPS does, and I believe
that it isn't 100% clear which one is better for those "just want time that
works most of the time" applications.   We could argue all day about this
and we won't get anywhere beyond us disagreeing about this.

Once you get to more time-critical apps where actual budget is going to be
expended on ensuring reliable NTP services are available 24x7, then neither
a default configuration NTP server nor a single GPS receiver will provide
reliable time.   Selecting servers and hardening firewalls to limit the
likelihood of time injection can work wonders on NTP robustness.   GPS
works too if you provide enough GPS timing sources that multiple locations
would have to be jammed at once.   Providing a mix of these is even
better.  If I was to go GPS-only I'd probably try to ensure a minimum of 3
different GPS receivers at 3 different locations, with internal NTP servers
pulling from each of the GPS-connected NTP servers.   5 would even be
better.   An even more robust option would be to go with 5 GPS receivers
and 2 or 3 NTP-connected stratum 1 time sources.   In this last case, you
could spoof ALL of the NTP servers and the GPS would still be in control.
You could also have signal failures at 3 of the GPS sites and the NTP
connections would provide redundant time sources.   Only with GPS failures
at multiple sites AND NTP failures or spoofing happening at the same time
would one have an issue where the NTP servers could possibly fail to
receive correct time.

On Mon, Aug 14, 2023 at 2:00 AM Mel Beckman <mel at beckman.org> wrote:

> Forrest,
>
> I think you’re gilding the lilly. My original recommendation was to use
> GPS as primary, for its superior accuracy and resistance to attack, and
> have anti-GPS back up.  If you want automatic fail over, do that in an
> intermediate server on your site that makes a conscious test and decision
> to fail over to Internet NTP.
>
> You’re mistaken to say that the vulnerability of GPS is remotely
> comparable  to the vulnerability of Internet-based NTP. To interfere with a
> GPS-derived clock, an attacker has to physically be present. That’s a huge
> expense — and risk — that hackers are really not interested in undertaking.
> They would much rather sit in Russia or China and attack NTP servers
> remotely using any of the several attack methodologies I’ve cited
> previously.
>
> So curate Internet NTP or not (personally, that seems like just another
> thing to monitor and maintain), but make GPS your primary time standard.
> You’re much better off staying air-gapped from Internet NTP until you
> detect a GPS failure.  All the other machinations are pointless while GPS
> is working, because GPS gives you by far the best accuracy and security for
> the buck.  Like I said, spend $400 on a commercial GPS time server and
> timing problems are solved. Or use facility-provided GPS if you can’t get
> an antenna up.
>
>  -mel
>
> On Aug 14, 2023, at 12:10 AM, Forrest Christian (List Account) <
> lists at packetflux.com> wrote:
>
> 
> I've responded in bits and pieces to this thread and haven't done an
> excellent job expressing my overall opinion.   This is probably because my
> initial goal was to point out that GPS-transmitted time is no less subject
> to being attacked than your garden variety NTP-transmitted time. Since this
> thread has evolved, I'd like to describe my overall position to be a bit
> clearer.
>
> To start, we need a somewhat simplified version of how UTC is created so I
> can refer to it later:
>
> Across the globe, approximately 85 research and standards institutions run
> a set of freestanding atomic clocks that contribute to UTC.   The number of
> atomic clocks across all these institutions totals around 450.   Each
> institution also produces a version of UTC based on its own set of
> atomic clocks.  In the international timekeeping world, this is designated
> as UTC(Laboratory), where Laboratory is replaced with the abbreviation for
> the lab producing that version of UTC.   So UTC(NIST) is the version that
> NIST produces at Boulder, Colorado, NICT produces UTC(NICT) in Tokyo, and
> so on.
>
> Because no clock is perfectly accurate, all of these versions of UTC drift
> in relation to each other, and you could have significant differences in
> time between different labs.   As a result, there has to be a way to
> synchronize them.  Each month, the standards organization BIPM collects
> relative time measurements and other statistics from each
> institution described above.  This data is then used to determine the
> actual value of UTC. BIPM then produces a report detailing each
> organization's difference from the correct representation of UTC.   Each
> institution uses this data to adjust its UTC representation, and the cycle
> repeats the next month. In this way, all of the representations of UTC end
> up being pretty close to each other.   The document BIPM produces is titled
> "Circular T."  The most recent version indicates that most of the
> significant standards institutions maintain a UTC version that differs by
> less than 10ns from the official version of UTC.
>
> Note that 10ns is far more accurate than we need for NTP, so most of the
> UTC representations can be considered identical as far as this discussion
> goes. Still, it is essential to realize that UTC(NIST) is generated
> separately from UTC(USNO) or other UTC implementations.  For example, a
> UTC(NIST) failure should not cause UTC(USNO) to fail as they utilize
> separate hardware and systems.
>
> Each of these versions of UTC is also disseminated in various ways.
> UTC(NIST) goes out via the "WWV" radio stations, NTP, and other esoteric
> methods.   GPS primarily distributes UTC(USNO), which is also available
> directly via NTP.  UTC(SU) is the timescale for GLONASS.  And so on.
>
> So, back to NTP and the accuracy required:
>
> Most end users (people running everyday web applications or streaming
> video or similar) don't need precisely synchronized time.   The most
> sensitive application I'm aware of in this space is likely TOTP, which
> often needs time on the server and time on the client (or hardware key)
> within 90 seconds of each other.   In addition, having NTP time fail
> usually isn't the end of the world for these users.  The best way to
> synchronize their computers (including desktop and server systems) to UTC
> is to point their computer time synchronization service (whatever that is)
> at pool.ntp.org, time.windows.com, their ISP's time server, or similar.
> Or, with modern OS'es, you can leave the time configured to whatever server
> the OS manufacturer preconfigured.   As an aside, one should note that
> historically windows ticked at 15ms or so, so trying to synchronize most
> windows closer than 15ms was futile.
>
> On the other hand, large ISPs or other service providers (including
> content providers) see real benefits to having systems synchronized to
> fractions of seconds of UTC.   Comparing logs and traces becomes much
> easier when you know that something logged at 10:02:23.1 on one device came
> before something logged at 10:02:23.5 on another.   Various
> server-to-server protocols and software implementations need time to be
> synchronized to sub-second intervals since they rely on timestamps to
> determine the latest copy of data, and so on.   In addition, as an ISP,
> you'll often provide time services to downstream customers who demand more
> accuracy and reliability than is strictly necessary.
>
> As a result, one wants to ensure that all time servers are synchronized
> within some reasonable standard of accuracy.   Within 100ms is acceptable
> for most applications but a goal of under 50ms is better.   If you have
> local GPS receivers, times down to around 1ms is achievable with careful
> design.  Beyond that, you're chasing unnecessary accuracy.  Note that loss
> of precision is somewhat cumulative here - running a time server
> synchronized to within 100ms will ensure that no client can be synchronized
> to better than within 100ms from that server.   Generally, you'll want your
> time server to be synchronized much better than needed to avoid the time
> server being the limiting factor.
>
> In a perfect world with no bad actors and where all links ran perfectly,
> one could set up an NTP server that pulled from pool.ntp.org or used GPS
> and essentially acted as a proxy.   Unfortunately, we don't live in this
> world.   So one has to ask how you build a system that meets at least the
> following goals:
>
> * Synchronized to UTC within 50ms, with lower being better.
> * Not subject to a reasonable set of attacks (typical DoS attacks, RF
> signal attacks, spoofing, etc).
> * Able to be run by typical network operations staff
>
> In addition, an ideal server setup would be made up of redundant servers
> in case one piece of hardware fails.  I will ignore this part, as it's
> usually just setting up multiple copies of the same thing.
>
> The two most straightforward options are using a GPS-based NTP appliance
> or installing an NTP server and pointing it at pool.ntp.org.   Under
> normal circumstances, both options will be synchronized to UTC with enough
> accuracy for most applications, and both are easy to run by typical network
> operations staff.  This assumes reasonably consistent network latency in
> the NTP case and a good sky view in the GPS case.  The GPS-based appliance
> is, however, subject to spoofing or jamming, as I've discussed earlier.
>  The NTP server is at the mercy of the quality of the servers it picked
> from pool.ntp.org and is also subject to various outside attacks
> (spoofing, etc.).   One must decide how critical time is to them before
> deciding whether this option is valid.
>
> The other end of the scale is the "develop your own offline version of UTC
> using atomic clocks" methodology.  This fixes the attack issue but
> introduces several others.   The main one is that you are now relying on
> the clock's accuracy.  Admittedly rubidium and especially cesium clocks
> tend to be sufficiently reliable and stable.   However, one has to ensure
> the frequency is accurate initially and stays that way. You must also wire
> the clock to an NTP Server and calibrate the initial UTC offset.   If the
> clock goes haywire or is less accurate than is required, your in-house
> version of UTC will drift in relation to real UTC.   This means you may
> need 2 or 3 or more atomic clocks to be sufficiently reliable.  You'll then
> need to regularly take an average, compare it to UTC, and adjust if it's
> drifted too much.   This quickly becomes more of a science project than
> something you want network operations staff to deal with on an ongoing
> basis.    To be clear:  If you need robust time not subject to outside
> forces and have or can obtain the skill set to pull this off internally, I
> won't argue that this is a bad option.  However, I feel this isn't the type
> of service most providers want to run internally.
>
> So, looking at some middle-ground options that trade a bit of robustness
> for ease of use is reasonable.
>
> My lowest cost preference has always been to use a set of in-house NTP
> servers pointed at a carefully curated collection of NTP servers.    Your
> curation strategy should depend on network connectivity, the reliability of
> the time sources, etc.   In North America, picking one or two NIST servers
> from each NIST location is a good starting point.  That is one or two from
> each of Maryland, Fort Collins, Boulder, and the University of Colorado.
>  One may want to add some servers from other timekeeping organizations
> (such as USNO).   Note that there is one commonality:  These time servers
> are run by organizations listed in circular T as contributing to UTC, and
> the servers are tied to the atomic clocks. That way, we ensure that the
> servers are not subject to inaccuracies caused by time transfer from an
> authoritative source for UTC.   What is left is any potential attack on the
> time transfer over NTP itself.   I would argue that with a curated list of
> enough NTP servers, this risk can be pushed down to where it is low enough
> for many use cases.   A lot will depend on the quantity and quality of NTP
> servers you select and the robustness of the network path to those
> servers.  If the packets between your NTP server and the NTP servers you
> choose traverse a relatively secure and short path with plenty of
> bandwidth, and the paths to differing NTP servers are diverse, many attacks
> will become harder to implement.   In addition, the more NTP servers you
> add, the more likely it is that NTP will be able to correctly pick the
> servers providing the correct time, even if an attacker is successfully
> spoofing one or more sources.  In some cases it may make sense to add
> additional servers which are run by third parties if it gains additional
> robustness based on network architecture.  This is especially true if
> you're closely connected network-wise with the third party and they run a
> good quality NTP service as well.
>
> As I've mentioned, a good middle-of-the-road solution is adding various
> sources of time derived via GPS.   Note I said, "to add."    Start with the
> carefully curated NTP server set, then install one or more GPS-based NTP
> Servers polled by your NTP server.   Adding these GPS time sources to your
> NTP servers does three things:  First, it provides another source of time
> NTP can use to determine the correct time.   Second, we're now using a
> different time transmission method with different vulnerabilities.   And
> finally, it will significantly improve the accuracy of the time the NTP
> server produces as NTPd will generally prefer it to do the final trimming
> to UTC.   The strength of the combination of both terrestrial transmitted
> time via NTP and the precision of rf-transmitted GPS time ensures that time
> is both correct and precise.  There are still attack vectors here, but as
> you add more time sources, the complexity of pulling off a successful
> attack increases.  This is especially true if you can monitor the NTP
> server for signs of stress, such as time servers that are not telling the
> correct time or GPS signals which are inconsistent with the NTP-derived
> time.   A successful attack would require simultaneous NTP (network) and
> GPS (rf) attacks.
>
> Other options or blends of options are also possible.   With a reasonably
> large network, putting enough GPS receivers into place would significantly
> reduce the possibility of a spoofer or jammer taking out your entire GPS
> infrastructure.  Reducing or eliminating external NTP time sources might be
> reasonable in that case.   The theory is that attacking GPS receivers at
> one location is easy.  Doing it at dozens simultaneously is much more
> difficult.   To use an exaggeration to make a point:  If you had 100
> different GPS receivers spread across 100 widely geographically diverse
> locations, and all of your NTP servers were able to poll all of them for
> time, the chances that an attacker would be able to take out or spoof
> enough GPS receivers to make a difference would be close to zero.  Your
> failure point becomes UTC(USNO) and the GPS constellation itself. The same
> argument would apply to NTP servers regarding quantity and diversity.
>
> Other options involve adding additional technologies.   For example, some
> appliances use GPS to discipline (adjust) an internal atomic clock. Once
> the atomic clock is locked to UTC, the GPS can fail for extended periods
> without affecting NTP output.   In addition, some of these will filter
> updates from the GPS based on the appliance's internal atomic time.   That
> way, a spoofer would be ignored, jammers would have to continue for hours
> or days, and so on.   Of course, these solutions' reliability depends on
> the implementation quality.   If I had the budget to implement something
> like this in a network, I'd likely scatter a few of these around the
> network and then still use garden variety NTPd servers which would be
> pointed at these appliances.  I might even consider buying solutions from
> multiple vendors to ensure a bug in one implementation was filtered out and
> ignored.
>
> I can't cover every option here, but balancing security, cost, operational
> complexity, and application needs is the key.   Some solutions are cheap
> and easy but not robust.  Some are highly robust but expensive and not
> easy.   Somewhere in the middle is probably where most real implementations
> should lie.
>
> Now, to address a couple of specific items:
>
> 1) Additional GPS and commercial time distribution systems will likely
> improve reliability.   However, only GPS and GALILEO are available for free
> in the US.   I'm ignoring GLONASS for various legal and political reasons.
>  GALILEO is a valid option but it lives in the same band as GPS, so jamming
> GPS will usually also jam GALILEO.  Utilizing GNSS receivers that use the
> civilian signals in the newer bands would also help.  Some commercial
> solutions are available that don't require GNSS, but they're relatively new
> and not as commonly available as one would like.
>
> 2) For running my own time servers in a service-provider environment, I'd
> rather specifically designate the exact NTP server I want to utilize and
> not rely on a third party to give me a pool of servers.   It's more about
> ensuring the server I use is running a trusted server, and if I delegate
> the server selection, I lose this ability.  On the other hand, where I'm
> not running a NTP server that is critical for many clients, I'll just point
> it at pool.ntp.org, or north-america.pool.ntp.org and skip all of the
> recommendations that I've made above.   I would be cautious about
> requesting pool.ntp.org add entries for "stratum of server" or "origin of
> time" as this seems like it would tend to overload the stratum one servers
> in the pool with people "optimizing" their configuration to use only
> stratum one servers.   Remember that pool.ntp.org is generally intended
> as an end-user-device service, and providing methods that end users can
> bypass the robustness that a fully distributed pool will provide is
> probably not a great idea.
>
> 3) This all should hopefully sort itself out over the next few years.
>  GPS and GALILEO are flying new birds that have changes designed to improve
> attack resilience by using cryptography to ensure authentic transmissions
> (which may rely on ground transmission of cryptographic keys).   NTP
> already supports manual cryptographic keys that work, but NTS is a pain in
> the rear. Hopefully, NTPv5 will have a better security mechanism.   Other,
> more secure, time sources are on the horizon as the cybersecurity crowd is
> aware of the issues.
>
> And finally, as a sort of a tl;dr; Summary:  Each operator needs to decide
> how critical time is to their network and pick a solution that works for
> them and fits the organization's budget.   Some operators might point
> everything at pool.ntp.org and not run their own servers.  Others might
> run their own time lab and use that time to provide NTP time and precision
> time and frequency via various methods.  Most will be somewhere in between.
> But regardless of which you choose, please be aware that GPS isn't 100%
> secure, and neither is NTP. If attack resilience matters to you, you should
> think about all of the attack vectors and design something that is robust
> enough to meet your use case.
>
>
>
>

-- 
- Forrest
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.nanog.org/pipermail/nanog/attachments/20230814/7c1e81d5/attachment.html>