ARIN DNSSEC Monitoring enhancement deployed (was: 2019-01-11 ARIN.NET DNSSEC Outage – Post-Mortem)

John Curran jcurran at arin.net
Mon Feb 4 18:44:21 UTC 2019


On 11 Jan 2019, at 3:59 PM, John Curran <jcurran at arin.net<mailto:jcurran at arin.net>> wrote:
...
My apologies for this incident – while ARIN does have some fragility in our older systems (which we have been working aggressively to phase out via system refresh and replacements), it is not acceptable to have this situation with key infrastructure such as our DNS zones.   We will prioritize the necessary alert and monitor changes and I will report back to the community once that has been completed.

Folks -

I indicated that we would report back once appropriate DNSSEC monitoring is in place - this has now been completed (ref: attached announcement of same)

Thanks again for your patience in this matter,
/John

John Curran
President and CEO
American Registry for Internet Numbers

Begin forwarded message:

From: ARIN <info at arin.net<mailto:info at arin.net>>
Subject: [arin-announce] DNSSEC Monitoring Enhancements
Date: 4 February 2019 at 11:32:25 AM EST
To: <arin-announce at arin.net<mailto:arin-announce at arin.net>>

On 31 January, ARIN deployed DNSSEC monitoring enhancements, including proactive RRSIG expiration checking, zone syntax checking, and DNSSEC validation. We are monitoring from various disparate locations across the Internet with these checks. This effort was undertaken in response to the incident that occurred on 11 January, detailed in the incident report below.

Improved monitoring of DNSSEC and the arin.net<http://arin.net> zone will provide earlier alerts of any issues such as Resource Record Signature (RRSIG) expiration and any issues with DNSSEC validation. These enhancements will provide early warning of potential issues, prevent outages, and improve our ability to troubleshoot DNSSEC problems if they occur in the future.

Regards,
Mark Kosters
Chief Technology Officer
American Registry for Internet Numbers (ARIN)

Incident Report:

On 11 January 2019, at approximately 8:30 a.m. ET, ARIN monitoring systems alerted that some arin.net<http://arin.net> properties were unreachable. All users with validating DNS resolvers were unable to look up resources within arin.net<http://arin.net> and thus were unable to reach them. ARIN’s www.arin.net<http://www.arin.net> and ftp.arin.net<http://ftp.arin.net> sites and Whois, RPKI, and DNS services were affected for those users who use validating resolvers.

ARIN’s Engineering staff determined that DNSSEC validation for the arin.net<http://arin.net> zone was failing and temporarily unpublished Delegation Signer (DS) records with our registrar so that we could investigate the problem. Upon troubleshooting, ARIN staff discovered that the removal of a resource record had created a spurious record, which caused a script to fail to reload. New versions of the zone could not be loaded, and the zone file in use expired. After determining the cause of the problem, the offending file was removed and the zone was reloaded. Delegation Signer (DS) records were republished and the zone validated, restoring service at approximately 10:30 a.m. ET.

_______________________________________________
ARIN-Announce

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.nanog.org/pipermail/nanog/attachments/20190204/1274d2df/attachment.html>


More information about the NANOG mailing list