7007 Explanation and Apology

Vincent J. Bono vbono at MAI.NET
Sun Apr 27 00:41:35 UTC 1997


Dear All,

    I would like to sincerely apologize to everyone everwhere who 
experienced problems yesterday due to the 7007 AS announcements.

    If anyone cares to know, here is what happened:


    At 11:30AM, EST, on 25 Apr 1997, our border router, stamped with 
AS 7007, recieved a full routing view from a downstream ISP (well, a 
view contacing 23,000 routes anyway).

    There was no distibute list imposed on the downstream since they 
also advertise their customer AS's to us (they were also 
experimenting with sending some routes out through us and some out 
through the MAE).  We did filter out routes from them containing any 
of our AS numbers but since they got the view from someone at 
MAE-East none of our internal AS numbers showed up at all.  Not 
having a filter imposed on the inbound side was our error.

    In an as yet unexplained twist of bits, the 7007 router then 
began to de-aggregate the 23K route view *and* strip the AS path out 
of it.  I will emphasize that we were running no IGP at the time.  
Not one.  Not OSPF, not RIP, nothing.  

    Our MAE-East border router, AS 6082, then got a feed of these 
routes, at last count 73,000+, which set off our network monitor 
system which wacthes for, among other things, route views over 45k 
lines in size.  At 11:45AM we disabled the BGP peering session with 
AS 1790 that was in place with the 7007 router and immediately 
contacted Sprint (contrary to popular belief that they called *us* 
first to let us know about the problem).  As we were trying to 
determine what had happened, we began getting calls from other ISPs 
saying that we were announcing their routes with specificity as well 
as best AS path.  That really alarmed us since we saw no 
announcements still going out.  When these calls persisted, we 
rebooted our 7007 router (that was at 12:00PM).  When the router came 
back up, it did begin to announce a full view to AS 1790 again, but 
this time as a normal BGP advertisement, i.e. with AS paths and 
aggregated addresses.  We then imposed a distibute filter on our 
downstream and toward 1790, which stopped the announcement and, we 
thought, solved the problem.  

    Well, the phone *kept* ringing and we then started to see the 
7007 paths coming into our other routers over the MAE's.  Okay, so 
panic ensued, and we unlugged *everything* at 12:15PM almost to the 
second.  Then, at 12:25 the Sprint NOC called us to say the they were 
about to turn down the DS-3 connection to our 7007 router since they 
were *still* seeing the routes.  We of course told them to go ahaead 
(since the router had no power to it at this point we were *very* 
confused).  

    It seems that even after we stopped announcing the demon-view at 
12:00, 1790 kept propagating the routes.  We continued to field 
calls until about 4:45PM yesterday from ISP's all over the world.

    According to our conversations with the Sprint NOC at 2:14PM 
yesterday, they simply could not clear the 7007 routes from their BGP 
tables, they "just keep appearing again" as one of the techs told us.

    It also seems that some large, switched-based backbone provider, 
began distributing the routes to MSN one the west coast which 
lingered until about 7:00PM EST.


    We had engineers from the router manufacturer in until about 
1:00AM this moring crawling all over the equipment making sure that 
we hadn't created an incorrect config set.  We also now impose full 
distribute list filters on all peers.  

    All I can say in our defense is that I believe we did debug the 
problem in the most expedient manner possible, and when it seemed 
that even after disabling the BGP session, we were still endagering 
other networks, we did completely disconnect ourselves from the Net.  

    We did *not* perform any of this maliciously, I'm not sure that I 
could duplicate the event if I tried.  Anyone who called and got a 
harsh voice on the phone, well, I sincerely apologize to them 
individually, but some in particular should not have tried to 
impersonate a company other than their own *and* should not have 
started cursing out the NOC tech who answered.

    I would also like to take this time to thank AT&T WorldNet, NASA 
Sciences Institute, and Net Access Corporation who called and did not 
just ask for an explanation, but offered asisstance.

Sincerely,
MANAGEMENT ANALYSIS, INCORPORATED

Vincent J. Bono
Director Network Services







 














More information about the NANOG mailing list