CenturyLink RCA?

Saku Ytti saku at ytti.fi
Mon Dec 31 15:06:35 UTC 2018


Hey Steve,

I will continue to speculate, as that's all we have.

> 1.  Are you telling me that several line cards failed in multiple cities in the same way at the same time?  Don't think so unless the same software fault was propagated to all of them.  If the problem was that they needed to be reset, couldn't that be accomplished by simply reseating them?

L2 DCN/OOB, whole network shares single broadcast domain

> 2.  Do we believe that an OOB management card was able to generate so much traffic as to bring down the optical switching?  Very doubtful which means that the systems were actually broken due to trying to PROCESS the "invalid frames".  Seems like very poor control plane management if the system is attempting to process invalid data and bringing down the forwarding plane.

L2 loop. You will kill your JNPR/CSCO with enough trash on MGMT ETH.
However I can be argued that optical network should fail up in absence
of control-plane, IP network has to fail down.

> 3.  In the cited document it was stated that the offending packet did not have source or destination information.  If so, how did it get propagated throughout the network?

BPDU

> My guess at the time and my current opinion (which has no real factual basis, just years of experience) is that a bad software package was propagated through their network.

Lot of possible reasons, I choose to believe what they've communicated
is what the writer of the communication thought that happened, but as
they likely are not SME it's broken radio communication. BCAST storm
on L2 DCN would plausibly fit the very ambiguous reason offered and is
something people actually are doing.

-- 
  ++ytti



More information about the NANOG mailing list