Centurylink having a bad morning?

Bryan Holloway bryan at shout.net
Mon Aug 31 15:57:24 UTC 2020


Not everyone will peer with you, notably, AS3356 (unless you're big 
enough, which few can say.)

On 8/31/20 4:33 PM, Tomas Lynch wrote:
> Maybe we are idealizing these so-called tier-1 carriers and we, tier-ns, 
> should treat them as what they really are: another AS. Accept that they 
> are going to fail and do our best to mitigate the impact on our own 
> networks, i.e. more peering.
> 
> On Mon, Aug 31, 2020 at 9:54 AM Martijn Schmidt via NANOG 
> <nanog at nanog.org <mailto:nanog at nanog.org>> wrote:
> 
>     At this point you don't even know whether it's a human error
>     (example: generating a flowspec rule for port TCP/179), a filtering
>     issue (example: accepting a flowspec rule for port TCP/179), or a
>     software issue (example: certain flowspec update crashes the BGP
>     daemon). And in the third scenario I think that at least some
>     portion of the blame shifts from the carrier to its vendors,
>     assuming the thing that crashed was not a home-grown BGP implementation.
> 
>     With the route optimizer incidents - because let's face it, Honest
>     Networker is on the money as usual
>     https://honestnetworker.net/2020/08/06/as10990-routing/ - there is
>     really no excuse for any tier-1 carrier, they should at the very
>     least have strict prefix-list based filtering in place for
>     customer-facing EBGP sessions. In those cases it's much easier to
>     state who's not taking care of their proverbial lawn.
> 
>     Best regards,
>     Martijn
> 
>     On 8/31/20 3:25 PM, Tom Beecher wrote:
>>
>>         https://blog.cloudflare.com/analysis-of-todays-centurylink-level-3-outage/
>>
>>
>>     I definitely found Mr. Prince's writing about yesterday's events
>>     fascinating.
>>
>>     Verizon makes a mistake with BGP filters that allows a secondary
>>     mistake from leaked "optimizer" routes to propagate, and Mr.
>>     Prince takes every opportunity to lob large chunks of granite
>>     about how terrible they are.
>>
>>     L3 allows an erroneous flowspec announcement to cause massive
>>     global connectivity issues, and Mr. Prince shrugs and says
>>     "Incidents happen."
>>
>>
>>
>>
>>
>>     On Mon, Aug 31, 2020 at 1:15 AM Hank Nussbacher
>>     <hank at interall.co.il <mailto:hank at interall.co.il>> wrote:
>>
>>         On 30/08/2020 20:08, Baldur Norddahl wrote:
>>
>>         https://blog.cloudflare.com/analysis-of-todays-centurylink-level-3-outage/
>>
>>         Sounds like Flowspec possibly blocking tcp/179 might be the cause.
>>
>>         But that is Cloudflare speculation.
>>
>>         Regards,
>>         Hank
>>         Caveat: The views expressed above are solely my own and do not
>>         express the views or opinions of my employer
>>
>>>         An outage is what it is. I am not worried about outages. We
>>>         have multiple transits to deal with that.
>>>
>>>         It is the keep announcing prefixes after withdrawal from
>>>         peers and customers that is the huge problem here. That is
>>>         killing all the effort and money I put into having
>>>         redundancy. It is sabotage of my network after I cut the
>>>         ties. I do not want to be a customer at an outlet who has a
>>>         system that will do that. Luckily we do not currently have a
>>>         contract and now they will have to convince me it is safe for
>>>         me to make a contract with them. If that is impossible I
>>>         guess I won't be getting a contract with them.
>>>
>>>         But I disagree in that it would be impossible. They need to
>>>         make a good report telling exactly what went wrong and how
>>>         they changed the design, so something like this can not
>>>         happen again. The basic design of BGP is such that this
>>>         should not happen easily if at all. They did something
>>>         unwise. Did they make a route reflector based on a database
>>>         or something?
>>>
>>>         Regards,
>>>
>>>         Baldur
>>>
>>>         On Sun, Aug 30, 2020 at 5:13 PM Mike Bolitho
>>>         <mikebolitho at gmail.com <mailto:mikebolitho at gmail.com>> wrote:
>>>
>>>             Exactly. And asking that they somehow prove this won't
>>>             happen again is impossible.
>>>
>>>             - Mike Bolitho
>>>
>>>             On Sun, Aug 30, 2020, 8:10 AM Drew Weaver
>>>             <drew.weaver at thenap.com <mailto:drew.weaver at thenap.com>>
>>>             wrote:
>>>
>>>                 I’m not defending them but I am sure it isn’t
>>>                 intentional.
>>>
>>>                 *From:* NANOG
>>>                 <nanog-bounces+drew.weaver=thenap.com at nanog.org
>>>                 <mailto:thenap.com at nanog.org>> *On Behalf Of *Baldur
>>>                 Norddahl
>>>                 *Sent:* Sunday, August 30, 2020 9:28 AM
>>>                 *To:* nanog at nanog.org <mailto:nanog at nanog.org>
>>>                 *Subject:* Re: Centurylink having a bad morning?
>>>
>>>                 How is that acceptable behaviour? I shall remember
>>>                 never to make a contract with these guys until they
>>>                 can prove that they won't advertise my prefixes after
>>>                 I pull them. Under any circumstances.
>>>
>>>                 søn. 30. aug. 2020 15.14 skrev Joseph Jenkins
>>>                 <joe at breathe-underwater.com
>>>                 <mailto:joe at breathe-underwater.com>>:
>>>
>>>                     Finally got through on their support line and
>>>                     spoke to level1. The only thing the tech could
>>>                     say was it was an issue with BGP route reflectors
>>>                     and it started about 3am(pacific). They were
>>>                     still trying to isolate the issue. I've tried
>>>                     failing over my circuits and no go, the traffic
>>>                     just dies as L3 won't stop advertising my routes.
>>>
>>>                     On Sun, Aug 30, 2020 at 5:21 AM Drew Weaver via
>>>                     NANOG <nanog at nanog.org <mailto:nanog at nanog.org>>
>>>                     wrote:
>>>
>>>                         Hello,
>>>
>>>                         Woke up this morning to a bunch of reports of
>>>                         issues with connectivity had to shut down
>>>                         some Level3/CTL connections to get it to
>>>                         return to normal.
>>>
>>>                         As of right now their support portal won’t
>>>                         load: https://www.centurylink.com/business/login/
>>>
>>>                         Just wondering what others are seeing.
>>>
>>
> 



More information about the NANOG mailing list