Centurylink having a bad morning?

Tom Beecher beecher at beecher.cc
Mon Aug 31 20:33:55 UTC 2020


Hopefully those customers learned the difference between redundancy and
diversity this weekend. :)

On Mon, Aug 31, 2020 at 3:48 PM Eric Kuhnke <eric.kuhnke at gmail.com> wrote:

> There's a number of enterprise end user type customers of 3356 that have
> on-premises server rooms/hosting for their stuff. And they spend a lot of
> money every month for a 'redundant' metro ethernet circuit that takes
> diverse fiber paths from their business park office building to the local
> clink/level3 POP. But all that last mile redundancy and fail over ability
> doesn't do much for them when 3356 breaks its network at the BGP level.
>
>
>
> On Mon, Aug 31, 2020 at 9:36 AM Drew Weaver <drew.weaver at thenap.com>
> wrote:
>
>> I also found the part where they mention that a lot of hosting companies
>> only have one uplink to be quizzical and also the fact that he goes pretty
>> close to implying that its Centurylink’s customers fault for not having
>> multiple paths to Cloudflare that don’t touch Centurylink a bit puzzling.
>> It could have just been poorly written.
>>
>>
>>
>>
>>
>> *From:* NANOG <nanog-bounces+drew.weaver=thenap.com at nanog.org> *On
>> Behalf Of *Tom Beecher
>> *Sent:* Monday, August 31, 2020 9:26 AM
>> *To:* Hank Nussbacher <hank at interall.co.il>
>> *Cc:* NANOG <nanog at nanog.org>
>> *Subject:* Re: Centurylink having a bad morning?
>>
>>
>>
>> https://blog.cloudflare.com/analysis-of-todays-centurylink-level-3-outage/
>>
>>
>>
>> I definitely found Mr. Prince's writing about yesterday's events
>> fascinating.
>>
>>
>>
>> Verizon makes a mistake with BGP filters that allows a secondary mistake
>> from leaked "optimizer" routes to propagate, and Mr. Prince takes every
>> opportunity to lob large chunks of granite about how terrible they are.
>>
>>
>>
>> L3 allows an erroneous flowspec announcement to cause massive global
>> connectivity issues, and Mr. Prince shrugs and says "Incidents happen."
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> On Mon, Aug 31, 2020 at 1:15 AM Hank Nussbacher <hank at interall.co.il>
>> wrote:
>>
>> On 30/08/2020 20:08, Baldur Norddahl wrote:
>>
>>
>>
>> https://blog.cloudflare.com/analysis-of-todays-centurylink-level-3-outage/
>>
>>
>>
>> Sounds like Flowspec possibly blocking tcp/179 might be the cause.
>>
>>
>>
>> But that is Cloudflare speculation.
>>
>>
>>
>> Regards,
>> Hank
>>
>> Caveat: The views expressed above are solely my own and do not express
>> the views or opinions of my employer
>>
>>
>>
>> An outage is what it is. I am not worried about outages. We have multiple
>> transits to deal with that.
>>
>>
>>
>> It is the keep announcing prefixes after withdrawal from peers and
>> customers that is the huge problem here. That is killing all the effort and
>> money I put into having redundancy. It is sabotage of my network after I
>> cut the ties. I do not want to be a customer at an outlet who has a system
>> that will do that. Luckily we do not currently have a contract and now they
>> will have to convince me it is safe for me to make a contract with them. If
>> that is impossible I guess I won't be getting a contract with them.
>>
>>
>>
>> But I disagree in that it would be impossible. They need to make a good
>> report telling exactly what went wrong and how they changed the design, so
>> something like this can not happen again. The basic design of BGP is such
>> that this should not happen easily if at all. They did something unwise.
>> Did they make a route reflector based on a database or something?
>>
>>
>>
>> Regards,
>>
>>
>>
>> Baldur
>>
>>
>>
>> On Sun, Aug 30, 2020 at 5:13 PM Mike Bolitho <mikebolitho at gmail.com>
>> wrote:
>>
>> Exactly. And asking that they somehow prove this won't happen again is
>> impossible.
>>
>> - Mike Bolitho
>>
>>
>>
>> On Sun, Aug 30, 2020, 8:10 AM Drew Weaver <drew.weaver at thenap.com> wrote:
>>
>> I’m not defending them but I am sure it isn’t intentional.
>>
>>
>>
>> *From:* NANOG <nanog-bounces+drew.weaver=thenap.com at nanog.org> *On
>> Behalf Of *Baldur Norddahl
>> *Sent:* Sunday, August 30, 2020 9:28 AM
>> *To:* nanog at nanog.org
>> *Subject:* Re: Centurylink having a bad morning?
>>
>>
>>
>> How is that acceptable behaviour? I shall remember never to make a
>> contract with these guys until they can prove that they won't advertise my
>> prefixes after I pull them. Under any circumstances.
>>
>>
>>
>> søn. 30. aug. 2020 15.14 skrev Joseph Jenkins <joe at breathe-underwater.com
>> >:
>>
>> Finally got through on their support line and spoke to level1. The only
>> thing the tech could say was it was an issue with BGP route reflectors and
>> it started about 3am(pacific). They were still trying to isolate the issue.
>> I've tried failing over my circuits and no go, the traffic just dies as L3
>> won't stop advertising my routes.
>>
>>
>>
>> On Sun, Aug 30, 2020 at 5:21 AM Drew Weaver via NANOG <nanog at nanog.org>
>> wrote:
>>
>> Hello,
>>
>>
>>
>> Woke up this morning to a bunch of reports of issues with connectivity
>> had to shut down some Level3/CTL connections to get it to return to normal.
>>
>>
>>
>> As of right now their support portal won’t load:
>> https://www.centurylink.com/business/login/
>>
>>
>>
>> Just wondering what others are seeing.
>>
>>
>>
>>
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.nanog.org/pipermail/nanog/attachments/20200831/eeb2d92b/attachment.html>


More information about the NANOG mailing list