United Airlines is Down (!) due to network connectivity problems

Wed Jul 8 21:49:27 UTC 2015

Well that's a given. I am talking about organizations like the NYSE or
MaBell,

On Wed, Jul 8, 2015 at 5:44 PM, Keith Stokes <keiths at neilltech.com> wrote:

>  Who roles out software in the middle of the week and not on weekends?
> People who have more business on the weekends than the week, such as
> retail.
>
>  On Jul 8, 2015, at 4:40 PM, Dovid Bender <dovid at telecurve.com> wrote:
>
> Other than for an emergency repair who roles out a software update in
> middle of the week? We test, test and then test some more and only then
> roll out on weekends. Our maintenance window is 00:00 - 01:00 Sunday
> mornings for sw updates etc.
>
>
> On Wed, Jul 8, 2015 at 3:02 PM, Matthew Huff <mhuff at ox.com> wrote:
>
> Traders on the floor are being told that it’s a software glitch from new
> software that was rolled out Tuesday night. Nothing official has been
> said.  The only thing I know for sure is that if the NYSE was hacked, they
> wouldn’t tell anyone the details for a long time, if ever.
>
> The impact of the NYSE being down is much less significant than it used to
> be since most stocks are multiple-listed on other exchanges.
>
> The lack of information through official channels is unusual though. In
> previous situations, there has been at least a little hand-holding. So far,
> nada. In fact, other than financial service provider’s emails, there has
> been no emails so far today from the NYSE, including the announcement of
> resumption of service. According the the NYSE web page, trading will resume
> at 3:05pm EST today with primary specialist, and 3:10 for everyone.
>
>
>
>
> On Jul 8, 2015, at 2:33 PM, Brett Frankenberger <rbf+nanog at panix.com>
>
> wrote:
>
>
> On Wed, Jul 08, 2015 at 01:55:43PM -0400, Valdis.Kletnieks at vt.edu wrote:
>
> On Wed, 08 Jul 2015 17:42:52 -0000, Matthew Huff said:
>
>
>  Given that the technical resources at the NYSE are significant and
> the lengthy duration of the outage, I believe this is more serious
> than is being reported.
>
>
> My personal, totally zero-info suspicion:
>
> Some chuckleheaded NOC banana-eater made a typo, and discovered an
> entirely new class of wondrous BGP-wedgie style "We know how we got
> here, but how do we get back?" network misbehaviors....
>
>
> We don't know how long the underlying problem lasted, and how much of
> the continued outage time is dealing with the logistics of restarting
> trading mid-day.  Completely stopping and then restarting trading
> mid-day is likely not a quick process even if the underlying technical
> issue is immediately resolved.
>
> (Such things have happened before - like the med school a few years ago
>
>  that
>
> extended their ethernet spanning tree one hop too far, and discovered
>
>  that
>
> merely removing the one hop too far wasn't sufficient to let it come
>
>  back up...)
>
>
> No, but picking a bridge in the center, giving it priority sufficient
> for it to become root, and then configuring timers[1] that would
> support a much larger than default diameter, possibly followed by some
> reboots, probably would have.
>
> From what has been publicly stated, they likely took a much longer and
> more complicated path to service restoration than was strictly
> necessary.  (I have no non-public information on that event.  There may
> be good reasons, technical or otherwise, why that wasn't the chosen
> solution.)
>
>    -- Brett
>
> [1] You only have to configure them on the root; non-root bridges use
> what root sends out, not what they ahve configured.
>
>
>
>
>
> ---
>
>  Keith Stokes
>
>
>
>
>