Issues with Gmail
joelja at bogus.com
Wed Sep 2 18:20:49 UTC 2009
Long before we has widespread commercial internet, we still had to have the backup plan for when the single highly fault tollerant entitity on which we were dependant on for a particular service went out.
Sometimes, that plan is wait for restoration, whether it was because the bell systems got a bit melty on the long distance, or because your regional utility managed to melt down the power grid taking out both substations providing diverse feeds.
Systemic but temporarly localized failured has existed as long as the weather. One can move the failure around but I think I can confidently assert that we'll never entirely eleminate it.
Michael Thomas <mike at mtcc.com> wrote:
>On 09/02/2009 10:33 AM, Robert Mathews (OSIA) wrote:
>> On Wed, Sep 2, 2009 at 5:05 AM, Randy Bush<randy at psg.com> wrote:
>>> [....] the internet is a wonderful
>>> demonstration of building a reliable network out of reliable components.
>>> but what we have with google mail (and apps) is two scary problems
>>> o way too many users relying on a single point of failure. so it
>>> makes the nyt when it breaks because of the number of users
>>> affected, and
>> I choose to not assume to "what/which single point of failure" this
>> reference by Randy applies. However, we can take confidence in the
>> fact that Google's Gmail service architecture is distributed; not to be
>> interpreted of course, as suggesting that within the distribution, there
>> isn't a single point of failure. Perhaps, from a network operations
>> point of view, the point needs elaboration.
>I think that Randy might be conflating single point of failure with
>"resilience". Google, distributed on every level as it is, is still
>just one operator and in this case the lemmings faithfully followed
>each other into the sea. We've been on an anti-resilience binge for
>quite some time, accelerated to warp speed by the advent of the
>Internet itself. There's something to be said about not having all of your
>police scanners, etc, etc on the internet from a resilience
>standpoint, but the siren call is strong for good reasons too.
More information about the NANOG