FYI Netflix is down

Bryan Horstmann-Allen bdha at mirrorshades.net
Sat Jun 30 20:45:11 UTC 2012


+------------------------------------------------------------------------------
| On 2012-06-30 16:08:40, Rayson Ho wrote:
| 
| If I recall correctly, availability zone (AZ) mappings are specific to
| an AWS account, and in fact there is no way to know if you are running
| in the same AZ as another AWS account:
| 
| http://aws.amazon.com/ec2/faqs/#How_can_I_make_sure_that_I_am_in_the_same_Availability_Zone_as_another_developer
| 
| Also, AWS Elastic Load Balancer (and/or CloudWatch) should be able to
| detect that some instances are not reachable, and thus can start new
| instances and remap DNS entries automatically:
| http://aws.amazon.com/elasticloadbalancing/
| 
| This time only 1 AZ is affected by the power outage, so sites with
| fault tolerance built into their AWS infrastructure should be able to
| handle the issues relatively easily.

Explain Netflix and Heroku last night. Both of whom architect across multiple
AZs and have for many years.

The API and EBS across the region were also affected. ELB was _also_ affected
across the region, and many customers continue to report problems with it.

We were told in May of last year after the last massive full-region EBS outage
that the "control planes" for the API and related services were being decoupled
so issues in a single AZ would not affect all. Seems to not be the case.

Just because they offer these features that should help with resiliency doesn't
actually mean they _work_ under duress.
-- 
bdha
cyberpunk is dead. long live cyberpunk.




More information about the NANOG mailing list