FYI Netflix is down

Jimmy Hess mysidia at gmail.com
Sat Jun 30 15:06:52 UTC 2012


On 6/30/12, Cameron Byrne <cb.list6 at gmail.com> wrote:
> On Jun 30, 2012 12:25 AM, "joel jaeggli" <joelja at bogus.com> wrote:
>> On 6/30/12 12:11 AM, Tyler Haske wrote:
> Geo-redundancy is key. In fact, i would take distributed data centers over
> RAID, UPS, or any other "fancy pants" © mechanisms any day.

Geo-redundancy is more expensive than any of those technologies, because it
directly impacts every application and reduces performance.  It means
that, for example, if an application needs to guarantee something is
persisted to a distributed database,  such as a record that such and
such user's  credit card has just been charged  $X  or such and such
user has uploaded  this blob to the web service ;    The round trip
time of the longest latency path between any of the redundancy sites,
is added to the critical path of the WRITE transaction latency during
the commit stage.    Because you cannot complete a transaction and
ensure you have consistency or correct data, until that transaction
reaches a system at the remote site managing the persistence, and is
acknowledged as received intact.

For example,  if you have geo sites, which are a minimum of   250
miles apart;  if you recall,  light only travels 186.28 miles per
millisecond.   That means you have a 500 mile round-trip and therefore
have added a bare minimum of 2.6 milliseconds of latency to every
write transaction,  and probably more like  15 milliseconds.

If your original transaction latency was   at  1 milliseconds.   or
1000 transactions per second,  AND you require only that the data
reaches the remote site and is acknowledged  (not that the
transaction succeeds at the remote site,  before you commit),  you are
now at  a minimum of 2.6 milliseconds    average  384 transactions per
second.

To actually do it safely, you require   3.6 milliseconds,  limiting
you to an average of  277 transactions per second.

If the application is not specially designed for remote site
redundancy, then this means you require a scheme such as synchronous
storage-level  replication  to achieve clustering;  which  has even
worse results if there is significant geographic dispersion.


RAID transactional latencies are much lower.

UPSes and redundant power  do not increase transaction latencies at all.

--
-JH




More information about the NANOG mailing list