NANOG 40 agenda posted

Paul Vixie paul at vix.com
Mon Jun 4 07:29:03 UTC 2007


two replies here.  i (paul at vix.com) said:

> > quagga ospf6d works great, and currently lacks only a health check API.

Donald Stahl <don at calis.blacksun.org> answered:

> Health checks are unfortunately the most important aspect of a LB for some
> people.

understood.

> Can you elaborate on where you use ECMP and specifics about your
> implementation that might interest people?

i could, but joe abley already did, and i wouldn't want to plagiarize him.
plz see <http://www.isc.org/pubs/tn/index.pl?tn=isc-tn-2004-1.html>.

---

Colm MacCarthaigh <colm at stdlib.net> answered:

> If you're load-balancing N nodes, and 1 node dies, the distribution hash
> is re-calced and TCP sessions to all N are terminated simultaneously. 

i could just say that since i'm serving mostly UDP i don't care about this,
but then i wouldn't have a chance to say that paying the complexity and bug
and training cost of an extra in-path powered box 24x365.24 doesn't weigh
well against the failure rate of the load balanced servers.  somebody could
drop an anvil on one of my servers twice a day (so, 730 times per year) and
i would still come out ahead, given that most TCP traffic comes from web
browsers and many users will click "Reload" before giving up.  then there's
CEF which i think keeps existing flows stable even through an OSPF recalc.
finally, there's the fact that we see less than one server failure per month
among the 100 or so servers we've deployed behind OSPF ECMP.

i know a lot of people who get paid well for building and selling and
supporting Extra Powered Boxes, and a lot of other people who will never
get fired for buying one... but that doesn't make it right.



More information about the NANOG mailing list