DNS Based Load Balancers (redux)

ennova2005-nanog at yahoo.com ennova2005-nanog at yahoo.com
Wed Jul 5 09:40:37 UTC 2006

 Stepping back for a moment...
  Many (most) popular services end up in multiple data centers first because they want to get diversity (of data centers, of ISPs, maybe of pricing).  All mission critical sites will be designed such a subset of these data centers can take their entire load if need be.
  Once spread out this way - you may need to run some or all of them in an active/active configuration so you need to balance load between them in some fashion between them.
  If you are going to split the load - a natural desire is to split it such that it actually increases performance for users.   
  You figure network proximity (of the end user to the serving destination) ought to be a criteria -but the load on your cluster may be more important for personalization intensive sites.
  You start with round robin DNS but it leaves you unsatisfied along the way. You play around with souped up DNS servers that are fed with monitoring tools that measure reachability as well as some measure of load. You also discover that the most popular browser will gladly ignore your TTL settings and insist on sending your traffic to the data center that is down. You are frustrated when you find out that users of ISP A are being served out of your Data Center at ISP B, even though you have a data center connected to ISP A.  You think Anycast might be the answer but not everyone is set up to do Anycast. You find some clever people have been aggregating data that will offer to geolocate your callers IP addresses and maybe there is a way to use that information to find the nearest server. You realize the accuracy of this list is dubious, the exchange points for several countries may actually be on the coasts of the United States, and how would you integrate this into your
 DNS or HTTP redirector, while still doing 2 shift day job.
  You turn to alternatives, and find the shiny boxes and/or services called the GLBS. They perform 2 main services.
  First, they hand out answers, which may vary in time and space,  to your clients as to where to find the service they are looking for.
  Second, they decide what this "right" answer is.
  You post to NANOG and you get admonished about their efficacy on both counts. This is initially wrapped in appeals to love of God and country and general harm that might befall mankind but no one says what or why.
  On reflection, objections to the first part of this are usually along the "strict constructionist" point of view. No real harm comes from returning changing answers but when the Man who wrote the book jumps in with both feet you take pause.  He chides people for using stupid tricks. You wonder if they are stupid in the same way as the "For Dummies" series of books is not really for dummies.
  Objections to the determination of what the "right" answer is are more vociferous. Some immediately take the view that since the question was about DNS based load balancers, the inference was that the GLBS must be using DNS logistics to decide what the right answer is, even though DNS may simply be used to   "right communicate the right answer ( the first part) , but not calculated ( the second part).
  The GLBS may indeed be using some measure of server load, or even BGP derived network maps, or some other knowledge of topology or proximity but that gets drowned in the "the proximity of the DNS resolver to the GLBS is not a proxy for the actual end user".    The latter is actually strictly true, and it is difficult to argue given the specific examples of where it fails,  but no one is able to  say how many times in normal use this technique actually returns a bad answer.
  You even hear from a man with one leg in US and one in Europe using a split tunnel VPN who wonders why when he orders Pizza using his tunnel to the HQ back in Europe, he doesn't get greasy satisfaction back in the US.  You wonder what happens when he calls 911 on his VOIP phone, without having manually configured his PSAP in that configuration, but you have other problems to worry about at the moment. You also hear about the "AOL Proxy" effect masking all users behind it. Well actually you don't hear that, but someone should have chimed in about that.
  You hear some mumbling about the use of AS path lengths or a geo-location database of end user IPs not being a true measure. Yet you wonder if the Internet is actually not getting more stable everyday and that the nominal topology and the AS Paths for the more heavily trafficked routes may actually not change that rapidly in normal course.
  You also hear from others who have been using variations of GLBS for several years, and have even created large businesses by serving their customers this way. Their web sites are full of gleaming testimonials from these customers. Some one says no one got fired for using the GLBS... You wonder if those customers just bought  insurance. 

  You scratch your head some more. You want to order that pizza  on line but you decide against it.
  You realize for all its resiliency and elegance at the packet shoveling level,  the services architecture on the Internet still leaves a few things to be desired. 
  You finally realize that most of the objections, well reasoned as they have been, are on specific and narrow grounds and these may or may not actually matter in your situation. 

You understand that it is futile to look for the "best" answer every single time - you just want it a large portion of the time, while still meeting your site diversity goal (the failure of which will actually get you fired).

You finally look at your budget, you examine your proclivity to hack and tweak, you consider the other demands on your time, and you finally chose a solution that is somewhere on the non linear continuum of time,money and benefit:  round robin DNS, special purpose DNS servers that also calculate the "right answer",  http redirectors that are topology aware, Anycast if appropriate, a GLBS appliance, a GLBS service and other assorted glueware tossed in the middle.

Everyone is slightly dissatisfied, but hey,  isnt that the hall mark of a successful negotiation.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.nanog.org/pipermail/nanog/attachments/20060705/7c6367a1/attachment.html>

More information about the NANOG mailing list