ISC DHCP server failover

David W. Hankins David_Hankins at isc.org
Sun Mar 21 17:33:39 UTC 2010


On Fri, Mar 19, 2010 at 05:10:04PM -0700, Mike wrote:
> With all due respect and acknowledgment of the tremendous contributions of 
> ISC and you yourself Mr. Hankins, I have to comment that failover in 
> isc-dhcp is broken by design because it requires the amount of handholding 
> and operator thinking in the event of a failure that you explained to us at 
> length is required. Failure needs to be handled automatically and without 
> any intervention at all, otherwise you might as well not have it and I 
> think most network operators would agree.

First let me say that I wasn't involved in failover's design, I'm only
a sort of "maintainer," so the criticism is not offending me in the
slightest. :)

Failover definitely busied itself with the cross-country,
geographically diverse DHCP server situation, hoping that by solving
that they are also giving "HA", heartbeat-cable types of folks a tool
they can also use, although it isn't explicitly designed for that
purpose alone.  That does tend to leave this community a little
under-served and unhappy, which was my motivation for failover
features in 4.2 to try and support their needs better (auto partner-
down, greater endurance in comms-interrupted).

What you describe for an alternative (although I will criticize it
slightly in suggesting you are under-estimating DHCP's needs; the
question of message delivery is really not relevant) are the building
blocks for something I would refer to as "DHCP Server Clustering".

I fully endorse it.

That is a set of separate programs that work together to appear from
the outside to be a single DHCP server (as those terms are defined in
RFC), and the ways in which you can build-in redundancy and self-
healing (self-restarting components, component failures only affect a
subset of services, redundant processes that cover gaps in coverage,
etc).

In short, you're describing one of our key motivations for migrating
ISC DHCP to the BIND 10 framework.

That gives us a complete set of tools.  Within the same rack, you will
ultimately be able to implement a "single server" from all outside
observance that is actually implemented in a redundant way across
(N+1) systems* or CPU's within one system, while still maintaining a
failover ability to tie two such geographically diverse clusters
together (not to mention co-habitation with BIND 10's DNS services
in the same configuration and monitoring plane) that don't actually
have to be clusters if you don't want all that baggage either.

So everyone's happy.

Unfortunately at the moment we are still collecting sponsors for the
DHCP-in-BIND-10 project, and no shovels have been turned.  But I'm
confident the work will proceed (and if anyone wishes to help as a
sponsor or a participant, please contact us!  We are in Anaheim this
week, and there is also a link in my signature you can click).

In the meantime, failover is a tool we have whereas DHCP clustering
software is so far only a tool we want to create.

* Some objects in the future-mirror may be further away than they
  appear.

-- 
David W. Hankins	BIND 10 needs more DHCP voices.
Software Engineer		There just aren't enough in our heads.
Internet Systems Consortium, Inc.		http://bind10.isc.org/
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 197 bytes
Desc: not available
URL: <http://mailman.nanog.org/pipermail/nanog/attachments/20100321/b9547c48/attachment.sig>


More information about the NANOG mailing list