US patent 5473599

Eygene Ryabinkin rea+nanog at grid.kiae.ru
Wed May 7 07:36:02 UTC 2014


Constantine,

Tue, May 06, 2014 at 06:11:04PM -0700, Constantine A. Murenin wrote:
> On 6 May 2014 15:17, David Conrad <drc at virtualized.org> wrote:
> > Except it wasn't useless: it was, in fact, in use by VRRP.
> > Further, the OpenBSD developers chose to squat on 240 for pfsync -
> > a number that has not yet been allocated.  If the OpenBSD
> > developers were so concerned about making the best choice, it
> > seems odd they chose an allocated number for one protocol and an
> > unallocated number for another protocol.
> 
> Can you explain why exactly do you find this odd?
> 
> VRRP/HSRP have had only one protocol number allocated to it; it's not
> like it had two, so, another one had to come out of somewhere else.

VRRP/HSRP comes from Cisco (well, VRRP is RFC'ed for some time, but
its origin is Cisco too), so there is a straight route for agreeing
on the protocol versioning within the same protocol ID.

And _I_ find this odd is because you'll probably find me very odd if
I'll come into your house, find currently unused room and will say
"oh, I'll live here; no one will be confused or troubled, the room is
unused".

Seriously, do you think that "we used same protocol number, but
version field tells you what's going on" is a really good excuse?  Had
you ever run tcpdump on the CARP traffic to see that it thinks it to
be VRRP and discover that you need '-T carp'?  Do you think that
in-hardware or in-software implementations really need to check an
extra field in the packet's contents to judge if this is VRRP (or
CARP) instead of just testing protocol number and leave packets for
unsupported protocol out of the path?  Do you think that extra
coordination with Cisco for not to reuse VRRP/HSRP protocol version
that was at that time unused (and picked by CARP) really worth that?

Tue, May 06, 2014 at 07:43:11PM -0700, Constantine A. Murenin wrote:
> On 6 May 2014 18:51, Jared Mauch <jared at puck.nether.net> wrote:
> > On May 6, 2014, at 9:11 PM, Constantine A. Murenin <mureninc at gmail.com> wrote:
> >> So, then the only problem, perhaps, is that noone has apparently
> >> bothered to explicitly document that both VRRP and CARP use
> >> 00:00:5e:00:01:xx MAC addresses, and that the "xx" part comes from the
> >> "Virtual Router IDentifier (VRID)" in VRRP and "virtual host ID
> >> (VHID)" in CARP, providing a colliding namespace, so, one cannot run
> >> both with the same Virtual ID on the same network segment.
> >
> > Or that CARP didn't get their OUI, ask for help from one of the
> > vendors that supports *BSD for use of their space or something
> > else.
> 
> Politics.  Again, this is a non-issue for most users -- there's a very
> easy, straightforward and complete workaround.

If you hadn't seen the cases when same VRIDs in the same network were
used for both VRRP and CARP doesn't mean that they aren't occurring in
the real world.  We use CARP and VRRP quite extensively and when we
first were hit by this issue, it was not that funny.  And our Cisco
folks had no knowledge about CARP, because they are just Cisco-heads
(and because there is no CARP specification of any kind).  Becides,
such "clever" choice of OUI limits total number of CARP + HSRP/VRRP
instances in the same L2 network to 256 instead of being 256 + 256.
When you're running many CARP and VRRP-based clusters, especially with
ARP load-balancing (multiple VRIDs for the single IP), this somewhat
pushes VRID space close to its limits.

One may say that (all that I had seen in the real-world conversations)

 a. people who use CARP and VRRP should know what they are both and
    avoid having same VRIDs;

 b. no sane person will use CARP load-balancing;

 c. the probability of having same VRIDs (and, thus, MACs) is small,

but choosing OUI from the VRRP space (hijacking that space) was
clearly the poor design choice.  Fullstop.  You may rant about Google,
SPDY and other stuff, but making examples of people doing more cruel
things doesn't help to alleviate the problem we're talking about.
That's just the polite way of saying "CARP developers were right, piss
off".


Getting the same protocol ID and reusing OUI assignment is a potential
point of confusion and errors.  It manifests itself in the real world.
Such potential points of breakage tend to strike at the worst possible
time and current networks are complicated enough even without this
mess; I think if you had worked in a complex network environments, you
know what I am talking about.  If not, well, just think of OSPF, BGP,
BFD, VRRP, MPLS, LACP, PAgP, ESRP, xSTP and other protocols that are
running today in any kinda matured network (not saying about other
fancy stuff like TRILL, SDN and other ones that tend to show up and be
even neccessary for some cases, but not very broadly deployed just
now).  Does anyone wants to have other problem-originators here?  You
appear to believe that it isn't an issue; well, that's your mindset.
Other people think that there is some level of controversy in all this.

Having the same protocol ID and OUI, but bumping the protocol version
looks like as "hey, we have better VRRP our here, let's use that", but
that didn't worked well, as we see now, and we have what we have: two
incompatible protocols that use same IDs and sometimes they clash.
Just learn the lesson and do the relevant parts of a protocol design
better next time.  And the "next time" can mean CARPv2 ;))


My personal view (if someone is interested) is that CARP did and still
does very good job and at the time of its appearence of was
technically superior vs VRRP/HSRP, just by having authentication.  And
it is/was free and non-patented thing that is a good stuff for some
environments.  But some points in CARP design are very rough and tend
to create network mess.  One can learn and avoid them, but it would be
much better if these points were eliminated from the beginning or, at
least, will be eliminated in the course of its further life.

Shutting up.
-- 
Eygene "using CARP and VRRP extensively" Ryabinkin,
National Research Centre "Kurchatov Institute"

Always code as if the guy who ends up maintaining your code will be
a violent psychopath who knows where you live.


More information about the NANOG mailing list