Unicast Flooding

Matthew Huff mhuff at ox.com
Wed Jun 17 21:58:23 UTC 2009


Unicast flooding is a common occurrence in large datacenters especially with asymmetrical paths caused by different first hop routers (via HSRP, VRRP, etc). We ran into this some time ago. Most arp sensitive systems such as clusters, HSRP, content switches etc are smart enough to send out gratuitous arps which eliminates the worries of increasing the timeouts. We haven't had any issues since we made the changes.

After debugging the problem we added "mac-address-table aging-time 14400" to our data center switches. That syncs the mac aging time to the same timeout value as the ARP timeout 

----
Matthew Huff       | One Manhattanville Rd
OTA Management LLC | Purchase, NY 10577
http://www.ox.com  | Phone: 914-460-4039
aim: matthewbhuff  | Fax:   914-460-4139


> -----Original Message-----
> From: Brian Shope [mailto:blackwolf99999 at gmail.com]
> Sent: Wednesday, June 17, 2009 5:33 PM
> To: nanog at nanog.org
> Subject: Unicast Flooding
> 
> Recently while running a packet capture I came across some unicast
> flooding
> that was happening on my network.  One of our core switches didn't have
> the
> mac-address for a server, and was flooding all packets destined to that
> server.  It wasn't learning the mac-address because the server was
> responding to packets out on a different network card on a different
> switch.  The flooding I was seeing wasn't enough to cause any network
> issues, it was only a few megs, but it was something that I wanted to
> fix.
> 
> I've ran into this issue before, and solved it by statically entering
> the
> mac-address into the cam tables.
> 
> I want to avoid this problem in the future, and I'm looking at two
> different
> things.
> 
> The first is preventing it in the first place.  Along those lines, I've
> seen
> some recommendations on-line about changing the arp and cam timeouts to
> be
> the same.  However, there seems to be a disagreement on which is
> better,
> making the arp timeouts match the cam table timeouts, or vice versa.
> Also,
> when talking about this, everyone seems to be only considering routers,
> but
> what about the timers on a firewall?  I'm worried that I might cause
> other
> issues by changing these timers.
> 
> The second thing I'm considering is monitoring.  I'd like to setup
> something
> to monitor for any excessive unicast flooding in the future.  I
> understand
> that a little unicast flooding is normal, as the switch has to do a
> little
> bit of flooding to find out where people are.  While looking for a way
> to
> monitor this, I came across the 'mac-address-table unicast-flood'
> command on
> Cisco switches.  This looked perfect for what I needed, but apparently
> it is
> currently not an option on 6500 switches with Sup720s.  Since there
> doesn't
> appear to be an option on Cisco that monitors specificaly for unicast
> floods, I thought that maybe I could setup a server with a network card
> in
> promiscuous mode and then keep stats of all packets received that
> aren't
> destined for the server and that also aren't legitimate broadcasts or
> multicasts.  The only problem with that is that I don't want to have to
> completely custom build my own solution.  I was hoping that someone may
> have
> already created something like this, or that maybe there is a good
> reporting
> tool for wireshark or something that could generate the report that I
> want.
> 
> Anyone have any suggestions on either prevention/monitoring?
> 
> Thanks!!
> 
> -Brian




More information about the NANOG mailing list