Thousands of hosts on a gigabit LAN, maybe not

Brandon Martin lists.nanog at monmotha.net
Fri May 8 19:41:31 UTC 2015


On 05/08/2015 02:53 PM, John Levine wrote:
> Some people I know (yes really) are building a system that will have
> several thousand little computers in some racks.  Each of the
> computers runs Linux and has a gigabit ethernet interface.  It occurs
> to me that it is unlikely that I can buy an ethernet switch with
> thousands of ports, and even if I could, would I want a Linux system
> to have 10,000 entries or more in its ARP table.
>
> Most of the traffic will be from one node to another, with
> considerably less to the outside.  Physical distance shouldn't be a
> problem since everything's in the same room, maybe the same rack.
>
> What's the rule of thumb for number of hosts per switch, cascaded
> switches vs. routers, and whatever else one needs to design a dense
> network like this?  TIA

Unless you have some dire need to get these all on the same broadcast 
domain, those kind of numbers on a single L2 would send me running for 
the hills for lots of reasons, some of which you've identified.

I'd find a good L3 switch and put no more ~200-500 IPs on each L2 and 
let the switch handle gluing it together at L3.  With the proper 
hardware, this is a fully line-rate operation and should have no real 
downsides aside from splitting up the broadcast domains (if you do need 
multicast, make sure your gear can do it).  With a divide-and-conquer 
approach, you shouldn't have problems fitting the L2+L3 tables into even 
a pretty modest L3 switch.

Densest chassis switches I know of are going to be gets about 96 ports 
per RU (48 ports each on a half-width blade, but you need breakout 
panels to get standard RJ45 8P8C connectors as the blades have MRJ21s) 
less rack overhead for power supplies, management, etc..  That should 
get you ~2000 ports per rack [1].  Such switches can be quite expensive. 
  The trend seems to be toward stacking pizza boxes these days, though. 
  Get the number of ports you need per rack (you're presumably not 
putting all 10,000 nodes in a single rack) and aggregate up one or two 
layers.  This gives you a pretty good candidate for your L2/L3 split.

[1] Purely as an example, you can cram 3x Brocade MLX-16 chassis into a 
42U rack (with 0RU to spare).  That gives you 48 slots for line cards. 
Leaving at least one slot in each chassis for 10Gb or 100Gb uplinks to 
something else, 45x48 = 2160 1000BASE-T ports (electrically) in a 42U 
rack, and you'll need 45 more RU somewhere for breakout patch panels!
-- 
Brandon Martin



More information about the NANOG mailing list