Recommended L2 switches for a new IXP

Tue Jan 13 21:50:03 UTC 2015

Manuel Marín writes:
> Dear Nanog community
> [...] There are so many options that I don't know if it makes sense to
> start with a modular switch (usually expensive because the backplane,
> dual dc, dual CPU, etc) or start with a 1RU high density switch that
> support new protocols like Trill and that supposedly allow you to
> create Ethernet Fabric/Clusters. The requirements are simple, 1G/10G
> ports for exchange participants, 40G/100G for uplinks between switches
> and flow support for statistics and traffic analysis.

Stupid thought from someone who has never built an IXP,
but has been looking at recent trends in data center networks:

There are these "white-box" switches mostly designed for top-of-rack or
spine (as in leaf-spine/fat-tree datacenter networks) applications.
They have all the necessary port speeds - well 100G seems to be a few
months off.  I'm thinking of brands such as Edge-Core, Quanta etc.

You can get them as "bare-metal" versions with no switch OS on them,
just a bootloader according to the "ONIE" standard.  Equipment cost
seems to be on the order of $100 per SFP+ port w/o optics for a
second-to-last generation (Trident-based) 48*10GE+4*40GE ToR switch.

Now, for the limited and somewhat special L2 needs of an IXP, couldn't
"someone" hack together a suitable switch OS based on Open Network Linux
(ONL) or something like that?

You wouldn't even need MAC address learning or most types of flooding,
because at an IXP this often hurts rather than helps.  For building
larger fabrics you might be using something other (waves hands) than
TRILL; maybe you could get away without slightly complex "multi-chassis
multi-channel" mechanisms, and so on.

"Flow support" sounds somewhat tough, but full netflow support that
would get Roland Dobbins' "usable telemetry" seal of approval is
probably out of reach anyway - it's a high-end feature with classical
gear.  With white-box switches, you could try to use the given 5-tuple
flow hardware capabilities - which might not scale that well -, or use
packet sampling, or try to use the built-in flow and counter mechanisms
in an application-specific way.  (Except *that's* a lot of work on the
software side, and a usably efficient implementation requires slightly
sophisticated hardware/software interfaces.)

Instead of a Linux-based switch OS, one could also build an IXP
"application" using OpenFlow and some kind of central controller.
(Not to be confused with "SDX: Software Defined Internet Exchange".)

Has anybody looked into the feasibility of this?

The software could be done as an open-source community project to make
setting up regional IXPs easier/cheaper.

Large IXPs could sponsor this so they get better scalability - although
I'm not sure how well something like the leaf-spine/fat-tree design maps
to these IXPs, which are typically distributed over several locations.
Maybe they could use something like Facebook's new design, treating each
IXP location as a "pod".
-- 
Simon.
[1] https://code.facebook.com/posts/360346274145943