Software router state of the art

Kevin Oberman oberman at es.net
Wed Jul 23 21:23:18 UTC 2008


> Date: Wed, 23 Jul 2008 16:51:50 -0400
> From: "William Herrin" <herrin-nanog at dirtside.com>
> Sender: wherrin at gmail.com
> 
> On Wed, Jul 23, 2008 at 3:59 PM, Kevin Oberman <oberman at es.net> wrote:
> >> The first bottleneck is the interrupts from the NIC. With a generic
> >> Intel NIC under Linux, you start to lose a non-trivial number of
> >> packets around 700mbps of "normal" traffic because it can't service
> >> the interrupts quickly enough.
> >
> > Most modern high performance network cards support MSI (Message Signaled
> > Interrupts) which generate real interrupts only in an intelligent
> > basis. and only at a controlled rate. Windows, Solaris and FreeBSD have
> > support for MSI and I think Linux does, too. It requires both hardware
> > and software support.
> 
> "ethtool -c". Thanks Sargun for putting me on to "I/O Coalescing."
> 
> But cards like the Intel Pro/1000 have 64k of memory for buffering
> packets, both in and out. Few have very much more than 64k. 64k means
> 32k to tx and 32k to rx. Means you darn well better generate an
> interrupt when you get near 16k so that you don't fill the buffer
> before the 16k you generated the interrupt for has been cleared. Means
> you're generating an interrupt at least for every 10 or so 1500 byte
> packets.

You have just hit on a huge problems with most (all?) 1G and 10G
hardware. The buffers are way too small for optimal performance in any
case where the RTT is anything more that half a millisecond, you exhaust
the window and stall the stream.

I need port move multi-gigabit streams across the country and between the
US and Europe. Those are a bit too far apart for those tiny buffers to
be of any use at all. This would require 3 GB of buffers. This same
problem also make TCP off-load of no use at all.
-- 
R. Kevin Oberman, Network Engineer
Energy Sciences Network (ESnet)
Ernest O. Lawrence Berkeley National Laboratory (Berkeley Lab)
E-mail: oberman at es.net			Phone: +1 510 486-8634
Key fingerprint:059B 2DDF 031C 9BA3 14A4  EADA 927D EBB3 987B 3751
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 224 bytes
Desc: not available
URL: <http://mailman.nanog.org/pipermail/nanog/attachments/20080723/f47cfd7c/attachment.sig>


More information about the NANOG mailing list