MTU of the Internet?

Sun Feb 8 19:33:14 UTC 1998

Paul A Vixie <paul at vix.com> writes:

> yes.  this is why we have a window scale tcp option now.  and this buffer
> requirement is per end-host, whether there are multiple tcp sessions or
> just one.

You need to support approximately a window worth of
buffering in each device along the path where there is the
possibility of transient congestion; this turns out to
mean you need approximately the line bandwidth * ca. 400ms
per link (or at least space for a decent delay * bandwidth
product on very very fast links).  See the work by Guy
Almes et al. on the T3 NSFNET link to Hawaii, and endless
rants by me, Peter and Curtis.

So, big amounts of unacknowledged data == big amounts of buffer.

Big amounts of buffer not being used == big amounts of
space for TCP to fill probe into and fill up.  Bad news
for a number of reasons.  Meet RED, which keeps buffer
occupation small, leaving it available for dealing with
transient congestion so already-transmitted unacknowledged
data from far away don't get lost when someone clicks on a
web page or starts an FTP or something.

> at least one satellite ip provider is evaluating transparent caching as a
> way to obviate these effects by aggregating dumb-client connections into
> something smarter for the sky link.

At least six such companies have wandered into the offices
of either me or my managing director in the past three months.
At least four other companies have telephoned.  Just FYI. --:)

> > 	So, if you have a system with per-connection bufffering, and
> > that hits a "speed limit" of 512 kbits/sec over your satellite T1,
> > you can raise the speed up to the full T1 by opening 3 streams.  Thus
> > more connections == faster downloads.
> 
> it's still the same amount of buffering per end host though.

The amount of per-interface buffering on routers is a
function of the propagation delay and bandwidth of the
network, optimally, whether there is one flow or ten flows
or ten thousand flows passing through.  This buffering is
more of a "speed limit" based on limited experiments done
with various satellites used for real circuit
transatlantic circuit restoration than host buffering is.
The problem is that a short transient spike comes along
and eats a couple of segments in a very wide window,
causing slow-start.   This often happens well away from
the pair of routers on either end of the satellite
connection, from anecdotal evidence.

> > 	... The same group ... did some satellite testing, trying
> > a parallel FTP client they had.  They benchmarked 1, 2, 4, 8, and 16
> > streams between two SunOS 4 boxes over a T1 satellite (500ms RTT).
> > Maximum performance (shortest transfer time) was achieved with 4
> > TCP streams.  
> 
> that's got more to say about the sunos tcp implementation than about tcp
> as a protocol.

No, it has more to say about the test methodology.
It is difficult to interfere with your own packets, if you
have only one interface to transmit on...

A better small-scale test would involve several sinks and
sources clustered around a router that does not impose a
round-robin or similar serialization scheme on inbound
packets, combined with a "burst generator" which sends
traffic across that single router, interfering with stuff
coming down from the satellite.

Cisco to their credit has in the past spent lots of time
studying this based on some studies led by Curtis and
others of the performance of the Bellcore ATM NAPs, and
some other experience garnered here and there.   Other
vendors are probably not ignorant of these testing
methodoligies... 

Anyone at ANS who wants to talk about the IBM traffic
recorder is welcome to do so. :)

	Sean.