400G forwarding - how does it work?

Dave Taht dave.taht at gmail.com
Sun Aug 7 17:34:17 UTC 2022

If it's of any help... the bloat mailing list at lists.bufferbloat.net has
the largest concentration of
queue theorists and network operator + developers I know of. (also, bloat
readers, this ongoing thread on nanog about 400Gbit is fascinating)

There is 10+ years worth of debate in the archives:
https://lists.bufferbloat.net/pipermail/bloat/2012-May/thread.html as one

On Sun, Aug 7, 2022 at 10:14 AM dip <diptanshu.singh at gmail.com> wrote:

> Disclaimer: I often use the M/M/1 queuing assumption for much of my work
> to keep the maths simple and believe that I am reasonably aware in which
> context it's a right or a wrong application :). Also, I don't intend to
> change the core topic of the thread, but since this has come up, I couldn't
> resist.
> >> With 99% load M/M/1, 500 packets (750kB for 1500B MTU) of
> >> buffer is enough to make packet drop probability less than
> >> 1%. With 98% load, the probability is 0.0041%.
> To expand the above a bit so that there is no ambiguity. The above assumes
> that the router behaves like an M/M/1 queue. The expected number of packets
> in the systems can be given by
> [image: image.png]
> where [image: image.png] is the utilization. The probability that at
> least B packets are in the system is given by  [image: image.png] where B
> is the number of packets in the system. for a link utilization of .98, the
> packet drop probability is .98**(500) = 0.000041%. for a link utilization
> of 99%,  .99**500 = 0.00657%.
Regrettably, tcp ccs, by design do not stop growth until you get that drop,
e.g. 100+% utilization.

>> When many TCPs are running, burst is averaged and traffic
> >> is poisson.
> M/M/1 queuing assumes that traffic is Poisson, and the Poisson assumption
> is
> 1) The number of sources is infinite
> 2) The traffic arrival pattern is random.
> I think the second assumption is where I often question whether the
> traffic arrival pattern is truly random. I have seen cases where traffic
> behaves more like self-similar. Most Poisson models rely on the Central
> limit theorem, which loosely states that the sample distribution will
> approach a normal distribution as we aggregate more from various
> distributions. The mean will smooth towards a value.
> Do you have any good pointers where the research has been done that
> today's internet traffic can be modeled accurately by Poisson? For as many
> papers supporting Poisson, I have seen as many papers saying it's not
> Poisson.
> https://www.icir.org/vern/papers/poisson.TON.pdf
> https://www.cs.wustl.edu/~jain/cse567-06/ftp/traffic_models2/#sec1.2

I am firmly in the not-poisson camp, however, by inserting (esp) FQ and AQM
techniques on the bottleneck links it is very possible to smooth traffic
into this more easily analytical model - and gain enormous benefits from
doing so.

> On Sun, 7 Aug 2022 at 04:18, Masataka Ohta <
> mohta at necom830.hpcl.titech.ac.jp> wrote:
>> Saku Ytti wrote:
>> >> I'm afraid you imply too much buffer bloat only to cause
>> >> unnecessary and unpleasant delay.
>> >>
>> >> With 99% load M/M/1, 500 packets (750kB for 1500B MTU) of
>> >> buffer is enough to make packet drop probability less than
>> >> 1%. With 98% load, the probability is 0.0041%.
>> > I feel like I'll live to regret asking. Which congestion control
>> > algorithm are you thinking of?
>> I'm not assuming LAN environment, for which paced TCP may
>> be desirable (if bandwidth requirement is tight, which is
>> unlikely in LAN).
>> > But Cubic and Reno will burst tcp window growth at sender rate, which
>> > may be much more than receiver rate, someone has to store that growth
>> > and pace it out at receiver rate, otherwise window won't grow, and
>> > receiver rate won't be achieved.
>> When many TCPs are running, burst is averaged and traffic
>> is poisson.
>> > So in an ideal scenario, no we don't need a lot of buffer, in
>> > practical situations today, yes we need quite a bit of buffer.
>> That is an old theory known to be invalid (Ethernet switches with
>> small buffer is enough for IXes) and theoretically denied by:
>>         Sizing router buffers
>>         https://dl.acm.org/doi/10.1145/1030194.1015499
>> after which paced TCP was developed for unimportant exceptional
>> cases of LAN.
>>  > Now add to this multiple logical interfaces, each having 4-8 queues,
>>  > it adds up.
>> Having so may queues requires sorting of queues to properly
>> prioritize them, which costs a lot of computation (and
>> performance loss) for no benefit and is a bad idea.
>>  > Also the shallow ingress buffers discussed in the thread are not delay
>>  > buffers and the problem is complex because no device is marketable
>>  > that can accept wire rate of minimum packet size, so what trade-offs
>>  > do we carry, when we get bad traffic at wire rate at small packet
>>  > size? We can't empty the ingress buffers fast enough, do we have
>>  > physical memory for each port, do we share, how do we share?
>> People who use irrationally small packets will suffer, which is
>> not a problem for the rest of us.
>>                                                 Masataka Ohta

FQ World Domination pending:
Dave Täht CEO, TekLibre, LLC
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.nanog.org/pipermail/nanog/attachments/20220807/2badb901/attachment.html>

More information about the NANOG mailing list