400G forwarding - how does it work?
Masataka Ohta
mohta at necom830.hpcl.titech.ac.jp
Sun Aug 7 11:16:07 UTC 2022
Saku Ytti wrote:
>> I'm afraid you imply too much buffer bloat only to cause
>> unnecessary and unpleasant delay.
>>
>> With 99% load M/M/1, 500 packets (750kB for 1500B MTU) of
>> buffer is enough to make packet drop probability less than
>> 1%. With 98% load, the probability is 0.0041%.
> I feel like I'll live to regret asking. Which congestion control
> algorithm are you thinking of?
I'm not assuming LAN environment, for which paced TCP may
be desirable (if bandwidth requirement is tight, which is
unlikely in LAN).
> But Cubic and Reno will burst tcp window growth at sender rate, which
> may be much more than receiver rate, someone has to store that growth
> and pace it out at receiver rate, otherwise window won't grow, and
> receiver rate won't be achieved.
When many TCPs are running, burst is averaged and traffic
is poisson.
> So in an ideal scenario, no we don't need a lot of buffer, in
> practical situations today, yes we need quite a bit of buffer.
That is an old theory known to be invalid (Ethernet switches with
small buffer is enough for IXes) and theoretically denied by:
Sizing router buffers
https://dl.acm.org/doi/10.1145/1030194.1015499
after which paced TCP was developed for unimportant exceptional
cases of LAN.
> Now add to this multiple logical interfaces, each having 4-8 queues,
> it adds up.
Having so may queues requires sorting of queues to properly
prioritize them, which costs a lot of computation (and
performance loss) for no benefit and is a bad idea.
> Also the shallow ingress buffers discussed in the thread are not delay
> buffers and the problem is complex because no device is marketable
> that can accept wire rate of minimum packet size, so what trade-offs
> do we carry, when we get bad traffic at wire rate at small packet
> size? We can't empty the ingress buffers fast enough, do we have
> physical memory for each port, do we share, how do we share?
People who use irrationally small packets will suffer, which is
not a problem for the rest of us.
Masataka Ohta
More information about the NANOG
mailing list