Latency/Packet Loss on ASR1006

Brian Turnbow b.turnbow at twt.it
Thu Dec 9 09:19:39 UTC 2021



> On 11/26/2021 1:09 PM, Colin Legendre wrote:
> > Hi,
> >
> > We have ...
> >
> > ASR1006  that has following cards...
> > 1 x ESP40
> > 1 x SIP40
> > 4 x SPA-1x10GE-L-V2
> > 1 x 6TGE
> > 1 x RP2
> >
> > We've been having latency and packet loss during peak periods...
> >
> > We notice all is good until we reach 50% utilization on output of...
> >
> > 'show platform hardware qfp active datapath utilization summary'
> >
> > Literally ... 47% good... 48% good... 49% latency to next hop goes
> > from 1ms to 15-20ms... 50% we see 1-2% packet-loss and 30-40ms
> > latency... 53% we see 60-70ms latency and 8-10% packet loss.
> >
> > Is this expected... the ESP40 can only really push 20G and then starts
> > to have performance issues?
> >

He had a similar issue about 4 years ago.
We were showing packet loss and drops getting progressively worse and the router was falling over when reaching about 70% of usage.
We could see the interface reliability go down and input errors due to overruns on the interfaces.
Cisco blamed it on microburtst not being able to be handled under load.


"We were able to replicate this scenario in our lab as well.
QFP under high load generated input errors and overruns which in turn led to unicast failures/ drops/ latency.
The issue is not consistent with QFP % utilization as sometimes with even 80%+ traffic, we  do not see the drops:"

And recommended removing traffic or upgrading esp.

One of our guys disabled nbar on the router and the problem disappeared.
I would suggest taking a look at what features you are using and if you can try and disable them to see if it makes any impact.
We then upgraded esps and all has been fine since.

Brian



More information about the NANOG mailing list