Latency/Packet Loss on ASR1006

Jean St-Laurent jean at ddostest.me
Fri Dec 10 01:12:04 UTC 2021


If you still need netflow to gain some visibility on what’s happening, you could check the percentage of netflow export. 

 

Usually 1/1000 is good or 0.1%. Maybe for you 1/1 000 000  could be good enough too.

 

If 100% was used, then indeed there are some real time performance penalties. Not much people need an accurate 100% of netflow exports. If you need 100% accuracy, then you need dedicated hardware.

 

0% or totally disabled is also often very good enough if you don’t need visibility. 😊

 

Netflow is useful in my opinion, but maybe not for every case.

 

Jean

 

From: NANOG <nanog-bounces+jean=ddostest.me at nanog.org> On Behalf Of Colin Legendre
Sent: December 9, 2021 7:18 PM
To: Brian Turnbow <b.turnbow at twt.it>
Cc: nanog <nanog at nanog.org>
Subject: Re: Latency/Packet Loss on ASR1006

 

NBAR was not enabled.. just netflow export.. and that was enough..


 

---
Colin Legendre
President and CTO

Coextro - Unlimited. Fast. Reliable.
w: www.coextro.com <http://www.coextro.com> 
e: clegendre at coextro.com <mailto:clegendre at coextro.com> 

p: 647-693-7686 ext.101
m: 416-560-8502
f: 647-812-4132

 

 

On Thu, Dec 9, 2021 at 7:17 PM Colin Legendre <clegendre at coextro.com <mailto:clegendre at coextro.com> > wrote:

Thanks for this.. turned off netflow export.. and it dropped our qfp load from 44% to 18%.  ugh..


 

---
Colin Legendre

 

 

On Thu, Dec 9, 2021 at 4:22 AM Brian Turnbow via NANOG <nanog at nanog.org <mailto:nanog at nanog.org> > wrote:



> On 11/26/2021 1:09 PM, Colin Legendre wrote:
> > Hi,
> >
> > We have ...
> >
> > ASR1006  that has following cards...
> > 1 x ESP40
> > 1 x SIP40
> > 4 x SPA-1x10GE-L-V2
> > 1 x 6TGE
> > 1 x RP2
> >
> > We've been having latency and packet loss during peak periods...
> >
> > We notice all is good until we reach 50% utilization on output of...
> >
> > 'show platform hardware qfp active datapath utilization summary'
> >
> > Literally ... 47% good... 48% good... 49% latency to next hop goes
> > from 1ms to 15-20ms... 50% we see 1-2% packet-loss and 30-40ms
> > latency... 53% we see 60-70ms latency and 8-10% packet loss.
> >
> > Is this expected... the ESP40 can only really push 20G and then starts
> > to have performance issues?
> >

He had a similar issue about 4 years ago.
We were showing packet loss and drops getting progressively worse and the router was falling over when reaching about 70% of usage.
We could see the interface reliability go down and input errors due to overruns on the interfaces.
Cisco blamed it on microburtst not being able to be handled under load.


"We were able to replicate this scenario in our lab as well.
QFP under high load generated input errors and overruns which in turn led to unicast failures/ drops/ latency.
The issue is not consistent with QFP % utilization as sometimes with even 80%+ traffic, we  do not see the drops:"

And recommended removing traffic or upgrading esp.

One of our guys disabled nbar on the router and the problem disappeared.
I would suggest taking a look at what features you are using and if you can try and disable them to see if it makes any impact.
We then upgraded esps and all has been fine since.

Brian

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.nanog.org/pipermail/nanog/attachments/20211209/490d6dfb/attachment.html>


More information about the NANOG mailing list