100G, input errors and/or transceiver issues
Stonebraker, Jack J
jjs at ots.utsystem.edu
Mon Jul 19 17:40:55 UTC 2021
We have a moderately dense deployment of 100-Gig LR4 (Both DWDM Lambdas and Juniper MX) around our WAN and we don't clock any background input errors on our interfaces unless there is an ongoing problem. That said, we have experienced issues with sub-millisecond link state changes between two endpoints that are physically cross connected to one another with no intermediary Layer 1 (DWDM, Etc.). There doesn't seem to be rhyme or reason to this and we've looked at each lane extensively and so far, everything has been inconclusive. We also experienced some code issues on Juniper MPC3D-NG's running 100-Gig's and our DWDM Client Ports where timing would start to slip and eventually cause the link to fail. Both Juniper and the DWDM Vendor found code variances they patched. We haven't had any such issues on Juniper MPC5's 7's or the 10003 Line Cards.
TL;DR: In my experience, 100-Gig might require some more TLC then 10-Gig to run clean and is more sensitive to variations in transport. Other's mileage may vary.
JJ Stonebraker | Associate Director
The University of Texas System | Office of Telecommunication Services
(512) 232-0888 | jjs at ots.utsystem.edu
From: NANOG <nanog-bounces+jjs=ots.utsystem.edu at nanog.org> on behalf of Graham Johnston <johnston.grahamj at gmail.com>
Sent: Monday, July 19, 2021 12:19 PM
To: Saku Ytti <saku at ytti.fi>
Cc: nanog list <nanog at nanog.org>
Subject: Re: 100G, input errors and/or transceiver issues
I don't at this point have long term data collection compiled for the issues that we've faced. That said, we have two 100G transport links that have a regular background level of input errors at ranges that hover between 0.00055 to 0.00383 PPS on one link, and none to 0.00135 PPS (that jumped to 0.03943 PPS over the weekend). The range is often directionally associated rather than variable behavior of a single direction. The data comes from the last 24 hours, the two referenced links are operated by different providers on very different paths (opposite directions). Over shorter distances, we've definitely seen input errors that have affected PNI connections within a datacenter as well. In the case of the last PNI issue, the other party swapped their transceiver, we didn't even physically touch our side; I note this only to express that I don't think this is just a case of the transceivers that we are sourcing.
Comparatively, other than clear transport system issues, I don't recall this sort of thing at all with 10G "wavelength" transport that we had purchased for years prior. I put wavelengths in quotes there knowing that it may have been a while since our transport was a literal wavelength as compared to being muxed into a 100G+ wavelength.
On Mon, 19 Jul 2021 at 12:01, Saku Ytti <saku at ytti.fi<mailto:saku at ytti.fi>> wrote:
On Mon, 19 Jul 2021 at 19:47, Graham Johnston
<johnston.grahamj at gmail.com<mailto:johnston.grahamj at gmail.com>> wrote:
> How commonly do other operators experience input errors with 100G interfaces?
> How often do you find that you have to change a transceiver out? Either for errors or another reason.
> Do we collectively expect this to improve as 100G becomes more common and production volumes increase in the future?
New rule. Share your own data before asking others to share theirs.
IN DC, SP markets 100GE has dominated the market for several years
now, so it rings odd to many at 'more common'. 112G SERDES is shipping
on the electric side, and there is nowhere more mature to go from
100GE POV. The optical side, QSFP112, is really the only thing left to
cost optimise 100GE.
We've had our share of MSA ambiguity issues with 100GE, but today
100GE looks mature to our eyes in failure rates and compatibility. 1GE
is really hard to support and 10GE is becoming problematic, in terms
of hardware procurement.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the NANOG