RS/960 Upgrade for T3 Backbone - 5th Week of 5

mak mak
Thu May 21 20:26:38 UTC 1992


	Phase-III T3 Network Deployment - Step 4 Status Report
	======================================================
	Jordan Becker, ANS		Mark Knopper, Merit

	Step 4 of the phase-III network was successfully completed last
Saturday 5/16.  All T3 nodes were back on-line within the scheduled
maintainance window with the exception of the Houston CNSS67 T1 concentrator
due to software configuration problem, and ENSS129 (Champaign) due to a T3
circuit problem.  The upgrade of all other nodes were completed by 10:00 EST
on 5/16.  The following T3 backbone nodes are currently running with new T3
RS960 hardware and software in a stable configuration:

Seattle POP:		CNSS88, CNSS89, CNSS91
Denver POP:		CNSS96, CNSS97, CNSS99
San Fran. POP:		CNSS8,  CNSS9,  CNSS11
L.A. POP:		CNSS16, CNSS17, CNSS19
Chicago POP:		CNSS24, CNSS25, CNSS27
Cleveland POP:		CNSS40, CNSS41, CNSS43
New York City POP:	CNSS32, CNSS33, CNSS35
Hartford POP:		CNSS48, CNSS49, CNSS51
St. Louis POP:		CNSS80, CNSS81, CNSS83
Houston POP:		CNSS64, CNSS65, CNSS67

Regionals:	ENSS141 (Boulder), ENSS142 (Salt Lake), ENSS143 (U. Washington)
		ENSS128 (Palo Alto), ENSS144 (FIX-W), ENSS135 (San Diego)
		ENSS130 (Argonne), ENSS131 (Ann Arbor),
		ENSS132 (Pittsburgh), ENSS133 (Ithaca), ENSS134 (Boston)
		ENSS137 (Princeton), ENSS129 (Champaign), ENSS140 (Lincoln)
		ENSS139 (Rice)

The CNSS32, CNSS48, CNSS64 nodes are now running with mixed technology (e.g.
3xRS960 T3 interfaces, 1xHawthorne T3 interface).


Step 4 Deployment Difficulties
==============================
	The RS960 week 4 deployment was successful with two exceptions.  The
first problem involved the link between ENSS129 (Champaign) and CNSS81 (St.
Louis POP).  The second problem was the T1 concentrator in Houston (C67).
This affected ENSS174 (IBM Austin) and ENSS173 (ITESM).

	On ENSS129, a jumper was initially found to be missing from the HSSI
board and the jumper was added.  ENSS129 was then successfully brought up
online at 6:20 EST.  Later in the morning, the ENSS129<->CNSS81 link began to
experience packet loss.  The DSUs were reporting coding violations and bipolar
violations on the E129<->C81 link.  The problem turned out to be a bad DS3
radio circuit between ENSS129->CNSS81 and MCI swapped the radio channel to
clear the problem.

	At the Houston POP, CNSS64 and CNSS65 were upgraded and came up
without problems.  ENSS139 (Rice U.) also came up without any problems.
However CNSS67 (the T1 concentrator) had complex problems.  CNSS67 was taken
down around 00:00 and the mechanical modifications were completed and bootup
began at 2:15 EST.  The machine rebooted and we started troubleshooting.  All
RS960 cards and the planar board were replaced in various combinations.  After
considerable analysis of the ODM configuration database, we suspected some ODM
problems and we chose to re-install the system software from a tape built from
another machine.  The machine came right up on the first try at 14:00 EST with
the same original hardware installed (including the original new RS960 card).
This was clearly a system software corruption problem and there are no
suspected hardware failures resulting from this.

	At the St. Louis POP, CNSS80, CNSS81, and CNSS83 were all upgraded and
came up without any problems, 3 hours ahead of schedule at 4:45 EST.

	At the Denver POP, a loose serial port connector delayed the upgrade
start time by a few minutes.  We could not access the DSU through the
out-of-band modem.  An ODM adapter configuration problem was also fixed where
cards were coming up up in the wrong slots.  However these problems were
solved and the Denver maintainence was completed well within the scheduled
window.

	Routing was enabled and traffic flow started through the southern
route (through Houston POP) by 9:35 EST even though the Houston T1
concentrator was still down so that the RS960 hybrid link upgrade at CNSS24 in
the Chicago POP and the scheduled CNSS25 RS960 card replacement could begin on
schedule.  When CNSS24 would not come up smoothly with the new RS960 card, it
was replaced without attempting to troubleshoot the system or the card.  The
replaced RS960 card will be returned to IBM for failure analysis. The RS960
card in CNSS25 was replaced as scheduled due to DMA under-runs which we
observed last week.  This installation went as planned and the machine came up
10:50 EST.

	New T3 internal link metrics have been installed to support load
balancing of traffic across the 3 different hybrid technology links that now
exist (e.g.  CNSS64<->CNSS72, CNSS48<->CNSS72, CNSS32<->CNSS56).


Step 5 Deployment Scheduled for 5/22
====================================
	Based upon the successful completion of step 4 of the deployment, step
5 is currently scheduled to commence at 23:00 local time on 5/22.  Step 5 is
the final phase-III upgrade step and will complete the deployment.  This will
involve the following nodes/locations:

Greensboro POP:		CNSS72, CNSS73, CNSS75
Washington D.C. POP:	CNSS56, CNSS57, CNSS58, CNSS59
2nd Site Visit:		CNSS32 (New York POP), CNSS64 (Houston POP)
			CNSS48 (Hartford POP)

Regionals:		ENSS138 (Georgia Tech.), ENSS136 (College Park)
			ENSS145 (FIX-E)

Other Nodes Affected:	ENSS150 (Concert), ENSS151, ENSS153, ENSS166

	Following the step 5 deployment, all T3 internal link metrics will be
re-adjusted to their normal metrics since no hybrid technology links will
exist.





More information about the NANOG mailing list