<html>


<head>


<meta http-equiv="Content-Type" content="text/html; charset=utf-8">


</head>


<body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class="">


I think the primary issue for front- vs rear-mounted switches is cooling. As long as you use switches that can pull cooling air from either the front or the back, it’s feasible to mount the TOR switches in the back. 


<div class=""><br class="">


</div>


<div class="">For example, I think these are parts I used to order for Cisco Catalyst 3850-48XS switches:</div>


<div class=""><br class="">


</div>


<div class="">FAN-T3-R= Fan module front-to-back airflow for 48XS<br class="">


<br class="">


FAN-T3-F= Fan module back-to-front airflow for 48XS</div>


<div class=""><br class="">


</div>


<div class="">But if the switch is hardwired to pull cooling air from the front, it’s going to be taking in hot air, not cold air, which could lead to overheating.</div>


<div class=""><br class="">


</div>


<div class="">As far as rails mounting time goes, it’s just not enough of a time factor to outweigh more important factors such as switch feature set, management architecture, or performance. Dell is pretty much in the back of the line for all of those factors, </div>


<div class=""><br class="">


</div>


<div class=""> -mel</div>


<div class=""><br class="">


<div>


<blockquote type="cite" class="">


<div class="">On Sep 27, 2021, at 2:32 PM, Andrey Khomyakov <<a href="mailto:khomyakov.andrey@gmail.com" class="">khomyakov.andrey@gmail.com</a>> wrote:</div>


<br class="Apple-interchange-newline">


<div class="">


<div dir="ltr" class="">


<div class="">Folks,</div>


<div dir="ltr" class=""><br class="">


</div>


<div dir="ltr" class="">I kind of started to doubt my perception (we don't officially calculate it) of our failure rates until Mel provided this:</div>


<div dir="ltr" class="">"That’s about the right failure rate for a population of 1000 switches. Enterprise switches typically have an MTBF of 700,000 hours or so, and 1000 switches operating 8760 hours (24x7) a year would be 8,760,000 hours. Divided by 12 failures


 (one a month), yields an MTBF of 730,000 hours.<span class="gmail-Apple-converted-space">" </span>At least I'm not crazy and our failure rate is not abnormal.</div>


<div dir="ltr" class=""><br class="">


</div>


<div dir="ltr" class="">I really don't buy the lack of failure in 15 years of operation or w/ever is the crazy long period of time that is longer than a standard depreciation period in an average enterprise. I operated small colo cages with a handful of Cisco


 Nexus switches - something would fail once a year at least. I operated small enterprise data centers with 5-10 rows of racks - something most definitely fails at least once a year. Fun fact: there was a batch of switches with the Intel Atom clocking bug. Remember


 that one a couple of years ago? The whole industry was swapping out switches like mad in a span of a year or two... While I admit that's an abnormal event, the quick rails definitely made our efforts a lot less painful.</div>


<div dir="ltr" class=""><br class="">


</div>


<div dir="ltr" class="">It's also interesting that there were several folks dismissing the need for toolless rails because switching to those will not yield much savings in time compared to recabling the switch. Somehow it is completely ignored that recabling


 has to happen regardless of the rail kit kind, i.e. it's not a data point in and of itself. And since we brought up the time it takes to recable a switch at replacement point, how is tacking on more time to deal with the rail kit a good thing? You have a switch


 hard down and you are running around looking for a screwdriver and a bag screws. Do we truly take that as a satisfactory way to operate? Screws run out, the previous tech misplaced the screw driver, the screw was too tight and you stripped it while undoing


 it, etc, etc...</div>


<div dir="ltr" class=""><br class="">


</div>


<div dir="ltr" class="">Finally, another interesting point was brought up about having to rack the switches in the back of the rack vs the front. In an average rack we have about 20-25 servers, each consuming at least 3 ports (two data ports for redundancy


 and one for idrac/ilo) and sometimes even more than that. Racking the switch with ports facing the cold aisle seems to then result in having to route 60 to 70 patches from the back of the rack to the front. All of a sudden the cables need to be longer, heavier,


 harder to manage. Why would I want to face my switch ports into the cold aisle when all my connections are in the hot aisle? What am I missing?<span class="gmail-Apple-converted-space"> </span></div>


<div dir="ltr" class=""><br class="">


</div>


<div dir="ltr" class="">I went back to a document my DC engineering team produced when we asked them to eval Mellanox switches from their point of view and they report that it takes 1 person 1 minute to install a Dell switch from cutting open the box to applying


 power. It took them 2 people and 15 min (hence my 30 min statement) to install a Mellanox switch on traditional rails (it was a full width switch, not the half-RU one). Furthermore, they had to install the rails in reverse and load the switch from the front


 of the rack, because with 0-U PDUs in place the racking "ears" prevent the switch from going in or out of the rack from the back.</div>


<div dir="ltr" class=""><br class="">


</div>


<div class="">The theme of this whole thread kind of makes me sad, because summarizing it in my head comes off as "yeah the current rail kit sucks, but not enough for us to even ask for improvements in that area." It is really odd to hear that most folks are


 not even asking for improvements to an admittedly crappy solution. I'm not suggesting making the toolless rail kit a hard requirement. I'm asking why we, as an industry, don't even ask for that improvement from our vendors. If we never ask, we'll never get.</div>


<div dir="ltr" class="">


<div class="">


<div dir="ltr" class="gmail_signature" data-smartmail="gmail_signature">


<div class=""><br class="">


</div>


--Andrey</div>


</div>


<br class="">


</div>


<br class="">


<div class="gmail_quote">


<div dir="ltr" class="gmail_attr">On Mon, Sep 27, 2021 at 10:57 AM Mel Beckman <<a href="mailto:mel@beckman.org" class="">mel@beckman.org</a>> wrote:<br class="">


</div>


<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex">


That’s about the right failure rate for a population of 1000 switches. Enterprise switches typically have an MTBF of 700,000 hours or so, and 1000 switches operating 8760 hours (24x7) a year would be 8,760,000 hours. Divided by 12 failures (one a month), yields


 an MTBF of 730,000 hours. <br class="">


<br class="">


 -mel <br class="">


<br class="">


> On Sep 27, 2021, at 10:32 AM, Doug McIntyre <<a href="mailto:merlyn@geeks.org" target="_blank" class="">merlyn@geeks.org</a>> wrote:<br class="">


> <br class="">


> On Sat, Sep 25, 2021 at 12:48:38PM -0700, Andrey Khomyakov wrote:<br class="">


>> We operate over 1000 switches in our data centers, and hardware failures<br class="">


>> that require a switch swap are common enough where the speed of swap starts<br class="">


>> to matter to some extent. We probably swap a switch or two a month.<br class="">


> ...<br class="">


> <br class="">


> This level of failure surprises me. While I can't say I have 1000<br class="">


> switches, I do have hundreds of switches, and I can think of a failure<br class="">


> of only one or two in at least 15 years of operation. They tend to be<br class="">


> pretty reliable, and have to be swapped out for EOL more than anything.<br class="">


> <br class="">


</blockquote>


</div>


</div>


</div>


</blockquote>


</div>


<br class="">


</div>


</body>


</html>