Why the US Government has so many data centers
george.herbert at gmail.com
Sat Mar 19 05:21:01 UTC 2016
Before I go on, I have not been in Todd's shoes, either serving nor directly supporting an org like that.
However, I have indirectly supported orgs like that and consulted at or supported literally hundreds of commercial and a few educational and nonprofit orgs over the last 30 years.
There are corner cases where distributed resilience is paramount, including a lot of field operations (of all sorts) on ships (and aircraft and spacecraft), or places where the net really is unstable. Any generalizations that wrap those legitimate exceptions in are overreaching their valid descriptive range.
That said, the vast bulk of normal world environments, individuals make justifications like Todd's and argue for distributed services, private servers, etc. And then do not run them reliably, with patches, backups, central security management, asset tracking, redundancy, DR plans, etc.
And then they break, and in some cases are and will forever be lost. In other cases they will "merely" take 2, 5, 10, in one case more than 100 times longer to repair and more money to recover than they should have.
Statistically these are very very poor operational practice. Not so much because of location (some) but because of lack of care and quality management when they get distributed and lost out of IT's view.
Statistically, several hundred clients in and a hundred or so organizational assessments in, if I find servers that matter under desks you have about a 2% chance that your IT org can handle supporting and managing them appropriately.
If you think that 98% of servers in a particular category being at high risk of unrecoverable or very difficult recovery when problems crop up is acceptable, your successor may be hiring me or someone else who consults a lot for a very bad day's cleanup.
I have literally been at a billion dollar IT disaster and at tens of smaller multimillion dollar ones trying to clean it up. This is a very sad type of work.
I am not nearly as cheap for recoveries as for preventive management and proactive fixes.
George William Herbert
Sent from my iPhone
> On Mar 18, 2016, at 9:28 PM, Todd Crane <todd.crane at n5tech.com> wrote:
> I was trying to resist the urge to chime in on this one, but this discussion has continued for much longer than I had anticipated... So here it goes
> I spent 5 years in the Marines (out now) in which one of my MANY duties was to manage these "data centers" (a part of me just died as I used that word to describe these server rooms). I can't get into what exactly I did or with what systems on such a public forum, but I'm pretty sure that most of the servers I managed would be exempted from this paper/policy.
> Anyways, I came across a lot of servers in my time, but I never came across one that I felt should've been located elsewhere. People have brought up the case of personal share drive, but what about the combat camera (think public relations) that has to store large quantities (100s of 1000s) of high resolution photos and retain them for years. Should I remove that COTS (commercial off the shelf) NAS underneath the Boss' desk and put in a data center 4 miles down the road, and force all that traffic down a network that was designed for light to moderate web browsing and email traffic just so I can check a box for some politician's reelection campaign ads on how they made the government "more efficient"
> Better yet, what about the backhoe operator who didn't call before he dug, and cut my line to the datacenter? Now we cannot respond effectively to a natural disaster in the Asian Pacific or a bombing in the Middle East or a platoon that has come under fire and will die if they can't get air support, all because my watch officer can't even login to his machine since I can no longer have a backup domain controller on-site
> These seem very far fetched to most civilian network operators, but to anybody who has maintained military systems, this is a very real scenario. As mentioned, I'm pretty sure my systems would be exempted, but most would not. When these systems are vital to national security and life & death situations, it can become a very real problem. I realize that this policy was intended for more run of the mill scenarios, but the military is almost always grouped in with everyone else anyways.
> Furthermore, I don't think most people realize the scale of these networks. NMCI, the network that the Navy and Marine Corps used (when I was in), had over 500,000 active users in the AD forest. When you have a network that size, you have to be intentional about every decision, and you should not leave it up to a political appointee who has trouble even checking their email.
> When you read how about much money the US military hemorrhages, just remember....
> - The multi million dollar storage array combined with a complete network overhaul, and multiple redundant 100G+ DWDM links was "more efficient" than a couple of NAS that we picked up off of Amazon for maybe $300 sitting under a desk connected to the local switch.
> - Using an old machine that would otherwise be collecting dust to ensure that users can login to their computers despite conditions outside of our control is apparently akin to treason and should be dealt with accordingly.
> Sent from my iPad
>>> On Mar 14, 2016, at 11:01 AM, George Metz <george.metz at gmail.com> wrote:
>>> On Mon, Mar 14, 2016 at 12:44 PM, Lee <ler762 at gmail.com> wrote:
>>> Yes, *sigh*, another what kind of people _do_ we have running the govt
>>> story. Altho, looking on the bright side, it could have been much
>>> worse than a final summing up of "With the current closing having been
>>> reported to have saved over $2.5 billion it is clear that inroads are
>>> being made, but ... one has to wonder exactly how effective the
>>> initiative will be at achieving a more effective and efficient use of
>>> government monies in providing technology services."
>>> Best Regards,
>> That's an inaccurate cost savings though most likely; it probably doesn't
>> take into account the impacts of the consolidation on other items. As a
>> personal example, we're in the middle of upgrading my site from an OC-3 to
>> an OC-12, because we're running routinely at 95+% utilization on the OC-3
>> with 4,000+ seats at the site. The reason we're running that high is
>> because several years ago, they "consolidated" our file storage, so instead
>> of file storage (and, actually, dot1x authentication though that's
>> relatively minor) being local, everyone has to hit a datacenter some 500+
>> miles away over that OC-3 every time they have to access a file share. And
>> since they're supposed to save everything to their personal share drive
>> instead of the actual machine they're sitting at, the results are
>> So how much is it going to cost for the OC-12 over the OC-3 annually? Is
>> that difference higher or lower than the cost to run a couple of storage
>> servers on-site? I don't know the math personally, but I do know that if we
>> had storage (and RADIUS auth and hell, even a shell server) on site, we
>> wouldn't be needing to upgrade to an OC-12.
More information about the NANOG