9/23 e-mail to G.North re UPS in large Telco Switchroom

greenspun.com : LUSENET : TimeBomb 2000 (Y2000) : One Thread

First time I ever posted a ? so be gentle. Not sure how to do the link. Copy/paste did not work. here is the general url. http://www.garynorth.com

Today on Gary North's list of new data there is an e-mail from a UPS provider who described a couple of visits to what sounds like a major hub of a telecommunications carrier or regional bell operating company. No company name given. No url link provided either. Please look under new information or telecommunications.

While it was highly detailed in terms of what the room looked like, I am going out on a limb and say it looks to me like a well-crafted fake. I would like Central office telecommunications types who post or lurk here to look at this and comment.

The reason I say this, is that there are a lot of ex Bell System people populating regional bell operating companies and carriers who have a ton of knowledge on switchroom design, single point of failure, and the interraction of UPS and battery systems. Issues such as redundancy, HVAC,MTBF, and no, you do not have a site with a large sunny window--all of these and more are drilled into people from this industry. Telecommunications switchroom design issues are the same as computer rooms. I have been involved in both for private customers.

If this post is plausible, it is scarey indeed.

The scenario the e-mail painted was right out of a Dilbert cartoon.

-- Nancy (wellsnl@hotmail.com), September 23, 1999

Answers

I have worked in a similar condition for about five years, it was not with any Bell system, but with microwave transmitions for telecomunications. Our carrier service back in 1988 resembled some of the mistakes mention in the posting at North's site. The UPS's made by APC at the time were always sounding alarms every 30 mins. The power in the Pontiac, MI area was not all that stable. We didn't have a great view of the city, only Interstate 75. And, yes, of all things, our I/O room, was indeed facing sunny skys with the A/C max'ed out. We ran fans at our desks. In the winter, we had desk floor heaters for our feet. Finally, we did have a system crash with an AT&T type system. Our processing plants in several states went offline, not for hours, but days.

The electrical contractors came in, replaced a 440v system transformer that overheated.

I can relate to the Post on North's site, as to how people who have no idea of setting things up. Our I/O room was just free space in our building converted over. The main engineering floor had the center of the building.

Finally, in or around 1992, the company hired a "real" IT person that was an IBM type. A new room was built, and an IBM AS/400 went in. BUT the communications room to this day is still there in the orginal place doing its job in the sun.

-- Joe Martin (nospam@nospam.com), September 23, 1999.


Joe, thanks for the reply. Since the Bell System deregulated in the early '80s, there have been a lot of new entrants who provide basic local and long distance dial tone. Your site sounds like Ameritech who I guess does cover about a 5 state area. Who knows?

My dial tone experience has all been in central office switchrooms and toll offices in California.

If there are a lot of rooms that answer to that description out there, the vulnerability index just crept up a notch or two.

I am always a little sceptical of G North's column when there is no url for attribution. I respect his work effort, but not always his views.

-- Nancy (wellsnl@hotmail.com), September 23, 1999.


I have worked in various computer rooms, and yes there always seems to be facility/security/layout problems. In the previous room, it was on the first floor corner of the building with at least 4 or 5 crank-out type windows in the room. Both rooms have pull tile floors with 2'x2' tiles that can be easily removed and crawled under to gain access. Junk accumulates in and around the rooms until you can hardly get around. Wires running all over the place, above and below floor, twisted and tangled. Many little problems all around.

I haven't read the GN piece yet, but I too am wary of any "letters" he receives. I don't think they're all fake, but liberally embellished I'm sure. I remember seeing one about 6-12 months ago that I KNEW was a fake. I just wish I had found this board before then to bring it up......

-- Jim (x@x.x), September 23, 1999.


For informational purposes. Copy/pasted from http://www.garynorth.com/y2k/detail_.cfm/6258

[start copy/paste] Date: 1999-09-23 04:52:45
Subject: Insider Reports on How His Local Phone Company Works. They Are Not Going to Run This Manually!
Comment: I received this e-mail on Sept. 22.

First, I hear a lot of talk about how, if there are failures, "they will fix it." Well, what happens if "they" (the people who know how to fix "it") don't want to assume the risk of collateral damage and simply walk away? * * * * * * * * * * * * *

Back in the early 1990's, I had a small business doing power systems consulting, primarily in the area of backup/standby power systems. I was asked to come to a site of one of the large telephone service providers whose UPS (Uninterruptible Power Supply) had experienced a fault condition and was no longer able to provide backup power. I got directions to the site and drove there that afternoon.

When I arrived I was taken to the computer room where the UPS was located. I was a bit surprised to see that it was located in a room with an exterior wall, where large windows faced the parking lot and city streets. Normally, computer rooms are located towards the center of a building in rooms with limited exterior access and visibility. This makes it easier to maintain the room at controlled temperature and humidity, and also provides a good measure of physical security. I also noticed that the critical load being protected by the UPS was rack after rack of smaller computers, not the usual (larger) mini or mainframe computer. I didn't think much of it as there is a general trend in the industry toward distributed processing, the move from central computer systems to an array of smaller computers who share the data processing burden. Also being powered were banks of electronic switches and routers.

Well, the UPS had indeed suffered a fault condition that necessitated me taking it off line for repairs. The larger UPS systems have a feature which allows you to do this without interrupting power to the load, but there is always a small possibility that this feature can malfunction, especially when the UPS is already experiencing problems. When this happens, the power to the computers is interrupted long enough to cause them to reboot - terminating any jobs they are doing at the time and often requiring a lot of manual intervention to restart those jobs. I told the manager that I needed to take the UPS off-line to repair it, and informed him of this remote possibility of powering down all those computers and phone switches during the switchover. I asked him what would happen if such an outage occurred. His response was that the phone service for five states would be interrupted until all the computers could be brought back up! When my heart started beating again, I asked how long it would take to restore the service if this happened. He replied that it would take some time, since by now it was after hours and the person needed to bring the system back up would have to be called in from home. At that point I backed away from the situation, told him that he needed to schedule all of this, and asked him to call me when he had all the backup resources in place to deal with any possible outages. To his credit, he did just that and a few days later the system was taken down (without incident), repaired, and brought back on line.

About a year later, I was called back to this site to work on another system problem. This one didn't deal with a UPS per se, but with the transformer that was located between it and the utility power source. They had noticed that this transformer was running very hot and wanted someone to take a look at it. When I got there, I found that they had done just about everything wrong with this installation. The transformer was located in a small utility closet (no ventilation) and was undersized for the load it was carrying. It was indeed running very hot, so I looked to see what fire protection measures were installed in the room in case it gave up the ghost. No Halon system, just a fire sprinkler.... Have you ever seen an electrical fire when it's hit with water? To make matters worse, the transformer was located on the "wrong" side of the backup generator, between it and the UPS. (Normally the generator is connected through a transfer switch directly to the UPS input, supplying it with power when the utility power is lost. Since the generator needs time to start up and stabilize, the UPS carries the load on backup batteries for seconds to minutes while the generator comes up to speed. In this configuration the UPS needs only enough reserve battery capacity to hold the load up until the generator can take over and supply the power needed.) In this case, the generator's output power was routed along with the utility power through this overheating transformer. If the transformer had failed, the circuit would have been broken, the UPS would have run until its batteries were depleted (about 15 minutes) and the backup generator would not be able to connect to the UPS and keep the load powered up.

Needless to say, this was a bad situation. I informed the manager and asked that he immediately call a meeting with everyone needed to rectify this situation. Within the hour we were meeting in a large conference room with big windows that overlooked the main operations area. On the wall of this area was a huge active display of the United States with telephone routing networks superimposed on it. From their terminals, technical specialists were busily routing telephone traffic to ensure smooth operation of this network. As I commented to the site manager about what I saw, he explained that this site controlled the routing for 100 of the largest "800 number" lines in the country. "That must be a lot of commerce," I noted. "Yep. About a million dollars a minute goes over those lines," he said. If this network was the operation being threatened by the failure of a single transformer (and it was), I knew the management would understand the need to address the problem immediately.

I explained the problem and the necessity of properly installing the correct transformer ASAP. It wouldn't increase revenues, or make the system run faster or better, it would simply make sure that the system kept running. I got the expected response - "You're right, it needs to be done, but I don't know how we're going to afford it...." I explained that the amount of money that they needed to spend corresponded to a loss of only a few seconds of that 800 number traffic I had just been told about. They proceeded to discuss back and forth how the needed funds could be appropriated. As they discussed things off to the side, I was again watching the big display board in action. One of the managers came over and said "Hey, check this out!" and pressed a button below one of the windows. Immediately the pane I was looking through became opaque - the window was actually a large LCD-type panel. "We do that when we need to have privacy in this room. Pretty slick, eh? Those windows cost a couple of grand each!" At that point, I knew that it was time to walk away from this whole thing. I sent them a bill for my time and never went back.

A couple of observations about all this in light of the Y2K problem. First, I hear a lot of talk about how, if there are failures, "they will fix it." Well, what happens if "they" (the people who know how to fix "it") don't want to assume the risk of collateral damage and simply walk away? In the first episode above, I backed away from tackling the problem immediately because, had I dropped the phone lines for a period of time, I would probably have been sued for negligence. It simply wasn't worth the risk. The old phrase comes to mind - "Poor planning on your part does not constitute an emergency response on my part." In the second episode, I knew that when the transformer did fail, there was going to be panic until it was fixed, and a lot of finger-pointing from some high profile folks afterwards. I wasn't even going be in the same county when all that went down.

Second, with all the concern about deliberate mischief occurring during the rollover, what if some miscreant had wanted to inflict physical damage on a facility such as this one? The equipment was rather exposed to this sort of threat (which is why the location of this facility has been kept deliberately vague). Getting all that equipment replaced and brought back on line again would certainly take some time. How much revenue would be lost if that were to occur? If there are multiple, systemic failures, would replacement equipment even be available?

Third, there is often an incredible arrogance on the part of managers who control budgets for things like Y2K and facilities repairs. Cory Hamasaki was right, the "horn-hairs" are the rule, not the exception. Money goes for the fancy stuff (like LCD window panes); Y2K repairs and transformers just aren't cutting edge items. The essential, mundane stuff loses out to superficial glitz and glamour, even when the business case is obvious. They don't make your systems any prettier, faster or cutting-edge, they simply ensure that the systems will continue to function post-2000. This makes it hard to get money for Y2K repairs even when the problem is staring you in the face.

Fourth, the phone switches I saw were entirely electronic and controlled by powerful computers. As far as I'm aware, you cannot run those switches manually.
[end copy/paste]



-- Chris (#$%^&@pond.com), September 23, 1999.


Nancy, Dilbert cartoons don't leave me feeling sick to my stomach.

I wish I had a 70 IQ at such times so I would live in bliss. The world is ran by idiots and I can't do anything about it.

-- Chris (#$%^&@pond.com), September 23, 1999.



Believe me Chris, I understand. I took early retirement in April from a telecom high tech company in Silicon Valley. To be charitable, my vice president acted a lot like the pointy-eared guy on Dilbert. Much of Dilbert is hilarious until it hits close to home.

-- Nancy (wellsnl@hotmail.com), September 23, 1999.

Moderation questions? read the FAQ