Dick Mills' Y2K Power Prognosis

greenspun.com : LUSENET : TimeBomb 2000 (Y2000) : One Thread

Dick Mills' Y2K Power Prognosis
By Dick Mills
June 1999 Prognosis
This prognosis was written looking back as I concluded the column in June of 1999.
My outlook on the Y2K problem in American utilities is not based on estimates of percent Y2K readiness. If it were, I'd have to update it every day as new news appeared. Rather, I base my predictions on the historical record.
The most significant historical record is that of the software industry as explained by Capers Jones in his book, The Year 2000 Problem. Mr. Jones says that completely finished, debugged and tested applications have on the average, 15% of the bugs outstanding when they go into service. There is no good reason why utilities or embedded chips should have an average much different than this.
Therefore, the best we can hope for is that the overwhelming majority (85%) of the bugs get fixed. What does that mean in terms of keeping the lights on? I have two specific predictions that have remained unchanged since June of 1998.
1.It is prudent to expect a blackout with an extend and duration of a typical once in ten year event, to happen in January 2000. Include that contingency in your Y2K preparations. If nothing happens, you won't be embarassed because that degree of preparation is always prudent. 2.Generating margins are tight in the summer in many parts of the USA. Because direct Y2K problems, and indirect Y2K supply chain problems such as fuel, and maintenance problems and financial problems, I expect the margings to sink even lower in 2000. This can result in power shortages, curtailments and rationing. The utility industry has not acknowledged this problem, nor have they analyzed it to my knowledge.
For utilities, my primary recommendation at this date is to focus on operational contingency planning, training and drills. It is not at all necessary that residiual Y2K bugs should lead to more than brief power outages.
June 1999 Prognosis
This prognosis was written looking forward as I commenced the column in June of 1998.
I've been getting some really interesting mail since starting this column. Some people like what I've been saying. Some don't. There have been lots of good questions that show the readers are really thinking.
Many readers ask for me to clearly state my prognosis, what I really expect to happen. Others have inferred my opinion even though I haven't stated it yet.
If you continue to read my column, you'll see that I sound optimistic sometimes and pessimistic other times. I'm going to propose that some really radical actions be taken before I'm done. I'm not trying to send mixed messages. I'm not trying to send any message at all. I don't have an agenda. I am trying to explain in layman's terms what I know about power systems and Y2K. This reality is neither black nor white; it's mottled.
Indeed, I'm going to do my best to avoid saying what I think, unless I can simultaneously present the rationale for why I think that way. That means we can only cover one small piece of the puzzle each week. It will take several dozen columns to hit all the important points.
Notwithstanding the above, I'll reluctantly bow to pressure and give an advance peek as to what I feel and to where all this logic is leading. I have separated it from the main thread of my columns to preserve their objectivity.
I believe that the electric utility can fulfill its primary mission in 2000. I define the mission as, "Provide power to most of the people most of the time in 2000 so as to avoid disastrous injury, suffering, business failures or unemployment due to lack of electricity."
In other words, we can avoid causing the end of civilization by blackouts, but the year 2000 will hardly be business as usual.
Is that optimistic or pessimistic? You choose.
You may be surprised to hear that Rick Cowles and I agree on almost all the factual risks and probable outcomes. However, where Rick prefers to call the glass half empty, I prefer to call it half full. Rick also comes from a utility background while I don't. My definition of the primary mission is considerably less than the standard of service that the electric utility culture strives to deliver, or that regulators require. Some things that I consider acceptable, others may consider unacceptable.
One thing I have very little patience with is the persistent demand from many quarters that we experts should reduce our conclusion to a sound bite. "So is it a big deal or not?" "Don't you expect significant problems." First, the reality of the problem is too complicated to simplify that much. Second, I can't guess what other people mean by phrases like "big deal" or "significant" or "acceptable." I'm an engineer and I need to be able to put engineering units beside my predictions.
Another thing I believe is that it is far too late to promote awareness among utilities by writing letters or asking tough questions. They are aware. The Y2K remediation wheels are in motion. Any wheels not in motion in June 1998 are too late to start.
It's also too early to give in to despair. There are still substantial things we can do in addition to fixing Y2K bugs to mitigate the effects. That will remain true up to the fall of 1999. Here's a couple,
1.Electricity customers are advised to make contingency plans for lack of power. Both short-term blackout, and longer-term power-shortage scenarios should be considered.
2.Electricity producers are advised to make contingency plans for avoiding power outages despite challenges caused by Y2K failures in utility owned equipment or disruptions in the supply chain.
The cost of contingency measures is relatively small. Therefore, these contingency measures are reasonable and prudent, with or without the immediate threat of Y2K, and regardless of what other Y2K remediations may be in progress.
======================================== End Dick Mills has been upbeat about y2k and power but I thought he displayed more concern in this article.
Ray

-- Ray (ray@totacc.com), June 17, 1999

Answers

Even a casual comparison of Mills' articles a year ago compared to the ones that he writes now support your opinion, Ray. Mills always seems to call 'em as he sees 'em, and his last few articles reflect a genuine concern, especially since his optimism of last year was based on the utilities devoting a lot of time and money towards manual operation contingency plans. So far, if this has been done to any significant degree, neither Mills, Cowles, nor anyone else has commented on it.

-- King of Spain (madrid@aol.com), June 17, 1999.

Point #1.
"Mr. Jones says that completely finished, debugged and tested applications have on the average, 15% of the bugs outstanding when they go into service. There is no good reason why utilities or embedded chips should have an average much different than this.
Therefore, the best we can hope for is that the overwhelming majority (85%) of the bugs get fixed."
I might like to point out that what Mr. Jones talks about, and what Mr. Mills quotes, is software-completion-bugs-included rates up until and including now. All software, on aveerage, across the board, is only 85% bug-free, in 1970, 1980, 1990, and now. Yet the world has managed to survive. 85% is the NORM.
If the Y2k thing ends up being only 85% fixed come Jan. 1, why should things all of a sudden blow up then? Is there some sort of pretzel logic, that Einstein couldn't even understand, going on here? If so, please do tell.
Point #2.
Dick Mills has always said that any power disruptions shouldn't last any more than 2 or 3 days. And that power plants can be "black-started" without too much trouble. There is NOTHING in his latest prognostications that contradicts those ideas.

-- Chicken Little (panic@forthebirds.net), June 17, 1999.

Chicken Big commented:
"Dick Mills has always said that any power disruptions shouldn't last any more than 2 or 3 days. And that power plants can be "black-started" without too much trouble. There is NOTHING in his latest prognostications that contradicts those ideas. "
Conversely, there is NOTHING in his latest prognostications that CONFIRMS he still thinks this way .... Doo Doo Bird !!
Ray

-- Ray (ray@totacc.com), June 17, 1999.

Whoa, Chicken Little. You're getting your facts mixed up.
We (meaning, businesses and government) are decidedly NOT running with software that is 85% reliable. It would never be used.
What Capers Jones is saying with his 85% number is that when applications are first put into service, 15% of them fail AND THE APPLICATION IS TAKEN OUT OF SERVICE -- until the failure is isolated and fixed (the system is 'rolled back' to the previous version of the application).
Almost all production software is better than 99.99% bug free right now.
But this won't be the case after 1/1/00 (or, for look-ahead scheduling and ordering software, some time in July, October, etc.). After that (those) dates, the application will no longer be able to be rolled back to a previous version, because the previous version is known to not work.
That's where the 15% bug rate will kick us in the pants.

-- Dean -- from (almost) Duh Moines (dtmiller@nevia.net), June 17, 1999.

Ah SH*T,
Anyone got a spare shovel? I need to bury my head for a while if it's ok. Power has always been my major concern and the more I read the more I am convinced it will not be there. To say power might be out for 2-3 days is nuts. If it goes down their is a reason. Pollys like to make the analogy of things like a downed cable in a storm...complete BS...the grid and all the "chips" that control it are not a "downed cable". Forget the shovel..to many heads stuck in the sand already.
FUWEY!

-- Mike (midwestmike_@hotmail.com), June 17, 1999.

I don't think the embedded chips fit into the 15% failure rate. Aren't we estimating 1.5 to 3% at this late date. Sen' Bennet blurted out something like that a month ago or so. (IN a WRP maybe?) So if they replace the....let's say 4% of bad chips then we should see 15% failure rate on those 4% which is....a very small number(.006?). Of course, embedded chips that keep track of date probably do so for maintenance intervals, and if they built that in to the chip then it's probably an important one. Now we would get the full 15% percent of software failures but unlike the chips, most of the software wouldn't be critical. (I hope).
In any case, it wouldn't hurt to be prepared to provide heat, water, food and light for a few weeks without power now would it? (How about 12 weeks, just in case)
If the lights go out, there's no sense keeping your...

-- eyes_open (best@wishes.net), June 17, 1999.

-- (@ .), June 17, 1999.

Can we please turn the italics off?

-- Joe Six-Pack (Average@Joe.Blow), June 17, 1999.

Is this just as annoying?

-- Joe Six-Pack (Average@Joe.Blow), June 17, 1999.

Unlike Ray I won't bore you with my minimal html skills

-- Joe Six-Pack (Average@Joe.Blow), June 17, 1999.

-- Joe Six-Pack -- wrote: "Unlike Ray I won't bore you with my minimal html skills"
Thank you...Your undignified "AIR" of superiority does a remarkable job on its own.

-- (cujo@baddog.com), June 17, 1999.

When we talk about a 15% failure rate, we are talking about 15% of a defined about of software. It is very rare to find an organization that has a 100% complete inventory of their software to start with. And, most organizations are only fixing their mission critical systems.
What about the software that is deemed to be non-mission critical? You cant apply the 85% success ratio to software that is not be fixed.
The situation is probably just as bad, if not worse, with the embedded systems problem. Many embedded systems are installed once and then forgotten about. I dont see how everyone is going to a complete inventory to start from.
B.K.

-- B. K. Myers (B.K.Myers@cwix.com), June 18, 1999.

eyes open stuttered:
I don't think the embedded chips fit into the 15% failure rate. Aren't we estimating 1.5 to 3% at this late date. Sen' Bennet blurted out something like that a month ago or so. (IN a WRP maybe?) So if they replace the....let's say 4% of bad chips then we should see 15% failure rate on those 4% which is....a very small number(.006?).
Just some idle speculation, so please don't take it as gospel.
Wasn't there a figure of around 50 billion embedded microchips world wide? If we just worked on 20 billion then if 5 billion people tested and repaired (if needed) embedded microchips then it would take one day (that is not including having to replace any other components).
If there were 5 billion embedded microchips left to test and 1 million people working on it and they did 4 a day like mentioned above. Then you only need 1250 more days to check every single one.
But in 200 days they would get 800 million tested (and remediated if needed), so that would take you up to the year 2000. That would leave you with 4.2 billion embedded microchips.
So if there was a 3 percent failure/minor problem/major problem etc rate then out of that remaining 4.2 billion you would only have 126 million embedded microchips experience some sort of difficulty come the Year 2000.
So the question is: Which ones? :-)
Remember just idle speculation ...
Regards, Simon

-- Simon Richards (simon@wair.com.au), June 18, 1999.

If there are undiscovered Y2K sensitivities in the chips used in the electrical grid systems that can not be overcome by manual methods it could take 6 + months to replace them and restore power.
My microprocessor engineer friends that work for chip companies told me that it takes, on the average 6 months, to create a custom chip and deliver it to the customer in significant quantities. The necessary steps include * design the needed micro-code * pre-production fabrication testing * reworking * testing * developing the new QA procedures * pre-production documentation for manufacturing and testing * final debugging * release to manufacturing * programming the test equipment * First production run and testing * Make modifications as necessary * Deliver units to customer.
The 6 month estimate assumed that all was normal at the engineering & manufacturing facilty : electrical power, communications, all employees on site, all vendors providing necessary support and transport.
But if there is no power or communications at the the engineering facility - then what happens to that normal 6 month schedule ? It is the 17th century again baby !
Recall that last December there was a 10+ hour power outage for the entire San Francisco Bay area. It was NOT Y2K related. It was a simple human error. Somebody forgot to remove some shorting bars when power was reapplied after some scheduled changes. The shorting bars caused significant damage to 6 huge transformers which had to be replaced. Fortunately emergency replacements were located and power was restored after 10 hours.
Manual override procedures were useless. The Bay area was power dead.
Here is the point : It took that power company 10+ hours to restore the power when it was caused by 1 original problem. They knew what the problem was and where it was. But it still took 10 + hours to fix !
Suppose they have a 1000 + near-simultaneous and avalanching problems at unknown locations. And the replacement parts are a MINIMUM of 6 months away. Then what ?

-- Ron Sander (judy_sander@hotmail.com), June 18, 1999.

Moderation questions? read the FAQ