Can Y2K Problems be Fixed in 15-30 days?

greenspun.com : LUSENET : Electric Utilities and Y2K : One Thread

Rick in your interview on the CBN site you state at the conclusion:

"As time progresses, and those faults begin building up in the system, and capacity issues start to come into play, what I'm concerned about is seeing the system absolutely stressed to a breaking point. In my worst-case scenario, if regional transmission facilities hit a critical mass of fault propogation, you're going to start seeing some real regional issues. I don't expect to see that kind of thing really play out in the first day or two after 1-1-2000. If you asked me to try to nail down a timeline, strictly off the top of my head, I'd say two weeks after 1-1-2000, and then you'll see a slow recovery for 15 or 30 days to some kind of equilibrium where you've at least got some degree of reliability just about everywhere."

In the interview you indicated that Y2K remediation will likely be substantially more difficult after 2000 then before becuase of bottlenecks. I don't think you are suggesting that after 2000 Y2K problems that have taken several years to find, test and fix only to be less than 50% completed to this point will somehow quickly be remedied within 15-30 days after 2000. If that's all it took to restore reliability back into the system, then why can't the industry just do that now in the next 15-30 days and be done with it?

I saw another commentator on CBN (can't remember his name), after stating how grave the Y2K problem was, that the systems would be restored to normal after 72 hours of general disruption. How can that be? If the problems can be fixed 72 hours after 2000 why not 72 hours before.

What gives here?

-- Anonymous, February 06, 1999

Answers

Just to clarify: the other person you're referring to was probably Dick Mills, who believes a 72-hour blackout in early 2000 is a realistic possibility. I don't think that's the same as getting everything fixed in 72 hours, though; but I can't answer for Mr Mills. I suppose he means enough will have been done by that time to fix whatever problems remain. Just a guess.

-- Anonymous, February 06, 1999

Joseph,

What you read was a "scope of problem" definition. As stated in the interview, I do not expect this to be a big bang on 01/01/2000. There is a certain amount of fault tolerance built into every complex system, and certainly, the electric system in the U.S. meets the "complex system" definition. I expect the number of problems to reach an apex sometime in January, and then decrease much as a fire would burn itself out. After all, there *is* a finite number of things that can fail, regardless of how large a number do fail. That doesn't mean that the problems to the end user (you!) go away as the number of failures begins to decrease, and that total normalcy to the end user returns as the number of failures tapers off. If a bunch of things fail at one time, all of the technicians in the world can't fix that bunch of failures all at the same time. It takes time.

What this means is: I expect to see *some* problems on 1/1/2000; even some that may be of a magnitude to impact system reliability in *some* isolated areas. These will primarily come from problem areas that no one anticipated in advance. Over the next 15 days, I'm postulating that the number of Y2k problems will increase, at first linearly, and toward the middle of January, exponentially, because of a combination of equipment failures and bad data propogating through systems (large, complex embedded and otherwise). Failures of complex systems are never linear in nature.

My scenario does *not* assume that the problems get fixed and everything is hunky dory by the end of January. What my scenario suggests is that the full magnitude of the causal factors won't be known until sometime well after 1/1/2000. In a situation like this, you can't fix the failures until you know *what* the failures (and failure mechanisms) are. That translates into service and reliability problems on the other end. I believe that some level of sporadic service disruptions will continue through the Year 2000 because of the Y2k issue.

And of course, my scenario is just that - a "educated" scenario. Don't like mine? Feel free to develop one of your own. ;-) But I hope this helps clarify the statements in my interview a bit.

-- Anonymous, February 06, 1999


Joseph, assuming a regional or grid-wide outage, there are some reasons which would support a gradual stabilization of the grid in the mentioned timeframes. The premise of a Y2K grid outage generally comes about not because it's thought that the power generating facilities in the grid will *all* be individually inoperable, but because the instability caused by those plants or transmission facilities which *do* end up having critical Y2K problems will bring the others down with them. (I suppose some people might think all the utilties will have critical problems, but I think it's safe to say that most acknowledge some utilities are ahead of others and will likely be Y2K good-to-go.)

It's this premise of some fine, some not, which allows for a shorter stabilization time. After the initial disruptions, those utilities which can't operate will remain off-line until repairs are made, however long those take. Those utilities which can operate will re-start and re-connect, as they are able, with other operable utilities. The "grid" would be less effective, but some power would flow. As Mr. Mills readily acknowledges in his articles, his definition of adequate power is not the same which the average citizen thinks of. He considers having some power flowing, even if it's only a few hours/day in a rolling ration situation, to be adequate in an emergency. Large manufacturing companies on "interruptible power contracts" would not get power in this case, so that available generation could go to homes and hospitals, etc.

So I would say that the prognostications of a stabilized system within a few days or a month, are based both on the premise of some available generation and minimum requirements, not what we now think of as "normal".

-- Anonymous, February 07, 1999


Joseph,

From another discussion, here is an excerpt that may be pertinent:

Many, many, devices contain embedded systems. So many, in fact, that there is not enough time to test them all. Such devices are essential components of numerous kinds of automated processes, as well as safety monitoring processes, so that not testing them has serious risks. On the other hand, most embedded systems have absolutley no date dependency and therefore will have no internal Y2K problem. And some that will fail, will fail only once, at midnight of the century change. Once restarted after that time, they will work fine. So, some organizations have concluded that since they cannot all be tested in the time available, and since most of them will have little or no Y2K problems, wait and see (or is it: hide and watch?). While the number of devices containing embedded systems is huge (estimated to be serveral billions, that's right, several billions), the percentage of them that may have Y2K problems is estimated to be small. The catch 22 is that, a small percentage of several billions is several millions! In one sense, considering the huge number of devices in question, fix on failure may be regarded as a practical, albeit risky, approach under the circumstances. However, it very clearly has the drawback that the when a plan includes this approach, then even if "things are going according to plan", the plan includes a whopper of a gamble. The gamble is complicated by the fact that a sudden large demand for specific devices would be likely to exceed the supply. Furthermore, for large numbers of older designs of embedded systems, the supply is zero! Simple replacement will not be an option. Alternate designs will be required, and will take more time to implement.

Jerry

-- Anonymous, February 07, 1999


The previous answers were all well stated. Thank you. Can I summarize in saying that for a significant period of time after 2000 the electric systems nationwide will not be running "normally" and normalcy will not return until the Y2K issuse are ultimately corrected however long that takes.

-- Anonymous, February 07, 1999


Joseph,

Yes and no, sort of. :-)

You can summarize it as you wish. However, I think that almost any summaries in between "no problems" and "total blackout" are at risk of being over simplified.

"Can I summarize in saying that for a significant period of time after 2000 the electric systems nationwide will not be running "normally" and normalcy will not return until the Y2K issuse are ultimately corrected however long that takes."

The phrase "electric systems nationwide" may likely be interpreted to suggest that what follows is intended to apply to every generation plant, transmission sytem, and distribution network in the country. I do not know if you have that interpretation in mind; it is one that I could not support.

The part about normalcy returning may be interpreted as a gradual, continuous, improvement of conditions until normalcy is achieved. I, for one, regard fluctuations in the amounts and quality of electric power in various parts of the country, and for varying periods in different areas to be among the more plausible of the possibilities.

Jerry

-- Anonymous, February 07, 1999


Joseph, I think you made a fine summary if we take "until Y2K issues are ultimately corrected" to mean all Y2K failures with global implications, not just in the U.S. electric industry. Oil production and imports, shipping, economic disruptions, political instabilities, and many other possible disruptions could all affect, in either large or small ways, not only electric power production in this country, but many other infrastructure aspects as well.

There's always the definition of "normalcy" to deal with, too. Ten years ago the use of cell phones was not considered normal for the average American citizen, but now it is. What's normal changes as society changes and time moves on. That's what I find interesting about the TEOTWAWKI acronym being used nowadays. The operative part of that for me is "As We Know It". The only prosnostication I personally am *positive* of, is that the Year 2000 date problem will create changes in various places around the world. Whether those changes will be small or great, widespread or localized, may be open to debate. In my own mind, however, it's a done deal that what the average person might consider normal right now will change after 2000 - whether that change only involves being more aware of how technology impacts our lives, or something much greater.

It's often touted that the only sure things in life are death and taxes. I add Change to that category. The world as we know it changes constantly. When Pearl Harbor was bombed, it was the end of the world as people were used to then; when President Kennedy was assassinated, it was the end of the world as Americans had known it; when Jonas Salk discovered a cure for polio, it was the end of the world as we had known it; when a man loses his job it's the end of his world as he had known it; when a couple has their first baby it's the end of their world as they have known it. Some changes can be prepared for and some can't. We're all just going to have to learn to value our innate flexibility, prepare as we can, and understand that all life is "Change".

-- Anonymous, February 08, 1999


No matter how you slice it, the timeline simply doesn't compute. Are you people telling me that if the electrical utility industry would have needed another year to generate reliable power on 12/31/99, that somehow they will be able to proceed faster AFTER the new millennium begins?

With ALL the simultaneous disruptions that are likely to occur AFTER the new millennium begins, it defies logic how anyone could reach such a conclusion. Am I missing something here? Please explain just how the remediation, testing as well as data exchange validation and non-compliant embedded chip location/replacement will all be done QUICKER in the first year of the new millennium.

-- Anonymous, February 08, 1999


Roger,

There are at least two parts of the Y2K problem that will be resolved much more quickly after 1/1/00 than before: finding which embedded systems will have a problem on 1/1/00, and fixing those that have a one time, roll over problem, on 1/1/00, but which work upon restarting.

This is not to say that they will be fixed in a predictable amount of time; it simply acknowledges that, once they have failed, you know which ones need to be fixed, and do not need to spend time searching for them among all the billions that will not fail. I am not talking code remediation here, but target recognition.

One of the aspects of the embedded system problem that has been stressed up one side and down the other, is that there are immensely too many embedded systems even to locate, before 1/1/00, which ones may have Y2K problems. A related aspect that has not been quite so widely stressed, is that the effort to test them would itself risk many disruptions, even if we had the time, and the numbers of knowledgeable technicians, needed to do so.

Perceiving that waiting until they fail saves the time wasted on testing those that will not fail, may not be particularly comforting, since we do not know how many of which kinds of failures will occur; it is merely recognizing, perhaps with some irony, that it is easier to spot a failure after it happens, than before.

And, of course, there remains the problem of getting the replacements for those that need to be replaced, and getting them installed and up and running while all whatever is breaking loose. Depending on how many of which kinds of failures occur, a big unknown at this time, the fix on failure approach may, after the fact, appear to have been a good idea, or a really, really, bad idea, or perhaps, something in between.

Stay tuned.

Jerry

-- Anonymous, February 09, 1999


Jerry:

I am not a computer nor an embedded chip expert, but I do have an engineering background. Frankly, it is difficult for me to imagine using fix-on-failure" as a practical way to restart a plant that has been disrupted by non-compliant chips for at least these reasons (and anyone who wants to add to this list is certainly welcome to do so):

1) A "fail safe" shut down is definitely NOT certain. Consequently, there could be damage, and loss of life. Do you think the plant manager will take the responsibility, or, more likely, will even be authorized to "flip the switch" after the FIRST malfunctioning chip has been found and replaced?

2) One non-compliant chip can cause a cascading number of "errors". Finding the original culprit is definitely NOT a slam dunk, to say nothing of clean-up, having adequately trained personnel, and replacement units on hand to trouble shoot and repair the faulty unit(s).

3) It is quite possible that successive operating failures can damage normal equipment, further complication repairs and trouble shooting.

4) Quite often an out of date, non-compliant controller cannot be reprogrammed, or off-the-shelf units do not exist, creating entensive downtime which could have a huge domino impact on vertical, industrial operations leading to bankruptcy in short order.

-- Anonymous, February 09, 1999



Roger,

Those are certainly among the risks of a fix on failure approach in certain kinds of facilities, and such risks will, I suspect, induce the managements of some facilites to shut them down before 1/1/00 (in spite of the costs of doing so), hoping to restart them later. Some reports indicate that Rolm and Haas (sp?) is already considering shutting plants down before the big day.

However, there will be some number of people who will adopt one form or another of a fix on failure approach and hope that in their situation, it will work out well. Probably, in some cases it will, and in some it will not.

I am not advocating it, and I am happy that I am not in a position of someone who has more embedded systems than they have time to test and repair. But it will be the policy in many situations, either by choice or by default. And it has its advocates, some of whom use the rationale that I mentioned in my previous post.

Jerry

-- Anonymous, February 09, 1999


This has been an excellent Dialog. But I have a few higher differnt that Bonnie touched on earlier.

Even if you can get the Powerplants internal systems operational in a month or so, how do you get by the problem that the plant relies on outside entities for fuel and communications, (data).

Florida, where I am from, gets most of its power from Petroleum. While not all of the Oil comes from the Middle East where they are not fully Y2k aware, it generally comes by boat. Given that it has been released that many ships have compliance problems and the Middle Eastern countries will have compliance problems, Can we assume that after the 30 days, we will have an adequate supply of Oil to fuel the plants.

Florida also gets a significant amount of its power from Coal. I could be wrong, but I don't see coal plants as a predominent feature of Florida's geography. I believe Coal comes from the north via trains. Trains rely on phone communications, not telegraph as they used to, to schedule and operate the trains. If the phones aren't working, the trains aren't going to be running at full steam. Oh don't the phones use electricity?

I understand, haveing worked on two Y2k problems that it can be virtually impossible if not at least very difficult to find and repair ALL embedded systems. But if you can get a good amount of them fixed ahead of time, the find and fix theory might not be to bad. Much time is spent in the inventory and assessment phases of the project. Fix on fail will eliminate those phases because it will identify which systems are not compliant right off. Remediation can be difficult only if the system broke on rollover. Other than that, as pointed out above, it is turn it is a restart. Finally, I don't believe Future date testing is applicable. The timeframe to fix embedded chips is much easier.

One other variable comes into play. If there is social unrest because of a panic, will the employees of the power plants leave their families to go to work and turn the systems on?

Back to the 30 days. If the power plants were an island and were self supporting, I would believe that. However, given that they are reliant on many many other entities, that oddly rely on them in turn, it is hard to believe that they will be back up in such a short period of time. Comments????

-- Anonymous, February 10, 1999


Matthew, you've raised good points about the interconnection problems, which are why any Year 2000 "guesses" are just that -- guesses. In my above post, I was giving possible reasons to support a return to power in the event of an outage, based on certain premises. Perhaps I should have made it clearer that this was not a prediction that things will happen that way.

In October of last year, there was a thread on this Forum titled, "Poll: Will the grid work?" Several people gave their best "guesses". The guess I made did take into account supply problems, as did some of the others. I'll re-post my guess here, and you can do a search for the thread to read the other predictions:

My personal opinion, derived as a composite from many sources, is that the grid will initially go down, whether from failures of some of the generating facilities, or because some of those facilities will deliberately be off line in order to deal with potential problems in a more controlled start up after the 2000 rollover. I think it's statistically very likely that some areas will experience prolonged outages, but for the majority, however, I believe power will be restored to some extent within a few hours to a couple of days.

This is not to say that restoration will be a case of "the problems are over, it's back to normal", however. I believe the control and monitoring by computers, which allows for far quicker and more accurate adjustments than can be accomplished by humans, will be degraded. I expect power, yes, but I also expect brownouts, possible rationing (a few hours/day), and in general a much lowered efficiency for many weeks. I also expect some fuel transportation difficulties which could cause problems in generation quite awhile after Jan.1. I wouldn't be surprised to see generation in an up/down flux. In other words, a roller coaster effect. Better than a long term outage, but much less than the optimum we are now used to.

I absolutely think it's worth preparing to be able to heat at least one room of your home without electricity if you live in a cold northern area.

-- Bonnie Camp (bonniec@mail.odyssey.net), October 30, 1998.

My personal opinion, derived as a composite from many sources, is that the grid will initially go down, whether from failures of some of the generating facilities, or because some of those facilities will deliberately be off line in order to deal with potential problems in a more controlled start up after the 2000 rollover. I think it's statistically very likely that some areas will experience prolonged outages, but for the majority, however, I believe power will be restored to some extent within a few hours to a couple of days.

This is not to say that restoration will be a case of "the problems are over, it's back to normal", however. I believe the control and monitoring by computers, which allows for far quicker and more accurate adjustments than can be accomplished by humans, will be degraded. I expect power, yes, but I also expect brownouts, possible rationing (a few hours/day), and in general a much lowered efficiency for many weeks. I also expect some fuel transportation difficulties which could cause problems in generation quite awhile after Jan.1. I wouldn't be surprised to see generation in an up/down flux. In other words, a roller coaster effect. Better than a long term outage, but much less than the optimum we are now used to.

I absolutely think it's worth preparing to be able to heat at least one room of your home without electricity if you live in a cold northern area.

-- Bonnie Camp (bonniec@mail.odyssey.net), October 30, 1998.

-- Anonymous, February 10, 1999


Sorry for the dupe, folks. I think I need a vacation!

-- Anonymous, February 10, 1999

Bonnie,

It was just as good the second time as the first! Kidding aside, I would say that your very concise summary includes key attributes of what seems to me to be likely.

Matthew,

The trains that carry the coal have the communications exposure that you mentioned, and also the exposure of potential problems in the programming of the centralized computers that schedule the switches and schedule which cars join which trains going where.

As for oil , there may be problems in the highly automated refineries, and when the refined products leave the refineries, much of them is transported by very long, highly automated, pipelines which frequently use electric motors at their many pumping stations.

For example, at: http://www.enerlink.com/success/page_colonial.html you will find a description of a pair of pipelines 2600 miles long each, with 73 main pumping stations using electricity from 26 different utilities!

I mentioned earlier in this thread some of the rationale underlying some people's hopes for quick fixes (which are separate from some people's hopes for very few problems), but I would not count on either quick or few.

Jerry

-- Anonymous, February 10, 1999



Bonnie Camp wrote: "I think it's statistically very likely that some areas will experience prolonged outages, but for the majority, however, I believe power will be restored to some extent within a few hours to a couple of days." Bonnie, could you provide us with those statistics that you have? What kind of statistical projection method are you using to make that forecast? Also, what are the evidences for those outages to last a few hours or a couple of days? Have you studied utilities case by case to make those forecasts? Where are those areas that you are mentioning? are they in Alaska or in sunny California? I might cancel my trip to Alaska or take the plane with my own generator together with my fishing utensils in that case :)

Anyway.

Thanks

Carlos Fernandez

-- Anonymous, February 11, 1999


Carlos,

www.euy2k.com/guest2.htm www.euy2k.com/guest4.htm

Bonnie went through every single Electric companies 10Q from 3rd quarter 1998. Quite frankly, I challenge anyone to show me someone who has done more research. I don't envy her but I truly thank her. Don't be too quick to dismiss good reports and analysis just because they don't predict the world ending. She has done much more research than I have. I predict a much gloomier scenereo and I give less references, yet you didn't question me. My references are there, I just don't keep a good record of the websites I visit so I am not a good source. Do your own research and don't be too quick to dismiss good news.

Bonnie,

I think that analysis is an excellent one. But the bottom line is no one really will know for certain until it happens. All the scenereos all too complex. Way too many things can happen and their effects can not always be predicted. Things can go wrong, but the effects can be minimal or things can go wrong and start a cascade of "other" problems. We won't know. We can only speculate.

Matt

-- Anonymous, February 11, 1999


Carlos, I base the statistical probability of outages in some areas on the bell curve phenomenon. The concept that in any group dynamic there will always be a few leading, the majority grouped in a central mass, and a few behind is one that's been observable in human nature for ages, but only in the last decades has it been validated by studies. (A case of scientists concurring with what any average person could pretty much observe about life for themself.) My premise is that there will always be those at the head of the class, the majority in the average ranges, and those who squeak by or fail to graduate altogether.

Not surprisingly, the data summaries I did of the estimated completion percentages according to the Jan. NERC report fell into a bell curve shape. Even though I didn't bring that point out in the commentary I wrote, (which I probably should have, but it was long enough as it was) it was observed by others, including Gary North who used the title, "NERC: The Bell-Shaped Curve of Completion" when he posted the link to my assessment.

Despite my dealing with various data compiling, I always try to take a broad view based on what I consider common sense observations. Putting aside possible disruptions from outside sources, that's why I believe there will be a few utilities out ahead in their Y2K projects and which will be the best-of-the-best, experiencing no Year 2000 problems (or completely negligable ones). The majority fall into the middle range for project completion and therefore will probably experience a middle range of problems, from very slight ones to more severe. Then there will be a few who will be behind all the rest, either in time frame or quality of work (or both) and their areas are where the outages will occur.

Even NERC has stated concerns about the status of the smaller Independent Power Producers. There have been discussions on other threads in this forum about smaller facilities not having the financial resources to hire outside specialist help. Nor do they have the resources to pay a wage scale which would attract or keep the most experienced in-house technicians, upon which any remediation project would then fall.

The bell curve phenomenon applies in many areas, also. Not all engineers are created equal. Not all doctors are created equal. Neither are all ditch-diggers. For every super-programmer, who is dedicated, professional, up to date on all the newest technology advances, and with a work ethic which doesn't quit, there is also the programmer who does only what he has to, hasn't learned anything new in a couple of years or more, and whose work philosophy is, "don't sweat the small stuff, put in your 8 hours and get outta here."

There are many permutations and complications to this, which is why Matthew is right that we can only speculate and make guesses. For instance what happens when you get opposing ends of the bell curve in the same company? A real life example: my husband is a systems consultant, and is aware of some large manufacturing companies where different project sections are being handled in very different ways. There is a legacy mainframe system where the programmers are working their butts off, appear to be doing everything right, and are completely dedicated to finding and fixing all Y2K problems before 2000, no matter how much time they have to put in. In the same business, the network manager, who has charge of remediation in that area, has checked (not fixed yet) some of the user's PC's. That's it. The servers, routers, software, etc. fall into the category of "they don't pay me enough to do all this stuff plus my regular job, and I don't understand what the big deal is anyway."

Even though the variables are so great, I still believe that somewhere, someplace, there is a utility(s) which is last in the remediation race and there will be a failure of magnitude causing a prolonged outage in that area. I've already addressed why I think it's possible some grid power in many areas will be restored in relatively short time frame in the Feb.7 post above. This also deals with the bell curve premise that the most prepared utilities will have a quick recovery from any outside disruptions, because these individual utilities will be completely operable individually.

As to what areas are the most at risk, I have my own hunches, but as Rick Cowles has posted previously, each interested person should gather as much info about their local situation as they can, and make their own judgements. Nobody has a crystal ball where absolutes exist.

Finally, for those dedicated people who are working so hard to remediate whatever area their job entails, my sincere thanks to all. Each bit of progress before time runs out raises the upper end of that bell curve. Your efforts *will* matter when taken in the context of the whole. At least that's my opinion. We all form our own ideas based on our individual life experiences and knowledge. I think it's likely that there will be many opinions offered about what will happen in 2000, which will each be proved correct _ somewhere_ in the world. *smile* As I see it, it's going to be a mixed bag of successes and failures, like everything else in life.

-- Anonymous, February 12, 1999


Moderation questions? read the FAQ