NRC Y2K Information Notice 99-12; Some specifics on the difference between critical safety systems and mission critical non-safety systems.

greenspun.com : LUSENET : Electric Utilities and Y2K : One Thread

Wow, I had just finished reading this Information Notice and was about to post this when I discovered that Rick had just posted mention of this I.N. So we're going to have back to back complementary info here.

The purpose of this I.N. 99-12 is stated as: "The U.S. Nuclear Regulatory Commission (NRC) is issuing this information notice (IN) to inform addressees of observations made by NRC staff during audits conducted on the Year 2000 (Y2K) readiness programs of twelve plants. It is expected that recipients will review the information for applicability to their facilities and consider actions, as appropriate. However, suggestions contained in this information notice are not NRC requirements: therefore, no specific action or written response is required."

The most interesting part of this Information Notice for me was the specifics given about the differences in mission critical safety systems versus mission critical non-safety systems. The entire I.N. is worth reading but here are the two paragraphs which specifically caught my attention:

"Most commercial nuclear power plants have protection systems based on analog technology rather than digital technology. Since Y2K concerns are associated with digital systems, analog reactor protection system functions are not affected directly by the Y2K problem. Although there is limited use of computer systems in nuclear power plant mission critical and safety-related functions, licensee Y2K programs have identified some software and digital devices that affect a small number of safety functions. None of these safety functions are actuation based."

"Digital systems and components requiring remediation of Y2K-related problems perform functions such as post-accident sampling, fuel handling, core power distribution monitoring, and reactor vessel level measurement. Licensees have identified incorrect dates in safety-related printouts, logs, and displays in systems such as radiation monitoring. These errors, however, have not affected the functions performed by the devices or systems. There are mission critical non-safety-related functions such as digital feedwater controls, moisture separator reheater controls, reactor recirculating coolant controls, and motor generator set controls that are affected by the Y2K concern and have required remediation. These balance-of-plant functions are critical for power generation."

Well, now we know that there ARE mission critical non-safety-related functions, considered critical for power generation, that are affected and do need remediation. The general idea of this Information Notice seems to be that since the NRC is concerned only with the oversight of safety-related systems, they are attempting to let licensees know that they found other non-safety critical systems with Y2K problems, even though those are outside the NRC jurisdiction.

-- Anonymous, May 19, 1999

Answers

Sorry! I forgot to post the URL for I.N. 99-12. Go to:

http://www.nrc.gov/NRC/GENACT/GC/IN/1999/in99012.txt

-- Anonymous, May 19, 1999


Bonnie --

If you dig a little deeper into one of the actual audits, the research is not much more reassuring. I studied the Washington Nuclear Plant Audit and found several interesting statements.

"...four major computer system (replacements) projects titled "Related Projects," although managed independently from the Y2K project, are tracked by the project manager for Y2K readiness because their completion on time is critical to the success of the Y2K project. These are: (1) the Control Room Plant Data Information System (PDIS) Project, (2) the PeopleSoft Project, (3) the Client Server Project and (4) the Passport Project.

"The audit team identified that the Y2K program schedule has no flexibility to account for unforseen problems in the Y2K readiness activities of the four related projects, and the remaining work on software assets and embedded systems. The Y2K project manager (PM) and project sponsor acknowledged this and stated their intent to address the issue. "

The plant gave the NRC auditors an example of an awareness program that may have pointed out a startling lack of thoroughness on the part of the initial inspectors :

"The awareness effort is ongoing. Recently a site-wide challenge was issued to all employees to identify items not already in the Y2K database, and a reward of a free lunch was offered those who were successful. As a result, 23 entries were submitted, of which 12 were assets not previously included in the inventory of digital assets (for example: digital camera and label maker). On the basis of its review of the licensee's communications, the audit team concluded that the licensee's Y2K awareness program is effective."

Finally, the report states:

"The licensee has not yet completed the detailed assessment phase. This is scheduled for completion by May 30, 1999. The purpose of the detailed assessment is to obtain sufficient information about each inventoried item to determine its expected Y2K performance The licensee states that as of February 1999, 89% of Mission Critical detailed assessments and 82% of the total detailed assessments have been completed.

Based on the review of some "Low" priority asset documentation, the audit team noted that the Y2K project team appeared to give less attention to the non-safety-related and non-operating-plant software and personal computers (PCs). For example, they were grouping their PCs by type, and testing one of each type. There may be differences in the BIOS of two PCs having the same part number but different dates of manufacture. Also there could be a problem with stand-alone software. Many suppliers do not upgrade the version number for minor changes. This was identified as a potential implementation problem."

So, I guess if I'm reading all this correctly, this particular plant has yet to finish even assessing their systems, much less remediating them. In my experience, the replacement, integration and testing of new systems is the most time-consuming part of the project. And then the final statement points out the fact that they appear to be fudging a little on the non-safety and "low" priority systems. One thing I learned from Mr. Cowles was that a nuclear plant can have rather low tolerances for the failure of what may appear to be a low priority system. I'll be looking for his assessment of all this.

I found the report to be about 95% reassuring, but that last 5% is really bugging me! Keep up the quality research, Bonnie.

-- Anonymous, May 20, 1999


For Bonnie Camp;

Bonnie,

I am quite interested to discover the focus of your review. I followed a link from the 'infamous' Gary North (man, what a job he's done, eh ?) that led to your items.

This is not an 'answer' to your immediate subject, but offering info and another area within which you might want to direct your investigative efforts.

I 'flagged' this and sent it to several people. I have gotten some patronizing responses, with one exception, and the bulk have not responded at all.

I think this incident, and the communication about it, needs higher visibility and debate.

Let's see what the following means to you. Let me know what you think.

Best regards,

Justin Kline TPheenix@aol.com

>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> (excerpt of previous cover letter) I get riled about this happy-face-non-journalism spin crap as much as anything I have studied or seen about Y2k.

Here's a challenge, or case-in-point, for you. I wrote to my cohorts in the Upstate Year 2000 User Group about the item I dug up, which I have pasted in below. My original was around Mid-March. I did the same, since, with a couple of 'more visible' folks. I am totally without a response, even from people who are seeking ways to understand.

Please check out what I wrote, for yourself, and tell me where you might find fault with my effort. I'm not looking for 'credit'. I am looking for confirmation from others who are willing to look at what's really happening behind the 'spins' and beyond paragraph 2 or 3.

I would also like to know if you encounter problems trying to back-track my info. I'm not going to just serve it up and hand it to you. I am inviting you, so don't be bashful. This is actually the second challenge. I surmise the first challenge will be finding the time to consider, or review, or respond.

Our User Group is hosted and maintained by russkelly.com You might have heard of him. You can find out a little about me at his site, under resumes.

Best regards,

Justin Kline TPheenix@aol.com

>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Alright -- it's soap-box time - - - - - - -

This is why people like me DO NOT TRUST assurances from people ALMOST like me !!!

The following is pasted in direct from the NRC web page from 2/12/1999.

Ask yourselves -- looking at the single line I spaced by itself -- Why ?? There should have been a procedure in place to prevent this type of error from being made. It is very clear (always thus in 20/20 hindsight). It is very stupid. It is inexcusable.

When in uncharted territory and trying new procedures, (especially involving life safety issues), EVERY single step involves knowing with absolute certainty where you are BEFORE you take the next step. i.e., make sure you have actually arrived at the point in the process where you THINK you are. Another way to think of it is "grounding" or "centering" for those NEW AGE afficionados.

Part of the cause for this type of mistake is it is unnatural to our basic human psyche to want things to 'fail'. This creates a natural 'blind spot' that keeps ALL of us from truly trying to 'break' something, in a test procedure such as the immediate subject, and as we function in our day-to-day worlds. It is required to dismantle and completely replace fundamental thought processes and then get one's self to deliberately find ways to 'break' what is being built, or to cause failures.

Yes, I said failures with an 's', because you want to know ALL the ways that a failure could attack or comprise your situation BEFORE you are faced with having to r-e-a-c-t to a failure you didn't have a plan to cover.

See for yourselves ;

>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Region I

[ Prev | Next ]

Peach Bottom Unit 2: Loss of Plant Monitoring System Computers During Y2K Testing

On February 8, 1999, while performing testing for a Y2K remediation modification to the Unit 2 rod worth minimizer (RWM) system, operators experienced a lock-up of both the primary and backup plant monitoring system (PMS) computers. As a result, operators also lost the following PMS-supported systems for about seven hours: safety parameter display system (SPDS), emergency response data system (ERDS), and 3D Monicore thermal limit monitoring system. Engineers had taken the backup PMS computer off-line and had advanced the PMS clock to a year 2000 Date. This led to a lockup of the backup PMS, and the system transferred to the primary, on-line PMS computer.

The engineers did not recognize that the system had transferred and,

believing that the original command was not accepted, again advanced the system clock, causing the primary PMS to lock up also. Several initial attempts to restore the PMS computers were unsuccessful, and operators determined that this constituted a major loss of emergency assessment capability. The PMS computers are not Y2K compliant, but the engineers believed that this would not impact the testing. Operators did not expect the testing would affect the on-line PMS computer. However, before the testing began, operators took contingency actions to lower Unit 2 power slightly to ensure shift average power levels were not exceeded. The licensee plans to perform a full root cause analysis of this event. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

"root cause analysis, my .. .. .. .."

Enjoy.

Justin

-- Anonymous, May 20, 1999


Justin, the Peach Bottom incident got quite a lot of input/debate on this forum after it had happened. If you go to the New Questions page of this forum and do a search (link at the top of the page) for "Peach Bottom", you'll get dozens of links with comments on that issue.

I have stated other times on this forum that in my view Y2K is as much a human problem as it is a computer problem. Humans make mistakes. Things will be missed, or errors made, probably even in the best-run Y2K projects. Whether the errors of commission or omission or tardiness will cause a major problem, or accumulate into a larger failure pattern we won't know until it happens - or doesn't happen. This is one of the reasons I continue to advocate personal risk management plans. "Murphy" is not going to suspend his law for the Year 2000.

ariZONEa, I understand your concerns about that "5%". There is positive industry news, but there are too many variables and potential "gotchas" in all industry and government areas for me to be comfortable with PR reassurances. Contingency plans should be across the board; individual, community, business, and government.

-- Anonymous, May 20, 1999


The NRC IN: There are mission critical non-safety-related functions such as digital feedwater controls, moisture separator reheater controls, reactor recirculating coolant controls, and motor generator set controls that are affected by the Y2K concern and have required remediation. These balance-of-plant functions are critical for power generation."

Bonnie's Conclusion: "Well, now we know that there ARE mission critical non-safety-related functions, considered critical for power generation, that are affected and do need remediation."

FF's Comments- Bonnie, you are reading more into this than the NRC said. Just because Y2K problems require remediation doesn't automatically mean that they were severe enough to impact power generation - in fact, I khave seen no such examples in nuclear plants in the industry findings I have seen. Almost all nukes have SOME Y2k remediation of non-safety systems....and always minor bugs in the examples I have seen that would not have impacted plant operation.

As I have said before, a factual example of a system/component y2k bug in a nuclear plant that would impact power operation would be nice.

Regards,

-- Anonymous, May 22, 1999



Factfinder, I respect what you have reported about your findings before on this forum. However, you'll have to deal with the NRC on this one. If, as you say, these non-safety critical systems "would not have impacted plant operation", then why does the NRC state they "required" remediation? The last I knew the word "required" meant "needed", not "would be nice but not necessary".

In this same Information Notice, the NRC also had already mentioned those systems "such as post-accident sampling, fuel handling, core power distribution monitoring, and reactor vessel level measurement" for which the Y2K problems found "have not affected the functions performed by the devices or systems". In fact, this classification of systems was mentioned in the SAME paragraph!

If there was not a difference in those critical "digital feedwater controls, moisture separator reheater controls" etc., there would be absolutely NO reason to separate them in a statement apart from those in which the performance functions would not be affected by Y2K problems. They could have just listed them all together into the "not affected the functions performed" statement.

They didn't. They separated the systems into two classes. I don't think I'm reading anything into what the NRC said. I think they said exactly what they intended to say. And if you're contending that even if the performance functions of these non-safety mission critical systems were affected, then that still wouldn't affect plant operations, then tell me why they're classified as "mission critical" in the first place? Not "important", not "non-essential", but as the NRC stated, "critical for power generation"?

Your assertions do not logically fit into the total context of this NRC Information Notice, unless you're also contending the NRC is sending out unsubstantiated info to licensees.

-- Anonymous, May 22, 1999


You make some good points Bonnie. In re-reading IN 99-12, it does seem that the NRC is implying that there are mission critical non-safety systems that had y2k bugs with more of an impact than the minor problems noted in the safety systems. So I concur that you may indeed may be correct in your interpretation - only the NRC can shed light on exactly what they are trying to say here, I certainly cannot.

What I can speak to, is the findings in the industry and the use of the phrase "required remediation." I cannot speak for all plants, but several that I have in depth knowledge of typically wants to have full "Y2K compliance". In these cases, even a minor problem means the component is NOT y2k compliant, and therefore will be upgraded. In these cases, remediation is "Required". Thats the exact terminology I have seen used, even for minor problem remediation.

The second point I would like to make is that while I have always thought that it is very possible that there could be y2k bugs severe enough to impact a component to the degree that power could be impacted, that could have impacted power generation, in all the industry findings and infomation I have seen, I have not come up with such a beast (smoking gun, etc) complete with a manufacturer, model, etc. Thats not to say its not out there, but if it is, its obviously not very prevalent. But please, lets get some input from others out there, for I would not be surprised to find a signficant "smoking gun"....has anyone out there found one?

The NRC IN does seem to imply more signficant problems in non-safety systems, but I do not see this in any way as very strong "evidence"...it looks more like semantics to me. In reveiwing the Audits at the plants, I could find no details concerning the types of problem. A few other bits of information for your consideration: 1.If a vendor says a device in "not compliant", there may not be a lot of digging into the details of the problem, the utility may want full compliance and spend their efforts upgrading, not in digging into the details of the y2k bug found. 2.

In closing, I concur that your reading of the report was fair, and I was wrong in stating that you were reading too much into it - your interpretation seems very reasonable in retrospect. I believe that the wording in the IN itself is unclear and appears to imply that there are more serious bugs in non-safety components that could power generation. Indeed, this may be the case, but the IN does not clearly state that, and therefore is not very good evidence. As always, my criteria for evidence of a smoking gun remains the Manufactuer/Model number, this allows us to confirm the report. As requested above, perhaps others in the industry can supply us with such evidence.

Regards,

-- Anonymous, May 26, 1999


Moderation questions? read the FAQ