Type testing is a waste of time, money & energy.

greenspun.com : LUSENET : Electric Utilities and Y2K : One Thread

Why spent a lot of time, money and energy on stupid type tests ? The same test x-times ? No, thanks. The same results x-times ? Yes.

It has to be the same result, because it is the same type of device, the same architecture. The y2k-problem is a date problem and that is related to the architecture of an object. When two devices of the same type different react on a date-test, than one of the two devices is totally malfunctioning or has another version. The malfunction has nothing to do with y2k, because the architecture is the same. It must be something else inside the device. You have to test all the different versions but not the same versions.

Typetesting is a waste of time, money and energy, unless someone can prove that it's necessary.

-- Anonymous, April 27, 1999

Answers

Menno.van.der.Wal, and industry "insider" from another country, wrote:

> It has to be the same result, because it is the same type > of device, the same architecture.

And that, my friends, is exactly why I feel completely justified in believing we may have major power problems.

Jon

-- Anonymous, April 27, 1999


Menno,

I can understand *logically* why you would say this; I think most all of us would say this makes sense. Unfortunately, reality doesn't always match with our common sense.

I will provide one example that was documented to me by a friend on assignment to one of the world's largest pharmaceuticals companies. He sent me documents detailing tests that were run on two identical machines used in their drug-making process. Both were the same model, from the same manufacturer and--get this--purchased on the same purchase order to arrive on the same date.

In Y2k testing, one machine worked fine, the other failed and shutdown. A mystery indeed. They worked hard to locate the problem, finally isolating the culprit: one internal chip.

Since multi-purpose chips are a common thing, it turns out that this manufacturer bought its chips at the best price that could be found on the particular day they were needed. When those chips arrived, they would dump them into the bin--alongside the remnants of the previous chip purchase--and continue production.

Unfortunately, as this company learned, one of the chip makers was producing a compliant chip, and one was not.

When these two identical drug-making machines were made, one of them *just happened* to utilize a chip from the first batch, while the other was made with a chip from the new batch.

Though the timing function was not critical to the drug-making machine in question, the multi-purpose chip still had the function embedded within it for other possible applications. When it encountered '00' it failed.

So... two identical machines. One Y2k compliant, the other not. If this company had been content with "type-testing" they would have checked the first machine and then moved on thinking all was OK.

Their logic--or common sense--would have led to a production shutdown later. And how long do you think it would have taken under high stress circumstances--with the line stopped, and other failures potentially occuring at the same time--to isolate that same, one chip problem? Now imagine this in a power plant, a chemical plant, etc.

The good news is: this pharmaceuticals company did NOT trust 'type- testing.' They were very thorough. Your response (and I hope you can see that this is not a personal attack, just a *logical assessment* based on what you've said) indicates that many others are not being as thorough--for whatever sincere and common sense reasons.

I am forced by documented facts to side with Jon on this one: The future should prove to be very interesting.

Bob Allen

-- Anonymous, April 27, 1999


Bob,

Was the machine called compliant by the manufacturer or supplier ? Was there a difference in versions between the two machines ?

Menno

-- Anonymous, April 27, 1999


Menno,

I do not know whether the manufacturer made any claims of compliance.

As to 'different versions', this I can answer. No. As I wrote before, these were two identical machines, manufactured, purchased and integrated at the same time. That was the point of the internal memo passed along to me from the pharmaceuticals company. It was essentially a warning from the management saying, "Don't trust type- testing."

The document did say that their policy on vendor claims was "trust, but verify." In other words, this company tested everything. That's how seriously they consider the whole Y2k challenge. But, again, as to a specific claim of Y2k-compliant status for these two machines, I cannot tell you whether it had been made.

Thanks for your questions, and I hope this helps you and many others to an awareness that will spare us from similar "illogical" system failures in the days to come. We're in this together.

Bob Allen

-- Anonymous, April 27, 1999


I believe there may be a difference in the definition of type-testing here, as Menno is using it, and as the term is used in the U.S. Menno wrote, "Typetesting is a waste of time, money and energy, unless someone can prove that it's necessary." It's my understanding that the definition of type-testing as used here in the U.S. is to test just one device and if that one test indicates compliance or "readiness" then the assumption is that you don't have to test any other devices of the exact same make. Under this definition, type-testing is a way to save time, money and energy, not the reverse. This definition of type-testing (test one and therefore assume all the rest are ok, too) is what has spurred some anxiety and debate surrounding the lack of time needed for testing, because of the late industry start in addressing Y2K issues and subsequent deadline pressure.

As Bob has related, there have been cases where taking the time to test each device, regardless of how identical they may seem, has been shown to be *not* a waste of time. NERC has acknowledged the danger of type-testing (of the definition I gave)in their reports, as have the auditors in NRC reports. NERC has also stated that some utilities are doing type-testing (of the definition I gave). However, NERC's recommendation is that this type testing NOT be used for mission-critical systems and that all devices be assessed and tested individually for optimum safety. The NRC audit reports have indicated that the reasons some utilities are going with type testing for critical items is based on their extremely high confidence in the vendor of a device, and said vendor's own testing and compliance statement. Nevertheless, for the reasons Bob gave, NERC has steadily continued to recommend individual testing of mission critical devices.

Unfortunately, the fact of the matter is that there is often not enough time, money or resources to do that individual testing, even if it remains the optimal practice. The high confidence in a vendor which is needed for a utility to be comfortable about not testing each device is demonstrated by an answer in the NERC March utility survey to the question, "List the greatest obstacles your organization faces in achieving Y2K readiness by December 31, 1999." This answer was:

"Having vendors come back and say hardware or software that was Y2K ready or compliant no longer is ready."

The situation boils down to having to make tough decisions regarding how much testing actually can be accomplished in the short time remaining and how much trust can be placed in specific vendors or manufacturers.

-- Anonymous, April 27, 1999



I have found this notion of chips that mysteriously fail highly suspect. Can you provide a part number for the changed chip? If you have the documentation you claim it should give the part number and manufacturer of the chip. Don't have to name the company using the chip to make this widget (I have heard it called a dozen things, this story has made the rounds), just give me the ID for the chip and who is making it - this is stamped on the plastic case around the chip.

-- Anonymous, April 28, 1999

Paul,

Is this a chip failure?

http://www.greenspun.com/bboard/q-and-a-fetch- msg.tcl?msg_id=000ljA

~C~

-- Anonymous, April 28, 1999

Hopefully, a better link to embedded testing article:

http://www.greenspun.com/bboard/q-and-a-fetch- msg.tcl?msg_id=000ljA

-- Anonymous, April 28, 1999

Ok... Over on Ed Yourdon's forum is a thread titled:

Embedded Systems: The Wildcard?

It's a clear piece about an embedded systmes test at Texaco.

Big Dog, you're right. I could benefit from a little break in the action.

~C~

-- Anonymous, April 28, 1999

Darned if I can find that thread, Critt. I will say that people who claim hardware clock failure and such get awfully shy when you ask them for a simple part number.

-- Anonymous, April 28, 1999


Paul,

No one's gotten 'shy' on you.

Unfortunately, the documentation I have does not contain the information you are requesting. This is an internal memorandum which explains the incident, but does not give all the minutiae. It is from the company that experienced the failure, so it's NOT a second-hand report about "something someone heard from somewhere."

As Marcella points out in another thread, this is one of the great frustrations of Y2k: the people who have access to the firsthand information are often unable or unwilling to provide it because they have been warned (or mandated to sign documents) to remain silent.

I have personally received a lot of documents where the providers will not allow me to reveal certain elements of the information contained. This limits my ability to publicize the contents, and thereby makes it a matter of trust, just as we all find ourselves blindly trusting some reporter or editor on stories like Kosovo, etc.

We will never have certainty either way about the exact outcome of Y2k until it is too late. This applies to all types of companies, including those that are the subject of this forum on electricity.

And to take this one step further, I have information on several companies--including one of the ten largest in America--that, if widely publicized and believed, have the potential to collapse their stock values overnight. I have internal remediation schedules for this company--a company that you would undoubtedly recognize--that are in total contradiction to SEC filings and prove that at least some companies are lying to the public, and only have faint hopes that they will finish a number of their critical fixes in time.

This, and the drug-making machine incident, deals with what I know. What about all that we still don't know? I have to agree with Bonnie's continuing admonition that prudence dictates some level of preparation. How much? That's for each of us to decide.

Looking for verification has an important part to play in all of this. At some point, however, information must lead to action. If you persist in skepticism past the point of being able to prepare, I hope you live very close to someone like Bonnie or me. We're busy getting ready to help.

Bob Allen

-- Anonymous, April 28, 1999


Motorola is not shy about listing their non-compliant RTCs and the part numbers. Here are 70: http://mot-sps.com/y2k/mailing.html#rtclist

Almost all large semiconductor manufacturers make clones of these parts.

The Texaco offshore oil rigs are largely controlled by QNX based industrial PCs.

QNX's home page is here: www.qnx.com

An example of an industrial PC can be found here: www.texasmicro.com

Here's what QNX has to say about Y2K and PCs:

"A further concern which affects almost every user of PCs is that the hardware clock on IBM-compatible personal computers stores the year as a two-digit value. QNX systems are most often configured to set the system time from this hardware clock upon startup by running the rtc utility. (QNX does not use the BIOS to determine system time at startup, and the rtc utility is the only component of the QNX operating system which uses or interacts with the hardware clock.) Current (QNX 4.23 and later) versions of the rtc utility function correctly through the year 2000 and beyond on most systems. Free updates are available for older versions of the rtc utility which have problems handling the transition to the year 2000 on machines whose BIOSes do not automatically make the century transition."

Notice the line, "QNX does not use the BIOS to determine system time at startup". Where's "Fact"Finder when you need him?

Look here http://www.qnx.com/support/y2k/index.html for the other 15 pages describing the problem, fixes, patches and work arounds.

Here's a list of other real world applications deployed on QNX and industrial PCs. All of which have potential Y2K problems. http://www.qnx.com/realworld/index.html

As I am very fond of repeating, QNX is only _one_ OS of many other proprietary and completely custom embedded OSen. Texas Micro is only _one_ of many manufacturers of industrialized PCs. The Motorola chips are not the only ones with the problem.

QNX and Motorola should be commended on how upfront they have been with their Y2K information.

Partly as a result of their openess and cooperation Texaco is almost completely finished (and tested!) with their Y2K remediation in their plant and automation facilities (there was a good article about it in Wired magazine a couple of months back). I have tried to communicate some of the more subtle issues concerning embedded systems in various threads in this forum (you can find them all by doing a search for "ajedgar"). Sometimes I don't explain things as well as I might. If any of you have questions you would like answered please feel free to ask. If I don't have the answer at hand I'll almost certainly be able to find it.

Regards,

--aj

-- Anonymous, April 28, 1999


Thanks Bonnie for making it clear. You are a big help. The question has to be: Individual testing is a waste of time, money & energy.

This story of Bob don't bring me closer to individual testing.

Some steps we have taken in our project are: 1 send questionaire about compliancy to manufacturers/suppliers 2 investigate the answers 3 decide what to do (upgrade, audit, test, etc.) if we trust the vendor compliancy than we do a type test, even if it is a critical system. if we don't trust the vendor compliancy than we do individual tests. 4 perform an integral test

As Bonnie wrote: The situation boils down to having to make tough decisions regarding how much testing actually can be accomplished in the short time remaining and how much trust can be placed in specific vendors or manufacturers.

Back to Bob,

I need to know more facts like: Who was the manufacturer ? Type ? Version ? Compliant ?

This is no evidence.

But if Jon Hylands will pay the price we are willing to do individual testing.

-- Anonymous, April 29, 1999


Menno,

Again, I wish that I *could* provide the information you and others seek... unfortunately it is not possible for me to do so.

Probably for the same reasons everyone else refuses to mention part numbers and makers (ie., lawyers out to make a short-term buck for themselves or their benefactors at the possible expense of the public at large) the internal memo I have in my possession does not include that specific information. And, I'm not surprised. The intent of the memo was to tell other project heads within the company that type- testing was not sufficient in upper management's eyes--based on this one experience. I am sure that those who worked with similar equipment were given all the details necessary, but it was not published for me to see.

Complicating things for me in doing further research on this incident (which occurred over a year ago) is that my source has been reassigned recently to another facility (a nuclear power plant, of all things!).

If my story isn't enough to make someone think long and hard about being content with type-testing on critical systems... there's nothing more I can say or do. When I have solid information I will try to sound the alarm. If people ignore the fire alarm and burn... they can't blame anyone for not trying to warn them.

Unfortunately, as is always the case, those who have ignored the various Y2k alarms for too long may take a lot of others with them. I'm thankful I no longer fear dying.

Bob Allen

-- Anonymous, April 29, 1999


Menno wrote:

> But if Jon Hylands will pay the price we are willing to > do individual testing.

Heh heh, no, I think I'll spend my money on my solar/wind system and more stored food...

Jon

-- Anonymous, April 29, 1999



Moderation questions? read the FAQ