Embedded Chips.....a dumb question?

greenspun.com : LUSENET : TimeBomb 2000 (Y2000) : One Thread

Can someone explain to me HOW they can test an embedded chip or embedded system?

-- Sheila (sross@bconnex.net), January 14, 1999

Answers

This depends on the system. Embedded systems take every form we've been able to dream up some practical use for.
In most cases, embedded systems that need to keep track of calendar time (a small minority) have some way to set that time. This is necessary because clocks aren't very accurate and wander quite a bit over time (and there can be other reasons depending on the system). The task is to determine how to set that time, then (usually) set it forward and see if anything at all happens that should not. Usually there's a technician who knows how to do this, or it's in the instruction manual. Many embedded systems are built around AT-type motherboards running DOS, for example.
This testing is usually done off-line, so as not to interfere with normal operations. That's why power plants schedule downtime for testing, and why some manufacturing operations have decided to fix on failure - they simply can't afford to stop the lines. Some processes operate every minute, and shutdowns can take anywhere from weeks (refineries) to six months (some nuclear plant designs), so finding the opportunity to test is more difficult than performing the test itself.
One challenge is determining the boundaries of the 'system', since embedded systems often don't stand alone -- they're connected to other systems. These systems of systems might have multiple clocks keeping track of the date for different purposes, and they communicate with one another. It may or may not be necessary to keep these clock synchronized, and this necessity (or lack of it) must be determined and followed for the test to be properly indicative. In some cases, there is no easily defined boundary, making true end-to- end testing impossible (until 01/01/00, when such a 'test' will be performed ready or not).
In some forums, there is almost what amounts to a bounty to anyone who can find any embedded systems that need to keep track of calendar time, and will have problems at or after rollover, and there's no way to set the time or any visible indication that the time is actually being used. I know of one case that almost fits -- it will roll over and die when the internal time representation runs out of bits, early in April 2023. Fortunately, this is not a critical system (and as usual, I don't expect it to be still around in 2023 (grin)).
In summary, each system must be individually investigated, there is no standard way of doing the testing, and in some cases it might be more trouble than it's worth (we hope).

-- Flint (flintc@mindspring.com), January 14, 1999.

Embedded Systems/Chips/Devices

-- Jack (jsprat@eld.net), January 14, 1999.

What I don't see discussed in details with reliable sources on this forum is the "stamped chip" problem, where chips were mass produced with all functions, calendar included, and inserted into "black boxes" wether they needed the calandar date or not, for cost efficiency.(I've seen only one personal page from an engineer with no other sources.) Maybe I've missed a thread where this was discussed in detail?
-How many of these are really out there? -What types of devices were they put in, and how critical are they to infrastructures and businesses? -How sure are we of their initial set dates? -Are there reliable blueprints for these devices? Are there reliable sources of manufacturers still in existance? How many are gone?

-- Chris (catsy@pond.com), January 14, 1999.

What is even more fun are the embedded systems that don't need to use time/date but need to (for example) check a sensor every 'x' seconds. Rather than use a rolloverable counter, an RTC may have been used for supply/logistical reasons.
On the newsgroups for automobiles there is a several hundred dollar bounty for proof of a car failing to work due to Y2K. I'll see if I can find the website with the parameters for winning.

-- Tod (muhgi@yahoo.com), January 14, 1999.

I have been trying to find a reliable report of an industrial quality PLC that stops due to time keeping problems for about a month. None yet. (Reliable - serial#, Model#, Manufacturer name) A couple of control panels have problems - but I would rather replace/repair 1 control panel that the 250 PLC's it controls.

-- Paul Davis (davisp1953@yahoo.com), January 14, 1999.

And that is why we are always seeing the quotable quote:

"Nobody knows ...."

when it comes to Y2K, especially regarding embedded chips.

And I think that many are missing the point, here. The challenge is not to find and prove that, yes indeed, one of the estimated 1%-2% of all embedded chips that have been produced that will fail has been identified and found to fail. The challenge is to find and prove that all embedded chips that our life sustaining systems depend on will not fail due to Y2K (replacing them with Y2K compliant ones where necessary).

For example, this is from the recent (1/11) NERC report regarding the electric utilities, which I think qualifies as a life sustaining enterprise:
Of greater concern, both in the electric industry and elsewhere,
is the pervasiveness of the Y2k bug in embedded chips. Small
electronic chips control devices used throughout our society.
Examples include heating and cooling systems, VCRs, answering
machines, facsimile machines, coffeepots, microwave ovens, and
traffic light controls.
 
In the electric industry, these chips are used in
communications and numerous power system device controllers.
Electronic chips are generally mass-produced without knowing
the ultimate application of the chip. A single circuit board
can have 2050 of these chips from various manufacturers.
Because of the diversity of chip suppliers, one vendor may use
a different mix of chips even within devices labeled with the
same name, model number, and year. Many of these chips have
built-in clocks that may experience date change anomalies
associated with Y2k. The difficulty is in identifying all of
these devices, determining if they have a Y2k problem, and
repairing or replacing those that do. It is estimated that less
than 12% of these devices may use a time/date function in a
manner that could result in a Y2k malfunction of the device.
Think of it like a Christmas tree string of bulbs, where one bad bulb can effectively cause the entire string of lights to fail!

-- Jack (jsprat@eld.net), January 14, 1999.

How do you test an embedded system for programmed date functions? When I check PROM program versions on our equipment, I have two methods available to me.
The first one is easy, if there is an ID label on the PROM, I can check it against a list for which revision program it's using. The programs have been evaluated using an off-line system (a PC and another, same model, embedded system running in the Program mode) and checking spare PROMs for the program contents. Works great when there are multiple applications of identical programs.
The second method is used when there is no ID label, no spare to check against or it's a one-of-a-kind application. That's when the entire system must be shut-down so that the embedded device can be put into "Program" and a laptop PC connected to it to read out the program to memory for evaluation back in the office. It doesn't take too long to do the download, but access to the device and time to take the system it operates off-line can be the hardest part.
In one of the earlier posts, someone touched on the subject of "date- stamped devices". These are chips which have their creation date set as a default function to be "etched in silicon" electronically as the chip is designed or manufactured. Another version is a date field that can be loaded into the chip by connecting a computer using the proper software. Such chips can be reset to the default by either removing a battery on the circuit board where the chip is used (like a PC motherboard), by using a jumper to short out certain pins or by plugging in to the device where the chip is used with a PC running the proper software.
Hope this provides some answers.
WW

-- Wildweasel (vtmldm@epix.net), January 14, 1999.

Wildweasel
The first one is easy, I THINK I GET IT.
The second method is used when there is
1. no ID label, 2. no spare to check against 3. or it's a one-of-a-kind application.
4. OR YOU CANNOT TRACK DOWN THE MANUFACTURER OF THE EQUIPMENT THE CHIP IS IN.
IT MAY NOT BE THE CHIP BUT THE PROGRAM ON IT That's when the entire system must be shut-down so that the embedded device can be put into "Program"
ARE YOU TALKING ABOUT:
1. IN CIRCUIT TESTING- LEAVING THE CHIP IN PLACE.
OR
2. DESOLDERING OR UNPLUGING THE CHIP.
IF 1. WHERE DO GET THE PROBES?
I WOULD THINK THERE IS A BUSINESS IN MAKING PROBES.
4-LEG ALL THE WAY UP TO 48+ - LEG PROBES THAT ARE LIKE ALLAIGTOR CLIPS

-- Steven Belsky (balstarr@idt.net), January 14, 1999.

"Stamped chips" are not going to be a widespread problem. Neither are things that use realtime clock chips because they are cheap. I've answered in detail before, but in brief for clock functionality to upset an embedded system either
1. if there is a realtime clock with a battery backup and a hardware design fault that will cause malfunctions by and of itself without the processor even using the time, or
2. the embedded system uses a realtime clock **that it set to the correct time** and contains a software Y2K bug.
1. is rare. In 2., note the emphasized phrase. If you are using a realtime clock chip for interval timing, you won't bother with a battery backup or with setting the clock when the system is manufactured. Therefore every time it's powered down, the clock will reset to some arbitrary base date and it'll never reach 2000.
So why the embedded systems fuss? Because you have to find and check every embedded system, or risk it causing a production shutdown (or worse). Also because the consequences in the oil and power industries have the *potential* to be disastrous. (I'm very glad I don't live in the frozen north of Canada).
So, a huge abount of work is needed doing assessment. Remediation will be needed in very few cases and is trivial in those where the embeddeded thing is a standard off-the-shelf item. (On the other hand, if it's custom code written 15 years ago by a company now out of business, it's extremely non-trivial!)

-- Nigel Arnot (nra@maxwell.ph.kcl.ac.uk), January 15, 1999.

answer for Steven Blesky:
The method of testing in circuit I was refering to was indeed in- circuit testing. If one of the PROMs has a program that needs replacement, we shut down the system and simply unplug the old PROM and plug in a new one (plug-in chip sockets are wonderful). If there is any problem with that method it's that some older devices use PROMs that are no longer in production. Kinda hard to get some vendor to start production of blank PROMS is memory sizes like 64K or 128K when the world is using 512K and 1MB PROMS.
There have been cases where we've paid several hundred dollars and in a few cases over one thousand dollars per chip (who says Y2K isn't good for business?). In cases where the device supports it, we've been up installing EEPROMS instead of one-time use PROMS. Using eraseable PROMS that can be re-programmed might make life easier the next time there's a programming crisis. Like when somebody finds an "OOPS!" in one of the Y2K fixes.
WW

-- Wildweasel (vtmldm@epix.net), January 15, 1999.

For Steven Blesky Part II, or OOps, I forgot this part.
The embedded devices I've been working with (Matsushta and NAIS PLCs) have the capability accept a plug-in cable hooked to the RS-232 port on my laptop. And there is a slider switch with Run, Program and Remote selections. There is a second switch with PROM and EEPROM positions to select which type of chip is going to be installed.
WW

-- Wildweasel (vtmldm@epix.net), January 15, 1999.

Moderation questions? read the FAQ