*The Importance of EMBEDDED Systems Revisited: Info, Discussion, Links

greenspun.com : LUSENET : TimeBomb 2000 (Y2000) : One Thread

There has been many a great discussion on Embedded systems. Some seem to get lost in the flurry, so I thought I would excerpt and link a couple forums on this page. Feel free to add all relevant links you wish.
A great quick discussion (pleave visit that thread for much more great info):
http://www.greenspun.com/bboard/q-and-a-fetch-msg.tcl?msg_id=001p5v
-------------------------
Laura --
Sorry, was out doing some chores and just got back.
As to the 'real' numbers, I couldn't tell you. I've heard numbers ranging from a low of 30 billion to a high of 165 billion. And it really depends on how you count them. Is a microwave an 'embedded system'? How about a coffee maker? The chips in your car?
If those are all counted then I would believe the higher numbers. (As an example, I believe my car has about 30 chips in it. How many millions of cars in the U.S. alone?)
Sorry I cannot provide a soporific for you. I don't sleep all that well some nights either. There are a whole lot of variables involved in this.
First, we don't know how many of these things there are. It can be argued that 'Well, nobody actually has to know this. Each individual company or organization knows what *they* have, and why should they care about what somebody else has?' This argument fails due to the fact that
a). Not every organization *does* know what they have. The thing that people forget is that a lot of this stuff was designed to be 'fire and forget'. You turn it on and forget about it until it either fails, is replaced, or requires maintainance. And there are some that were procured, installed, maintained, etc, by people who are no longer with the organization. There may or may not have been records, but who thinks to look at them.
b). They may or may not know what *other* systems the given one interacts with. That is, the organization may know what systems *they* have, but not what systems are *also* required for their systems to work.
Second, with respect to 'embedded' chips, there is a marked tendency to forget about the *software* (or firmware, for the purists), and concentrate on the *hardware*. This was one of the main thrusts of the Dale Way essay (Critique of Ed Yourdon's Y2K End Game Essay). As soon as you concentrate on the *hardware* you are, indeed, probably looking at something like a 3% or lower failure incidence. But the *real* issue is the code that resides in these things. As an interesting example of this, there was a thread three or four days ago in which I was arguing with Paul Davis about this sort of thing, (the one where he was pontificating about 'hand waving', 'magic', and the lack of chips in cars), and he stated something like 'Of course, it is possible to put non-compliant code in a compliant chip...'. I restrained myself, with difficulty, from pointing out the obvious, which is that *THAT IS EXACTLY WHAT THE PROBLEM IS!* I mean, the hardware itself *almost* never cares about the date. This is the basis for all of the arguments about the 'hardware only cares about the "tick" of the real-time clock'. Which is true, but tells one ABSOLUTELY NOTHING about whether the software in there cares.
Third, a good many of these things were written in the late seventies and early eighties. Almost nobody gave a thought to Y2K back then, and even those who did usually got overruled because there was "no way this system will survive till them." After all, the rated life of the chips themselves was only about 5 years. Unfortunately, this isn't the way it worked out. An awful lot of those systems are still in place. There typically isn't any source code for them. There isn't any documentation on them. (Frequently, not even a requirements document. This was one reason why a lot of them have survived. Nobody knows or remembers what all they were supposed to do, or what sort of restraints were required, so nobody wants to replace them, not knowing what will be overlooked.) Systems like this are *extremely* hard to remediate. Shoot, they are hard to inventory or assess.
Fourth, an awful lot of the 'speculation' concerns individual chips. This is probably a mistake. It would probably be a lot more intelligent to concentrate on the number of *systems* which contain these chips. I have no feel for this at all. There has been virtually no discussion of this point that I have found, it has all concentrated on the chips.
Looking at systems would be easier, as I suspect that the numbers would be much more manageable. Instead of 30-165 billion chips, you would probably be looking at 1-2 billion systems. The *downside* of this is that you would probably be looking at *MUCH* more significant failure rates. (I believe the Gartner Group posited 30% to as much as 60% 'systemic' failure rates, but I didn't read the whole article, just the part that was 'cut and pasted'. ) I do suspect that a lot of systems are vulnerable to failures of the *system* due to failures of small proportions (possibly as few a *one*) of the chips.
I don't know if this helps or not. It is about the best I can do.
-- just another (another@engineer.com), November 21, 1999.
-----------------------------
Another short great description:
--------------
From: http://www.greenspun.com/bboard/q-and-a-fetch-msg.tcl?msg_id=001rdB
Mr. Fey - I would like to address your concern about the applicability of considering the importance of secondary clocks.
To understand the importance I will use a 'real-world' example of which you might have some familiarity. The clocking system of the computer sitting in front of you.
Your system, if it is like most, has a BIOS subroutine that enables you to reset the real-time clock on the motherboard of that computer. The input you provide the BIOS routine is a form of primary clock. It provides a start reference point akin to "At the tone, the time will be...!" In almost all other applications of importance to the y2k issue ... telcom, power, and other utilities; military systems; satellites; etc ... the primary clock consists of an electromagnetic signal emanating from the National Institute of Standards and Technology on the Ground Positioning Satellite System. Just like entering the wrong date/time onto your BIOS configuration, an erroneous date/time signal from NIST will result in an erroneous time stored in affected systems.
The next level in the clocking hierarchy, the secondary clock, would include such circuitry as the real-time, or hardware, clock itself in your system. Everytime the secondary clock circuit initializes, it recovers the stored values of current date and time from the original primary input and adds the elapsed time since that input. The secondary clock then outputs the 'corrected' current date/time to the working, or tertiary, clock. For MOST systems, the secondary clock is what provides the continuous definition of current date/time to the hardware and core operating system.
Finally, in your computer is what is commonly referred to as the Software clock ... which is the tertiary clock. This is used by application software for maintaining current date/time when running those applications. This clock is nothing more than a software subroutine that keeps a continous count of elapsed time and adds that to the value of the secondary clock when the software application was initialized. Problem here is that the subroutine uses the timing chain provided by the secondary clock as a means of keeping track of time intervals to determine that elapsed time.
Y2K has the POTENTIAL of creating the following problems in clock/timing circuits:
a) GPS glitches - so does the entire continuum of systems utilizing it for primary clock and timing information.
b) If the BIOS is not compliant - the secondary clock may lose the date/time input from the primary clock and/or it may not run at the correct rate.
In the first case, the system hardware may either not initialize at all or it may fail at the start of the operating system, resulting in the 'blue screen of death'. Either way, the result is an obviously dead system.
In the second case, one error that might result will probably be that system timing can become compromised resulting in a system error shut-down. The timing chain upon which the entire system depends becomes scrambled. The operating and/or hardware systems lock up and the system operator will have to reboot. All application software data entered from the time the timing chain started to glitch until the system locks up will most likely be lost. Note: this condition could also lead to the demise of circuit boards and/or components in the system due to additive effects of timing pulse overvoltages burning out microcircuit components that have a clock input. Overlapping voltages on the clock bus are additive, like batteries in series.
The other scenario is perhaps the most insidious of all - application software would be initialized with an erroneous intialization date/time. The software may a suffer an obvious error and shut down or worse, continue to operate using that faulty date/time info.
Conclusion: The secondary clock is VERY important ... it is the heart of the system ... everything is derived from, and depends upon, the accurate initialization and operation of the secondary clock.

-- hiding in plain sight@edge. of no-where, November 26, 1999.
----------------------------------
An excellent, albeit long, thread for review:
http://www.greenspun.com/bboard/q-and-a-fetch-msg.tcl?msg_id=001eWJ
"The Institution of Electrical Engineers: The Millenium Problem in Embedded Systems": List of applications of embedded systems
Also there:
Embedded Systems Fault Casebook
GN: Category: Noncompliant_Chips
*Embedded Systems and the Year 2000 Problem
(The OTHER Year 2000 Problem)
The "Embedded Systems" archive from TB2000:
http://www.greenspun.com/bboard/q-and-a-one-category.tcl?topic=TimeBomb%202000%20%28Y2000%29&category=Embedded%20Systems

-I am in a hurry, and I know there are a great many more valuable resources re. this on the web. Please add relevant and lucid explanation info and links. Thanks!

-- faith'nhope (y2kaos@home.com), December 04, 1999

Answers

Embedded clocks that are "unknown" will of necessity fall into two rough categories: those without battery backup, and those with battery backup. ("Megacaps" fall into the "battery backup" category, with the caveat that they will only sustain the clock for a few minutes to a few weeks, depending on system design.)
Systems designed without battery backup will revert to their epoch date each time they're powered down, and can *probably* be presumed to be able to handle it gracefully.
That leaves us with systems -- with "unknown" clocks -- that are battery backed.
If we narrow our focus to those systems that will cause 1/1/0 failures when they hit The Date, we should brace *not* for a sudden rash of problems at the stoke of midnight Jan 1, but for a bell curve starting *very* soon, and extending an equal amount of time *after* the end of the year.
The reason for this is simple. Clocks drift. Most (heck, *all*) of the computer clocks I've use are *far* less accurate timekeepers than even the cheapest two dollar digital wristwatch I've ever seen.
With your PC, it's trivial to adjust the clock -- easier than a typical wristwatch, in fact.
But with a system where the clock is *not* even known to *exist*, the drift will accumulate, and accumulate, and accumulate...
Some will run a tad fast, some will run a tad slow.
It'll be interesting to see what comes of this. I would *not* be surprised to find out that some of the recent refinery and processing kabooms were caused by drifting embedded clocks.
I'd be even *more* surprised if we *did* find out, even if it were the case. CYA, I suspect, has become the order of the day.

-- Ron Schwarz (rs@clubvb.com.delete.this), December 04, 1999.

Well, This thread certainly is interesting, though quite flawed technically. I recommend the Dell discussion of the RTC and secondary (system) clock issues, if anyone desires factual information: ------------------ The Century Rollover and the PC System Date September 1996 http://www.dell.com/us/en/bsd/topics/vectors_1996- century.htm ---------------------------------------------------------------------- ---------- Pete Woytovech, Senior Programmer, Dell BIOS Development There has been much discussion of the information systems challenges facing most companies as the year 2000 approaches. Most of the attention has centered on the significant problem of identifying and updating business software applications to handle the century rollover. Another aspect of the problem, however, has received less attention. Most of the personal computers (PCs) in operation today will be unable to advance their hardware-based system dates to the year 2000 without intervention through the system BIOS. It is most likely to happen to PCs that are powered off at the time of the century rollover, and then powered up on January 1, 2000 or after. Unlike the larger software applications problem, the system date problem is quite easily remedied. The ramifications of not fixing it, however, can be major. For instance, many applications rely on the system date for accurate processing. Additionally, network operating systems (NOSs) and workgroup applications base many automated network functions on the server system date. Most PCs maintain two system datesone in the CMOS real-time clock (RTC), a chip located on the system board that maintains the date and time even when the system is powered off, and one in the operating system (OS) software. The RTC system date can be set and retrieved using BIOS calls. When the OS is booted, it normally initializes the current date from the RTC through a BIOS call. Once initialized, the OS maintains the system date as long as the system is powered on. Conversely, under certain conditions, most OSs can update the RTC date while the system is running. This behavior applies both to standalone PCs and PC-based network servers. In contrast, networked PCs can behave somewhat differently, depending upon the NOS. Many NOSs are configured so that when a PC logs in to the network, the local OS can initialize the current date from the NOS so that the server controls the system date on its attached networked PCs. The system date is maintained correctly as long as the RTC is able to automatically update to the correct date or the OS has an opportunity to update it. However, the century rollover challenges this process because, under certain conditions, the PC RTC will not be updated correctly to the year 2000. This is due to an inherent flaw in most RTCs dating back to the early days of the PC, as well as peculiarities in the system dates of some of the most common OSs. [Much more at the link above]. --------- The title of this thread is catchy, but I of course prefer the original "Embedded Systems Revisited" posts:
Embedded Systems Revisited
Oil & Gas Revisited - IT and Embedded Systems - (html format)
10 Documented Examples of Y2K Functional Failures in Embedded Systems Regards,

-- FactFinder (FactFinder@bzn.com), December 04, 1999.

A few more comments on the clock issue. Typically, the RTC is referred to as the primary clock, the operating system as the "secondary clock". Use of GPS time date for improved accuracy is sometimes used, but not very common in the power industry except perhaps in the SCADA systems used for grid operations, and even then its not always present (or required).
Some company's talk about "tertiary" clocks, blah blah, but basically they are talking about RTC date/time ---> BIOS date/time storage in CMOS ---->OS clock.
Basically, its really two clocks, RTC, and your operating system clock. Turn on your PC, the bios gets the RTC date, your OS starts up, then your OS gets the date from the bios CMOS registers.
The secondary clock generally refered to is the operating "system clock" (DOS/Win 3.1/95/98 for example), which grabs the RTC date via the BIOS CMOS upon bootup, then keeps its OWN time off of the microprosessor clock pulses. The system clock is pretty inaccurate compared to the RTC, which is why your PC can "drift" off time if left on for days. Also, if you change the Win/DOS system clock, it updates the RTC clock as well (Note, some networks feed the time instead from another server on bootup).
Regards,

-- FactFinder (FactFinder@bzn.com), December 04, 1999.

The two constants of the universe appear to be:

1) The speed of light.

2) Vastly different "expert" opinions as to the significance of Y2K on embedded systems.

The speed of light is a measured value. The Y2K/embedded issues are anyone's guess.

"Hope for the best, prepare for the worst."

-- King of Spain (madrid@aol.cum), December 04, 1999.

KOS, Hate to be the one to break the news, but the speed of light also varies....according to the medium....
Regards,

-- FactFinder (FactFinder@bzn.com), December 04, 1999.

FactFinder --
And what, pray tell, do PC's have to do with embedded systems?

-- just another (another@engineer.com), December 04, 1999.

FactFinder -- And what, pray tell, do PC's have to do with embedded systems?
-- just another (another@engineer.com), December 04, 1999.
Just another, quite a lot actually. In the context of y2k, many use the term "embedded systems" to describe any systems outside of the traditional IT domain, such as DCCs, PLC based systems, SCADA based systems, etc. Many of these systems use PCs as the man-machine interface (MMI), and in some cases the system itself is PC based.
The Garntner Group, GIGA, IEE (UK) all typically include PC based systems used for monitoring, SCADA, etc., under the broad category definition of "embedded systems".
If you are a purist and prefer to use the term "embedded system" only for firmware based devices (this is my preference as well), be aware that even the field of embedded systems programming is more and more moving into the typical pc type microprocessor area with embeddeds.
Even using the narrow definition of "embedded systems" you will find that it is quite common that a PC is used (yes, a standard IBM compatible PC in many cases) to communicate and exachange data with the embedded devices. Examples are PCs used as the MMI for PLC based systems, RTUs used with SCADA PCs, radiation monitoring embeddeds communicating with PCs, portable test equipment using PCs, data acqusition systems using PCs, etc. In assessing and testing in the power industry I frequently had to test embedded devices along with PC based equipment in order to test the system as a whole (end to end testing).
Regards,

-- FactFinder (FactFinder@bzn.com), December 05, 1999.

Moderation questions? read the FAQ