Embedded Systems -- Failures

greenspun.com : LUSENET : TimeBomb 2000 (Y2000) : One Thread

I don't work on Embedded systems. I'm strictly a Cobol/VB programmer that has done y2k work. I don't understand why the pollies discount evidence like this. Do you guys think these are all lies? I'm just curious.


-- Larry (cobol.programmer@usa.net), October 28, 1999


I don't think anyone is saying that embedded systems cannot have Y2k failures. Just like any system that misuses a date construct embedded systems can misuse date constructs too. I work with embedded systems and I have always maintained that the Doomers wildly exaggerate the scope of the embedded problem. They assume that if it is an embedded system then it must have a clock which cares about the year and somewhere embedded software is using the year incorrectly. This is totally untrue and I guess only engineers with embedded experience can understand this fact.

-- You Knowwho (debunk@doomeridiots.com), October 28, 1999.

What "evidence" do you mean? THis looks like a cut and past of the old IEE website list of alleged embedded failures which has been discussed over and over. In general, their is no independent verification (sound familiar?) of any of this so called evidence and the alleged consequences are way over exagerrated in many cases. There are and will be problems with embedded systems, just not to the extent as was originally thought.

-- There Is (noone@home.now), October 28, 1999.

Looks like a rework of the IEEE link.

Eminently fixable, virtually all cases were upgrades or replaces, and practically all I bet have been fixed.



-- shuggy (shimei123@yahoo.co.uk), October 28, 1999.


Some of the consequences appear to be serious. Then it is safe to say that the consequences listed on the link are lies because they "wildly exaggerate the scope of the embedded problem."

If I didn't fix my software to understand 2000, then the consequences would be serious. A shutdown, plain and simple. This was my assessment and it wasn't "widly exaggerated." It was a fact. They would be down right now as we speak if I didn't make the repairs. But because it sounds like the "doomer language" we choose not to use this language. I refuse to be dishonest. I tell things the way they are.

But thanks for your opinion, I guess I still don't understand what a shutdown is.

-- Larry (cobol.programmer@usa.net), October 28, 1999.

Youknowwho, I don't think the scenario of every Embed failing is at issue here. Nobody that I know of assumes total failure as even a remote possibility. The question is of odds. The best figures I have gleened indicate a realistic number of Embedded Systems worldwide at roughly 65 billion. If only 5 percent of these systems are date critical, that leaves a staggering sum of 3.25 billion systems left to be found and remediated. Many of these systems are in farflung places such as wellheads, offshore platforms, the tops of frac towers and other remote, inaccessable places. I certainly wouldn't take those odds to Monte Carlo (or Vegas).

-- Wild Celt (oddsare@againstus.com), October 28, 1999.

Thereis and Shuggy --

I will assume your answer is that they are lies (to sum it up). You're also saying that even if there were date functions, they're all fixed now or enough of them so our day to day living isn't impacted in any way.

-- Larry (cobol.programmer@usa.net), October 28, 1999.


Beats me. I'm a Sr. Programmer/Analyst too. Been doin this stuff since 1975. Started courses in 73 actually. Remember talking to cohorts back in 83 about Year 2000 and saying I didn't want to be in the business when it hit, and that was then! Well, I'm still here. Worked on IBM mainframe MVS TSO, OS JCL CICS VSAM DB2 COBOL, AS 400 CL COBOL 400, DEC PDP8 Dibol, PDP 11/34 - 11/44 Basic+ RSTS/E, VAX VMS DCL Vax-Cobol RMS, Windows NT 4.00sp4 Visual Studio C++ 6.0 multi threaded Oracle Client/Server with embedded SQL and PRO*C all DOS stuff and many others too numerous to mention. Worked for Banks, Telephony, Insurance and Now the Federal Gubbmint. I'm Scared shitless and I know where your comin from bro. Function point analysis didn't work in software engineering for estimates of labor hours and sure won't work in this convoluted situation. That Hoff guy is running a bluff. I did the same thing once, the management at one company I worked for was pullin a CYA job, i.e., WE NEED MAN HOUR ESTIMATES FOR ALL OUSTANDING PHASES OF PROJECT X NOW!! Figuring I couldn't produce one, so the SR Analyst is off the hook by indirection and I'm holdin the bag. No way we could produce anything good. So, I used the COCOMO model for cost estimation tied to FP/LOC and a multi-variable formula that produced something like 10 man years at some astronomical cost. Nobody wanted to hear the answer but they couldn't disproved it and the numbers weren't palatable so they just dropped the subject. It was disinformation and it works well. Made em shut up. Those figures can be skewed depending on you sampling and without independent verification are worthless. The basic answer here is the Polly's don't get it, they haven't got the years of hard knocks in the field. And to you pollies out there, yes I know the Cocomo is based on LOC, but don't argue unless you understand simultaneous solving of equations and associative properties of algebra. In my opinion we'll get thru but just barely. I think I'll go join shakey in a bunker at forty feet. The doomers are the programmers and technicians for christ sake. Wake up were in it for real!!!

-- Slammer (BillSlammer@Yahoo.Com), October 28, 1999.

Where is just another engineer? He could shed some serious light. Or Larry, see his previous threads....

-- preparing (preparing@home.com), October 28, 1999.

MTBF is what is missing and there is no way to obtain it currently.

When you add redundancy you add failure.

-- snooze button (alarmclock_2000@yahoo.com), October 28, 1999.

Relax Slammer. I'm a computer engineer and a Debunker not a Doomer. Stating that Doomers are all technical is about the most laughable statement that anyone has said about Y2k. But thanks for the laugh.

Somehow an old mainframer stating that he is scared sh*tless about Y2k doesn't surprise me. I guess I've seen Hamasaki going through panic attacks too much. Doesn't give me too much respect for the mainframe profession.

-- You Knowwho (debunk@doomeridiots.com), October 28, 1999.

Slammer -- It seems you're right. The only ones who really seem to get it are the programmers (in most cases, there are exceptions). Yet we're being told that we're off our rockers. Funny isn't it? This world runs the way it does because of people like you and me. In short, to put it mildly, millions are living because of people like us. This is the thanks we get for warning people. I'm made fun of down here at work. I often think, how ungrateful people are to us and in some cases, we're blamed for this problem.

-- Larry (cobol.programmer@usa.net), October 28, 1999.


Have you ever contributed anything to this forum, other than your superior self-satisfied snot-flinging at people who are trying to address the problem? If you think they are overstating the problem, why not USE FACTS TO ALLAY THEIR CONCERNS instead of dropping trou' and pissing on them like a two year old? You are an embarassment.

And what's in it for you? Your sole purpose seems to be to downplay concern using haughty dismissal and insult, devoid of facts. Does it just boost your ego, or are you getting in on the PR cash bonanza?


-- Liberty (liberty@theready.now), October 28, 1999.

"I don't think anyone is saying that embedded systems cannot have Y2k failures. Just like any system that misuses a date construct embedded systems can misuse date constructs too. I work with embedded systems and I have always maintained that the Doomers wildly exaggerate the scope of the embedded problem. They assume that if it is an embedded system then it must have a clock which cares about the year and somewhere embedded software is using the year incorrectly. This is totally untrue and I guess only engineers with embedded experience can understand this fact." _________________________

Well as an engineer who works with embedded systems I must take issue with this.

It's a matter of complexity.

The odds of a microprocessor failing in your microwave are next to nil.

The odds of a complex system failing that has thousands of embedded chips in it and also interfaces with outside systems which may be date senstive for some reason completely unrelated to the acutual embedded chips is exponentially greater.

Most embedded chips in and of themselves do not care in the slightest what the date is (for the moment I will ignore the possibility that the basic programming in the chip before it gets to the end user might have some long forgotten problems with dates. I have seen lots of opinion and papers speculating on this but am not expert enough to decide if this is a real or imagined issue, if it real then the point is mote, there would be massive failures then).

Where the problem lies is in large complex embedded SYSTEMS. The larger and more complex the system the greater the chance that there will be a Y2K issue.

In other words the systems most likely to have problems are the most critical and complex ones and the most difficult to troubleshoot and fix.

These systems represent a very small percent of embedded chips as a whole but a large number of complex systems.

It also represents many systems that are dangerous or impossible to operate manually. Want to go crawling around in a toxic area that is cleaned and maintained by robots if a systems fails?

Complex systems would include: oil refineries, oil rigs, food processing plants, chemical processing plants, military hardware, communications networks, assembly lines of all types, and so on.

Only a small percentage of the total would have to fail to cause chaos.

If you really are an engineer then you will certainly agree that replacing or repairing any failed chips in a complex system is going to go way beyond any 3 or 4 day timeframe. Weeks or months is more likely, and that's assuming that they are still available. Many manufacturers are no long in business and most are in Asia.

The amazing thing about complex electronics is that they work at all. People have gotten so used to EVERYTHING always working that they have no clue as to how much effort and time goes into to making them work and how many product fail to get to market because they can't be made to work outside of a lab. Garner Group (who I consider to be quite polly) estimates that as many as 35% of all complex embedded systems will have Y2K issues ALL AT THE SAME TIME.

It does not need to be a catastrophic failure to shut down a production line, something simple will do. The part of the system that measures salt in crackers for instance. Wahat a cracker with half salt in it?

Many large complex systems are not designed to be shut down except during very costly and time consuming scheduled maintenance. They run 24 hours a day 7 days a week. Many of these systems can take days to get back into production once forced to shut down.

Not to mention that if a system fails the fix is likely to have the exact same problem and shut down again (or worse yet case a different failure elsewhere in the system that you then have to find to attempt to fix.

So please, don't pull the "I'm an engineer and I know more than you do" crap.

-- John Beck (eurisko111@aol.com), October 28, 1999.

Slammer I love it when you talk dirty!

-- pauline jansen (paulinej@angliss.vic.edu.au), October 28, 1999.

John...you get it. And beyond that you can articulate the problem. Thanks.

-- Mark (Ezmonydm@aol.com), October 28, 1999.


I don't recall anyone stating that embedded systems were not included in any company's remediation effort. Processes and controls which are critical to a company's work efforts are not being ignored. However, it doesn't take a genius to determine whether a programmed date is part of an embedded system's function. Systems that I have worked on have a Dallas RTC. We did provide the ability to set and display the RTC but we did not display the year in 4 digits. We had requirements from customers back in 1997 to change the display to 4 digits and test for all the Y2k rollover scenarios. This date had nothing to do with the functioning of the system, but we made it compliant anyway. If we had done nothing, there would have been no dire consequences.

I understand that there are systems which may perform date calculations. But I also understand that I am not the only engineer who understands this and can analyze which systems require remediation and which ones don't.

But I won't propagate the myth that there are millions of "hidden" embedded "chips" which will all explode at midnight on January 1st.

-- You Knowwho (debunk@doomeridiots.com), October 28, 1999.

YouKnowWho = Poole. But he doesn't dare post under his own name.

-- BigDog (BigDog@duffer.com), October 28, 1999.


You are assuming that the people making the decisions are engineers and they are choosing to replace potentially problematic systems instead of the much more economical option of fix on failure.

As far as my clients go the "pointy haired bosses" are far too concerned with their stock options and losing them to be spending money on repairs that the odds tell them are unlikely to fail. They are almost as a rule doing FOF.

The ones who are right will be heros and get promoted.

The ones who are wrong will be looking for a new job.

The vast majority of sysytems will not have critical failures, but many will and it could seriously affect many companies.

-- John Beck (eurisko111@aol.com), October 28, 1999.

To set the record straight, I am not Poole.

I'm sure Diane can tell you that.

-- You Knowwho (debunk@doomeridiots.com), October 28, 1999.

Youknow who? Ok, so you're really involved with embedded systems. Why don't you address some of the obvious problems that have been stated. All you offer is how some easily fixable systems "might" have been fixed.

I'm not experienced in this field, but it seems from what I've read, that many "boards" are put together using a certain type of chip, or chip run, and integrated into a "system" to perform a dedicated function. That these "groupings" of chips, although never intended to care about the date or time, might put out completely wrong output -when the date reaches 2000. (I would take wrong out-put to also mean shutdown / not perform the intended task correctly.)

As Just said, some of these date capabilities were put on these chips just because they could. Who ever tested them (as a system)with the Rollover in mind back then? And since you can't concieve of why a device like that would care about the time/date, you or someone like you, would never think to test many systems that may infact have a problem.

As to not buying that "millions of systems"" will fail at the Rollover, what date would you choose? April 1st, 2000????

-- Gregg (g.abbott@starting-point.com), October 28, 1999.

Yeeeehaaaa! Look at all the Polly hand-waving in this thread. It's enough to make ya dizzy.

There are and will be problems with embedded systems, just not to the extent as was originally thought.

Is there supposed to be some meaning attached to that assertion? Is it like saying "It's not as bad as we thought, ma'am. You're husband isn't going to be quadriplegic. Just paraplegic"?


-- Lane Core Jr. (elcore@sgi.net), October 28, 1999.


What are you trying to say here? To the extent that our former ignorance is being rectified, it's due to extensive test results NOT finding problems. Of all those "billions" of embeddeds, the list of actual problems discovered is something you can read in 15 minutes. The list of problems likely to have serious impacts you can read in 2 minutes. The actual number of serious-impact devices appears to be small and the remediation manageable, based on what we've found so far, but our search isn't complete nor is our knowledge.

So this is like saying "ma'am, we've done a cursory examination of your husband and find nothing serious yet. We're still looking."

On the other hand, Lane, if you *do* have some hard information rather than just trying to mock "the enemy", where is it? Otherwise, you're in the uncomfortable position of saying "OK, everything we've looked at has been pretty good, but we haven't looked at everything yet, so things are sure to be awful and anyone who has the sense to recognize a clear trend is just waving their hands."

You surely know better.

-- Flint (flintc@mindspring.com), October 28, 1999.

pauline: Do you like to mudwrestle?

-- King of Spain (madrid@aol.cum), October 28, 1999.

Larry -- Good post.

Beck -- Bingo

Beck -- Bingo again (Dilbert is actually a documentary.)

YouKnowWho -- Come off of the 'I'm an engineer and I know better' attitude. I'm one too, and I sure don't. I did embedded systems for a number of years, until the distributed networking applications started paying better, and I don't know what will go wrong. Shoot, Beck brought up the point that there could be chips out there with corrupted microcode and I hadn't even *THOUGHT* of that!

Consider the conversation that Flint and I had on an earlier thread on this topic. The one about the hierarchy. My point there, and actually, I think he had it too, but I misread his answer, was that even stuff down at the bottom of the hierarchy can cause a shutdown. In particular, anything mandated by EPA or OSHA is going to do this.

And the bigger point I apparently failed to make is that *NOONE*, let's see, can that be further emphasized?, has an overview of this whole picture. The Engineer is over there with his two or three or four pieces of the puzzle and they all fit together and look okay, and he can't see the whole picture, or even what it will look like, but his stuff is okay, and you're over here with your three or four pieces, and they all fit together and look okay, and Flint is in the corner with his three or four pieces, and some are okay and some don't fit with anything else, and Beck is next door with his pieces, and I've got 5 or 6 that don't any of them fit with anything else, and the problem is it is a 65 billion piece puzzle, so we can't even get a hint from our pieces what the finished thing will look like. I realize that analogy is suspect, but this one is pretty close, if you ask me.

-- just another (another@engineer.com), October 28, 1999.

"Beck -- Bingo again (Dilbert is actually a documentary.)"

My professional life IS a Dilbert cartoon. I play Dogbert the consultant.

Most people I know who are not in the engineering field hold the profession in awe. The reality is something VERY different.

The majority of engineers these days are not qualified to sort buttons much less design products.

Most of the creative work is done by a very small minority.

But it pays well.

-- John Beck (eurisko111@aol.com), October 29, 1999.


Very astute observation regarding engineers of today. I assume you're referring to the younger ones. My oil industry boys pretty much think this is true in regards to refining processes. The designers are one thing but some of the "engineers" who supervise the daily operations are an entirely different matter. The stories I've heard in coffee shops by plant operators and other engineers would curl your hair concerning stupid engineers who nearly blow the whole d--n refinery apart, save for some astute operator/foreman who catches it in time to prevent disaster.

-- R.C. (racambab@mailcity.com), October 29, 1999.

You Knowwho,

"and I have always maintained that the Doomers wildly exaggerate the scope of the embedded problem."

OH, reeaaalllyyyy????? How about the oil industry?The scope of the embedded problems in the oil industry are under-exagerrated, pardner. I've posted extensively on this in various reports to this forum. I've yet to find anyone who can refute anything that my various sources from around the nation in the oil biz have told me. Embeddeds in other industries may be one thing, but in the oil/gas/petrochemical biz, it's quite another. Considering the scope of the problem that the oil industry may have only identified and tested 5% of the inventories IS alarming, to say the least. Type- testing is not acceptable for risk-management reduction. Unfortunately this is all the oil industry could do. SCADA systems are a nightmare in the oil fields, pipelines and refineries. Inability to access is another. I've got guys right now who've told me that they have non compliant systems that they know of and can NOT replace because, there's no parts, or to remediate requires a shut down that would become permanent, especially on oil wells where on older systems, once operations stop, you have to re-drill a different hole because the pressure is gone and the oil falls back to an inaccessible level.

Your statements on embeddeds tells me YOU KNOW NOTHING about the oil situation...and its shaping up that oil is the most critical of the industries right now. Serious problems indeed. Will the oil industy go completely down? It's certainly a distinct possibility and you get better odds at Vegas than the oil industry is getting right now. The odds are probably 50/50 for at least a 1 to 3 week downtime and maybe much worse. Let's not forget foreign oil here which supplies 56% of our domestic consumption and the CIA says these guys are in critical trouble. Remember the 1973-74 oil embargo crisis only saw a 2.2% drop in supplies... By CIA estimates it looks like we'll easily exceed that and I think its 50/50 for a 10% or greater supply loss for both foreign and domestic sources. It could be as much as 60 to 80% loss of supply!!! And I'm just speaking of Crude itself, not the refining aspects, or the pipelines. i.e. I'm talking about at the well itself. I'd say its about 90%-95% certain we'll at least see a 2.2%+ drop in supplies come January for probably a minimum of a month and more likely a lot longer. The ramifications of even these minimum amounts will be severe on our economy and that of the whole world. So, think about that the next time you open your keyboard to debunk the doomers. It won't be a 10 but a 5 (even a 3 or 4) will have very serious ramifications on our economy at least in the short term.

-- R.C. (racambab@mailcity.com), October 29, 1999.

No One,Obviously you've not read the CIA report or the IMM report... and apparently you must have missed the IMF and the BIS statements regarding the world situation.

The BIS is warning of international banking clearance problems in the system... cascading failures and cross defaults are now considered to be very, very real possibilities according to the BIS.

Then there is the CIA report on the world scene... Have you bothered to read the assessment of the nations that supply our foreign oil??? Check this out: http://www.iea.org/ieay2k/newlinks/imports.htm

The IMF has also issued veiled warnings especially regarding SE Asia, Latin Am, Africa and Russia.

I believe it was the IMM that had also issued some serious concerns if not alarms over the oil situation especially in the OPEC and Latin American suppliers.

-- R.C. (racambab@mailcity.com), October 29, 1999.

R.C. Thanks for the reality check. But for those of you who still trust in the 1-10 scale, I warn you: anything more than a 5 will rapidly escalate to a 9 or 10 because the credit/debt balloon inside which we live will implode ferociously. Too many fail to realize this. When you have a house of cards, you don't need an earthquake to bring it down. only the pitterpatter of little feet. There's 700 trillion in debt out there in the world today; the US stock markets are overvalued probably 400%. Take a look at figures on household installment debt, government-guaranteed loan programs, underfunded pension debt obligations, Social Security 'trust' obligations, the $80 trillion in derivatives globally, etc. A frugal, healthy, balanced, financially secure society might weather Y2k with mere belt-tightening. The world today can't. What's the USA going to do if things are bad? Deficit spend? We're already 6 trillion in debt!!!

-- Stantheman (Heidrich@presys.com), October 29, 1999.

Flint, To the extent that your ignorance is being rectified formally, I say you are the eptiomy of ineffectualnous.

-- Gregg (g.abbott@starting-point.com), October 29, 1999.

"Doomers wildly exaggerate the scope of the embedded problem. They assume that if it is an embedded system then it must have a clock which cares about the year and somewhere embedded software is using the year incorrectl"

Say what? Where in hell did you get this idea? Listen genius, it's just like mainframe or PC programs. Some, many in fact, don't do anything with date or time. So what? Many do. Many that do critical things do.

Go blow your smoke somewhere else. You're avoiding the point. Just another distraction.

Tick... Tock... <:00=

PS - I'll read the rest of this thread later. I just wanted to make a point with "You Don't Know Shit" before I go to work. <:)=

-- Sysman (y2kboard@yahoo.com), October 29, 1999.

You KnowWho,

Hey asshole, I resent that remark about being an old Mainframer. It's true I've done that route but if you really read my post you would've noted that I worked with mid range and the most modern distributed systems using multi threaded intra net stuff. In my opinion if all we had were mainframes we would be better off. If anything networking and distributed systems amplify the problem, Mr. I'm not suprised about mainframers panicking. So why don't you relax some more yourself.

I'll tell you what.. I think your a light weight. And I'll prove it. See below a chunk of code that has a Y2k caused Problem. See if you can point it out for us all. This is only a small tiny chunk of one app that has over 50,000 lines of code. Maybe Hoff's function point analysis could solve it.

/* PIPE SERVICE FUNC's GROUP #1: This is a set of WIN32 functions that */ /* have been developed to service the pipe communicating with the MFDC */ /* Acquiisition module, and the pipe or pipes that service the Oracle */ /* Ring processing engine. Ack NaK and multi ring processor logic is */ /* controlled thru an intermediary global structure for buffering rings */

void MFDCPipeServiceThread(){ DWORD PipeMessageLength=0,Rcount=0L; char PipeXBuff[PIPE_LEN], StaleRing[PIPE_LEN],Caption[80]; struct _PROCESS_INFORMATION ProcessInformation; struct _PROCESS_INFORMATION *lpProcessInformation=&ProcessInformation; STARTUPINFO si; STARTUPINFO *psi=&si; int OraInstanceIndex;

sprintf(&Caption[0],"PRMS->MFDC Ring Acquisition Process"); si.cb=sizeof(STARTUPINFO); si.lpReserved=NULL; si.lpDesktop=NULL; si.lpTitle=&Caption[0]; si.cbReserved2=0; si.lpReserved2=NULL; si.dwFlags= STARTF_FORCEONFEEDBACK || STARTF_USESTDHANDLES; si.wShowWindow=SW_MAXIMIZE;

/* Step #1: Setup a Named Pipe for InterProcess Communications with the */ /* MFDCAQC module we are about to initiate, then start the process.. */ AcqPipeHandle=CreateNamedPipe( MFDCACQPIPENAME, // address of pipe name PIPE_ACCESS_DUPLEX, // pipe open mode PIPE_TYPE_MESSAGE| PIPE_WAIT, // pipe-specific modes 1, // maximum number of instances PIPE_LEN, // output buffer size, in bytes PIPE_LEN, // input buffer size, in bytes 0, // time-out time, in milliseconds (LPSECURITY_ATTRIBUTES) NULL ); // address of security attributes structure

if (AcqPipeHandle==INVALID_HANDLE_VALUE) { CloseHandle(AcqPipeHandle); MFDCthreadID=0L; return; }

AcqMutexHandle=CreateMutex(0,FALSE,"MFDCACQMUTEX"); if (!AcqMutexHandle){ CloseHandle(AcqPipeHandle); MFDCthreadID=0L; return; } // Setup A Mutex for the ReAnimator Module

if (!CreateProcess( NULL, // Application Path+Name MFDCACQUISITIONPROCESS, // Command Line NULL, // Process Attrib. NULL, // Thread Attrib. FALSE, // Inherit Handles CREATE_NEW_CONSOLE| NORMAL_PRIORITY_CLASS, // Indicates a New console.. NULL, // Use My ENV Var's NULL, // Process Working Dir psi, // Startup Info Structure lpProcessInformation) // Process_Info Structure ){ CloseHandle(AcqPipeHandle); ReleaseMutex(AcqMutexHandle); CloseHandle(AcqMutexHandle); MFDCthreadID=0L; return; }


if (!ConnectNamedPipe(AcqPipeHandle,(LPOVERLAPPED) NULL)) if (GetLastError()!=ERROR_PIPE_CONNECTED){ TerminateProcess(MFDCProcessHandle,0); CloseHandle(AcqPipeHandle); ReleaseMutex(AcqMutexHandle); CloseHandle(AcqMutexHandle); MFDCProcessHandle=0L; MFDCthreadID=0L; return; }

/* Setup a thread that will monitor the health of the Process we just */ /* Created using a WaitOnObject() for the Mutex we created. The mutex */ /* will Be opened by the Acquisition Process spawned and be closed .. */ /* when the process terminates, at which time our montior thread will */ /* fall thru and release and close the mutex, stop this thread and Re */ /* start the thread. All of this is naturally providing AutoStart Set */ if (RegistryBuff.AcquisitionAutoStart=='Y') {

MFDC_CPRthreadHandle=CreateThread(0,0, (LPTHREAD_START_ROUTINE) MFDC_CPRThread, 0,0,&MFDC_CPRthreadID);

} /* Must be Done After ConnectNamedPipe above to assure the ACQuisition */ /* Was Up and running to begin with before we start monitoring the Mutex */

/* Step #2: We will Loop infinitely within this thread after creating the */ /* Pipe and Acquisition process, interacting with the acquisition process */ /* to acquire the rings, until such time as the main process copies STOP */ /* into memory at the Address of the command string passed to this thread */ while (strcmp(MFDCCommandString,"STOP")) {

/* First thing, check for any rings left laying around on Failed */ /* ORAPipeServiceThreads and attempt a recovery.. */ memset(&StaleRing[0],'\0',sizeof(StaleRing));

for(OraInstanceIndex=0; OraInstanceIndex if (!ORAthreadID[OraInstanceIndex]&& RingXbuff[OraInstanceIndex].Ring[0] =='Q' && RingXbuff[OraInstanceIndex].RingAck! ='X'){

memcpy(&StaleRing[0], &RingXbuff [OraInstanceIndex].Ring[0], strlen(&RingXbuff [OraInstanceIndex].Ring[0])); memset(&RingXbuff [OraInstanceIndex].Ring[0],'\0',PIPE_LEN); break; }

} // End FOR check All OraInstances for Rings Whose Process has died // and do a pushdown on the RingProcessor Buffer Stack to next Active


if (!strcmp(MFDCCommandString,"PING")) strcpy(&PipeXBuff [0],"PING"); else strcpy(&PipeXBuff[0],"ACK"); PipeMessageLength=strlen(&PipeXBuff[0]);

if (!WriteFile(AcqPipeHandle,PipeXBuff,strlen(PipeXBuff)+1, &PipeMessageLength,(LPOVERLAPPED) NULL) ){ CloseHandle(AcqPipeHandle); ReleaseMutex(AcqMutexHandle); CloseHandle(AcqMutexHandle); MFDCProcessHandle=0L; MFDCthreadID=0L; return; }

if (StaleRing[0]!='\0'){ memcpy(&PipeXBuff[0],&StaleRing[0],strlen(&StaleRing [0])); PipeMessageLength=strlen(&StaleRing[0]); } else { ReadFile( AcqPipeHandle, &PipeXBuff[0], PIPE_LEN, &PipeMessageLength, (LPOVERLAPPED) NULL);

PipeXBuff[PipeMessageLength]='\0'; } /* At this point we get data either by finding and reprocessing a */ /* StaleRing or on a new Pipe Feed from the MFDC aquisition Module */

if (!strcmp(PipeXBuff,"DROPPING")) { memset(MFDCCommandString,'\0',10); break; } // If MFDCACQ.EXE indicates to us its dropping, exit the thread // by falling out of the Loop. This will Allow restart. if (PipeXBuff[0]=='?') continue; // Go to top of loop and ACK this pulse // The Acquisition Program doesn't have // input but wants a periodic handshake

CheckPointBufferToDisk(); // At this point have valid ring. Checkpoint any unprocessed rings // to Disk Buffer at first opportunity which would exclude this one. // However, if we never get back to Write on Pipe to request another // ring, .cnt file would not have been updated on Acquisition side, // so this latest ring will be restart point @.TOE or 3B2 again. Also // OraPipe side checkpoints to disk when processed. So we are tight.

for (;;) {

if (!strcmp(MFDCCommandString,"STOP")) break; if (!memcmp(&MFDCCommandString[0],"PING",4) &&!StaleRing[0]) { strcpy(MFDCPingResponse," Interface Up.."); memset(MFDCCommandString,'\0',10); if (!memcmp(&PipeXBuff[0],"PONG",4)) break; } // If our thread exists when PING'ed we are up because // our pipe is intact.. Otherwise we would have died. // Clear MFDCCmdStrng when processed here because thread is // multi tasked. So only it knows when he's done.. PONG is // MFDCACQ return value, so don't pass to Ring Processor.

for(OraInstanceIndex=0; OraInstanceIndex if (ORAthreadID[OraInstanceIndex]&& RingXbuff [OraInstanceIndex].RingAck=='X'){ // Use Previously ACKed array slot

memset (&WorkingRing,'\0',sizeof(struct RingBuff)); memcpy (&WorkingRing.Ring,&PipeXBuff[0],PipeMessageLength); WorkingRing.Ring [PipeMessageLength]='\0'; WorkingRing.RingNum=++Rcount; // Build Working Ring image first then load // Buffer in one stmt to prevent other thread // from Pulling Ring Prematurely

memcpy(&RingXbuff [OraInstanceIndex],&WorkingRing,sizeof(struct RingBuff)); memset(&PipeXBuff [0],'\0',PIPE_LEN); break; // Exit For To pause for Acked Ring } // End IF found Acked Ring

// Loop internally While Waiting for a Ring to Be ACKED By // RingProcPipe Thread, or STOP Issued by USER via GUI. // In Multiple Ora Thread situation use First ACK.. } // END internal loop thru all instances

Sleep(10); /* Book recommends Sleep to free CPU cycles */ if (PipeXBuff[0]=='\0') break;

} // END LOOP till Ring Processed.. by ORARINGPROC side.

} // End While !STOP

/* Step #3: Once we've fallen out of the above loop due to a STOP command */ /* we will pass the stop request thru the pipe to the acquisition module */ /* to cause a shutdown, close the pipe and terminate the thread instance. */ memcpy(&PipeXBuff[0],"STOP",4); PipeMessageLength=4; memset(MFDCCommandString,'\0',10); // Must Clear Cmmds when processed here // because thread is multi tasked. So // Only he knows when he's done..

WriteFile(AcqPipeHandle,PipeXBuff,strlen(PipeXBuff)+1, &PipeMessageLength,(LPOVERLAPPED) NULL);

ReleaseMutex(AcqMutexHandle); CloseHandle(AcqMutexHandle); CloseHandle(AcqPipeHandle); MFDCProcessHandle=0L; MFDCthreadID=0L; return;


/* The Following procedure will sleep a designated time first and */ /* subtract its interval from the Thread Instance time remaining till */ /* the MFDC Acquisition Process Pipe Service Thread TimeOut occurs */ /* At that time an event triggering a shutdown and restart will occur. */ /* Consequently the Timed thread is responsible for continuing to inc- */ /* rement its time remaining at critical junctures or suffer a time out.*/

void MFDC_CPRThread(void){ DWORD ExitStatus=0;

WaitForSingleObject(AcqMutexHandle,INFINITE); // Since the Connection on Named Pipe from // MFDCPipeServiceThread <-> MFDCACQ.EXE is // established after the OpenMutex within // MFDCACQ, and this thread is kicked off // After ConnectNamed Pipe, we are sure the // Wait will function immediately here..

ReleaseMutex(AcqMutexHandle); /* Since Acq. Process is Dead in order */ CloseHandle(AcqMutexHandle); /* to Get Here, Clear Mutex before Re */ /* Starting the MFDCPipeService for next*/ /* Goround in case PipeService thread */ /* Doesn't Close this guy.. */

/* May have been turned off while I was waiting ... */ if (RegistryBuff.AcquisitionAutoStart=='Y') { if (MFDCthreadID){ strcpy(MFDCCommandString,"STOP"); Sleep(100);

if (MFDCProcessHandle){ if (GetExitCodeProcess (MFDCProcessHandle,&ExitStatus)){ if (ExitStatus==STILL_ACTIVE) TerminateProcess (MFDCProcessHandle,0);

} /* Harsh Termination if Still Active at Time expiration */


} /* END if MFDCProcessHandle indicates still in memory */

if (MFDCthreadID&&MFDCthreadHandle){ TerminateThread(MFDCthreadHandle,0); MFDCthreadHandle=0L; MFDCthreadID=0L; } /* Use Unpreferred method for thread termination */ /* if still active but Acquisition dead.The thread*/ /* will clear properly if it senses broken pipes */ /* however, since we create the pipes if the Acq*/ /* Process is terminated by say a Ctrl-C we would*/ /* Hang on read until timeout thinking Acq may be */ /* Processing and not respond to "STOP" So Kill it*/

} /* If old Thread Still Active, Stop it First */

Sleep(4000); // Delay for old threads and process to clear..

MFDC_CPRthreadID=0L; MFDCthreadHandle=CreateThread(0,0, (LPTHREAD_START_ROUTINE) MFDCPipeServiceThread, 0,0,&MFDCthreadID); /* Apply CPR to the thread (Bring it Back to Life) */

} /* End if AutStart Still set to 'Y' */

else MFDC_CPRthreadID=0L; return;


-- Slammer (BillSlammer@Yahoo.Com), October 29, 1999.

Slammer, do you have this code in .txt format? I'm a VI guy and this formatting is confusing. Also, how are you implementing your timers in this system?

-- You Knowwho (debunk@doomeridiots.com), October 29, 1999.

Dear YouknowWho,

If your a Unix Guy and can read 'C' you should know C++ win32 API for NT which is 95% similar if your not using foundation classes. The Timers are normal and contained within threads created in the above code in conjucntion with Mutually excluded objects for interprocess control and health monitoring. This post is in Text mode, and you should be able to know where the carriage returns go. Slammer, and P.S. don't call me an old mainframer again.

-- Slammer (BillSlammer@Yahoo.com), October 29, 1999.


I've never used any Microsoft libraries. We have had homegrown timer implementations and embedded RTOS. So is there an issue with NT timers?

-- You Knowwho (debunk@doomeridiots.com), October 29, 1999.

You KnowWho,

The answer is that it was not a problem with Microsoft .LIB libraries or functionality. Rather it was the way in which timers acted in conjunction with Hardware which was changed for Y2K implementation. In truth a somewhat baited question, and yes it is a timing issue dealing with syncronicity of threads timing processes dealing with data acquisition. It was truthful though because we did result in a production failure in the Y2K remediated system because of Oracle latency in the remote server being precived by a faster machine as being longer and causing a repeated timeout rollback situation resulting in data loss. You see my point, if you find it confusing, don't you think an IV&V team looking at this code will also. Because that fragment is exactly what was given to them, in a larger context of code. They looked it over but my experience is they don't look past the "is there any place in the code with a date sensitive variable?" (sorry IV&V guys but you all didn't get the hardware context in real time internals). There was no cohesiveness in the test plan to production rollout coordinated with hardware acquisition because while the code tests where going on TPTB are still deciding what type of hardware to use while demanding certain compliance numbers to meet mandated deadlines. They did the best they could however and did turn up bugs... Look, in this situation all the IV&V in the world wouldn't catch it. The stuff was fixed on failure, but these are complex issues. I'm not a doomer and I'm sure we're going to get thru this but I also sure we're going to have big problems. I'm not hysterical either. If anything I'm a debunker also as I don't believe we are facing TEOTWAKI. My money's still in the bank, but you better believe I checked out my banks homepage, Y2K statements and the SEC 10-Q before I left it there. I think things are not as good as people believe and those denialist will be hurt by this. Take for example this scenario.. A company has a payroll system, an Accounts Recievable system, an Accounts Payable System and a Budget system. They claim all systems are fixed but the Budget system which will be installed in December. To the public eye, 3 out of 4 or 75 percent are fixed. The contingency plan is that if Budget fails they will use the previous fiscal year budget so they remove it from critical list. Wha la, they are now 100% compliant. Even did volume and regression tests. along comes the Big enchalda (2000/01/01) and the Budget system fails. Problem is that since it was still undergoing remediation when the others were deemed compliant that no integration or end to end test was run. So it fails in production which blows down other applications on the second business day when the files were left unprocessed. Simplistic, yes. But if there are more complex scenarios there are more complex problems than this, and more likely not to be spotted than this, don't you think? The fix is in, but incomplete due to inadequate testing. I've already experienced this. Code walkthru's and unit level debugging on a project workbench are no guarantee of compliance without end to end and parallel testing. All people posting here have incomplete viewpoints, myself included. The Business programmers don't understand the ladder logic in the PLC's or work at the microcode level. The assembly guys don't always understand the complexities of large scale business systems and real time data aquisition and DBMS for expert systems governing plant operations can be like rocket science. As programmers we are always forced to live by preparing for worst case scenario's and to err on the side of accuracy. Thats why the Programmers tend to be more doomer than the managers, or politicians, who are eternal yes men. Lets all adopt a realistic viewpoint and take some precautions, especially non- techies. A healthy skepticism is warranted here.

-- Slammer (BillSlammer@Yahoo.Com), October 29, 1999.

Lucent had to replace or upgrade about one third of the control elements in their factories. That information was in one of their 10Q reports.

-- Dave (dannco@hotmail.com), October 29, 1999.


I am not saying the potential consequences in the link you provided are lies. I am saying "Where they fixed? Were there solutions?" The answer is "Yes!".

The original issue was the questioned intractability of identifying and remediating emdedded systems. The link provided proves that embedded systems are not insoluble problems.

The potentially insoluble problem is human management and incompetence which has zilch to do with embedded chips and more to do with money and kudos.

I expect the utilities to stay up in the UK, they may stumble but I do not believe they will fall. That is what people are worried most about on this forum - will they have water, heat and light.

What happens in Asia and South America is secondary. I do expect them to have greater problems and I do expect an energy crisis akin to the 1970s - did the 1970s energy crisis result in wastelands and migrant hordes?

If the supply lines become constricted from foreign countries, we'll get through. We Brits have seen it all before - petrol station queues, enforced power cuts, rationing dring the War, little money.

Not exactly a sunday picnic but great at showing the heights the human spirit can rise to.

As for doomer and polly camps - believe it or not but it is possible to take a middle position. Nothing is black and white in this Y2K speculation scenario.



-- Shuggy (shimei123@yahoo.co.uk), October 29, 1999.


The funtion point analysis was to attempt to quantify and compare something that's been all too obvious to anyone willing to look. That the system replacements and modifications in the past year have generated enormous amounts of system failures and errors, and even business disruptions. And they've been handled, some better than others. But no meltdown. No collapse. No TEOTWAWKI. Not even a recession.

But don't feed me this "Polly's don't get it, they haven't got the years of hard knocks in the field" crap. Yeah, I do Project Management now. But the reason I'm still working when others aren't is I also do development. Been in the trenches, systems programming. Built and maintained a system of 1800 Series/1 systems in EDL on EDX, nightly connects for updates and data feeds to MVS. Wrote 5250 datastream applications to allow TELON to execute applications on AS/400's (remember when they were called SilverLake?). Even had my hands deep in the guts of TELON's code generator, which was in essence a Macro-Assembler on MVS. Started down the MVS sysprog route, when I jumped to SAP. Right decision? Depends on how you measure, but I've got a few fairly crass ones that say YES.

"Hard Knocks in the field"? Bullshit. Been living them the last four years, in too many locations to count. Doing it now, in fact.

Oh yeah, and don't bother dumping a load of unformatted C++. Doesn't prove a thing. I could dump an ABAP/4 program here, but what would it prove?

-- Hoffmeister (hoff_meister@my-deja.com), October 29, 1999.

More than some function point analysis, it proves that code is complex enough that IV&V guys don't catch everything and thier the ones who we spent all the money on to save us.

-- Slammer (Billslammer@Yahoo.com), October 29, 1999.

Also Hoff guy, you said;

"Oh yeah, and don't bother dumping a load of unformatted C++. Doesn't prove a thing. I could dump an ABAP/4 program here, but what would it prove"

It would prove my point, if you don't think I would understand it what makes you think the contract guys in the code factories would. That doing code walkthru's in esoteric langauges won't spot all problems, only extensive broad and deep testing involving unit, volume, intergation and parallel catch it all. That the failure rate has been increasing and will continue to increase as the pot continues to boil. The IV&V teams which have been thrown at this stuff in the fortune 1000 and costing billions in labor have been doing just that, for the most part code examination with some superficial tests (Not end to end or parallel) and that comprises the extent of work on a lot of systems deemed COMPLIANT. So you support my point exactly. Thank you. I believe that FP analysis can be skewed way out of line by the sampling of cases involved, the objects and are only worthwhile in systems of a small number of permuations of languages, objects and artifacts. An analysis of this type will always be called into question.

-- Slammer (BillSlammer@Yahoo.com), October 29, 1999.

Naw, Slammer. The ones who will "save us" are the same ones who've been "saving us" for the last 10, 20, 30 years.

-- Hoffmeister (hoff_meister@my-deja.com), October 29, 1999.


Nobody knows all languages.

Yes, the failure rate due to date problems will rise.

The failure rate due to replacements and modifications has already peaked.

Comparing the two was the point of the analysis.

-- Hoffmeister (hoff_meister@my-deja.com), October 29, 1999.

Beck --

I know what you mean about the profession. Of course, from my 'advanced' age viewpoint, most of these should still be playing with blocks somewhere. (Unfair, I know, but still.)

What is worse, most of the Y2K remediation I've seen done was done by 'summer interns', (read kids on summer vacation), who neither understood the problem, understood the implications of the errors, understood the process, and had no guidance from a grownup. (This is not calculated to inspire me with even the same warm feeling I'd get from wetting my pants!

As to the point about the minority, it is true, but the minority is shrinking every day. I haven't written a line of code in two months. My position is supposed to be a senior technical one, but I spend my time writing memos, answering email, attending meeting, filling out forms, traveling to meetings, publishing minutes of meetings, reading published meeting minutes, creating and updating project plans, and other assorted management details. And most of the senior guys that I have worked with over the years are doing much the same.

-- just another (another@engineer.com), October 30, 1999.

After spending about 30 years with several mainframe and mini companies, watching joe sixpack and the people he elects, and lurking on this forum for 18 months, I will continue to prepare for the worst.

Thank you all for expressing your opinions.

-- Started in 1964 with IBM (Long@TimeLurker.com), October 31, 1999.

Moderation questions? read the FAQ