Crouch-Echlin and Time Dilation

greenspun.com : LUSENET : TimeBomb 2000 (Y2000) : One Thread

I posted this on euy2k.com's forum and wanted to share it with the group...

I picked this up from Cory Hamasaki's DC WRP [...] --------------------------------------------------------------------- "TD, Time Dilation, the Crouch-Echlin Effect or CE/TD is an elusive but serious aspect of the larger Year 2000 issue that was discovered by Jace Crouch and Mike Echlin and first reported on the newsgroup comp.software.year-2000.

Specifically, TD refers to the time and date instabilities that will occur in the year 2000 and beyond on some personal computers and some embedded systems. These time and date instabilities occur when BIOS time and date routines improperly access a non-buffered RTC during startup, resulting in a personal computer or an embedded system that has difficulty calculating or retaining the correct time and/or date in the year 2000 and beyond.

On these systems the time and/or date will intermittently and abruptly "leap" forward (or occasionally backward) when the system is powered up, not only causing the system to display and store an incorrect time and date, but also leading in certain instances to the failure of com ports and hard drives, cmos scrambling, the OS ceasing to function properly because it is suddenly operating at a date beyond its original design parameters, and occasionally resulting in a system that will not boot up, or even make it out of POST.

These time and date instabilities can occur after the year 1999 because the BIOS then takes longer to access and process data obtained from the RTC, and on systems with a non-buffered RTC the BIOS may do this while the data is incorrect. In the era 20xx, a non-buffered RTC accessed shortly before the update flag is set may return bad data because the time and date calculations take longer than 244 microseconds in the era 20xx and the calculations may extend into the period when the RTC is in update status.

If this occurs when the RTC is accessed during POST, Time and Date instabilities can occur not because this incorrect data used to calculate time and date for the software clock, but also because the incorrect time and date may get written back to the RTC/CMOS, thereby sustaining the time and date errors until the RTC/CMOS is reset by the user or by remediative software.

Occasionally (but devastatingly), the events that result in TD also result in CMOS corruption and/or hard drive boot sector corruption. For a detailed description of how this works, see . [...] -Jace Crouch --------------------------------------------------------------------- My questions;  are utilities taking into account this effect when assessing their systems?  can TD or CE/TD create problems for utilities? Thanks! ---------------------------------------------------------- Asked by Michael Taylor (mtdesign3@aol.com) on October 29, 1998.

Answers

_Any_ IBM PC/AT based embedded systems will almost certainly cause problems for _any_ application in which it is used. I have yet to hear of any system involving more than 3 PCs that did not have to be remediated. In other words if you have any moderately sophisticated embedded system controlling or monitoring _any_ process that involves more than 3 IBM PC/AT compatible boxes the chances are greater than 90% that the system will fail. Period.

Don't pay any heed to these people who say "no problem". I have found that they fall into one of three broad catagories; they are either just plain ignorant and like the sound of their own voice and spouting off on the 'net, or they are not completely ignorant but in denial, or , and the most dangerous class, they do understand the problem but they are over specialised and do not recognise nor understand the scope of the problem.

For a list of realtime clock chips that HAVE the Y2K bug and will NOT be fixed check here: http://www.mot-sps.com/y2k/yr2knote.html and here: http://www.mot-sps.com/y2k/black_prod.html

These chips are used in hundreds of millions of IBM PC/AT compatible embedded systems (and hundreds of millions of IBM PC/AT compatible desktop systems too). These chips are also used in hundreds of millions of other embedded systems of which I know absolutely nothing. I am an IBM PC/AT embedded systems expert (But not overly specialised :-)

Check here to see what an IBM PC/AT compatible embedded CPU board looks like: http://www.ampro.com/products/coremod/cm-p5i.htm That board measures 3.5" x 3.25" you can put them everywhere, and people do. Also notice what they say about Y2K (hint: nothing).

Check here for an excellent page describing Y2K problems and embedded PCs: http://www.qnx.com/support/y2k/index.html (it also gives a pretty good feeling for the complexity of the issue)

Check here for a "short" list of industries and applications that use embedded PCs: http://www.qnx.com/realworld/index.html and here: http://www.qnx.com/company/compover.html#Customers

As I have mentioned many times before there are dozens of embedded OS manufacturers and dozens embedded hardware platforms for the IBM PC/AT compatible market alone. There are also literaly hundreds, maybe thousands of custom/in-house/proprietary embedded kernels and OS's too.

Something else to think about, once you've got your head wrapped around this part of the story remind yourself that the embedded IBM PC/AT compatible market is a small fraction of the overall embedded systems market and hence a small part of the embedded systems Y2K problem.

See my comments in the thread "Gartner Report (98-10-12)" in this forum for more info. http://www.greenspun.com/bboard/q-and-a-fetch-msg.tcl?msg_id=000BzM Feel free to email me if you'd like further clarification on any of this.

Finally, don't let anyone tell that there is no problem with embedded PCs. They simply do not know what they are talking about.

Regards, Andrew J. Edgar Manager, Systems Software Centigram Communications Corp. Disclaimer: I speak only for myself and from my personal experience. I do not speak as any kind of representative, nor spokesperson of my employer. Answered by Andrew J. Edgar (ajedgar@centigram.com) on October 30, 1998. ---------------------------------------------------------------------

This may be why we hear that systems will have problems well into the year 2000 and it will be like a rolling storm of problems after problems after p

-- Michael Taylor (mtdesign3@aol.com), October 30, 1998

Answers

I've been trying to tell people about this for about 6 months. Its even more invisible than the desktop/mainframe application stuff. It is second in importance only to primary electric power generation. If the infrastructure falls this could be the main culprit. So many systems, so diverse and hidden.

-- R. D..Herring (drherr@erols.com), October 30, 1998.

I can't find anyone supporting these findings, so for now I will watch and wait. I seems odd to me that the guys who discovered this are also selling the test, and the cure. If this were a widespread problem, you'd think they would be able to give away the test, and still sell the cure. A couple of 286 or 386 PCs going whacky after 1/1/00 is not going to keep me awake at night. I need more than what I've seen.

The Compaq/Digital "confirmation" of the effect is less than convincing. No web site, no research data, just an email which I've seen reproduced a couple of places. Until we have more widespread confirmation, and more solid data, I'm not going to be too concerned.

-- Mike (gartner@execpc.com), October 30, 1998.


I did a search at Infoseek with "Crouch Echlin" as the search parameters. I found one (1) web-site with any reference to this effect, and that was http://www.nethawk.com/~jcrouch/dilation.htm which is Crouch's own site. There were links there to Compaq/Digital's site. When I followed the links I found no mention of this so-called effect. I used the search facility on the Compaq/Digital site with both "time dilation" and "Crouch Echlin" and again got no hits mentioning this effect. Crouch's site also had a link to "their favorite critic" which was a c.s.y2k posting Charles Reuben, who argued that they did not follow proper scientific/engineering methods investigating the problem and had wasted every one's time with an unproven theory. The Crouch site also said that Digital was accepting orders for his "TD" testing software, but only via e-mail. It also offered another company's site for orders less than 50 copies. All very suspicious.

-- Buddy Y. (DC) (buddy@bellatlantic.net), October 30, 1998.

Buddy and Mike, try this site:

http://www.intranet.ca/~mike.echlin/bestif/tdpaper.htm

At no point is any wild claim made and there is no request for money for a fix. In fact, a request for help in testing and augmenting the findings is made.

Like I said, I'm not a computer tech. I don't even use a "PC". I'm on a Mac. Why is it I can see exactly why this is possible and what could happen to systems that suffer from such a disruption?

The serious nature of problems such as this and the fact that these problems are very, very obscure has me very worried. The "Jo Anne Effect" was something that wasn't a mainstream concept or even widely known until earlier THIS year. The embedded systems issue didn't become a mainstream y2k problem until earlier this year.

Oh, and if you visit the site above it outlines exactly what occurs and how you yourself can also conduct similar tests.

How can you discount this problem if you have just heard about it or if you yourself have never tried a similar series of tests?

Should this kind of reaction only further my fears that myopic vision clouds technical people from the broad scope of y2k?

I'm with R.D. and I'm scared stiff. ==========================================================

-- Michael Taylor (mtdesign3@aol.com), October 31, 1998.


So, basically, another contingency plan is to have a "large" back-up supply of paper, pencils and pencil sharpeners? I sill love my Mac.

-- Diane J. Squire (sacredspaces@yahoo.com), October 31, 1998.


I'm very surprised by the skepticism. Following is a link to a well written explanation of TD. It also includes data from testing AND the name of the Compaq/Digital person responsible for further investigation.

http://www.intranet.ca/~mike.echlin/bestif/tdpaper.htm

Barry Pardee Americas Year 2000 Expertise Center Manager Compaq Computer Corporation Barry.Pardee@digital.com http://www.digital.com/services/nsis/adi/y2k.htm http://www.software.digital.com/year2000 I don't have any problem with Crouch/Echlin devising a tool for sale. They are professional software people! Its how they make a living.

I recently completed my own tests on a local shop 386 and Gateway 486. Both exhibited time dilation after one week! I'm testing a Pentium 100 now - no problems so far. This is a real effect that WILL kill people. Its more than a date problem. It also causes unpredictable losses of serial I/O. I haven't figured out how yet.

Don't you see what this means? Some of these boards aren't firmware fixable. But its not always clear which ones. You have to scrap and replace to be sure. And thats just not happening out there in manufacture/industry land. Companies are concentrating on their billing systems first, inventory and payroll second then everything else somewhere at the end of the train. I predict there will be a lot of "compliant" companies with dead production by March 2000. (Not to mention a few explosions, fires etc.)

-- R. D..Herring (drherr@erols.com), October 31, 1998.


I've read the Mike Echlin and Jace Crouch stuff about TD. I've read the copy of the email from Barry Pardee.

I searched the Digital/Compaq sites for anything related to "TD" "Time Dilation" "Crouch" or "Echlin" Nothing there. Are they sure enough about this thing to publicize it?

I'm no electrical engineer or computer scientist, so alot of the explanations for the phenomenon are beyond me. But I'm reserving judgement until there is something more to sink my teeth into.

The email from Barry Pardee at Compaq/Digital says that there may be a problem with some unbuffered RTCs. "Some". From what I understand, these unbuffered RTCs were only common in older PCs (286s). That's not enough to even justify buying a single copy of TD Tools, let alone a copy for every PC in our office. I have no idea how often these are used in other devices, but I haven't seen any test results to prove or disprove the possibility of problems outside PCs.

After I posted a reply about TD to another thread here about ten days ago, I got an email from Charles Reuben. He's a major critic of the TD believers. Since I'd prefer not to post any part of that message without his permission, I hope he reads this thread and responds in person. I can send you to this link: which you may have seen already.

As for TD tools, I have no problem with programmers making money from their software. It just seems that if this was a fairly common problem there would be plenty of money to be made just from the fix, without having to sell the test. TD tools doesn't tell you if your PC *will* suffer from TD, only whether or not it has an unbuffered RTC, which means it *might*. TD Tools will also fix the problem, if it occurs. According to Mr. Reuben, problems with unbuffered RTCs can be addressed with RighTime's BIOS solution also. He seems to think there might be more to this than just selling software.

I dunno, but I'll keep watching.

-- Mike (
gartner@execpc.com), October 31, 1998.


close

-- Mike (gartner@execpc.com), October 31, 1998.

Mike, I've read Reuben's rhetorical rantings about TD being either "nonsense" or "due to power spikes". Thats just plainly nuts. He offers zero evidence for his supposition about internal power fluctuations. He did not do any experimentation of his own. He just huffs and puffs about the "scientific method". Well part of the scientific method is experimentation and yet he seems allergic to actually getting his hands dirty. However, Reuben then goes on to say that Rightime BIOS fix can fix the problem. Now think about this, first there is no problem, then there is a random problem caused by "power spikes", and finally, don't worry about it, just fix the BIOS. Thats the strangest argument string I've ever encountered. But if you take his ending, fix the BIOS, thats MY POINT. However, you get there, you have to examine and fix each 286/386/486 controller out there. (And thats a lot!.)

In any case Crouch Echlin published their raw data, including internal registers as a function of time. All I can tell you is round up a half dozen or so 386s and 486s and try it. See what happens. You won't be satisfied until you see the results yourself. Yes, lack of buffering may be part of the problem. It seems to have more to do with a "window" of about 240 msec in which the results of certain registers may be inaccurate if queried. Therefore, this will happen in increasing order of frequency pentium, 486, 386, 286. (As an aside, I was about to buy a Sub Zero refrigerator for a kitchen re-do. On reading the specs, I found out the main controller was 486 based! Decided to hold off for this year.)

As for Compaq/Digital not having it on their web site, I'm not surprised. Its hardly the thing they want to advertise. I'll try to call Barry and find out what info the corp is willing to make public.

-- R. D..Herring (drherr@erols.com), November 01, 1998.


Moderation questions? read the FAQ