Y2K is NOT a maintenance problem!

greenspun.com : LUSENET : TimeBomb 2000 (Y2000) : One Thread

Y2K is expected to cause problems quite simply because of those systems (almost all of them that use dates) that are DESIGNED with the assumption that they are operating in the 20th century.

Maintenance is normally a process that maintains or keeps in operation a machine, process, etc. With mechanical systems, such as aircraft maintenance, this most usually involves replacement of parts before they wear out, lubrication, adjustment, inspection, and even, on occasion, replacement of a component that was ill conceived to begin with. In such a case, however, the maintenance folks are not the same folks who re-design that component.

In the case of software maintenance, we have some differences. Nearly ALL systems contain components that do date arithmetic. Not all of those components are the same, more the opposite than not. The software components that are designed to operate in the 20th century are far from uniform, and, for the most part, only share the assumption of the 20th century as "now". Whether you call it a "fix" or not, the nature of the correction is one of re-design.

Unfortunately, the assignment to "make it work" is largely perceived by the decision makers as one of repairing something that is broken instead of re-designing something that is doing what it is supposed to do.

On the bright side, those doing the actual remediating of the code, understand this and do their best to re-design the code.

On the dark side, the fact that most of the remediation is based on "windowing" is prima facie evidence that the decision makers are taking the short way out. A 100 year "window" is what got us here in the first place.

"Bad code" is what is killing us now. "Bad code" is not what got us here. It was short sighted thinking and decision making that got us here. It was taking the short way out that got us here. It was bad management and greedy people who got us here.

They're still doing it.

We're still letting them.

We'll deserve whatever we get from them.

My Grandfather often told me, "There's never time to do it right, but there's always time to do it over."

Not this time, Gramp.

-- Hardliner (searcher@internet.com), January 26, 1999

Answers

There's never time to do it right, but there's always
time to do it over.

And that is the way our world has worked for a long time now, and has never let us down. Sure, people get mad, maybe even demoted or fired, and sure the money that gets coughed up to do the re-work is always way more than it would have taken to do it right (and mucho better) in the first place. But, all in all, this approach works.

But not this time. Y2K CANNOT BE FIXED.

-- Jack (jsprat@eld.net), January 26, 1999.

Amen, Hardliner!

Unfortunately, even the portion of the public that is "semi- technical" will probably never realize this...and the PR folks do all they can to maintain this misconception.

I already have a few old elms in my yard targeted for the axe.

-- Delete (del@dos.com), January 26, 1999.


W..e..l..l.. I like your point and it is one that will affect my future Y2K thinking/blabbing, thanks. Still, I think the subject is a little more nuanced. Lots of Y2K remediation can legitimately be termed maintenance, but you are right that much is really redesign.

Another of the hairy gotchas here is that geeks/geekettes are, under great stress and pressure and without useful management input, making on-the-spot decisions about whether to fix cleanly, kludge or redesign to do dates right with subsequent implications upon failures that will show up at unit, system, production and regression testing time. You know, when we get around to that in 2000 or 2001.

I will amend my future insistence on Y2K being a singular maintenance problem to its being a singular maintenance and redesign problem with very ill-defined theoretical boundaries between the two. Yummy.

-- BigDog (BigDog@duffer.com), January 26, 1999.


>>We'll deserve whatever we get from them.

Hardliner, I understand you're trying to inculcate responsibility and vigilance in the reader, but I tend to formulate this as "They'll deserve whatever they get from us." In a limited sense, we are "to blame" if we are sold shoddy goods, or if we mistakenly place our trust in those who abuse us. But our sense of self-responsibility shouldn't be allowed to obviate the requirements of justice. I'm not concerned with the programmers, but the managers (of power plants, rail companies, etc.) who ignored programmers' pleas to fix the problem -decades ago- ought to be breaking rock on a farm for a few decades. Same goes for lying bankers, city officials, etc., who, evidence in hand, are telling people NOT to prepare. They're ensuring future panic and death, while they go one last round at the Wall Street roulette wheel. I look around at the people who look with trust to their civil servants, managers of utilities, etc., and see their more-or-less innocent kids at play, their struggle to stay afloat - much less stay aware of Y2k-type developments - and I can't really say they *deserve* what's coming. More to the point, I don't think they're going to dust themselves and their surviving family members off and say "oh well, we deserve financial ruin and death - we trusted them!"

I'm not advocating vigilantism, but pointing out that if justice is not served in an official capacity, it will likely be served in an unofficial one. With less precision.

E.

-- E. Coli (nunayo@beeswax.com), January 26, 1999.


E.,

You're right. I'm afraid I didn't put that very well. What I was trying to express was the idea that the first time someone wrongs you, it's "Shame on you!" The second time they do it, it's "Shame on me!" as in, "I should have known better."

"Deserve" was the wrong word and clearly, the victim who's trust has been betrayed is not deserving of the consequences of that betrayal.

I share your views on "justice" as well.

-- Hardliner (searcher@internet.com), January 26, 1999.



Well said, E.C.

Hardliner, the nuances will be lost on most of the inhabitants of this board. You are mostly correct in the re-design vs maintenance argument, but many will think that re-design is part of maintenance. Its the kind of argument that will be a Ph.D. thesis in 2050.

I wonder if Paul Milne could be the Warden at the Rock Farm?



-- RD. ->H (drherr@erols.com), January 26, 1999.

"My Grandfather often told me, "There's never time to do it right, but there's always time to do it over." Not this time, Gramp. "

Your Gramps wouldn't have thought of it, because it's a hard thought to think when everything is working just fine. But we'll probably have the opportunity, not to "do it over," but to do it differently.

There's gotta be another way.

-- Tom Carey (tomcarey@mindspring.com), January 27, 1999.


It may be semantics, but I'd say it's a maintenance problem in the same way that automobile recalls are a maintenance problem. Things quite often prove to have been designed not-quite-right and have to be fixed in a hurry before they fail disastrously.

Fixing a Y2K bug is usually trivial. Finding them and testing the fixes is the problem. And the unique fact that the problem is hard- linked to the date.

-- Nigel Arnot (nra@maxwell.ph.kcl.ac.uk), January 27, 1999.


I can't remember the name of the book (quality related), but one of its basic premises was that quality is 'the adherence to specifications'. The book suggested that the Pinto automobile (notorious for blowing up when rear ended) was in fact a quality piece because it DID meet or exceed the designers specifications. The reason it had a tendency to blow up was POOR design.

Software maintenance is the same. If the system is not performing to the original specs, corrections to bring the system up to spec is maintenance. Changes to the original specs IS redesign.

MoVe Immediate

-- MVI (vtoc@aol.com), January 27, 1999.


The book title synapse just got a jump start. The name of the book is "Quality is Free".

MoVe Immediate (who wonders why I keep getting erratic occurances of floating single quote marks floating through my text after I post)

-- MVI (vtoc@aol.com), January 27, 1999.



Nigel,

Even in your auto recalls, it's only the implementation of the correction that's a maintenance function. The redesign is done elsewhere.

The fact that the same software engineer may perform both functions vis a vis Y2K tends to blur the distinction but there is still a difference that cannot be attributed to simple semantics between changing hard-coded city names or telephone area codes, for example, and rewriting the procedure and redefining the criteria for handling date arithmetic. The first is clearly maintenance and the second is just as clearly redesign.

If you simply exchange a part during routine maintenance, you have to be sure that the new part fits, that it is the correct part, that all the bolts are tight and that all the safety wire is in place, but there is no concern as to how this new part will interact with the rest of the mechanism. That is a known quantity. The part has not changed in nature, it has only been replaced by a new, perfectly functioning one.

If, on the other hand, you replace, again during routine maintenance, a redesigned part, in addition to all the concerns above, hopefully someone, somewhere, has not only redesigned the part, but they have investigated and taken into account how the changes made will affect how the redesigned part will interact with the rest of the mechanism.

That is the critical difference, in Y2k and in other like instances. That's why the testing function is so vital and necessary and why implementation without testing is such a risk.

There's an old saying that goes, "If it ain't broke, don't fix it," and it has been around a long time for a very good reason. Every time you perform a maintenance action of any kind, you have another chance to break something. If your action is implementation of an untested but redesigned part, it is obvious that the risk is that much greater.

Aviation maintenance is a good parallel because of the criticality of the work done. The lack of a single cotter pin can, and has, cost many lives. In Y2K work, it is not hard to find examples of life or death applications.

Again, let me ask those of you that understand these distinctions to bear with me. I have no wish to insult your intelligence. It is, however, my hope that that rarest of animals, a manager with a conscience, may get a glimmer. That a light bulb will go on somewhere and a seed may get planted that grows into understanding. I'm not optimistic about that, but I share with "Deano" and I'm sure a lot of the rest of you, a refusal to quit before the finish line regardless of odds. Whatever the odds may be, Y2K is the stakes game to end all stakes games and we all have to play.

-- Hardliner (searcher@internet.com), January 27, 1999.


Freeze and test is the ball-game in 1999, not to avoid breakdowns but to contain them. Hardliner is right-on in that the next spin (and it will be *within* IT) is that "testing, especially acceptance test" is't **really** needed because it's just maintenance. Wrong.

-- BigDog (BigDog@duffer.com), January 27, 1999.

Hardliner,

Some recalls require the fitting of a part redesigned elsewhere. Others just require the removal and replacement of a standard part, because it has become apparent that due to a failure in quality control, a defective batch got into production. It can be as simple as a batch of high-tensile steel bolts that were incorrectly tempered.

There are indeed Y2K bugs that require subtle and complex redesign of systems, but they are (IMHO) a minority. Most are trivial -- once you've found them! -- and in this case are a lot more like replacing a component that wasn't redesigned, simply one of a bad batch. The overall architecture of the program is usually unchanged or only slightly changed.

But, You're absolutely right to emphasize the importance of testing. The most trivial of fixes can be got wrong (like a mechanic over- torqueing that bolt!) , and the worst possible time to discover that is in production shortly after 1/1/2000. Especially if it slags a database, which is not so unlikely. And given that you're probably looking at a very large number of trivial fixes, each different, it's a near-certainty that at least one will itself be a dud.

Testing is also vital to locate those Y2K bugs which weren't spotted when the code was "proof-read", whether by a person or a machine or both. The more thorough the test, the less chance of a bug slipping through the net. In the absence of thorough testing, Y2K bugs that get missed will outnumber Y2K fixes that were wrong.

-- Nigel Arnot (nra@maxwell.ph.kcl.ac.uk), January 28, 1999.


Moderation questions? read the FAQ