The $64,000 Question...

greenspun.com : LUSENET : TimeBomb 2000 (Y2000) : One Thread

I think the following essay bears in-depth reading as it closely ties in with the theories of Infomagic, Dr. Altman and Roberto Vacca, to name just three.

It addresses the system of interdependencies we have built up, by accident - not design, throughout the technological world.

The $64,000 question is - how many and at what capacity do these linked sub-systems need to function to retain the structure of the whole?

This was written by Mike Goodwin.

I've been pondering the root Y2K problem for many years, searching for a concise way to describe the true nature of the potential threat. This week, aided by the phraseology of a scientist, I've constructed this question:

"What is the fault tolerance of our globally-distributed specialization network?"

This is the relevant Y2K question. Remember, it's not the compliance of home appliances that matter ( and why polls keep asking people about home appliances is an unfortunate mystery... ) [no mystery to me pal, deliberate skewing of questionaires in a disinformation campaign - Andy], and the likelihood of failures somewhere on the planet are all but certain. Failures are going to occur, without a doubt.

The question concerns the ability of our globally-distributed specialization network to survive faults. If the global system is highly fault tolerant, it will survive intact, with few disruptions. If the global system has low fault tolerance, we're in for a very rough ride. Perhaps even a multi-year shutdown of civilization as we know it.

FAULT TOLERANCE HAS NEVER BEEN TESTED

Recognize the fault tolerance of our "new" global community has never been tested. In the days of World War II, America was relatively isolated. We could build our own planes, trains and automobiles ( tanks, too ) . We had factories, we had relatively short, U.S.-based assembly lines with skilled U.S.-based workers who possessed labor skills. The network of specialization was much smaller, and therefore, more fault tolerant. Everybody knows the fewer pieces you have in an engine, the less likely it is to fail. Simplicity leads to reliability. Complexity results in a low fault tolerance.

Today, the manufacturing base of America is nearly extinct, and the supply lines for building products stretch across oceans, involving a half-dozen countries for parts. This is the "globally-distributed" specialization network to which I refer, and it is a relatively young system.

It's been driven by economics, by specialization, by efficient ocean-going transports and air deliveries. It's enabled by international telecommunications: e-mails, faxes, phone calls, even video conferencing. International banks allow the moving of funds from buyer to seller, through trusted international clearinghouse networks. This is, indeed, a "network" of a thousand parts, and each part of the machine must work at near-perfect efficiency for the whole system to operate correctly.

WE ALREADY KNOW THE SYSTEM CAN HANDLE A 1% FAILURE RATE

So what is the fault tolerance of this system anyway? That's the debate, that's the big question. Clearly, the people who say that systems fail all the time -- with no big deal -- are missing the point. Yes, power plants fail on a daily basis. Phone lines go down somewhere on the planet on a daily basis. Banks mess up transactions with frightening regularity. We understand that this global network has a fault tolerance of at least 1%. But that's not the right question. Y2K isn't a local hurricane. It isn't a local power outage or a local bank error. It's a simultaneous, global slam-dunk event. It may raise the failure rate of this network to 10%. And *that* is the big question: is our globally-distributed specialization network able to withstand a simultaneous failure of 10% of its parts? See, isolated failures always rely on the non-failing services -- and an excess of available resources -- to complete repairs. When a power plant fails, all the power experts get called on the phone lines, and they rush to the scene to fix this lone failing power plant. They use credit cards to buy plane tickets, gas, food, you name it. And when they're done, they go home and wait for the next power emergency. This demonstrates the 1% fault tolerance of our current system. But what if ten power plants go down? Suddenly you've got 1/10th of the available resources for each power plant. Then what if the telecomm is down? You can't reach the people qualified to repair the power. If the telecomm is down, they can't use their credit cards to get there. Then what if the airlines aren't flying? You've got delays, people have to drive. So they depend on oil, but what if the oil tanker shipments are delayed?

AT WHAT POINT IS THE FAILURE UNIVERSAL?

See, at some point, somewhere between 1% and 100%, you get a total failure of the network. The real Y2K question, when you boil it down, concerns this number. What percentage of simultaneous failure can the network withstand without collapsing?

Clearly, it's something lower than 80%, something higher than 1%. Perhaps the network could withstand a 5% failure; that's debatable. Imagine if 5% of all financial transactions were bad. That would clobber the financial institutions: busy signals forever. Imagine Wall Street with a 5% transaction failure. The whole system would shut down due to the 5% failures. A 10% failure would seemingly bring most networks down. Imagine if 10% of the parts in a power plant didn't work correctly. That's an off-line plant in short order. Imagine if 10% of the parts didn't show up at the Chrysler plant. That's a sure-thing shutdown. Imagine if 10% of the water treatment plants in the country failed. It would be a Red Cross nightmare, just attempting to supply water to 10% of the population.

In my opinion, the world probably can't withstand a 10% failure rate without severe and long-term consequences. A 20% failure rate would be, I think, a fatal economic event. It would thrust the world into a depression with all the resulting costs in dollars and lives. At a 20% failure rate, the efficiencies break down: the food production and deliveries, the oil, power, banking, telecommunications, and so on.

80% ISN'T GOOD ENOUGH

This is why, when people tell you that 80% of the systems are going to be ready, that's not nearly good enough. Technically, if you believe my analysis, 80% of the systems working is still a disaster. 20% of the systems failing could break the global network's back. In fact, a 95% "working" ratio isn't good enough, either. Even a 5% failure could have long-term, painful consequences. In order to avoid the worst effects of the Millennium Bug, systems need to operate at 99% or better. We need to have less than one failure per one hundred systems. At that rate, I'm confident the fault tolerance ability is sufficient.

Comments ladies and gentlemen???

Andy

Two digits. One mechanism. The smallest mistake.

"The conveniences and comforts of humanity in general will be linked up by one mechanism, which will produce comforts and conveniences beyond human imagination. But the smallest mistake will bring the whole mechanism to a certain collapse. In this way the end of the world will be brought about."

Pir-o-Murshid Inayat Khan, 1922 (Sufi Prophet)

-- Andy (2000EOD@prodigy.net), January 30, 1999

Answers

This is why a Worldwide Depression is my best case scenario. A house of cards only needs to have one or two key cards pulled from the base to fall. If we lose oil, electricity, transportation, the world is down for the count.

-- Bill (y2khippo@yahoo.com), January 30, 1999.

I would comment that it's more like the 64 Trillion Dollar Question...

-- pshannon (pshannon@inch.com), January 30, 1999.

Purely Bullshit from the people who are on top of the list of those who like to hear themselves talk. Dr. Altman has got a bug so far up his ass he doesn't know whether to shit or go crazy. His "probablities for systemic failure" are reminiscent of the most popular forms of mathematical cultism and not as nearly as telling as your average guy on the streets who can tell you which way the wind is blowing. And take it from THIS guy on the streets...

The wind ain't blowing in that direction. You want to mock me? Go ahead, that won't change the fact that % points don't mean a damn thing in this world of human nature. We don't accept % points, we accept reality, as transient and changing as they are. In short, your theories don't amount to shit. Theories are good on paper, but this isn't paper we are dealing with. It's a system thats 5,000 years in the making. You couldn't stop it if you wanted to. Like it or not, your deluded fantasies just WON'T ring true.

-- (Y2K My @SS . Com), January 31, 1999.


Never wrestle with a pig. You'll get dirty, and the pig will love it.

-- (x@y.z), January 31, 1999.

It's a system that's 5,000 years in the making. So what is your background Mr. y2k? It is a system that is full of cancer. It's a system that has no landlord and no one to guide the overall system. Where each of the components answers only to themselves and not the whole. It is a system that rarely learns from the past and so is condemned to repeat its mistakes. Are things getting better I can't see it. Will life go on? Yes but not as we know it and there is precious little time before we find out if you are right or wrong. If you are right cool, however if you are wrong then what? At any rate we only have 11 months to go and then its show time. Tman

-- Tman (Tman@k2k.com), January 31, 1999.


A friend of mine piloting a B17 lost three of the four engines over France coming back from a raid into Germany. The surviving crew threw out everything loose, he got down to the tree tops,.and made it back to England. Now that's "fault-tolerance."

Charlotte's web ain't like that.

-- Tom Carey (tomcarey@mindspring.com), January 31, 1999.


Y2Ketc. --

Your style of evangelism isn't very persuasive, for a fact. But you're welcome to think what you like about Y2K -- just like everyone else.

Obviously this forum bothers you a lot. According to your own statements, there's nothing here for you to learn.

So why do you keep coming back?

-- Tom Carey (tomcarey@mindspring.com), January 31, 1999.


To agitate and irritate and because he's wore out his welcome on other BB's. He'll soon get tired of us because there are too many intellectual thinkers and wise people that post here, and he isn't capable of keeping up with the thought process.

-- ~~ (~~@~~.com), January 31, 1999.

Tom

That is the most ass kickin plane ever made, Deedah's favorite of all time.

PS, AKA known as "Ground Effects" when flying so low.

-- Uncle Deedah (oncebitten@twiceshy.com), January 31, 1999.


I'm not sure about Goodwin's numbers - I think there's certainly room to debate - but his point is well taken and something to keep in mind as we move forward. Since all the programs will not be fixed, his question points the way toward mitigation through a combination of remediation and preparation at all levels.

I believe his question could be better stated as:

What are they most effective (time and money) ways to both DECREASE the number of errors and, simultaneously, collectively INCREASE our fault tolerance in the short time remaining?

It does little good to argue about Goodwin's number - yours are as good as his. And we could argue for months over the implications of those numbers. It doesn't really matter.

It is the abstraction here that is important. It suggests that we should not sit by doing nothing - waiting for the programmers to lower the number of faults - for this only lowers one side of the equation. If we can proactively raise our collective fault tolerance, then we have attacked this problem from two different but productive angles.

Building in more 'fault tolerance' is something that each of us can do in very meaningful ways. People who are prepared have no reason to panic, no reason to be frightened and no reason to cause problems for others around them.

-- Arnie Rimmer (Arnie_Rimmer@usa.net), January 31, 1999.



Thanks chaps and chapesses,

It is early yet - there are a lot of yee hahs amongst us - by and large good questions all - put you're thinking caps on as this question will be The Deciding Factorover everything this oncoming year - interdependencies... not that difficult if you think about it...

Arnie - good point - however I do think Mr. Goodwin is trying to lock down a percentage - we all know this cannot be done - but it does bear one to think about the small numbers (percentage-wise) rather than the larger ones in this context.

Let's stick to the question shall we?

Percentages - call it odds if you will - maybe we should get a Vegas bookie to give us all some input???

Andy

Two digits. One mechanism. The smallest mistake.

"The conveniences and comforts of humanity in general will be linked up by one mechanism, which will produce comforts and conveniences beyond human imagination. But the smallest mistake will bring the whole mechanism to a certain collapse. In this way the end of the world will be brought about."

Pir-o-Murshid Inayat Khan, 1922 (Sufi Prophet)

Two digits. One mechanism. The smallest mistake.

"The conveniences and comforts of humanity in general will be linked up by one mechanism, which will produce comforts and conveniences beyond human imagination. But the smallest mistake will bring the whole mechanism to a certain collapse. In this way the end of the world will be brought about."

Pir-o-Murshid Inayat Khan, 1922 (Sufi Prophet)

-- Andy (2000EOD@prodigy.net), January 31, 1999.


oops.

-- Andy (2000EOD@prodigy.net), January 31, 1999.

testing testing 1 2 3 how does this friggin' html work fer crissakes!!!???

-- Andy (2000EOD@prodigy.net), January 31, 1999.

i furgin give up...

answer the bleedin question and stop laughing at my ineptitude you're embarrassing me! >

-- Andy (2000EOD@prodigy.net), January 31, 1999.


Help - No Spam Please!!!

-- Andy (2000EOD@prodigy.net), January 31, 1999.


Bloody hell! - what did I do??

-- Andy (2000EOD@prodigy.net), January 31, 1999.

Order. Order I say!

So, what's the percentage?

My take - I think it could be very low, I keep remembering 1973, I was 15 and reading Melody Maker and New Musical Expess in London, there were three page articles about how the lack of oil, petrol, would impact the production of, wait for it... LP's (I'm showing my age now...) LP + Long Playing Record... 33 rpm... have you seen Austin powers? - that's me without the bad teeth - Shaggadelick Bayby!!!

Petrol rationing, garbage strikes (rubbish strikes), power strikes...

I remember the date - 1973.

Just like 1987. Black Monday.

1966 - England won the World Cup.

Anyone here remember two out of three :), off the top of your head, immediately?

2000.

No oil.

NO xxxx, fill in the blank.

And repeat.

Percentage that won't make it in the USA??? Higher than you would think.

Percentage that won't make it world-wide - catastrophic.

Work out the interdependencies for yourselves :)

Andy

1 byte.

-- Andy (2000EOD@prodigy.net), January 31, 1999.


Andy, andy, andy....

sigh...

Did you learn Nothing from the fumble-thumb post?

(take it easy, bub, or you'll blow a gasket!)

Leave the HTML to the big people, 'K?

-- Mutha Nachu (---@trees.com), January 31, 1999.


The point of fault tolerance is well taken but vastly oversimplified in your statement. Different systems have different fault tolerances before becoming non-functional. A typical bank's fault tolerance is probably on the order of 1 in 10,000 or 0.01% while the IRS is more like your stated 1%. (A B17 can fly with one engine but a 747 is wreckage.)

-- RD. ->H (drherr@erols.com), January 31, 1999.

Death by a thousand cuts.

-- Bill (y2khippo@yahoo.com), January 31, 1999.

When the B-17 lost the first engine, the PIC (Pilot In Command) commented that they would be an hour late returning to their home field. After the second engine quit, the PIC tried to keep the crew calm by saying they would be 2 hours late. After engine number 3 failed, the PIC was using all his experience, training and knowledge to keep his butt in one piece, but took the time to say they would now be 3 hours late.

One crew member, Paul Davis, said: "I hope nothing happens to that last engine. We'll be up here all day."

Fellow crewmembers Deano and Troll Maria nodded in agreement.

-- PNG (png@gol.com), January 31, 1999.


Andy, you hit the nail on the head - that's exactly why I'm preparing!

Arnie, very good point, one Diane makes frequently.

PNG,LOL and many thanks for not abandoning us completely!

Since I am not a compugeek, my best response must be preparation for me and my community.

-- Tricia the Canuck (jayles@telusplanet.net), February 01, 1999.


Moderation questions? read the FAQ