This is sort of a availability/sl management question. We are running into a wall with idetnifying a suitable metric that helps us and our business unit stakeholders determine SL compliance.

Problem Statement: Currently uptime reports do not accurate reflect the true customer impact and there is concern that some of these numbers might mislead management.

Example: We reported 99.95% uptime during one month, however there were probably 10 events that had variuos impact to customers. Some outages may have impacted only a small fraction of users who were still able to log right back into the system and continue the work. The outage itself lasted 20 minutes and was specifically centered around a sinle web server in a configuration that involes 6 others.

Question: What is the industry norm for measuring total uptime of a ecommerce system? NOTE: One idea we are tinkering with is the notion of customer impact minutes. This ties each minute as a failure opportunity and the total customer down minutes are figured in to get a percentage of "success".

Regards, SteveR SLM Analyst

-- Steve Rodriguez


Steve, My initial impression is that you may be making this harder than it needs to be. Some basic statistical methods using synthetic transactions ought to give you a reasonable estimate of overall availability, taking into account partial outages, etc. The bottom line is whether or not there was an impact on a user.

Rick Sturm President Enterprise Management Associates

-- Rick Sturm

