Availability Guide for Problem Management
Introduction to Problem Management
Availability Guide for Problem Management–125509
1-3
Measuring Outages
Outage Minutes
While the computer industry often measures availability as percentage of total time, 
Tandem recommends measuring availability by outage minutes, assuming 24x7x365 
operations. Using an outage-minutes-per-year measurement is easy to understand and 
provides more meaningful data than percentage numbers. Table 1-1 compares 
percentages with equivalent outage minutes and the resulting user impact. 
Measuring Downtime in Minutes
A couple of decades ago, it was reasonable to assume that a computer system should be 
available 75 percent of the time. Today, however, reliability standards have increased 
substantially. For example, you might well compare a computer system that is available 
99.9 percent of the time with a computer system that is available 99.99 percent of the 
time.
Now consider the same two computer systems in terms of outage minutes. The first 
system is unavailable for 500 minutes during the year while the other system is 
unavailable for only 50 minutes during the same year. These values are more meaningful 
in view of the fact that the costs of application downtime are usually measured in cost 
per minute.
In addition, measuring downtime in minutes makes it easier to understand the benefits of 
automated problem resolution. For example, suppose one of your service-level 
objectives is to keep downtime to less than 50 minutes per year. If it takes, on average, 5 
minutes to manually correct an outage, then your application can tolerate 10 outages per 
year, or an average of about 1 outage every 5 weeks. Given that a fully automated 
solution to a problem can be accomplished, typically 20 times faster than a manual 
solution of the same problem, it follows that you can tolerate up to 200 outages each 
year using fully automated solutions, or about one outage every 1.5 to 2 days and 
achieve the same goal.
Measuring Downtime in a Client/Server Application
For client/server types of applications it is useful to take measuring downtime a step 
further and express it as the number of user outage minutes. A failure in the client part 
of the application might affect only one user, but to that user the application is down. A 
failure in part of the network could affect several users. A failure in the server, however, 
could affect hundreds of users. It is, therefore, important that an outage in the server be 
weighted over an outage in the client.
Table 1-1. Outage Minutes per Year (Assuming 24x7x365 Operations)
Percent 
Availability 90% 99% 99.9% 99.99% 99.999% 100%
Outage 
Minutes/Year* 50,000 5,000 500 50 5 0
User Impact* 35 days 3.5 days 8.3 hours 50 minutes 5 minutes 0 minutes
*Outage minutes per year and user impact days are approximations.










