Availability Guide for Application Design

What Is Application Availability?

Availability Guide for Application Design—525637-004

1-7

Alternative Ways to Measure Downtime

of the application might affect only one user, but to that user the application is down. A

failure in part of the network could affect several users. A failure in the server, however,

could affect hundreds of users. It is therefore important that an outage in the server be

weighted over an outage in the client.

By expressing downtime in terms of user outage minutes, a one-minute outage in the

client equals one minute of downtime. An outage of one minute in the server, however,

equals one minute times the number of users accessing the server.

Of course, not all user outage minutes are equal. You might need to refine the model

for measuring outage minutes to suit your business needs. For example, a seed

retailing company might have a different sales channel for backyard gardeners than for

commercial market gardeners and farmers. Clearly, a one-minute outage on a line that

typically carries orders for a few packets of border plant seeds should be weighted less

than a one-minute outage on a line that often carries orders for planting several

thousand acres of corn.

The correct way to measure an outage affecting a batch program varies from one

application to another. The batch program could be considered a major user and,

therefore, should be weighted more heavily than single-transaction users. Conversely,

you could argue that the batch program should be weighted more lightly because you

can easily start it again and how long it takes or when it finishes is not important.

Alternative Ways to Measure Downtime

Of course, many users might choose to measure downtime in ways other than user

outage minutes, depending on their specific business needs. For example, a site might

be obligated to pay penalties for each transaction that does not get processed while

the application is down. Such a site might supplement its measure of downtime as

follows.

To measure the number of transactions that would have been processed during an

outage, the site keeps a record of the number of transactions it normally processes by

minute and by day of the week. If an outage occurs, for example, at 10 a.m. on

Tuesday morning and lasts for 15 minutes, the site can calculate the average number

of transactions that would normally be processed during that period. Subsequently, the

site pays a corresponding penalty to its customer.

Using this method leads to significantly different outage costs depending on the time of

day and the day of the week. An hour-long outage at 2 a.m. on Monday morning might

carry a negligible penalty when compared with a 15-minute outage at 5 p.m. on a

Friday.

What Causes Outages?

Before attempting to design an available application, it is important to understand the

potential causes of outage. While the specific causes of outage are many, it is possible

to place them into meaningful categories. This subsection introduces the five outage