Availability Guide for Application Design

ManualsBrandsHP ManualsServerHP NonStop G-Series

211

212

213

214

215

216

217

218

219

220

Instrumenting an Application for Availability

Availability Guide for Application Design—525637-004

8-5

Writing Code to Handle Problem Errors

When using the technique of looping and waiting, it is preferable to try to recover a

few times, over a period, but to reduce the frequency of retries if the first two or

three retries fail.

3. Tell appropriate people about the error (get help and warn users).

4. Recognise when the problem is repaired.

5. Know what processes in the application need restarting, and where (perhaps

based on the content of a configuration file or database).

6. Tell the users when they can resume work, and enable them to do so (the error or

its handling might have locked out some users).

For example, if a disk or disk file has no more space, a temporary resource problem

exists. The following questions need to be answered to properly cope with the

situation:

•

What file system error does the program look for to identify the problem?

•

Should the whole application abort, or should the server process enter a waiting

loop while checking to see if operations has fixed the problem (in this case, either

by raising the maximum number of extents for the file or by running an online

partition split [SQL only])?

If the application enters a waiting loop, how does it handle the wait?

•

What other processes are affected by the application?

Does the program need a communication strategy which will send messages to

other servers affected by this problem, and stop them, too?

Will those other processes be able to stop themselves independently by using the

same logic as the process encountering the problem?

•

Will users be able to do useful work while the problem exists?

Should the program tell the users about the error?

Which users need to know about the error?

•

How does the program tell the users?

Should the process send a message to the requester program to display on the

user screen that says “Transaction types … are currently unavailable - try later,

please”?

The program can do that in a terminal/Pathway TCP environment by using an

unsolicited message to the TCP. However, if the program is a client/server PC

application on a LAN, how can the program get the message to the PC?

Does the program's client/server design need to allow for such messages to the

end-user screen? Does it need a pop-up dialog error box?

•

How does the program communicate with the operations/technical support people?