TFDS Manual
Tandem Failure Data System (TFDS) Manual—520628-003
1-1
1 Introduction to TFDS
This section presents a brief overview of what TFDS is and what it can do for you.
Section 2, Using TFDS, provides more detail on how TFDS works.
Overview of TFDS
TFDS is a key automation and problem-management component of the NonStop
Kernel operating system. It automates most tasks associated with data collection and
resource recovery in the event of software-related processor or subsystem failure.
TFDS monitors processors in HP NonStop servers for software failure notifications. It
collects and analyzes failure data against a local incident database to determine
whether the failure is a first-time occurrence or the result of a recurring defect.
You can configure TFDS to automatically initiate a processor dump and reload the
processor if the software failure is a processor halt. Automating these steps eliminates
the delay of waiting for manual intervention to collect the data and reload the
processor. The only manual operation, when necessary, involves retrieving the
magnetic tape containing the captured failure data and forwarding it to your service
provider. For known problems, TFDS suppresses redundant dumps, tracks the number
of occurrences, and notifies your service provider (if configured).
What TFDS Can Do for You
TFDS helps you manage software failure. For a processor halt, the TFDS data
collection and recovery services help you bring a halted processor back online as
quickly as possible. TFDS can also provide cost savings:
•
Besides saving time, the suppression of redundant dumps also saves valuable disk
space on your system.
•
TFDS reduces overhead associated with failure data capture by minimizing
downtime. Because the pertinent data is captured the first time a failure occurs,
you rarely need to reproduce software failures to accommodate further data
collection.
•
TFDS is configurable to meet your specific needs. This flexibility includes dump
placement, dump file analysis, and the ability to control the dumping and reloading
of software-related failed processors. For example, you decide whether your
priority is retrying a failed dump to gather failure data or reloading your processor
as quickly as possible.
•
In addition to monitoring and processing processor halts, TFDS also processes
software instrumentation calls issued by many HP software subsystems to indicate
internal software failure. Instrumentation calls are imbedded in software to provide
serviceability.