HP NetRAID-4M Configuration and Upgrade Guide (Release 5)
Chapter 9 Issues and Problem Resolution
52
Problem: When one of the currently running services like CLI, SNMP agent or the
AIF daemon dies, there is a possibility that it will not exit gracefully, thereby
causing this problem to be triggered. In order to receive AIF information and display
the data, a user program must register with the driver by calling an IOCTL. When
the user program no longer wants to receive and display AIF information, it must
unregister with the driver. While the user program is registered with the driver, each
AIF received by the driver is copied into a private kernel File Information Block
(FIB) and queued to be retrieved by the program. If the user program exits without
unregistering with the driver, the user program does not release all of its queued
FIBs, which then remain in memory. If this happens repeatedly, the kernel may run
out of memory. This can cause either a hang or a panic.
Workaround: Install the patch for the NetRAID-4M Controller. Readme file is
provided with the patch distributed by Linux Red Hat web site. To install the patch,
complete the following steps:
1. If you have not already done so, download the patch utility.
2. Copy the patch to your Linux source directory, typically /usr/src/linux.
3. Run the patch utility.
4. Rebuild the kernel in accordance with the kernel README at:
/usr/src/linux/README.
5. Reboot your system.
36 Very Short I/O timeout values can cause Linux kernel panic
Problem: In normal operation, the NetRAID-4M controller can typically recover
from errors within 21 seconds. More serious SCSI problems (e.g., SCSI related
“unexpected bus free” and “command timeout,” drive spin-up delays, and failover
to a bad drive) require extensive fault isolation, preventing the OS from accessing
the drives. Depending on the severity of the SCSI device failure, the time needed to
complete the error isolation process may exceed 40 seconds, causing the Linux OS
kernel to panic. This error rarely occurs.
Workaround: Make sure the SCSI bus is in good shape. OS kernel extended I/O
timeout will be implemented in a future Linux release.
37 Exchange -1018 drive error
Problem: When you perform an online backup of the Exchange Server directory or
information store on a single-processor computer, a -1018 error
(JET_errReadVerifyFailure) may be returned even though the corresponding