6.3.2 HP StoreAll OS Release Notes

Segment evacuation

• The segment evacuator cannot evacuate segments in a READONLY, BROKEN or UNIVAILABLE

state.

The ibrix_collect command

• If collection does not start after a node recovers from a system crash, check the /var/crash/

<timestamp> directory to determine whether the vmcore is complete. The command

ibrix_collect does not process incomplete vmcores. Also check /usr/local/ibrix/log/

ibrixcollect/kdumpcollect.log for any errors.

• If the status of a collection is Partially_collected, typically the management console service

was not running or there was not enough space available in the /local disk partition on the

node where the collection failed. To determine the exact cause of a failure during collection, see

the following logs:

◦ /usr/local/ibrix/log/fusionserver.log

◦ /usr/local/ibrix/log/ibrixcollect/ibrixcollect.py.log

• Email notifications do not include information about failed attempts to collect the cluster

configuration.

• In some situations, ibrix_collect successfully collects information after a system crash but

fails to report a completed collection. The information is available in the /local/ibrixcollect/

archive directory on one of the file serving nodes.

Cluster component states

• Changes in file serving node status do not appear on the management console until 6 minutes

after an event. During this time, the node status may appear to be UP when it is actually DOWN

or UNKNOWN. Be sure to allow enough time for the management console to be updated before

verifying node status.

• Generally, when a vendorstorage component is marked Stale, the component has failed

and is not responding to monitoring. However, if all components are marked Stale, this implies

a failure of the monitoring subsystem. Temporary failures of this system can cause all monitored

components to toggle from Up, to Stale, and back to Up. Common causes of failures in the

monitoring system include:

◦ Reboot of a file serving node

◦ Network connectivity issues between the management console and a file serving node

◦ Resource exhaustion on a file serving node (CPU, RAM, I/O or network bandwidth)

While network connectivity and resource exhaustion issues should be investigated, they can occur

normally due to heavy workloads. In these cases, you can reduce the frequency at which

vendorstorage components are monitored by using the following command:

ibrix_fm_tune -S -o vendorStorageHardwareStaleInterval=1800

The default value of this command is 900; the value is in seconds. A higher value reduces the

probability of all components toggling from Up to Stale and back to Up because of the conditions

listed above, but will increase the time before an actual component failure is reported.

24 Workarounds