Using HP Serviceguard for Linux with VMware

Timer. The “Timer” function of virtual machines is implemented in software, whereas physical
machines implement it in hardware. It is possible for timer interrupts to be missed if too many virtual
machines, along with their applications, are run on a single physical machine, and this could result
in Serviceguard missing heartbeats. In that case, the heartbeat interval needs to be increased on all
clusters running on that VM node. While there is no specific limit to the number of virtual machines
running on a physical machine, administrators should be aware of this behavior. The limits can then
be set on a case-by-case basis.
Logical NICs. There could be practical difficulties in allocating more than 3 logical NICs in a
virtual machine. Serviceguard configuration requires at least two heartbeat links, so if the
applications need multiple data networks, you may have to share the logical NICs for data and
heartbeats.
Vmotion not supported. VMware VMotion allows a VM to move between physical platforms
while the VM is running, as part of scheduled maintenance. Depending on the configuration of the
VM, the time it takes for VMotion to complete varies, and this can lead to unforeseen interactions.
For this reason HP does not support VMotion on VM nodes of a Serviceguard cluster.
Multiple VM guests on a ESX Host You can create multiple VMguests on a ESX Hosts and
create a Serviceguard cluster with one VMguest from Each host . You can also add a physical server
to this cluster. One issue has been seen in this kind of setup. If a vmguest is being powered on/off it
may affect another vmguest (doing cmapplyconf with lock lun during same time frame) which is
residing on same host .Here cmapplyconf may fail saying physical lock lun device cannot be
used for cluster lock as it is not similar to another node’s lock lun device. Chance of this failure to
occur is very rare. The workaround for this issue is to retry cmapplyconf.
Using VMware NIC teaming to avoid Single Points of
Failure
Because virtual machines use virtual network interfaces and HP does not support channel bonding
of virtual NICs, you should use VMware NIC teaming instead.
How NIC teaming works. VMware NIC teaming at the host level provides the same
functionality as Linux channel bonding, allowing you to group two or more physical NICs into a
single logical network device called a bond
4
. Once a logical NIC is configured, the virtual machine
is not aware of the underlying physical NICs. Packets sent to the logical NIC are dispatched to one
of the physical NICs in the bond and packets arriving at any of the physical NICs are automatically
directed to the appropriate logical NIC. NIC teaming can be configured in load-balancing or fault-
tolerant mode . You should use fault-tolerant mode. When NIC teaming is configured in fault-tolerant
mode, and one of the underlying physical NICs fails or its cable is unplugged, ESX Server will detect
the fault condition and automatically move traffic to another NIC in the bond. This eliminates any
one physical NIC as a single point of failure, and makes the overall network connection fault-
tolerant. This feature requires the beacon monitoring feature [1], [3], of both the physical switch and
ESX Server NIC team to be enabled. (Beacon monitoring
5
allows ESX Server to test the links in a
bond by sending a packet from one adapter to the other adapters within a virtual switch across the
physical links.)
4
Bond generated by NIC teaming is different from bonds created by channel bonding.
5
Turning on beacon monitoring is reported to have problems. People have completely lost access to ESX server and there is a Cisco
whitepaper
recommending against turning this on, as it sometimes generates false failures.
4