Data Transformation Engine Intelligent Business Integration Reference Guide

Intelligent Business Integration Reference Guide
61
Chapter 6 - High-Availability and
Load-Balancing Architecture
The following are the high-availability and load-balancing options that Mercator
products currently offer:
Server clustering
Enterprise JavaBeans
Server Clustering
A server clustering approach caters for high-availability and failover but does not
address the load-balancing requirement. The Mercator products support failover in
the Microsoft Cluster Service and Sun Cluster 3 server environments.
Typically, in clustering approaches, there are two server nodes, both of which
have access to a shared disk array. One server node is the active, primary node,
and the second server is online, but passive. The two servers send heartbeat
information, which is monitored by cluster management software. In the event
that the primary server fails and does not send its heartbeat, the monitoring
software begins to fail over to the secondary server, which then becomes the
active node.
The cluster management software permits the selection of processes and services
to be restarted on the secondary node in the event that the primary node fails.
Because the Event Server is installed as a service on Windows or a daemon on
UNIX, it is an ideal candidate for restarting on the secondary node. The key to not
losing any data in the failover is in the fact that both nodes point to a shared disk
and that each map thread is a single atomic transaction.
The map components and any import and export directories, local message
queues, and so forth are installed on the shared disk farm. Because each map
thread is transactional, if any map threads are running on the primary node when
it fails, the map rolls back the transaction it was processing at the time of the
failure, leaving all input sources (files, databases, message queues, and so on)
completely intact. When the systems fails over to the secondary node and the
Mercator Event Server is started, the map picks up precisely from the point the
primary Event Server rolled back to, working with the components left intact on
the shared disk array.