Jump to content United States-English
HP.com Home Products and Services Support and Drivers Solutions How to Buy
» Contact HP
More options
HP.com home
Designing Disaster Tolerant High Availability Clusters: > Chapter 4 Designing a Continental Cluster

Designing a Disaster Tolerant Architecture for use with Continentalclusters

» 

Technical documentation

Complete book in PDF
» Feedback
Content starts here

 » Table of Contents

 » Glossary

 » Index

The Continentalclusters product operates as a configuration of two Serviceguard clusters, which can run a package on a cluster and a Recovery Cluster. The key elements providing disaster tolerance in a continental cluster are:

  • Mutual Recovery

  • Serviceguard clusters

  • Data replication

  • Highly available WAN networking

  • Data center processes and procedures coordinated between the two cluster sites

You have a great deal of latitude in selecting these elements for your configuration. It is recommended that you record your choices on worksheets which can be reviewed and updated periodically.

Mutual Recovery

For mutual recovery, any cluster in a continental cluster may contain both primary and recovery packages for any recovery group. Recovery groups may be defined, for example, such that cluster A and cluster B contain recovery packages. In this case, cmrecovercl could be run on cluster B to recover packages from cluster A, or on cluster A to recover packages from cluster B.

Serviceguard Clusters

Each Serviceguard cluster in a continental cluster provides high availability for an application at the local level at that particular site. For optimal performance and to assure adequate capacity on the recovery cluster, it is best to have similar hardware on both clusters. For example, if one cluster contains two V class HP 9000 systems with 1Gb of memory each, it is not a good idea to have a low-end K series HP 9000 with 128 Mb of memory in the other cluster. Each cluster may have as many nodes as are permitted in an ordinary Serviceguard cluster, and each may be running packages that are not configured to fail over between clusters.

NOTE: Remember that when cluster A takes over for cluster B, it must run cluster B’s packages as well as any packages that it was already running on its own, unless you choose to stop those packages.

Data Replication

Data replication between the Serviceguard clusters extends the scope of high availability to the level of the continental cluster. You must select a technology for data replication between the two clusters. There are many possible choices, including:

  • Logical replication of databases

  • Logical replication of filesystems

  • Physical replication of data volumes via software

  • Physical replication of disk units via hardware

Table 4-3 “Data Replication and Continentalclusters” is a brief discussion of how a data replication method affects a continental cluster environment. A detailed description of data replication can be found in Chapter 1, in the section titled “Disaster Tolerant and Recovery in a Serviceguard Cluster.” Specific guidelines for configuring the HP StorageWorks E Disk Array XP Series and the EMC Symmetrix Disk Array for physical data replication in a continental cluster are provided in Chapters 5 and 6. In order to use these data replication solutions in a Continentalclusters environment you will need to purchase Metrocluster/CA and Metrocluster/SRDF products separately.

White papers describing specific implementations are also available from http://docs.hp.com/hpux/ha.

If you choose a data replication technology that is not mentioned above, and if you want to do your own integration, you have to use the guidelines described in section, Using the Recovery Command to Switch All Packages. In that case, note the following:

  • Continentalclusters product is only responsible for the following: Continentalclusters configuration and management commands, the monitoring of remote cluster status, and the notification of remote cluster events.

  • Continentalclusters product provides a single recovery command to start all recovery packages that are configured in the Continentalclusters configuration file. These recovery packages are typical Serviceguard's packages. Continentalclusters recovery command does not do any checking on the status of the devices and data that are used by the application prior to starting the recovery package. The user is responsible for checking the state of the devices and the data before executing Continentalclusters recovery command.

Table 4-3 Data Replication and Continentalclusters

Replication Type

How it Works

Continentalclusters Implication

Logical Database Replication

Transactions from the primary application are applied from logs to a copy of the application running on the recovery site. (This is an example only; there are other methods.)

Requirements on CPU and I/O may limit or prevent the Recovery Cluster from running additional applications.

Logical Filesystem Replication

Writes to the filesystem on the primary cluster are duplicated periodically on the recovery cluster.

CPU issues are the same as for Logical Database Replication. The software may have to be managed as a separate Serviceguard package.

Physical Replication of Data Volumes via Software

Disk mirroring via LVM software. Mirroring is done on disk links (SCSI or FibreChannel).

Requirements on CPU are less than for logical replication, but there is still some CPU use. Distance limits may make this type of replication inappropriate for Continentalclusters.

Physical Replication of Disk Units via Hardware

Replication of the LUNs within a disk array through dedicated hardware links such as EMC SRDF or Continuous Access XP.

Limited CPU requirements, but the requirement of synchronous data replication slows replication, and may impair application performance. Increased network speed and bandwidth can remedy this.

 

Logical data replication may require the use of packages to handle software processes that copy data from one cluster to another or that apply transactions from logs that are copied from one cluster to another. Some methods of logical data replication may use a logical replication data sender package; others may use a logical replication data receiver package; some may use both. Logical replication data sender and receiver packages are configured as part of the data recovery group, as shown below under “Creating the Continentalclusters Configuration.”

Physical Data Replication using Special Environment files

For physical data replication Continentalclusters had two pre-integrated solutions. One uses XP/CA and other uses EMC/SRDF. In order to use these data replication solutions in a Continentalclusters environment you will need to purchase the Metrocluster/CA or Metrocluster/SRDF products separately.

Physical data replication generally does not require the use of separate sender or receiver packages, but it does require specialized logic in the package control scripts to handle the transfer of control from the storage units of one cluster to the storage units at the other cluster. The packages that use physical data replication with the HP StorageWorks E Disk Array XP Series with Continuous Access XP should have created a specific environment file using template /opt/cmcluster/toolkit/SGCA/xpca.env; for packages that are using physical data replication with EMC Symmetrix and the SRDF facility should be created using /opt/cmcluster/toolkit/SGSRDF/srdf.env.

Both of these templates can be purchased separately with the product Metrocluster/CA or Metrocluster/SRDF.

Details on configuring the special Continentalclusters control scripts are in Chapters 5 and 6. Some additional notes are provided below.

Highly Available Wide Area Networking

Disaster tolerant networking for Continentalclusters is directly tied to the data replication method. In addition to the reliability of the redundant lines connecting the remote nodes, you also need to consider what bandwidth you need to support the data replication method you have chosen. A continental cluster that handles a high number of write transactions per minute will not only require a highly available network, but also one with a large amount of bandwidth. Details on highly available networking can be found in Chapter 1, in the section titled “Disaster Tolerant Architecture Guidelines.” White papers describing specific implementations are also available from http://docs.hp.com.

Data Center Processes

Continentalclusters provides the cmrecovercl command that fails over all applications on the primary cluster that are protected by Continentalclusters. However, application failover also requires well-defined processes for the two sites. These processes and procedures should be written down and made available at both sites.

Some considerations for site management are as follows:

  • Who notifies whom for the various events: configuration changes, alerts, alarms?

  • What communication methods should be used? Email? Phone? Beeper? Multiple methods?

  • Who has authority to perform what sort of configuration modifications? Can the administrator at one site log in to the nodes on the remote site? If so, what permissions would be set?

  • How often is a practice failover done?

  • Is there a documented test plan?

  • What is the process for tracking changes made to the primary cluster?

Continentalclusters Worksheets

Planning is an essential effort in creating a robust continental cluster environment. It is recommended that you record the details of your configuration on planning worksheets. These worksheets can be filled in partially before configuration begins, and then completed as you build the continental cluster. Both the site with the cluster and the site with the Recovery Cluster should have a copy of these worksheets to help coordinate initial configuration and subsequent changes. Complete the worksheets in the following sections for each pair of clusters that will be monitored by the Continentalclusters monitor.

Data Center Worksheet

The following worksheet will help you describe your specific data center configuration. Fill out the worksheet and keep it for future reference.

    =======================================================================

    Continental Cluster Name: _____________________________________________

    =======================================================================

    Primary Data Center Information:

         Primary Cluster Name: ____________________________________________

         Data Center Name and Location: ___________________________________

         Main Contact: ____________________________________________________

         Phone Number: ____________________________________________________

         Beeper: __________________________________________________________

         Email Address: ___________________________________________________

         Node Names: ______________________________________________________

         Monitor Package Name: __ccmonpkg__________________________________

         Monitor Interval: __60 seconds____________________________________

    =======================================================================

    Recovery Data Center Information:

         Recovery Cluster Name: ___________________________________________

         Data Center Name and Location: ___________________________________

         Main Contact: ____________________________________________________

         Phone Number: ____________________________________________________

         Beeper: __________________________________________________________

         Email Address: ___________________________________________________

         Node Names: ______________________________________________________

         Monitor Package Name: __ccmonpkg__________________________________

         Monitor Interval: __60 seconds____________________________________

Recovery Group Worksheet

The following worksheet will help you organize and record your specific recovery groups. Fill out the worksheet and keep it for future reference.

    =======================================================================

    Continental Cluster Name: _____________________________________________

    =======================================================================

    Recovery Group Data:

         Recovery Group Name: _____________________________________________

         Primary Cluster/Package Name:_____________________________________

         Data Sender Cluster/Package Name:_________________________________

         Recovery Cluster/Package Name:____________________________________

         Data Receiver Cluster/Package Name:_______________________________

    Recovery Group Data:

         Recovery Group Name: _____________________________________________

         Primary Cluster/Package Name:_____________________________________

         Data Sender Cluster/Package Name:_________________________________

         Recovery Cluster/Package Name:____________________________________

         Data Receiver Cluster/Package Name:_______________________________

    Recovery Group Data:

         Recovery Group Name: _____________________________________________

         Primary Cluster/Package Name:_____________________________________

         Data Sender Cluster/Package Name:_________________________________

         Recovery Cluster/Package Name:____________________________________

         Data Receiver Cluster/Package Name:_______________________________





Cluster Event Worksheet

The following worksheet will help you organize and record the cluster events you wish to track. Fill out a worksheet for each primary or recovery cluster that you wish to monitor. You must monitor each cluster containing a recovery package.


    Continental Cluster Name: _____________________________________________

    =======================================================================

    Cluster Event Information:

         Cluster Name _____________________________________________________

         Monitoring Cluster: ______________________________________________

         UNREACHABLE:

         Alert Interval:___________________________________________________

         Alarm Interval:___________________________________________________

         Notification:_____________________________________________________

         Notification:_____________________________________________________

         Notification:_____________________________________________________


         DOWN:

         Alert Interval:___________________________________________________

         Notification:_____________________________________________________

         Notification:_____________________________________________________

         UP:

         Alert Interval:___________________________________________________

         Notification:_____________________________________________________

         Notification:_____________________________________________________

         ERROR::

         Alert Interval:___________________________________________________

         Notification:_____________________________________________________

         Notification:_____________________________________________________
Printable version
Privacy statement Using this site means you accept its terms Feedback to webmaster
© Hewlett-Packard Development Company, L.P.