 |
» |
|
|
 |
After a failover to a cluster occurs, restoring disaster tolerance
has many challenges, the most significant of which are: Restoring the failed cluster Depending on the nature of the disaster you may need to create
a new cluster, or you may be able to restore the cluster. Steps
for each scenario are discussed in the following sections. Before starting up the new or the failed cluster, make sure
the AUTO_RUN flag for all of the Continentalclusters application packages
is disabled. This is to prevent starting the packages unexpectedly
with the cluster. Resynchronizing the data To resynchronize the data, you either restore the data to
the cluster and continue with the same data replication procedure,
or set up data replication to function in the other direction.
The following sections briefly outline some scenarios for
restoring disaster tolerance. Restore
Clusters to their Original Roles |  |
If the disaster did not destroy the cluster, you can return
both clusters in a recovery pair to their original roles. To
do this: Make sure that both clusters are up
and running, with the recovery packages continuing to run on the
surviving cluster. On each cluster, stop the Continentalclusters monitor package
if it is still running: # cmhaltpkg ccmonpkg Compare the clusters to make sure their configurations
are consistent. Correct any inconsistencies. For each recovery group where the new cluster will
run the primary package: Synchronize the data from the disks
on the surviving cluster to the disks on the new cluster. This may
be time-consuming. Halt the recovered application on the surviving
cluster if necessary, and start it on the new cluster. To keep application down time to a minimum, start
the primary package on the cluster before resynchronizing the
data of the next recovery group.
Restart the monitor using the following command
on each cluster: # cmrunpkg ccmonpkg Alternatively, if you have modified the monitoring package configuration,
use the following sequence on each cluster to apply the new configuration
and start the monitor: # cmapplyconf -P ccmonpkg.config # cmmodpkg -e ccmonpkg View the status of the Continentalcluster.
Primary
Packages Remaining on the Surviving Cluster |  |
Configure the failed cluster in a recovery pair as a recovery-only
cluster and the surviving cluster as a primary-only cluster. This
minimizes the downtime involved with moving the applications back
to the restored cluster. It also assumes that the surviving cluster
has sufficient resources to handle running all critical applications
indefinitely.  |  |  |  |  | NOTE: In a multiple recovery pairs scenario, where more than
one primary cluster are configured to share the same recovery cluster,
the following procedure to switch the role of the failed cluster
and the surviving cluster should not be used. |  |  |  |  |
Use the following: Halt the monitor packages. Issue the
following command on each cluster: # cmhaltpkg ccmonpkg Edit the Continentalclusters
ASCII configuration file. You will need to change the definitions
of monitoring clusters, and switch the names of primary and recovery
packages in the definitions of recovery groups. You may also need
to re-create data sender and data receiver packages. Check and apply the Continentalclusters
configuration: # cmcheckconcl -v -C cmconcl.config # cmapplyconcl -v -C cmconcl.config Restart the monitor packages. Issue the following
command on each cluster: # cmmodpkg -e ccmonpkg View the status of the Continentalcluster. # cmmviewconcl
Before applying the edited configuration, the data storage
associated with each cluster needs to be prepared to match the new
role. In addition, the data replication direction needs to be changed
to mirror data from the new primary cluster to the new recovery
cluster. Primary
Packages Remaining on the Surviving Cluster using cmswitchconcl |  |
Continentalclusters provides the command cmswitchconcl to facilitate steps two and three described in the section “Primary
Packages Remaining on the Surviving Cluster”. The command cmswitchconcl is used to switch the roles of primary and recovery
packages of the Continentalclusters recovery groups for which the
specified cluster is defined as the primary cluster. The cmswitchconcl command should not be used in a
multiple recovery pair configuration where more than one primary
cluster is sharing the same recovery cluster. Otherwise, the command
will fail. To restore disaster tolerance with cmswitchconcl while continuing to run the packages on the surviving
cluster, use the following procedures: Halt the monitor package on each cluster: # cmhaltpkg ccmonpkg Run: # cmswitchconcl \ -C currentContinentalclustersConfigFileName \ -c oldPrimaryClusterName \ [-a] [-F NewContinentalclustersConfigFileName] The above command switches the roles of the primary and recovery packages
of the Continentalclusters recovery groups for which “oldPrimaryClusterName” is
defined as the primary cluster. The default values of monitoring package name (ccmonpkg) and interval
(60 seconds), and notification scheme (SYSLOG) with notification
delay (0 seconds) will be added for cluster “OldPrimaryClusterName”,
which will serve as the recover-only cluster. If editing of the
default values are desired, you may do it with file “NewContinentalclusterConfigFileName” if
-F is specified, or with file “CurrentContinentalclustersConfigFileName” if
-F is not specified. If editing of the new configuration file is
needed, you should not use -a option. If option -a is specified
the new configuration will be applied automatically. If option -a is specified with cmswitchconcl in step 2, skip this step. Otherwise manually apply
the new Continentalclusters configuration: # cmapplyconcl -v -c newContinentalclustersConfigFileName (if -F is specified in step 2) # cmapplyconcl -v -c CurrentContinentalclusterConfigFileName (if -F is not specified in step 2) Restart the monitor packages,
issue the following command on each cluster: # cmmodpkg -e ccmonpkg View the status of the Continentalcluster: # cmviewconcl
The cmswitchconcl command can also be used to switch the package role
of a recovery group. If only a subset of the primary packages will remain
running on the surviving (recovery) cluster, a new option -g is provided
with the cmswitchconcl command. This option reconfigures the roles of the packages
of a recovery group and helps retain recovery protection after a
failover. Usage of option -g (recovery group based role switch reconfiguration) is the
same as the one for -c (cluster based role switch reconfiguration). However,
option -c and -g of the cmswitchconcl command are mutually exclusive. # cmswitchconcl \ -C currentContinentalclustersConfigFileName \ -g RecoverGroupName \ [-a] [-F NewContinentalclustersConfigFileName] The following is a sample of input and output files for running cmswitchconcl -C sample.input -c clusterA -F Sample.out sample.input============### Section 1. Cluster Information |
CONTINENTAL_CLUSTER_NAME Sample_CC_ClusterCLUSTER_NAME ClusterA CLUSTER_DOMAIN cup.hp.com NODE_NAME node1 NODE_NAME node2 MONITOR_PACKAGE_NAME ccmonpkgCLUSTER_NAME ClusterBCLUSTER_DOMAIN cup.hp.com NODE_NAME node3 NODE_NAME node4 MONITOR_PACKAGE_NAME ccmonpkgMONITOR_INTERVAL 60 SECONDS |
### Section 2. Recovery GroupsRECOVERY_GROUP_NAME RG1 PRIMARY_PACKAGE ClusterA/pkgX RECOVERY_PACKAGE ClusterB/pkgX'RECOVERY_GROUP_NAME RG2 PRIMARY_PACKAGE ClusterA/pkgY RECOVERY_PACKAGE ClusterB/pkgY' DATA_RECEIVER_PACKAGE ClusterB/pkgR1RECOVERY_GROUP_NAME RG3 PRIMARY_PACKAGE ClusterB/pkgZ RECOVERY_PACKAGE ClusterA/pkgZ' |
RECOVERY_GROUP_NAME RG4 PRIMARY_PACKAGE ClusterB/pkgW RECOVERY_PACKAGE ClusterA/pkgW' DATA_RECEIVER_PACKAGE ClusterA/pkgR2 |
### Section 3. Monitoring DefinitionsCLUSTER_EVENT ClusterA/DOWN MONITORING_CLUSTER ClusterB CLUSTER_ALERT 60 SECONDS |
NOTIFICATION TEXTLOG /home/user/logs/events.log “CC alert: DOWN” NOTIFICATION SYSLOG “CC alert: DOWN” CLUSTER_ALARM 90 SECONDSNOTIFICATION TEXTLOG /home/users/logs/events.log “CC alarm: DOWN” NOTIFICATION SYSLOG “CC alarm: DOWN” |
sample.output### Section 1. Cluster Information CONTINENTAL_CLUSTER_NAME Sample_CC_ClusterCLUSTER_NAME ClusterA CLUSTER_DOMAIN cup.hp.com NODE_NAME node1 NODE_NAME node2 MONITOR_PACKAGE_NAME ccmonpkg MONITOR_INTERVAL 60 SECONDSCLUSTER_NAME ClusterBCLUSTER_DOMAIN cup.hp.com NODE_NAME node3 NODE_NAME node4 |
### Section 2. Recovery GroupsRECOVERY_GROUP_NAME RG1 PRIMARY_PACKAGE ClusterB/pkgX' RECOVERY_PACKAGE ClustserA/pkgXRECOVERY_GROUP_NAME RG2 PRIMARY_PACKAGE ClusterB/pkgY' RECOVERY_PACKAGE ClusterA/pkgY DATA_RECEIVER_PACKAGE ClusterA/pkgR1RECOVERY_GROUP_NAME RG3 PRIMARY_PACKAGE ClusterB/pkgZ RECOVERY_PACKAGE ClustserA/pkgZ' |
RECOVERY_GROUP_NAME RG4 PRIMARY_PACKAGE ClusterB/pkgW RECOVERY_PACKAGE ClusterA/pkgW' DATA_RECEIVER_PACKAGE ClusterA/pkgR2 |
### Section 3. Monitoring DefinitionsCLUSTER_EVENT ClusterB/DOWN MONITORING_CLUSTER ClusterA CLUSTER_ALERT 0 MINUTES NOTIFICATION SYSLOG “CC alert: DOWN” CLUSTER_ALARM 0 MINUTES NOTIFICATION SYSLOG “CC alarm: DOWN”CLUSTER_EVENT ClusterB/UNREACHABLE MONITORING_CLUSTER ClusterA CLUSTER_ALERT 0 MINUTES NOTIFICATION SYSLOG “CC alert: UNREACHABLE” CLUSTER_ALARM 0 MINUTES NOTIFICATION SYSLOG “CC alarm: UNREACHABLE”CLUSTER_EVENT ClusterB/ERROR MONITORING_CLUSTER ClusterA CLUSTER_ALERT 0 MINUTES NOTIFICATION SYSLOG “CC alert: ERROR”CLUSTER_EVENT ClusterB/UP MONITORING_CLUSTER ClusterA CLUSTER_ALERT 0 MINUTES NOTIFICATION SYSLOG “CC alert: UP” |
Newly
Created Cluster Will Run Primary Packages |  |
After you create a new cluster to replace the damaged cluster,
you may choose to restore the critical applications to the new cluster and
restore the other cluster to its role as a backup for the recovered
packages. Configure the new cluster as a Serviceguard cluster.
Use the cmviewcl command on the surviving cluster and compare the results to
the new cluster configuration. Correct any inconsistencies on
the new cluster. Halt the monitor package on the surviving recovery
cluster: # cmhaltpkg ccmonpkg Edit the continental cluster configuration file
to replace the data from the old failed cluster with data from the
new cluster. Check and apply the Continentalclusters configuration: # cmcheckconcl -v -C cmconcl.config # cmapplyconcl -v -C cmconcl.config For each recovery group where the new cluster will
run the primary package: Synchronize the data from the disks
on the surviving recovery cluster to the disks on the new cluster.
This may be time-consuming. Halt the application on the surviving recovery cluster
if necessary, and start it on the new cluster. To keep application down time to a minimum, start
the primary package on the cluster before resynchronizing the
data of the next recovery group.
If the new cluster acts as
recovery cluster for any recovery group, create a monitor package
for the new cluster. Use the following command to apply the configuration
of the new monitor pakcage: # cmapplyconf -p ccmonpkg.config Restart the monitor package on the surviving cluster: # cmrunpkg ccmonpkg View the status of the Continentalcluster.
|