A cluster or its component nodes may be in several different
states at different points in time. Status information for clusters,
packages and other cluster elements is shown in the output of the cmviewcl command and in some displays in Serviceguard Manager.
This section explains the meaning of many of the common conditions
the cluster or package may be in.
Information about cluster status is stored in the status database,
which is maintained on each individual node in the cluster. You
can display information contained in this database by issuing the cmviewcl command:
You can also specify that the output should be formatted as
it was in a specific earlier release by using the -r option indicating
the release format you wish. Example:
Types
of Cluster and Package States |
 |
A cluster or its component nodes may be in several different
states at different points in time. The following sections describe
many of the common conditions the cluster or package may be in.
The status of a cluster may be one of
the following:
Up. At least one
node has a running cluster daemon, and reconfiguration is not taking
place.
Down. No cluster daemons are running
on any cluster node.
Starting. The cluster is in the
process of determining its active membership. At least one cluster
daemon is running.
Unknown. The node on which the cmviewcl command is issued cannot communicate with other nodes
in the cluster.
The status of a node is either up (active as
a member of the cluster) or down (inactive in
the cluster), depending on whether its cluster daemon is running
or not. Note that a node might be down from the cluster perspective,
but still up and running HP-UX.
A node may also be in one of the following states:
Failed. A node never
sees itself in this state. Other active members of the cluster will
see a node in this state if that node was in an active cluster,
but is no longer, and is not halted.
Reforming. A node is in this state
when the cluster is re-forming. The node is currently running the
protocols which ensure that all nodes agree to the new membership
of an active cluster. If agreement is reached, the status database
is updated to reflect the new cluster membership.
Running. A node in this state has
completed all required activity for the last re-formation and is
operating normally.
Halted. A node never sees itself
in this state. Other nodes will see it in this state after the node
has gracefully left the active cluster, for instance with a cmhaltnode command.
Unknown. A node never sees itself
in this state. Other nodes assign a node this state if it has never
been an active cluster member.
The status of a package can be one of
the following:
Up. The package control
script is active.
Down. The package control script
is not active.
The state of the package can be one of
the following:
Starting. The start
instructions in the control script are being run.
Running. Services are active and
being monitored.
Halting. The halt instructions
in the control script are being run.
Package
Switching Attributes
Packages also have the following switching attributes:
Package Switching.
Enabled means that the package can switch to another node in the
event of failure.
Switching Enabled for a Node. Enabled
means that the package can switch to the referenced node. Disabled
means that the package cannot switch to the specified node until
the node is enabled for the package using the cmmodpkg command.
Every package is marked Enabled or Disabled for each node
that is either a primary or adoptive node for the package.
Status
of Group Membership
The state of the cluster for Oracle RAC is one of the following:
Up. Services are
active and being monitored. The membership appears in the output
of cmviewcl -l group.
Down. The cluster is halted and
GMS services have been stopped. The membership does not appear in
the output of the cmviewcl -l group.
The following is an example of the group membership output
shown in the cmviewcl command:
# cmviewcl -l group GROUP MEMBER PID MEMBER_NODE DGop 1 10394 comanche 0 10499 chinook DBOP 1 10501 comanche 0 10396 chinook DAALL_DB 0 10396 comanche 1 10501 chinook IGOPALL 2 10423 comanche 1 10528 chinook
|
where the cmviewcl output values are:
- GROUP
the name of a configured
group
- MEMBER
the ID number of a member of a group
- PID
the Process ID of the group member
- MEMBER_NODE
the Node on which the group member is running
Services have only status, as follows:
Up. The service is
being monitored.
Down. The service is not running.
It may have halted or failed.
Uninitialized. The service is included
in the package configuration, but it was not started with a run
command in the control script.
The network interfaces have only status, as follows:
Unknown. We cannot determine whether
the interface is up or down. This can happen when the cluster is
down. A standby interface has this status.
The serial line has only status, as follows:
Up. Heartbeats are
received over the serial line.
Down. Heartbeat has not been received
over the serial line within 2 times the NODE_TIMEOUT value.
Recovering. A corrupt message was
received on the serial line, and the line is in the process of resynchronizing.
Unknown. We cannot determine whether
the serial line is up or down. This can happen when the remote node
is down.
Failover
and Failback Policies
Packages can be configured with one of two values for the FAILOVER_POLICY parameter:
CONFIGURED_NODE. The package fails over to the next node in the node
list in the package configuration file.
MIN_PACKAGE_NODE. The package fails over to the node in the cluster with
the fewest running packages on it.
Packages can also be configured with one of two values for
the FAILBACK_POLICY parameter:
AUTOMATIC. With this setting, a package, following a failover,
returns to its primary node when the primary node becomes available
again.
MANUAL. With this setting, a package, following a failover,
must be moved back to its original node by a system administrator.
Failover and failback policies are displayed in the output
of the cmviewcl -v command.
Examples
of Cluster and Package States |
 |
The following sample output from the cmviewcl -v command shows status for the cluster in the sample configuration.
Everything is running normally; both nodes in a two-node cluster
are running, and each Oracle RAC instance package is running as
well. The only packages running are Oracle RAC instance packages.
 |
CLUSTER STATUS example up NODE STATUS STATE ftsys9 up running Network_Parameters: INTERFACE STATUS PATH NAME PRIMARY up 56/36.1 lan0 STANDBY up 60/6 lan1 PACKAGE STATUS STATE AUTO_RUN NODE ops_pkg1 up running disabled ftsys9 Policy_Parameters: POLICY_NAME CONFIGURED_VALUE Start configured_node Failback manual Node_Switching_Parameters: NODE_TYPE STATUS SWITCHING NAME Primary up enabled ftsys9 (current) NODE STATUS STATE ftsys10 up running Network_Parameters: INTERFACE STATUS PATH NAME PRIMARY up 28.1 lan0 STANDBY up 32.1 lan1 PACKAGE STATUS STATE AUTO_RUN NODE ops_pkg2 up running disabled ftsys10 Policy_Parameters: POLICY_NAME CONFIGURED_VALUE Start configured_node Failback manual Node_Switching_Parameters: NODE_TYPE STATUS SWITCHING NAME Primary up enabled ftsys10 (current) Alternate up enabled ftsys9
|
If the cluster is using a quorum server for tie-breaking services,
the display shows the server name, state and status following the
entry for each node, as in the following excerpt from the output
of cmviewcl -v:
CLUSTER STATUS example up NODE STATUS STATE ftsys9 up running Quorum Server Status: NAME STATUS STATE lp-qs up running ... NODE STATUS STATE ftsys10 up running Quorum Server Status: NAME STATUS STATE lp-qs up running |
If the cluster is using the VERITAS Cluster Volume Manager
for disk storage, the system multi-node package CVM-VxVM-pkg must
be running on all active nodes for applications to be able to access
CVM disk groups. This package is shown in the following output of
the cmviewcl command:
CLUSTER STATUS example up NODE STATUS STATE ftsys8 down halted ftsys9 up running SYSTEM_MULTI_NODE_PACKAGES: PACKAGE STATUS STATE VxVM-CVM-pkg up running |
When you use the -v option, the display shows the system multi-node package
associated with each active node in the cluster, as in the following:
SYSTEM_MULTI_NODE_PACKAGES: PACKAGE STATUS STATE VxVM-CVM-pkg up running NODE STATUS STATE ftsys8 down halted NODE STATUS STATE ftsys9 up running Script_Parameters: ITEM STATUS MAX_RESTARTS RESTARTS NAME Service up 0 0 VxVM-CVM-pkg.srv
|
Status
After Moving the Package to Another Node
After issuing the following command:
# cmrunpkg -n ftsys9 pkg2 |
the output of the cmviewcl -v command is as follows:
 |
CLUSTER STATUS example up NODE STATUS STATE ftsys9 up running Network_Parameters: INTERFACE STATUS PATH NAME PRIMARY up 56/36.1 lan0 STANDBY up 60/6 lan1 PACKAGE STATUS STATE AUTO_RUN NODE pkg1 up running enabled ftsys9 Policy_Parameters: POLICY_NAME CONFIGURED_VALUE Failover min_package_node Failback manual Script_Parameters: ITEM STATUS MAX_RESTARTS RESTARTS NAME Service up 0 0 service1 Subnet up 0 0 15.13.168.0 Resource up /example/float Node_Switching_Parameters: NODE_TYPE STATUS SWITCHING NAME Primary up enabled ftsys9 (current) Alternate up enabled ftsys10 PACKAGE STATUS STATE AUTO_RUN NODE pkg2 up running disabled ftsys9 Policy_Parameters: POLICY_NAME CONFIGURED_VALUE Failover min_package_node Failback manual Script_Parameters: ITEM STATUS NAME MAX_RESTARTS RESTARTS Service up service2.1 0 0 Subnet up 15.13.168.0 0 0 Node_Switching_Parameters: NODE_TYPE STATUS SWITCHING NAME Primary up enabled ftsys10 Alternate up enabled ftsys9 (current) NODE STATUS STATE ftsys10 up running Network_Parameters: INTERFACE STATUS PATH NAME PRIMARY up 28.1 lan0 STANDBY up 32.1 lan1
|
 |
Now pkg2 is running on node ftsys9.
Note that it is still disabled from switching.
Status
After Package Switching is Enabled
The following command changes package status back to Package Switching
Enabled:
The output of the cmviewcl command is now as follows:
CLUSTER STATUS example up NODE STATUS STATE ftsys9 up running PACKAGE STATUS STATE AUTO_RUN NODE pkg1 up running enabled ftsys9 pkg2 up running enabled ftsys9 NODE STATUS STATE ftsys10 up running |
Both packages are now running on ftsys9 and pkg2 is enabled
for switching. Ftsys10 is running the daemon and no packages are
running on ftsys10.
Status
After Halting a Node
After halting ftsys10, with the following
command:
the output of cmviewcl is as follows on ftsys9:
CLUSTER STATUS example up NODE STATUS STATE ftsys9 up running PACKAGE STATUS STATE AUTO_RUN NODE pkg1 up running enabled ftsys9 pkg2 up running enabled ftsys9 NODE STATUS STATE ftsys10 down halted |
This output is seen on both ftsys9 and ftsys10.
If you are using a serial (RS232) line as a heartbeat connection,
you will see a list of configured RS232 device files in the output
of the cmviewcl -v command. The following shows normal running status:
CLUSTER STATUS example up NODE STATUS STATE ftsys9 up running Network_Parameters: INTERFACE STATUS PATH NAME PRIMARY up 56/36.1 lan0 Serial_Heartbeat: DEVICE_FILE_NAME STATUS CONNECTED_TO: /dev/tty0p0 up ftsys10 /dev/tty0p0 NODE STATUS STATE ftsys10 up running Network_Parameters: INTERFACE STATUS PATH NAME PRIMARY up 28.1 lan0 Serial_Heartbeat: DEVICE_FILE_NAME STATUS CONNECTED_TO: /dev/tty0p0 up ftsys9 /dev/tty0p0 |
The following shows status when the serial line is not working:
CLUSTER STATUS example up NODE STATUS STATE ftsys9 up running Network_Parameters: INTERFACE STATUS PATH NAME PRIMARY up 56/36.1 lan0 Serial_Heartbeat: DEVICE_FILE_NAME STATUS CONNECTED_TO: /dev/tty0p0 down ftsys10 /dev/tty0p0 NODE STATUS STATE ftsys10 up running Network_Parameters: INTERFACE STATUS PATH NAME PRIMARY up 28.1 lan0 Serial_Heartbeat: DEVICE_FILE_NAME STATUS CONNECTED_TO: /dev/tty0p0 down ftsys9 /dev/tty0p0 |
Viewing
Data on Unowned Packages
The following example shows packages that are currently unowned,
that is, not running on any configured node. Information on monitored resources
is provided for each node on which the package can run; this allows
you to identify the cause of a failure and decide where to start
the package up again.
UNOWNED_PACKAGES PACKAGE STATUS STATE AUTO_RUN NODE PKG3 down halted enabled unowned Policy_Parameters: POLICY_NAME CONFIGURED_VALUE Failover min_package_node Failback automatic Script_Parameters: ITEM STATUS NODE_NAME NAME Resource up manx /resource/random Subnet up manx 192.8.15.0 Resource up burmese /resource/random Subnet up burmese 192.8.15.0 Resource up tabby /resource/random Subnet up tabby 192.8.15.0 Resource up persian /resource/random Subnet up persian 192.8.15.0 Node_Switching_Parameters: NODE_TYPE STATUS SWITCHING NAME Primary up enabled manx Alternate up enabled burmese Alternate up enabled tabby Alternate up enabled persian
|