Implement automated or manual recovery?
Manual recovery costs
less to implement and gives more flexibility in making decisions
while recovering from a disaster. Evaluating the data and making
decisions can add to recovery time, but it is justified in some
situations, for example if applications compete for resources following
a disaster and one of them has to be halted.
Automated recovery reduces the amount of time and in most
cases eliminates human intervention needed to recover from a disaster. You
may want to automate recovery for any number of reasons:
Automated recovery is usually faster.
Staff may not be available for manual recovery,
as is the case with "lights-out" data centers.
Reduction in human intervention is also a reduction
in human error. Disasters don't happen often, so lack of
practice and the stressfulness of the situation may increase the
potential for human error.
Automated recovery procedures and processes can
be transparent to the clients.
Even if recovery is automated, you many choose to, or need
to recover from some types of disasters with manual recovery. A rolling disaster,
which is a disaster that happens before the cluster has recovered
from a previous disaster, is an example of when you may want to
manually switch over. If the data link failed, and as it was coming
up and resynchronizing data, a data center failed, you would want
human intervention to make judgment calls on which site had the
most current and consistent data before failing over.