Restoring the Cluster After Disaster Recovery

This topic describes how to restore the cluster after you have recovered the primary site.
gateway83
This topic describes how to restore the
API Gateway
cluster after you have recovered the primary site.
The restoration process assumes that you have a recent backup of the cluster, made prior to the disaster
To restore the cluster:
  1. Restore the cluster to its last known good state, including database replication between the database nodes. For more information, see:
  2. Schedule a maintenance window for fail-back to the cluster.
    Wait until the maintenance window before continuing.
  3. Access the privileged shell on each cluster node and then stop the Gateway service and confirm the listening ports:
    [root@Gateway1 ~]#
    service ssg stop
    Shutting down Gateway Services: [ OK ]
    [root@Gateway1 ~]#
    netstat -tnl
    Active Internet connections (only servers)
    Proto Recv-Q Send-Q Local Address Foreign Address State
    tcp 0 0 0.0.0.0:3306 0.0.0.0:* LISTEN
    tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN
    [root@Gateway1 ~]#
    .
    .
    .
    [root@Gateway2 ~]#
    service ssg stop
    Shutting down Gateway Services: [ OK ]
    [root@Gateway2 ~]#
    netstat -tnl
    Active Internet connections (only servers)
    Proto Recv-Q Send-Q Local Address Foreign Address State
    tcp 0 0 0.0.0.0:3306 0.0.0.0:* LISTEN
    tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN
    [root@Gateway2 ~]#
  4. Confirm that database replication is functioning on each node:
    [root@Gateway1 ~]#
    mysql -e "SHOW SLAVE STATUS\G" | grep Running
    Slave_IO_Running: Yes
    Slave_SQL_Running: Yes
    [root@Gateway1 ~]#
    .
    .
    [root@Gateway2 ~]#
    mysql -e "SHOW SLAVE STATUS\G" | grep Running
    Slave_IO_Running: Yes
    Slave_SQL_Running: Yes
    [root@Gateway2 ~]#
     
    You should see "Yes" for each of the slaves. If not, examine your replication configuration and adjust as necessary.
    The
    mysql
    command used in the code example above is a shortcut that displays only the status of the slaves. For a more comprehensive replication status, see Check Replication Status.
  5. Open a privileged shell on the DR (Disaster Recovery) node.
  6. Dump the database and then copy it to primary DB node (Gateway1):
    [root@Gateway-DR ~]#
    mysqldump -r Gateway-DR.sql ssg
    [root@Gateway-DR ~]#
    scp Gateway-DR.sql ssgconfig@Gateway1.mycompany.com
    :
    The authenticity of host 'Gateway1.l7tech.com (10.7.50.201)' can't be established.
    RSA key fingerprint is 7a:43:f1:ed:a3:23:e1:70:ea:dc:3b:f7:c0:b8:c7:b2.
    Are you sure you want to continue connecting (yes/no)? yes
    Warning: Permanently added 'Gateway1.l7tech.com,10.7.50.201' (RSA) to the list of known hosts.
    ssgconfig@Gateway1.l7tech.com's password:
    <password>
    Gateway-DR.sql 100% 1052KB 1.0MB/s 00:00
    [root@Gateway-DR ~]#
  7. On Gateway1, load the
    Gateway-DR.sql
    database dump into the Gateway database:
    [root@Gateway1 ~]#
    mysql ssg < ~ssgconfig/Gateway-DR.sql
  8. Start the Gateway service on both cluster nodes:
    [root@Gateway1 ~]#
    service ssg start
    Starting Gateway Services: [ OK ]
    [root@Gateway1 ~]#
    .
    .
    [root@Gateway2 ~]#
    service ssg start
    Starting Gateway Services: [ OK ]
    [root@Gateway2 ~]#
  9. Confirm that replication monitoring is working on both nodes. Also confirm that the Policy Manager can connect. When this is verified, you can redirect traffic back to the cluster.
  10. Stop the Gateway on the DR node. To do this, use option
    7
    (Manage CA API Gateway status) in the Gateway Configuration menu.
  11. Reconfigure the DR node to replicate from Gateway2. For more information, see Configuring Cluster Database Replication.