Restart Replication

When replication fails, there is a risk that the Secondary database node is not properly prepared for a failover condition when the Primary database is no longer accessible. This is not considered a critical condition, but it should be resolved promptly.

gateway83

The
restart_replication.sh
script is designed to restart the replication starting at the current point of the replay on the MASTER database. Note that there is inherent risk if the MASTER database contains pertinent information added after the replication failure, as this information is not
replayed to the SLAVE.

When replication is successfully restarted, you see the message "Slave successfully started."

(1) Be sure to back up the Primary Gateway node before performing any of the procedures below. The procedures can be executed on a running Gateway. (2) Replication must already be correctly configured and previously running before you can restart it.

Virtual appliances may experience a slight delay when restarting, as the vmware-tools_reconf_once s
ervice takes a moment to prepare the VMware tools for the new OS kernel

When Primary Node Slave Fails

Since the Primary Node is considered to be the authoritative database, it is safe to run the restart_replication.sh
script if this node fails.

To restart replication on the Primary Node:

Run the following script on the Primary Node:

# /opt/SecureSpan/Appliance/bin/restart_replication.sh

Complete the prompts in the script:

[primary]# ./restart_replication.shEnter hostname or IP for the MASTER: [SET ME] machine.mycompany.comEnter replication user: [repluser] repluserEnter replication password: [replpass] [password]Enter MySQL root user: [root] rootEnter MySQL root password: [] [password]Slave successfully started[primary]# mysql -e "SHOW SLAVE STATUS\G"*************************** 1. row ***************************               Slave_IO_State: Waiting for master to send event                  Master_Host: machine.mycompany.com                  Master_User: repluser                  Master_Port: 3307                Connect_Retry: 10              Master_Log_File: ssgbin-log.000016          Read_Master_Log_Pos: 6587150               Relay_Log_File: ssgrelay-bin.000002                Relay_Log_Pos: 264        Relay_Master_Log_File: ssgbin-log.000016             Slave_IO_Running: Yes            Slave_SQL_Running: Yes              Replicate_Do_DB:          Replicate_Ignore_DB:           Replicate_Do_Table:       Replicate_Ignore_Table:      Replicate_Wild_Do_Table:  Replicate_Wild_Ignore_Table:                   Last_Errno: 0                   Last_Error:                 Skip_Counter: 0          Exec_Master_Log_Pos: 6587150              Relay_Log_Space: 422              Until_Condition: None               Until_Log_File:                Until_Log_Pos: 0           Master_SSL_Allowed: No           Master_SSL_CA_File:           Master_SSL_CA_Path:              Master_SSL_Cert:            Master_SSL_Cipher:               Master_SSL_Key:        Seconds_Behind_Master: 0Master_SSL_Verify_Server_Cert: No                Last_IO_Errno: 0                Last_IO_Error:               Last_SQL_Errno: 0               Last_SQL_Error:                  Master_Bind:  Replicate_Ignore_Server_Ids:             Master_Server_Id: 2[primary]#

When Secondary Node Slave Fails

Restarting replication on the Secondary Node is more involved. Running restart_replication.sh
may result in missing configuration details in the Secondary database. It is preferable to destroy the Secondary database and then rebuild it from the Primary.

The /opt/SecureSpan/Appliance/bin/create_slave.sh
script clones the database from the Primary node. Note:
If the database is very large, the clone operation may time out. If this occurs, see "Manually Rebuilding Replication" below.

WARNINGS:
(1) Stop the slave on the Primary node before continuing with this procedure! Failure to do so drops the database on the Primary node. (2) During the cloning of the database in the procedure below, the Gateway cluster does not process any incoming requests.

Rebuilding with create_slave.sh

To rebuild the database using the create_slave_script.sh script:

On the Primary database node, run mysqladmin stop-slave
and confirm:

[primary]# mysqladmin stop-slaveSlave stopped[primary]# mysql -e "SHOW SLAVE STATUS\G"...             Slave_IO_Running: No            Slave_SQL_Running: No...[primary]#

On the Secondary database node, ensure that the slave is stopped and then run the create_slave.sh
script. Answer “yes” to cloning the database:

[secondary]# mysqladmin stop-slaveSlave stopped[secondary]# cd /opt/SecureSpan/Appliance/bin/[secondary]# ./create_slave.sh -vEnter hostname or IP for the MASTER: machine.mycompany.comEnter replication user: [repluser] repluserEnter replication password: [replpass] [password]Enter MySQL root user: [root] rootEnter MySQL root password: [] [password]Do you want to clone a database from machine.mycompany.com (yes or no)? [no] yesEnter name of database to clone: [ssg] ssg--> MASTER = machine.mycompany.com--> DBUSER = repluser--> DBPWD = replpass--> ROOT = root--> ROOT_PWD = 7layer--> CLONE_DB = yes--> DB = ssg--> Stopping slave--> File = ssgbin-log.000020--> Position = 8405121--> Changing MASTER settings--> Confirming slave not running on machine.mycompany.com--> Slave_IO_Running = No--> Slave_SQL_Running = No--> Master_Host = machine.mycompany.comW A R N I N GAbout to drop the ssg database on localhostand copy from machine.mycompany.comAre you sure you want to do this? [N] Y--> Dropping database--> Creating database: ssg--> Copying database from machine.mycompany.com--> Starting slave--> Confirming slave startup--> Slave_IO_Running = Yes--> Slave_SQL_Running = YesSlave successfully createdManually confirm that slave is running on machine.mycompany.com[secondary]#

Restart the replication on the Primary database node:

[primary]# ./restart_replication.shEnter hostname or IP for the MASTER: [SET ME] machine.mycompany.comEnter replication user: [repluser] repluserEnter replication password: [replpass] [password]Enter MySQL root user: [root] rootEnter MySQL root password: [] [password]Slave successfully started[primary]#

Manually Rebuilding Replication

If the cloning step times out under the "Rebuilding with?create_slave.sh" method described above, do the following to recover:

On the Primary database node run mysqladmin stop-slave
and confirm:

[primary]# mysqladmin stop-slaveSlave stopped[primary]# mysql -e "SHOW SLAVE STATUS\G"...             Slave_IO_Running: No            Slave_SQL_Running: No...[primary]#

Stop replication and drop the database on the Secondary database node:

[secondary]# mysqladmin stop-slaveSlave stopped[secondary]# mysqladmin drop ssgDropping the database is potentially a very bad thing to do.Any data stored in the database will be destroyed.Do you really want to drop the 'ssg' database [y/N] yDatabase "ssg" dropped[secondary]#

Dump the database on the Primary database node and transfer it to the Secondary database node using scp:

[primary]# mysqldump --master-data=1 --routines -r ssg.sql ssg[primary]# scp ssg.sql ssgconfig@machine.mycompany.com:ssgconfig@machine.mycompany.com's password: [password]ssg.sql 100% 1726KB 1.7MB/s 00:00[primary]#

Create the database on the Secondary database node and load the dump file into it:

[secondary]# mysqladmin create ssg[secondary]# mysql ssg < ~ssgconfig/ssg.sql[secondary]# mysqladmin start-slaveSlave started[secondary]# mysql -e "SHOW SLAVE STATUS\G"...             Slave_IO_Running: Yes            Slave_SQL_Running: Yes...[secondary]#

Run restart_replication.sh on the Primary database node and confirm:

[primary]# cd /opt/SecureSpan/Appliance/bin[primary]# ./restart_replication.shEnter hostname or IP for the MASTER: [SET ME] machine.mycompany.comEnter replication user: [repluser] repluserEnter replication password: [replpass] [password]Enter MySQL root user: [root] rootEnter MySQL root password: [] [password]Slave successfully started[primary]# mysql -e 'SHOW SLAVE STATUS\G'...             Slave_IO_Running: Yes            Slave_SQL_Running: Yes...[primary]#

Configuring Cluster Database Replication

Content feedback and comments

CA API Gateway 9.1

Restart Replication