Restart Replication
When replication fails, there is a risk that the Secondary database node is not properly prepared for a failover condition when the Primary database is no longer accessible. This is not considered a critical condition, but it should be resolved promptly.
gateway83
When replication fails, there is a risk that the Secondary database node is not properly prepared for a failover condition when the Primary database is no longer accessible. This is not considered a critical condition, but it should be resolved promptly.
The replayed to the SLAVE.
restart_replication.sh
script is designed to restart the replication starting at the current point of the replay on the MASTER database. Note that there is inherent risk if the MASTER database contains pertinent information added after the replication failure, as this information is not
When replication is successfully restarted, you see the message
"Slave successfully started."
(1) Be sure to back up the Primary Gateway node before performing any of the procedures below. The procedures can be executed on a running Gateway. (2) Replication must already be correctly configured and previously running before you can restart it.
Virtual appliances may experience a slight delay when restarting, as the
vmware-tools_reconf_once s
ervice takes a moment to prepare the VMware tools for the new OS kernelWhen Primary Node Slave Fails
Since the Primary Node is considered to be the authoritative database, it is safe to run the
restart_replication.sh
script if this node fails. To restart replication on the Primary Node:
- Run the following script on the Primary Node:# /opt/SecureSpan/Appliance/bin/restart_replication.sh
- Complete the prompts in the script:[primary]# ./restart_replication.shEnter hostname or IP for the MASTER: [SET ME]machine.mycompany.comEnter replication user: [repluser]repluserEnter replication password: [replpass][password]Enter MySQL root user: [root]rootEnter MySQL root password: [][password]Slave successfully started[primary]# mysql -e "SHOW SLAVE STATUS\G"*************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: machine.mycompany.com Master_User: repluser Master_Port: 3307 Connect_Retry: 10 Master_Log_File: ssgbin-log.000016 Read_Master_Log_Pos: 6587150 Relay_Log_File: ssgrelay-bin.000002 Relay_Log_Pos: 264 Relay_Master_Log_File: ssgbin-log.000016 Slave_IO_Running: Yes Slave_SQL_Running: Yes Replicate_Do_DB: Replicate_Ignore_DB:Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 6587150 Relay_Log_Space: 422 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: 0Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 0 Last_SQL_Error: Master_Bind: Replicate_Ignore_Server_Ids: Master_Server_Id: 2[primary]#
When Secondary Node Slave Fails
Restarting replication on the Secondary Node is more involved. Running
restart_replication.sh
may result in missing configuration details in the Secondary database. It is preferable to destroy the Secondary database and then rebuild it from the Primary.The
/opt/SecureSpan/Appliance/bin/create_slave.sh
script clones the database from the Primary node. Note:
If the database is very large, the clone operation may time out. If this occurs, see "Manually Rebuilding Replication" below.WARNINGS:
(1) Stop the slave on the Primary node before continuing with this procedure! Failure to do so drops the database on the Primary node. (2) During the cloning of the database in the procedure below, the Gateway cluster does not process any incoming requests.Rebuilding with create_slave.sh
To rebuild the database using the create_slave_script.sh script:
- On the Primary database node, runmysqladmin stop-slaveand confirm:[primary]# mysqladmin stop-slaveSlave stopped[primary]# mysql -e "SHOW SLAVE STATUS\G"... Slave_IO_Running: No Slave_SQL_Running: No...[primary]#
- On the Secondary database node, ensure that the slave is stopped and then run thecreate_slave.shscript. Answer “yes” to cloning the database:[secondary]# mysqladmin stop-slaveSlave stopped[secondary]# cd /opt/SecureSpan/Appliance/bin/[secondary]# ./create_slave.sh -vEnter hostname or IP for the MASTER: machine.mycompany.comEnter replication user: [repluser]repluserEnter replication password: [replpass][password]Enter MySQL root user: [root]rootEnter MySQL root password: [][password]Do you want to clone a database from machine.mycompany.com (yes or no)? [no]yesEnter name of database to clone: [ssg]ssg--> MASTER = machine.mycompany.com--> DBUSER = repluser--> DBPWD = replpass--> ROOT = root--> ROOT_PWD = 7layer--> CLONE_DB = yes--> DB = ssg--> Stopping slave--> File = ssgbin-log.000020--> Position = 8405121--> Changing MASTER settings--> Confirming slave not running on machine.mycompany.com--> Slave_IO_Running = No--> Slave_SQL_Running = No--> Master_Host = machine.mycompany.comW A R N I N GAbout to drop the ssg database on localhostand copy from machine.mycompany.comAre you sure you want to do this? [N]Y--> Dropping database--> Creating database: ssg--> Copying database from machine.mycompany.com--> Starting slave--> Confirming slave startup--> Slave_IO_Running = Yes--> Slave_SQL_Running = YesSlave successfully createdManually confirm that slave is running on machine.mycompany.com[secondary]#
- Restart the replication on the Primary database node:[primary]# ./restart_replication.shEnter hostname or IP for the MASTER: [SET ME] machine.mycompany.comEnter replication user: [repluser]repluserEnter replication password: [replpass][password]Enter MySQL root user: [root]rootEnter MySQL root password: [][password]Slave successfully started[primary]#
Manually Rebuilding Replication
If the cloning step times out under the "Rebuilding with?create_slave.sh" method described above, do the following to recover:
- On the Primary database node runmysqladmin stop-slaveand confirm:[primary]# mysqladmin stop-slaveSlave stopped[primary]# mysql -e "SHOW SLAVE STATUS\G"... Slave_IO_Running: NoSlave_SQL_Running: No...[primary]#
- Stop replication and drop the database on the Secondary database node:[secondary]# mysqladmin stop-slaveSlave stopped[secondary]# mysqladmin drop ssgDropping the database is potentially a very bad thing to do.Any data stored in the database will be destroyed.Do you really want to drop the 'ssg' database [y/N]yDatabase "ssg" dropped[secondary]#
- Dump the database on the Primary database node and transfer it to the Secondary database node using scp:[primary]# mysqldump --master-data=1 --routines -r ssg.sql ssg[primary]# scp ssg.sql ssgconfig@machine.mycompany.com:ssgconfig@machine.mycompany.com's password:[password]ssg.sql 100% 1726KB 1.7MB/s 00:00[primary]#
- Create the database on the Secondary database node and load the dump file into it:[secondary]# mysqladmin create ssg[secondary]# mysql ssg < ~ssgconfig/ssg.sql[secondary]# mysqladmin start-slaveSlave started[secondary]# mysql -e "SHOW SLAVE STATUS\G"... Slave_IO_Running: Yes Slave_SQL_Running: Yes...[secondary]#
- Run restart_replication.sh on the Primary database node and confirm:[primary]# cd /opt/SecureSpan/Appliance/bin[primary]# ./restart_replication.shEnter hostname or IP for the MASTER: [SET ME] Enter replication user: [repluser]repluserEnter replication password: [replpass][password]Enter MySQL root user: [root]rootEnter MySQL root password: [][password]Slave successfully started[primary]# mysql -e 'SHOW SLAVE STATUS\G'... Slave_IO_Running: Yes Slave_SQL_Running: Yes...[primary]#