Restart Replication

When replication fails, there is a risk that the Secondary database node is not properly prepared for a failover condition when the Primary database is no longer accessible. This is not considered a critical condition, but it should be resolved promptly.
gateway83
When replication fails, there is a risk that the Secondary database node is not properly prepared for a failover condition when the Primary database is no longer accessible. This is not considered a critical condition, but it should be resolved promptly.
The
restart_replication.sh
 script is designed to restart the replication starting at the current point of the replay on the MASTER database. Note that there is inherent risk if the MASTER database contains pertinent information added after the replication failure, as this information is
not
replayed to the SLAVE.
When replication is successfully restarted, you see the message 
"Slave successfully started."
(1) Be sure to back up the Primary Gateway node before performing any of the procedures below. The procedures can be executed on a running Gateway. (2) Replication must already be correctly configured and previously running before you can restart it.
Virtual appliances may experience a slight delay when restarting, as the
vmware-tools_reconf_once s
ervice takes a moment to prepare the VMware tools for the new OS kernel
When Primary Node Slave Fails
Since the Primary Node is considered to be the authoritative database, it is safe to run the 
restart_replication.sh
 script if this node fails. 
To restart replication on the Primary Node:
  1. Run the following script on the Primary Node:
    # /opt/SecureSpan/Appliance/bin/restart_replication.sh
  2. Complete the prompts in the script:
    [primary]
    # ./restart_replication.sh
    Enter hostname or IP for the MASTER: [SET ME] 
    machine.mycompany.com
    Enter replication user: [repluser] 
    repluser
    Enter replication password: [replpass] 
    [password]
    Enter MySQL root user: [root] 
    root
    Enter MySQL root password: [] 
    [password]
    Slave successfully started
    [primary]
    # mysql -e "SHOW SLAVE STATUS\G"
    *************************** 1. row ***************************
                   Slave_IO_State: Waiting for master to send event
                      Master_Host: machine.mycompany.com
                      Master_User: repluser
                      Master_Port: 3307
                    Connect_Retry: 10
                  Master_Log_File: ssgbin-log.000016
              Read_Master_Log_Pos: 6587150
                   Relay_Log_File: ssgrelay-bin.000002
                    Relay_Log_Pos: 264
            Relay_Master_Log_File: ssgbin-log.000016
                 Slave_IO_Running: Yes
                Slave_SQL_Running: Yes
                  Replicate_Do_DB:
              Replicate_Ignore_DB:
               Replicate_Do_Table:
           Replicate_Ignore_Table:
          Replicate_Wild_Do_Table:
      Replicate_Wild_Ignore_Table:
                       Last_Errno: 0
                       Last_Error:
                     Skip_Counter: 0
              Exec_Master_Log_Pos: 6587150
                  Relay_Log_Space: 422
                  Until_Condition: None
                   Until_Log_File:
                    Until_Log_Pos: 0
               Master_SSL_Allowed: No
               Master_SSL_CA_File:
               Master_SSL_CA_Path:
                  Master_SSL_Cert:
                Master_SSL_Cipher:
                   Master_SSL_Key:
            Seconds_Behind_Master: 0
    Master_SSL_Verify_Server_Cert: No
                    Last_IO_Errno: 0
                    Last_IO_Error:
                   Last_SQL_Errno: 0
                   Last_SQL_Error:
                      Master_Bind:
      Replicate_Ignore_Server_Ids:
                 Master_Server_Id: 2
    [primary]#
When Secondary Node Slave Fails
Restarting replication on the Secondary Node is more involved. Running 
restart_replication.sh
 may result in missing configuration details in the Secondary database. It is preferable to destroy the Secondary database and then rebuild it from the Primary.
The 
/opt/SecureSpan/Appliance/bin/create_slave.sh
 script clones the database from the Primary node. 
Note:
 If the database is very large, the clone operation may time out. If this occurs, see "Manually Rebuilding Replication" below.
WARNINGS:
(1) Stop the slave on the Primary node before continuing with this procedure! Failure to do so drops the database on the Primary node. (2) During the cloning of the database in the procedure below, the Gateway cluster does not process any incoming requests.
Rebuilding with create_slave.sh
To rebuild the database using the create_slave_script.sh script:
  1. On the Primary database node, run 
    mysqladmin stop-slave
     and confirm:
    [primary]
    # mysqladmin stop-slave
    Slave stopped
    [primary]
    # mysql -e "SHOW SLAVE STATUS\G"
    .
    .
    .
                 Slave_IO_Running: No
                Slave_SQL_Running: No
    .
    .
    .
    [primary]#
  2. On the Secondary database node, ensure that the slave is stopped and then run the 
    create_slave.sh
     script. Answer “yes” to cloning the database:
    [secondary]
    # mysqladmin stop-slave
    Slave stopped
    [secondary]
    # cd /opt/SecureSpan/Appliance/bin/
    [secondary]
    # ./create_slave.sh -v
    Enter hostname or IP for the MASTER: machine.mycompany.com
    Enter replication user: [repluser] 
    repluser
    Enter replication password: [replpass] 
    [password]
    Enter MySQL root user: [root] 
    root
    Enter MySQL root password: [] 
    [password]
    Do you want to clone a database from machine.mycompany.com (yes or no)? [no] 
    yes
    Enter name of database to clone: [ssg] 
    ssg
    --> MASTER = machine.mycompany.com
    --> DBUSER = repluser
    --> DBPWD = replpass
    --> ROOT = root
    --> ROOT_PWD = 7layer
    --> CLONE_DB = yes
    --> DB = ssg
    --> Stopping slave
    --> File = ssgbin-log.000020
    --> Position = 8405121
    --> Changing MASTER settings
    --> Confirming slave not running on machine.mycompany.com
    --> Slave_IO_Running = No
    --> Slave_SQL_Running = No
    --> Master_Host = machine.mycompany.com
    W A R N I N G
    About to drop the ssg database on localhost
    and copy from machine.mycompany.com
    Are you sure you want to do this? [N]
     Y
    --> Dropping database
    --> Creating database: ssg
    --> Copying database from machine.mycompany.com
    --> Starting slave
    --> Confirming slave startup
    --> Slave_IO_Running = Yes
    --> Slave_SQL_Running = Yes
    Slave successfully created
    Manually confirm that slave is running on machine.mycompany.com
    [secondary]#
  3. Restart the replication on the Primary database node:
    [primary]
    # ./restart_replication.sh
    Enter hostname or IP for the MASTER: [SET ME] machine.mycompany.com
    Enter replication user: [repluser] 
    repluser
    Enter replication password: [replpass] 
    [password]
    Enter MySQL root user: [root] 
    root
    Enter MySQL root password: [] 
    [password]
    Slave successfully started
    [primary]#
Manually Rebuilding Replication
If the cloning step times out under the "Rebuilding with?create_slave.sh" method described above, do the following to recover:
  1. On the Primary database node run 
    mysqladmin stop-slave
     and confirm:
    [primary]
    # mysqladmin stop-slave
    Slave stopped
    [primary]
    # mysql -e "SHOW SLAVE STATUS\G"
    .
    .
    .
                 Slave_IO_Running: No
                Slave_SQL_Running: No
    .
    .
    .
    [primary]#
  2. Stop replication and drop the database on the Secondary database node:
    [secondary]
    # mysqladmin stop-slave
    Slave stopped
    [secondary]
    # mysqladmin drop ssg
    Dropping the database is potentially a very bad thing to do.
    Any data stored in the database will be destroyed.
    Do you really want to drop the 'ssg' database [y/N] 
    y
    Database "ssg" dropped
    [secondary]#
  3. Dump the database on the Primary database node and transfer it to the Secondary database node using scp:
    [primary]
    # mysqldump --master-data=1 --routines -r ssg.sql ssg
    [primary]
    ssgconfig@machine.mycompany.com's password: 
    [password]
    ssg.sql 100% 1726KB 1.7MB/s 00:00
    [primary]#
  4. Create the database on the Secondary database node and load the dump file into it:
    [secondary]
    # mysqladmin create ssg
    [secondary]
    # mysql ssg < ~ssgconfig/ssg.sql
    [secondary]
    # mysqladmin start-slave
    Slave started
    [secondary]
    # mysql -e "SHOW SLAVE STATUS\G"
    .
    .
    .
                 Slave_IO_Running: Yes
                Slave_SQL_Running: Yes
    .
    .
    .
    [secondary]#
  5. Run restart_replication.sh on the Primary database node and confirm:
    [primary]
    # cd /opt/SecureSpan/Appliance/bin
    [primary]
    # ./restart_replication.sh
    Enter hostname or IP for the MASTER: [SET ME] 
    Enter replication user: [repluser] 
    repluser
    Enter replication password: [replpass] 
    [password]
    Enter MySQL root user: [root] 
    root
    Enter MySQL root password: [] 
    [password]
    Slave successfully started
    [primary]
    # mysql -e 'SHOW SLAVE STATUS\G'
    .
    .
    .
                 Slave_IO_Running: Yes
                Slave_SQL_Running: Yes
    .
    ..
    [primary]#