1. 23 Apr, 2015 2 commits
    • Kristian Nielsen's avatar
      Merge MDEV-8031 into 10.1 · c2dd88ac
      Kristian Nielsen authored
      c2dd88ac
    • Kristian Nielsen's avatar
      MDEV-8031: Parallel replication stops on "connection killed" error (probably... · b616991a
      Kristian Nielsen authored
      MDEV-8031: Parallel replication stops on "connection killed" error (probably incorrectly handled deadlock kill)
      
      There was a rare race, where a deadlock error might not be correctly
      handled, causing the slave to stop with something like this in the error
      log:
      
      150423 14:04:10 [ERROR] Slave SQL: Connection was killed, Gtid 0-1-2, Internal MariaDB error code: 1927
      150423 14:04:10 [Warning] Slave: Connection was killed Error_code: 1927
      150423 14:04:10 [Warning] Slave: Deadlock found when trying to get lock; try restarting transaction Error_code: 1213
      150423 14:04:10 [Warning] Slave: Connection was killed Error_code: 1927
      150423 14:04:10 [Warning] Slave: Connection was killed Error_code: 1927
      150423 14:04:10 [ERROR] Error running query, slave SQL thread aborted. Fix the problem, and restart the slave SQL thread with "SLAVE START". We stopped at log 'master-bin.000001 position 1234
      
      The problem was incorrect error handling. When a deadlock is detected, it
      causes a KILL CONNECTION on the offending thread. This error is then later
      converted to a deadlock error, and the transaction is retried.
      
      However, the deadlock error was not cleared at the start of the retry, nor
      was the lingering kill signal. So it was possible to get another deadlock
      kill early during retry. If this happened with particular thread
      scheduling/timing, it was possible that the new KILL CONNECTION error was
      masked by the earlier deadlock error, so that the second kill was not
      properly converted into a deadlock error and retry.
      
      This patch adds code that clears the old error and killed flag before
      starting the retry. It also adds code to handle a deadlock kill caught in a
      couple of places where it was not handled before.
      b616991a
  2. 22 Apr, 2015 1 commit
  3. 21 Apr, 2015 2 commits
  4. 20 Apr, 2015 6 commits
  5. 19 Apr, 2015 1 commit
  6. 17 Apr, 2015 7 commits
  7. 16 Apr, 2015 1 commit
  8. 15 Apr, 2015 2 commits
  9. 14 Apr, 2015 4 commits
  10. 13 Apr, 2015 7 commits
    • Alexander Barkov's avatar
    • Kristian Nielsen's avatar
      MDEV-7936: Assertion `!table || table->in_use == _current_thd()' failed on... · ed349270
      Kristian Nielsen authored
      MDEV-7936: Assertion `!table || table->in_use == _current_thd()' failed on parallel replication in optimistic mode
      
      Additional 10.1-specific test case.
      ed349270
    • Kristian Nielsen's avatar
      Merge MDEV-7936 into 10.1 · 2de8db62
      Kristian Nielsen authored
      2de8db62
    • Kristian Nielsen's avatar
      Merge MDEV-7936 into 10.0. · 17aff4b1
      Kristian Nielsen authored
      Conflicts:
      	sql/sql_base.cc
      17aff4b1
    • Kristian Nielsen's avatar
      MDEV-7936: Assertion `!table || table->in_use == _current_thd()' failed on... · 60d094ae
      Kristian Nielsen authored
      MDEV-7936: Assertion `!table || table->in_use == _current_thd()' failed on parallel replication in optimistic mode
      
      Make sure that in parallel replication, we execute wait_for_prior_commit()
      before setting table->in_use for a temporary table. Otherwise we can end up
      with two parallel replication worker threads competing with each other for
      use of a temporary table.
      
      Re-factor the use of find_temporary_table() to be able to handle errors
      in the caller (as wait_for_prior_commit() can return error in case of
      deadlock kill).
      60d094ae
    • Kristian Nielsen's avatar
      MDEV-7668: Intermediate master groups CREATE TEMPORARY with INSERT, causing... · c47fe0e9
      Kristian Nielsen authored
      MDEV-7668: Intermediate master groups CREATE TEMPORARY with INSERT, causing parallel replication failure
      
      [This commit cherry-picked to be able to merge MDEV-7936, of which it
      is a pre-requisite, into both 10.0 and 10.1.]
      
      Parallel replication depends on locking (table locks, row locks, etc.) to
      prevent two conflicting transactions from running and committing in parallel.
      But temporary tables are designed to be visible only to one thread, and have
      no such locking.
      
      In the concrete issue, an intermediate master could commit a CREATE TEMPORARY
      TABLE in the same group commit as in INSERT into that table. Thus, a
      lower-level master could attempt to run them in parallel and get an error.
      
      More generally, we need protection from parallel replication trying to run
      transactions in parallel that access a common temporary table.
      
      This patch simply causes use of a temporary table from parallel replication
      to wait for all previous transactions to commit, serialising the replication
      at that point.
      
      (A more fine-grained locking could be added later, possibly. However,
      using temporary tables in statement-based replication is in any case
      normally undesirable; for example a restart of the server will lose
      temporary tables and can break replication).
      
      Note that row-based replication is not affected, as it does not do any
      temporary tables on the slave-side.
      
      This patch also cleans up the locking around protecting the list of
      temporary tables in Relay_log_info. This used to take the
      rli->data_lock at the end of every statement, which is very bad for
      concurrency. With this patch, the lock is not taken unless temporary
      tables (with statement-based binlogging) are in use on the slave.
      c47fe0e9
    • Alexander Barkov's avatar
  11. 12 Apr, 2015 7 commits