1. 09 Apr, 2015 5 commits
  2. 08 Apr, 2015 19 commits
  3. 05 Apr, 2015 7 commits
  4. 03 Apr, 2015 4 commits
  5. 01 Apr, 2015 1 commit
  6. 31 Mar, 2015 2 commits
  7. 30 Mar, 2015 2 commits
    • Kristian Nielsen's avatar
      Merge MDEV-7847 and MDEV-7882 into 10.0. · f573b65e
      Kristian Nielsen authored
      Conflicts:
      	mysql-test/suite/rpl/r/rpl_parallel.result
      	sql/rpl_parallel.cc
      f573b65e
    • Kristian Nielsen's avatar
      MDEV-7847: "Slave worker thread retried transaction 10 time(s) in vain, giving... · 880f2273
      Kristian Nielsen authored
      MDEV-7847: "Slave worker thread retried transaction 10 time(s) in vain, giving up", followed by replication hanging
      
      This patch fixes a bug in the error handling in parallel replication, when one
      worker thread gets a failure and other worker threads processing later
      transactions have to rollback and abort.
      
      The problem was with the lifetime of group_commit_orderer objects (GCOs).
      A GCO is freed when we register that its last event group has committed. This
      relies on register_wait_for_prior_commit() and wait_for_prior_commit() to
      ensure that the fact that T2 has committed implies that any earlier T1 has
      also committed, and can thus no longer execute mark_start_commit().
      
      However, in the error case, the code was skipping the
      register_wait_for_prior_commit() and wait_for_prior_commit() calls. Thus
      commit ordering was not guaranteed, and a GCO could be freed too early. Then a
      later mark_start_commit() would reference deallocated GCO, which could lead to
      lost wakeup (causing slave threads to hang) or other corruption.
      
      This patch makes also the error case respect commit order. This way, also the
      error case gets the GCO lifetime correct, and the hang no longer occurs.
      880f2273