MDEV-6775: Wrong binlog order in parallel replication: Intermediate commit
The code in binlog group commit around wait_for_commit that controls commit order, did the wakeup of subsequent commits early, as soon as a following transaction is put into the group commit queue, but before any such commit has actually taken place. This causes problems with too early wakeup of transactions that need to wait for prior to commit, but do not take part in the binlog group commit for one reason or the other. This patch solves the problem, by moving the wakeup to happen only after the binlog group commit is completed. This requires a new solution to ensure that transactions that arrive later than the leader are still able to participate in group commit. This patch introduces a flag wait_for_commit::commit_started. When this is set, a waiter can queue up itself in the group commit queue. This way, effectively the wait_for_prior_commit() is skipped only for transactions that participate in group commit, so that skipping the wait is safe. Other transactions still wait as needed for correctness.
Showing
This diff is collapsed.
Please register or sign in to comment