MDEV-6775: Wrong binlog order in parallel replication
In parallel replication, the wait_for_commit facility is used to ensure that events are written into the binlog in the correct order. This is handled in an optimised way in the binlogging group commit code. However, some statements, for example GRANT, are written directly into the binlog, outside of the group commit code. There was a bug that this direct write does not correctly wait for the prior transactions to have been written first, which allows f.ex. GRANT to be written ahead of earlier transactions. This patch adds the missing wait_for_prior_commit() before writing directly to the binlog. However, the problem is still there, although the race is much less likely to occur now. The problem is that the optimised group commit code does wakeup of following transactions early, before the binlog write is actually done. A woken-up following transaction is then allowed to run ahead and queue up for the group commit, which will ensure that binlog write happens in correct order in the end. However, the code for directly written events currently bypass this mechanism, so they get woken up and written too early. This will be fixed properly in a later patch.
Showing
Please register or sign in to comment