Commit 684715a2 authored by Kristian Nielsen's avatar Kristian Nielsen

MDEV-6775: Wrong binlog order in parallel replication

In parallel replication, the wait_for_commit facility is used to ensure that
events are written into the binlog in the correct order. This is handled in an
optimised way in the binlogging group commit code.

However, some statements, for example GRANT, are written directly into the
binlog, outside of the group commit code. There was a bug that this direct
write does not correctly wait for the prior transactions to have been written
first, which allows f.ex. GRANT to be written ahead of earlier transactions.

This patch adds the missing wait_for_prior_commit() before writing directly to
the binlog.

However, the problem is still there, although the race is much less likely to
occur now. The problem is that the optimised group commit code does wakeup of
following transactions early, before the binlog write is actually done. A
woken-up following transaction is then allowed to run ahead and queue up for
the group commit, which will ensure that binlog write happens in correct order
in the end. However, the code for directly written events currently bypass
this mechanism, so they get woken up and written too early.

This will be fixed properly in a later patch.
parent 55791c1a
......@@ -5848,7 +5848,10 @@ bool MYSQL_BIN_LOG::write(Log_event *event_info, my_bool *with_annotate)
if (direct)
{
int res;
DBUG_PRINT("info", ("direct is set"));
if ((res= thd->wait_for_prior_commit()))
DBUG_RETURN(res);
file= &log_file;
my_org_b_tell= my_b_tell(file);
mysql_mutex_lock(&LOCK_log);
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment