• Andrei Elkin's avatar
    Bug #38694 Race condition in replication thread shutdown · d38e6263
    Andrei Elkin authored
    The issue of the current bug is unguarded access to mi->slave_running 
    by the shutdown thread calling end_slave() that is bug#29968 
    (alas happened not to be cross-linked with the current bug)
    
    Fixed:
    
    with removing the unguarded read of the running status
    and perform reading it in terminate_slave_thread()
    at time run_lock is taken (mostly bug#29968 backporting, still with some
    improvements over that patch - see the error reporting from 
    terminate_slave_thread()).
    Issue of bug#38716 is fixed here for 5.0 branch as well.
    
    Note:
    
    There has been a separate artifact identified - 
    a race condition between init_slave() and  end_slave() - 
    reported as  Bug#44467.
    
    mysql-test/r/rpl_bug38694.result:
      a new results file is added.
    mysql-test/t/rpl_bug38694-slave.opt:
      simulating delay at slave threads shutdown.
    mysql-test/t/rpl_bug38694.test:
      A new test to check if a delay at the termination phase of slave threads
      could cause any issue.
    sql/slave.cc:
      The unguarded read of the running status is removed. Its reading is done in
      terminate_slave_thread() at time run_lock is taken;
      Calling terminate_slave_threads(skip_lock := !need_slave_mutex) in the failing branch of start_slave_threads() which is bug#38716 issue.
    sql/slave.h:
      removing terminate_slave_thread() out of the global interface scope.
    d38e6263
slave.cc 177 KB