1. 08 Nov, 2013 1 commit
    • unknown's avatar
      MDEV-4506: Parallel replication · 2ea0e599
      unknown authored
      Tested manually that crash in the middle of writing transaction on the master
      does correctly cause a rollback on slave, so remove the corresponding ToDo.
      2ea0e599
  2. 07 Nov, 2013 3 commits
  3. 06 Nov, 2013 6 commits
    • Elena Stepanova's avatar
    • Elena Stepanova's avatar
      71c56e6e
    • Elena Stepanova's avatar
      More verbose error messages · cdecd86a
      Elena Stepanova authored
      cdecd86a
    • unknown's avatar
      MDEV-4506: Parallel replication · dcb3650d
      unknown authored
      MDEV-5217: Incorrect MyISAM event execution order causing incorrect parallel replication
      
      In parallel replication, if transactions A,B group-commit together on the
      master, we can execute them in parallel on a replication slave. But then, if
      transaction C follows on the master, on the slave, we need to be sure that
      both A and B have completed before starting on C to be sure to avoid
      conflicts.
      
      The necessary wait is implemented such that B waits for A to commit before it
      commits itself (thus preserving commit order). And C waits for B to commit
      before it itself can start executing. This way C does not start until both A
      and B have completed.
      
      The wait for B's commit on A happens inside the commit processing. However, in
      the case of MyISAM with no binlog enabled on the slave, it appears that no
      commit processing takes place (since MyISAM is non-transactional), and thus
      the wait of B for A was not done. This allowed C to start before A, which can
      lead to conflicts and incorrect replication.
      
      Fixed by doing an extra wait for A at the end of B before signalling C.
      dcb3650d
    • unknown's avatar
      MDEV-4506: Parallel replication · c90f4f02
      unknown authored
      MDEV-5217: Unlock of de-allocated mutex
      
      There was a race in the code for wait_for_commit::wakeup().
      
      Since the waiter does a dirty read of the waiting_for_commit
      flag, it was possible for the waiter to complete and deallocate
      the wait_for_commit object while the waitee was still running
      inside wakeup(). This would cause the waitee to access invalid
      memory.
      
      Fixed by putting an extra lock/unlock in the destructor for
      wait_for_commit, to ensure that waitee has finished with the
      object before it is deallocated.
      c90f4f02
    • unknown's avatar
      MDEV-4506: Parallel replication · bdbf90b9
      unknown authored
      MDEV-5217: Incorrect event pos update leading to corruption of reading of events from relay log
      
      The rli->event_relay_log_pos was sometimes undated incorrectly when using
      parallel replication, especially around relay log rotates. This could cause
      the SQL thread to seek into an invalid position in the relay log, resulting in
      errors about invalid events or even random corruption in some cases.
      bdbf90b9
  4. 05 Nov, 2013 2 commits
    • unknown's avatar
      MDEV-4506: Parallel replication. · b0391d1b
      unknown authored
      MDEV-5217: Last_sql_error lost in parallel replication.
      
      For some reason, the query execution code in log_event.cc call
      rli->clear_error for each event (part of clear_all_errors()).
      This causes a problem in parallel replication, where the
      execution in one worker thread could clear the error set by
      another thread, causing the SQL thread to stop but leaving no
      error visible in SHOW SLAVE STATUS.
      
      There seems to be no reason to clear the global error code
      in Relay_log_info for each event execution, from code review
      and from running the test suite. So remove this clearing of
      the error code to make things work also in the parallel case.
      b0391d1b
    • unknown's avatar
      MDEV-4506: Parallel replication · c834242a
      unknown authored
      MDEV-5217: SQL thread hangs during stop if error occurs in the middle of an event group
      
      Normally, when we stop the slave SQL thread in parallel replication, we want
      the worker threads to continue processing events until the end of the current
      event group. But if we stop due to an error that prevents further events from
      being queued, such as an error reading the relay log, no more events can be
      queued for the workers, so they have to abort even if they are in the middle
      of an event group. There was a bug that we would deadlock, the workers
      waiting for more events to be queued for the event group, the SQL thread
      stopped and waiting for the workers to complete their current event group
      before exiting.
      
      Fixed by now signalling from the SQL thread to all workers when it is about
      to exit, and cleaning up in all workers when so signalled.
      
      This patch fixes one of multiple problems reported in MDEV-5217.
      c834242a
  5. 04 Nov, 2013 4 commits
    • Sergei Golubchik's avatar
      increase the initial ibdata1 size, as explained in MySQL-5.6... · bf603250
      Sergei Golubchik authored
      increase the initial ibdata1 size, as explained in MySQL-5.6 revid:kevin.lewis@oracle.com-20120802192452-kmikiz990xzje18b
      
      "
        A maximum size of 10 Mb works in 5.1 because the initial
        required size of ibdata1 was less than 10M.  But in 5.5, a
        change was made to allocate all 128 rollback segments at
        bootstrap.  Since then, the initial size has been 10M + the
        default autoextend size of 8M. 
      
        In 5.6, worklog 6216 changes the autoextend size from 8M to
        64M.  This changes the initial size of ibdata1 from 18M in
        5.5 and earlier releases of 5.6 to 74M in the current
        mysql-5.6 and mysql-trunk.  So this change is especially
        needed in 5.6.
      "
      
      12M is enough to avoid autoextending during bootstrap
      bf603250
    • Sergei Golubchik's avatar
      MDEV-5080 Assertion `strcmp(share->unique_file_name,filename) ||... · 1ef87c55
      Sergei Golubchik authored
      MDEV-5080 Assertion `strcmp(share->unique_file_name,filename) || share->last_version' fails at /storage/myisam/mi_open.c:67
      
      extend table names discovery (ha_discover_table_names() and Discovered_table_list) to return
      or optionally filter out temporary tables ("#sql..."). SHOW commands and I_S tables
      typically want temp table filtered out, while DROP DATABASE wants to see them too.
      
      additonally, remove the supression for the warning "Invalid (old?) table or database name"
      from mtr, and add it to .test files as needed (we need to test that this warning
      does *not* happen in drop.test)
      1ef87c55
    • Sergei Golubchik's avatar
      restore the condition in filename_to_tablename() · 032a61fc
      Sergei Golubchik authored
      (broken in the revid:sergii@pisem.net-20130615170931-bn2h8j30vu5bfp0t)
      032a61fc
    • Sergei Golubchik's avatar
      MDEV-5232 SET ROLE checks privileges differently from check_access() · 79d2e6c8
      Sergei Golubchik authored
      use the same inconsistent priv_user@host pair for SET ROLE privilege checks,
      just as check_access() does
      79d2e6c8
  6. 03 Nov, 2013 4 commits
  7. 02 Nov, 2013 2 commits
  8. 01 Nov, 2013 1 commit
  9. 31 Oct, 2013 1 commit
    • unknown's avatar
      MDEV-5206: Incorrect slave old-style position in MDEV-4506, parallel replication. · 39df665a
      unknown authored
      In parallel replication, there are two kinds of events which are
      executed in different ways.
      
      Normal events that are part of event groups/transactions are executed
      asynchroneously by being queued for a worker thread.
      
      Other events like format description and rotate and such are executed
      directly in the driver SQL thread.
      
      If the direct execution of the other events were to update the old-style
      position, then the position gets updated too far ahead, before the normal
      events that have been queued for a worker thread have been executed. So
      this patch adds some special cases to prevent such position updates ahead
      of time, and instead queues dummy events for the worker threads, so that
      they will at an appropriate time do the position updates instead.
      
      (Also fix a race in a test case that happened to trigger while running
      tests for this patch).
      39df665a
  10. 30 Oct, 2013 1 commit
    • unknown's avatar
      MDEV-5196: Server hangs or assertion `!thd->wait_for_commit_ptr' fails on... · 9c8da4ed
      unknown authored
      MDEV-5196: Server hangs or assertion `!thd->wait_for_commit_ptr' fails on MASTER_POS_WAIT with slave-parallel-threads > 0
      
      Fix a couple of issues in MDEV-4506, Parallel replication:
      
       - Missing mysql_cond_signal(), which could cause hangs.
      
       - Fix incorrect update of old-style replication position.
      
       - Change assertion to error handling (can trigger on manipulated/
         corrupt binlog).
      9c8da4ed
  11. 29 Oct, 2013 5 commits
    • unknown's avatar
      merge 5.5->10.0-base · f4d5d849
      unknown authored
      f4d5d849
    • unknown's avatar
      Merge 5.3->5.5 · 52dea410
      unknown authored
      52dea410
    • unknown's avatar
      MariaDB made be compiled by gcc 4.8.1 · 5ce11d8b
      unknown authored
        
        There was 2 problems:
        1) coping/moving of the same type (usually casting) as sizeof() (solved in different ways depends on the cause);
        2) using 'const' in SSL_CTX::getVerifyCallback() which return object (not reference) and so copy of the object will be created and 'const' has no sens.
      5ce11d8b
    • unknown's avatar
      MDEV-5195: Race when switching relay log causing crash · f2799c68
      unknown authored
      In parallel replication, when the IO thread switches relay log,
      the SQL thread re-opens the current relaylog and seeks to the
      current position. There was a race that would cause it to
      sometimes seek to the wrong position, causing corruption and
      crash.
      f2799c68
    • timour@askmonty.org's avatar
      MDEV-5104 crash in Item_field::used_tables with broken order by · 883af99e
      timour@askmonty.org authored
      Analysis:
      st_select_lex_unit::prepare() computes can_skip_order_by as TRUE.
      As a result join->prepare() gets called with order == NULL, and
      doesn't do name resolution for the inner ORDER clause. Due to this
      the prepare phase doesn't detect that the query references non-exiting
      function and field.
        
      Later join->optimize() calls update_used_tables() for a non-resolved
      Item_field, which understandably has no Field object. This call results
      in a crash.
      
      Solution:
      Resolve unnecessary ORDER BY clauses to detect if they reference non-exising
      objects. Then remove such clauses from the JOIN object.
      883af99e
  12. 28 Oct, 2013 2 commits
    • unknown's avatar
      MDEV-4506: Parallel replication. · 2fbd1c73
      unknown authored
      MDEV-5189: Error handling in parallel replication.
      
      Fix error handling in parallel worker threads when a query fails:
      
       - Report the error to the error log.
      
       - Return the error back, and set rli->abort_slave.
      
       - Stop executing more events after the error.
      2fbd1c73
    • Sergei Golubchik's avatar
      Don't allow authentication clauses for roles, in particular: · fef41669
      Sergei Golubchik authored
        GRANT ... IDENTIFIED BY [ PASSWORD ] ...
        GRANT ... IDENTIFIED VIA ... [ USING ... ]
        GRANT ... REQUIRE ...
        GRANT ... MAX_xxx ...
        SET PASSWORD FOR ... = ...
      fef41669
  13. 27 Oct, 2013 1 commit
  14. 26 Oct, 2013 2 commits
  15. 25 Oct, 2013 2 commits
    • unknown's avatar
      MDEV-5189: Incorrect parallel apply in parallel replication · 6a38b594
      unknown authored
      Two problems were fixed:
      
      1. When not in GTID mode (master_use_gtid=no), then we must not apply events
         in different domains in parallel (in non-GTID mode we are not capable of
         restarting at different points in different domains).
      
      2. When transactions B and C group commit together, but after and separate
         from A, we can apply B and C in parallel, but both B and C must not start
         until A has committed. Fix sub_id to be globally increasing (not just
         per-domain increasing) so that this wait (which is based on sub_id) can be
         done correctly.
      6a38b594
    • unknown's avatar
      MDEV-4506: Parallel replication. · 80d0dd7b
      unknown authored
      Do not update relay-log.info and master.info on disk after every event
      when using GTID mode:
      
       - relay-log.info and master.info are not crash-safe, and are not used
         when slave restarts in GTID mode (slave connects with GTID position
         instead and immediately rewrites the file with the new, correct
         information found).
      
       - When using GTID and parallel replication, the position in
         relay-log.info is misleading at best and simply wrong at worst.
      
       - When using parallel replication, the fact that every single
         transaction needs to do a write() syscall to the same file is
         likely to become a serious bottleneck.
      
      The files are still written at normal slave stop.
      
      In non-GTID mode, the files are written as normal (this is needed to
      be able to restart after slave crash, even if such restart is then not
      crash-safe, no change).
      80d0dd7b
  16. 24 Oct, 2013 3 commits