1. 06 Nov, 2013 4 commits
    • Elena Stepanova's avatar
      More verbose error messages · cdecd86a
      Elena Stepanova authored
      cdecd86a
    • unknown's avatar
      MDEV-4506: Parallel replication · dcb3650d
      unknown authored
      MDEV-5217: Incorrect MyISAM event execution order causing incorrect parallel replication
      
      In parallel replication, if transactions A,B group-commit together on the
      master, we can execute them in parallel on a replication slave. But then, if
      transaction C follows on the master, on the slave, we need to be sure that
      both A and B have completed before starting on C to be sure to avoid
      conflicts.
      
      The necessary wait is implemented such that B waits for A to commit before it
      commits itself (thus preserving commit order). And C waits for B to commit
      before it itself can start executing. This way C does not start until both A
      and B have completed.
      
      The wait for B's commit on A happens inside the commit processing. However, in
      the case of MyISAM with no binlog enabled on the slave, it appears that no
      commit processing takes place (since MyISAM is non-transactional), and thus
      the wait of B for A was not done. This allowed C to start before A, which can
      lead to conflicts and incorrect replication.
      
      Fixed by doing an extra wait for A at the end of B before signalling C.
      dcb3650d
    • unknown's avatar
      MDEV-4506: Parallel replication · c90f4f02
      unknown authored
      MDEV-5217: Unlock of de-allocated mutex
      
      There was a race in the code for wait_for_commit::wakeup().
      
      Since the waiter does a dirty read of the waiting_for_commit
      flag, it was possible for the waiter to complete and deallocate
      the wait_for_commit object while the waitee was still running
      inside wakeup(). This would cause the waitee to access invalid
      memory.
      
      Fixed by putting an extra lock/unlock in the destructor for
      wait_for_commit, to ensure that waitee has finished with the
      object before it is deallocated.
      c90f4f02
    • unknown's avatar
      MDEV-4506: Parallel replication · bdbf90b9
      unknown authored
      MDEV-5217: Incorrect event pos update leading to corruption of reading of events from relay log
      
      The rli->event_relay_log_pos was sometimes undated incorrectly when using
      parallel replication, especially around relay log rotates. This could cause
      the SQL thread to seek into an invalid position in the relay log, resulting in
      errors about invalid events or even random corruption in some cases.
      bdbf90b9
  2. 05 Nov, 2013 2 commits
    • unknown's avatar
      MDEV-4506: Parallel replication. · b0391d1b
      unknown authored
      MDEV-5217: Last_sql_error lost in parallel replication.
      
      For some reason, the query execution code in log_event.cc call
      rli->clear_error for each event (part of clear_all_errors()).
      This causes a problem in parallel replication, where the
      execution in one worker thread could clear the error set by
      another thread, causing the SQL thread to stop but leaving no
      error visible in SHOW SLAVE STATUS.
      
      There seems to be no reason to clear the global error code
      in Relay_log_info for each event execution, from code review
      and from running the test suite. So remove this clearing of
      the error code to make things work also in the parallel case.
      b0391d1b
    • unknown's avatar
      MDEV-4506: Parallel replication · c834242a
      unknown authored
      MDEV-5217: SQL thread hangs during stop if error occurs in the middle of an event group
      
      Normally, when we stop the slave SQL thread in parallel replication, we want
      the worker threads to continue processing events until the end of the current
      event group. But if we stop due to an error that prevents further events from
      being queued, such as an error reading the relay log, no more events can be
      queued for the workers, so they have to abort even if they are in the middle
      of an event group. There was a bug that we would deadlock, the workers
      waiting for more events to be queued for the event group, the SQL thread
      stopped and waiting for the workers to complete their current event group
      before exiting.
      
      Fixed by now signalling from the SQL thread to all workers when it is about
      to exit, and cleaning up in all workers when so signalled.
      
      This patch fixes one of multiple problems reported in MDEV-5217.
      c834242a
  3. 04 Nov, 2013 4 commits
    • Sergei Golubchik's avatar
      increase the initial ibdata1 size, as explained in MySQL-5.6... · bf603250
      Sergei Golubchik authored
      increase the initial ibdata1 size, as explained in MySQL-5.6 revid:kevin.lewis@oracle.com-20120802192452-kmikiz990xzje18b
      
      "
        A maximum size of 10 Mb works in 5.1 because the initial
        required size of ibdata1 was less than 10M.  But in 5.5, a
        change was made to allocate all 128 rollback segments at
        bootstrap.  Since then, the initial size has been 10M + the
        default autoextend size of 8M. 
      
        In 5.6, worklog 6216 changes the autoextend size from 8M to
        64M.  This changes the initial size of ibdata1 from 18M in
        5.5 and earlier releases of 5.6 to 74M in the current
        mysql-5.6 and mysql-trunk.  So this change is especially
        needed in 5.6.
      "
      
      12M is enough to avoid autoextending during bootstrap
      bf603250
    • Sergei Golubchik's avatar
      MDEV-5080 Assertion `strcmp(share->unique_file_name,filename) ||... · 1ef87c55
      Sergei Golubchik authored
      MDEV-5080 Assertion `strcmp(share->unique_file_name,filename) || share->last_version' fails at /storage/myisam/mi_open.c:67
      
      extend table names discovery (ha_discover_table_names() and Discovered_table_list) to return
      or optionally filter out temporary tables ("#sql..."). SHOW commands and I_S tables
      typically want temp table filtered out, while DROP DATABASE wants to see them too.
      
      additonally, remove the supression for the warning "Invalid (old?) table or database name"
      from mtr, and add it to .test files as needed (we need to test that this warning
      does *not* happen in drop.test)
      1ef87c55
    • Sergei Golubchik's avatar
      restore the condition in filename_to_tablename() · 032a61fc
      Sergei Golubchik authored
      (broken in the revid:sergii@pisem.net-20130615170931-bn2h8j30vu5bfp0t)
      032a61fc
    • Sergei Golubchik's avatar
      MDEV-5232 SET ROLE checks privileges differently from check_access() · 79d2e6c8
      Sergei Golubchik authored
      use the same inconsistent priv_user@host pair for SET ROLE privilege checks,
      just as check_access() does
      79d2e6c8
  4. 03 Nov, 2013 4 commits
  5. 02 Nov, 2013 2 commits
  6. 01 Nov, 2013 1 commit
  7. 31 Oct, 2013 1 commit
    • unknown's avatar
      MDEV-5206: Incorrect slave old-style position in MDEV-4506, parallel replication. · 39df665a
      unknown authored
      In parallel replication, there are two kinds of events which are
      executed in different ways.
      
      Normal events that are part of event groups/transactions are executed
      asynchroneously by being queued for a worker thread.
      
      Other events like format description and rotate and such are executed
      directly in the driver SQL thread.
      
      If the direct execution of the other events were to update the old-style
      position, then the position gets updated too far ahead, before the normal
      events that have been queued for a worker thread have been executed. So
      this patch adds some special cases to prevent such position updates ahead
      of time, and instead queues dummy events for the worker threads, so that
      they will at an appropriate time do the position updates instead.
      
      (Also fix a race in a test case that happened to trigger while running
      tests for this patch).
      39df665a
  8. 30 Oct, 2013 1 commit
    • unknown's avatar
      MDEV-5196: Server hangs or assertion `!thd->wait_for_commit_ptr' fails on... · 9c8da4ed
      unknown authored
      MDEV-5196: Server hangs or assertion `!thd->wait_for_commit_ptr' fails on MASTER_POS_WAIT with slave-parallel-threads > 0
      
      Fix a couple of issues in MDEV-4506, Parallel replication:
      
       - Missing mysql_cond_signal(), which could cause hangs.
      
       - Fix incorrect update of old-style replication position.
      
       - Change assertion to error handling (can trigger on manipulated/
         corrupt binlog).
      9c8da4ed
  9. 29 Oct, 2013 5 commits
    • unknown's avatar
      merge 5.5->10.0-base · f4d5d849
      unknown authored
      f4d5d849
    • unknown's avatar
      Merge 5.3->5.5 · 52dea410
      unknown authored
      52dea410
    • unknown's avatar
      MariaDB made be compiled by gcc 4.8.1 · 5ce11d8b
      unknown authored
        
        There was 2 problems:
        1) coping/moving of the same type (usually casting) as sizeof() (solved in different ways depends on the cause);
        2) using 'const' in SSL_CTX::getVerifyCallback() which return object (not reference) and so copy of the object will be created and 'const' has no sens.
      5ce11d8b
    • unknown's avatar
      MDEV-5195: Race when switching relay log causing crash · f2799c68
      unknown authored
      In parallel replication, when the IO thread switches relay log,
      the SQL thread re-opens the current relaylog and seeks to the
      current position. There was a race that would cause it to
      sometimes seek to the wrong position, causing corruption and
      crash.
      f2799c68
    • timour@askmonty.org's avatar
      MDEV-5104 crash in Item_field::used_tables with broken order by · 883af99e
      timour@askmonty.org authored
      Analysis:
      st_select_lex_unit::prepare() computes can_skip_order_by as TRUE.
      As a result join->prepare() gets called with order == NULL, and
      doesn't do name resolution for the inner ORDER clause. Due to this
      the prepare phase doesn't detect that the query references non-exiting
      function and field.
        
      Later join->optimize() calls update_used_tables() for a non-resolved
      Item_field, which understandably has no Field object. This call results
      in a crash.
      
      Solution:
      Resolve unnecessary ORDER BY clauses to detect if they reference non-exising
      objects. Then remove such clauses from the JOIN object.
      883af99e
  10. 28 Oct, 2013 2 commits
    • unknown's avatar
      MDEV-4506: Parallel replication. · 2fbd1c73
      unknown authored
      MDEV-5189: Error handling in parallel replication.
      
      Fix error handling in parallel worker threads when a query fails:
      
       - Report the error to the error log.
      
       - Return the error back, and set rli->abort_slave.
      
       - Stop executing more events after the error.
      2fbd1c73
    • Sergei Golubchik's avatar
      Don't allow authentication clauses for roles, in particular: · fef41669
      Sergei Golubchik authored
        GRANT ... IDENTIFIED BY [ PASSWORD ] ...
        GRANT ... IDENTIFIED VIA ... [ USING ... ]
        GRANT ... REQUIRE ...
        GRANT ... MAX_xxx ...
        SET PASSWORD FOR ... = ...
      fef41669
  11. 27 Oct, 2013 1 commit
  12. 26 Oct, 2013 2 commits
  13. 25 Oct, 2013 2 commits
    • unknown's avatar
      MDEV-5189: Incorrect parallel apply in parallel replication · 6a38b594
      unknown authored
      Two problems were fixed:
      
      1. When not in GTID mode (master_use_gtid=no), then we must not apply events
         in different domains in parallel (in non-GTID mode we are not capable of
         restarting at different points in different domains).
      
      2. When transactions B and C group commit together, but after and separate
         from A, we can apply B and C in parallel, but both B and C must not start
         until A has committed. Fix sub_id to be globally increasing (not just
         per-domain increasing) so that this wait (which is based on sub_id) can be
         done correctly.
      6a38b594
    • unknown's avatar
      MDEV-4506: Parallel replication. · 80d0dd7b
      unknown authored
      Do not update relay-log.info and master.info on disk after every event
      when using GTID mode:
      
       - relay-log.info and master.info are not crash-safe, and are not used
         when slave restarts in GTID mode (slave connects with GTID position
         instead and immediately rewrites the file with the new, correct
         information found).
      
       - When using GTID and parallel replication, the position in
         relay-log.info is misleading at best and simply wrong at worst.
      
       - When using parallel replication, the fact that every single
         transaction needs to do a write() syscall to the same file is
         likely to become a serious bottleneck.
      
      The files are still written at normal slave stop.
      
      In non-GTID mode, the files are written as normal (this is needed to
      be able to restart after slave crash, even if such restart is then not
      crash-safe, no change).
      80d0dd7b
  14. 24 Oct, 2013 4 commits
  15. 23 Oct, 2013 5 commits
    • Sergei Golubchik's avatar
      MDEV-5176 Server crashes in fill_schema_applicable_roles on select from... · 65eee0be
      Sergei Golubchik authored
      MDEV-5176 Server crashes in fill_schema_applicable_roles on select from APPLICABLE_ROLES after a suicide
      
      Don't assume that thd->security_ctx->priv_user is an actually existing user account
      65eee0be
    • Sergei Golubchik's avatar
      MDEV-5170 Assertion `(&(&acl_cache->lock)->m_mutex)->count > 0 &&... · 7761a278
      Sergei Golubchik authored
      MDEV-5170 Assertion `(&(&acl_cache->lock)->m_mutex)->count > 0 && pthread_equal(pthread_self(), (&(&acl_cache->lock)->m_mutex)->thread)' fails after restarting server with a pre-created role grants
      
      lock acl_cache->lock mutex for the duration of acl_load
      7761a278
    • unknown's avatar
      MDEV-4506: Parallel replication. · a09d2b10
      unknown authored
      Fix some more parts of old-style position updates.
      Now we save in rgi some coordinates for master log and relay log, so
      that in do_update_pos() we can use the right set of coordinates with
      the right events.
      
      The Rotate_log_event::do_update_pos() is fixed in the parallel case
      to not directly update relay-log.info (as Rotate event runs directly
      in the driver SQL thread, ahead of actual event execution). Instead,
      group_master_log_file is updated as part of do_update_pos() in each
      event execution.
      
      In the parallel case, position updates happen in parallel without
      any ordering, but taking care that position is not updated backwards.
      Since position update happens only after event execution this leads
      to the right result.
      
      Also fix an access-after-free introduced in an earlier commit.
      a09d2b10
    • unknown's avatar
      MDEV-5133: Test suite tests *_func_view fail in time zones East of UTC+3 · e6ac94a6
      unknown authored
      test time increased to be working on all timezones.
      e6ac94a6
    • Sergei Golubchik's avatar