Commits · dcb3650d6305483c002477804c57162c466ce397 · Kirill Smelkov / mariadb

06 Nov, 2013 3 commits

MDEV-4506: Parallel replication · dcb3650d

unknown authored Nov 06, 2013

MDEV-5217: Incorrect MyISAM event execution order causing incorrect parallel replication

In parallel replication, if transactions A,B group-commit together on the
master, we can execute them in parallel on a replication slave. But then, if
transaction C follows on the master, on the slave, we need to be sure that
both A and B have completed before starting on C to be sure to avoid
conflicts.

The necessary wait is implemented such that B waits for A to commit before it
commits itself (thus preserving commit order). And C waits for B to commit
before it itself can start executing. This way C does not start until both A
and B have completed.

The wait for B's commit on A happens inside the commit processing. However, in
the case of MyISAM with no binlog enabled on the slave, it appears that no
commit processing takes place (since MyISAM is non-transactional), and thus
the wait of B for A was not done. This allowed C to start before A, which can
lead to conflicts and incorrect replication.

Fixed by doing an extra wait for A at the end of B before signalling C.

dcb3650d

MDEV-4506: Parallel replication · c90f4f02

unknown authored Nov 06, 2013

MDEV-5217: Unlock of de-allocated mutex

There was a race in the code for wait_for_commit::wakeup().

Since the waiter does a dirty read of the waiting_for_commit
flag, it was possible for the waiter to complete and deallocate
the wait_for_commit object while the waitee was still running
inside wakeup(). This would cause the waitee to access invalid
memory.

Fixed by putting an extra lock/unlock in the destructor for
wait_for_commit, to ensure that waitee has finished with the
object before it is deallocated.

c90f4f02

MDEV-4506: Parallel replication · bdbf90b9

unknown authored Nov 06, 2013

MDEV-5217: Incorrect event pos update leading to corruption of reading of events from relay log

The rli->event_relay_log_pos was sometimes undated incorrectly when using
parallel replication, especially around relay log rotates. This could cause
the SQL thread to seek into an invalid position in the relay log, resulting in
errors about invalid events or even random corruption in some cases.

bdbf90b9

05 Nov, 2013 2 commits

MDEV-4506: Parallel replication. · b0391d1b

unknown authored Nov 05, 2013

MDEV-5217: Last_sql_error lost in parallel replication.

For some reason, the query execution code in log_event.cc call
rli->clear_error for each event (part of clear_all_errors()).
This causes a problem in parallel replication, where the
execution in one worker thread could clear the error set by
another thread, causing the SQL thread to stop but leaving no
error visible in SHOW SLAVE STATUS.

There seems to be no reason to clear the global error code
in Relay_log_info for each event execution, from code review
and from running the test suite. So remove this clearing of
the error code to make things work also in the parallel case.

b0391d1b

MDEV-4506: Parallel replication · c834242a

unknown authored Nov 05, 2013

MDEV-5217: SQL thread hangs during stop if error occurs in the middle of an event group

Normally, when we stop the slave SQL thread in parallel replication, we want
the worker threads to continue processing events until the end of the current
event group. But if we stop due to an error that prevents further events from
being queued, such as an error reading the relay log, no more events can be
queued for the workers, so they have to abort even if they are in the middle
of an event group. There was a bug that we would deadlock, the workers
waiting for more events to be queued for the event group, the SQL thread
stopped and waiting for the workers to complete their current event group
before exiting.

Fixed by now signalling from the SQL thread to all workers when it is about
to exit, and cleaning up in all workers when so signalled.

This patch fixes one of multiple problems reported in MDEV-5217.

c834242a

04 Nov, 2013 4 commits

increase the initial ibdata1 size, as explained in MySQL-5.6... · bf603250

Sergei Golubchik authored Nov 04, 2013

increase the initial ibdata1 size, as explained in MySQL-5.6 revid:kevin.lewis@oracle.com-20120802192452-kmikiz990xzje18b

"
  A maximum size of 10 Mb works in 5.1 because the initial
  required size of ibdata1 was less than 10M.  But in 5.5, a
  change was made to allocate all 128 rollback segments at
  bootstrap.  Since then, the initial size has been 10M + the
  default autoextend size of 8M. 

  In 5.6, worklog 6216 changes the autoextend size from 8M to
  64M.  This changes the initial size of ibdata1 from 18M in
  5.5 and earlier releases of 5.6 to 74M in the current
  mysql-5.6 and mysql-trunk.  So this change is especially
  needed in 5.6.
"

12M is enough to avoid autoextending during bootstrap

bf603250

MDEV-5080 Assertion `strcmp(share->unique_file_name,filename) ||... · 1ef87c55

Sergei Golubchik authored Nov 04, 2013

MDEV-5080 Assertion `strcmp(share->unique_file_name,filename) || share->last_version' fails at /storage/myisam/mi_open.c:67

extend table names discovery (ha_discover_table_names() and Discovered_table_list) to return
or optionally filter out temporary tables ("#sql..."). SHOW commands and I_S tables
typically want temp table filtered out, while DROP DATABASE wants to see them too.

additonally, remove the supression for the warning "Invalid (old?) table or database name"
from mtr, and add it to .test files as needed (we need to test that this warning
does *not* happen in drop.test)

1ef87c55

restore the condition in filename_to_tablename() · 032a61fc
Sergei Golubchik authored Nov 04, 2013
```
(broken in the revid:sergii@pisem.net-20130615170931-bn2h8j30vu5bfp0t)
```
032a61fc
MDEV-5232 SET ROLE checks privileges differently from check_access() · 79d2e6c8
Sergei Golubchik authored Nov 04, 2013
```
use the same inconsistent priv_user@host pair for SET ROLE privilege checks,
just as check_access() does
```
79d2e6c8

03 Nov, 2013 4 commits
- merge mdev-4506-base into 10.0-base · 00ba6191
  Sergei Golubchik authored Nov 03, 2013
  
  00ba6191
- MDEV-4332 Increase username length from 16 characters · 5c9d2c6c
  Sergei Golubchik authored Nov 03, 2013
```
10.0 part of the task, fix system tables
```
  5c9d2c6c
- remove hostname-dependent part of the test · da122e85
  Sergei Golubchik authored Nov 03, 2013
  
  da122e85
- Fixed number of keys to be 64 bit safe · 679c682d
  Michael Widenius authored Nov 03, 2013
  
  679c682d
02 Nov, 2013 2 commits
- grant/revoke ... to/from current_role · 320b8528
  Sergei Golubchik authored Nov 02, 2013
  
  320b8528
- MDEV-5225 Server crashes on CREATE USER|ROLE CURRENT_ROLE or DROP ROLE CURRENT_ROLE · 1f036865
  Sergei Golubchik authored Nov 02, 2013
  
  1f036865
01 Nov, 2013 1 commit
- Merge MDEV-4506: Parallel replication into 10.0-base. · cb86ce60
  unknown authored Nov 01, 2013
  
  cb86ce60
31 Oct, 2013 1 commit

MDEV-5206: Incorrect slave old-style position in MDEV-4506, parallel replication. · 39df665a

unknown authored Oct 31, 2013

In parallel replication, there are two kinds of events which are
executed in different ways.

Normal events that are part of event groups/transactions are executed
asynchroneously by being queued for a worker thread.

Other events like format description and rotate and such are executed
directly in the driver SQL thread.

If the direct execution of the other events were to update the old-style
position, then the position gets updated too far ahead, before the normal
events that have been queued for a worker thread have been executed. So
this patch adds some special cases to prevent such position updates ahead
of time, and instead queues dummy events for the worker threads, so that
they will at an appropriate time do the position updates instead.

(Also fix a race in a test case that happened to trigger while running
tests for this patch).

39df665a

30 Oct, 2013 1 commit

MDEV-5196: Server hangs or assertion `!thd->wait_for_commit_ptr' fails on... · 9c8da4ed

unknown authored Oct 30, 2013

MDEV-5196: Server hangs or assertion `!thd->wait_for_commit_ptr' fails on MASTER_POS_WAIT with slave-parallel-threads > 0

Fix a couple of issues in MDEV-4506, Parallel replication:

 - Missing mysql_cond_signal(), which could cause hangs.

 - Fix incorrect update of old-style replication position.

 - Change assertion to error handling (can trigger on manipulated/
   corrupt binlog).

9c8da4ed

29 Oct, 2013 5 commits

merge 5.5->10.0-base · f4d5d849
unknown authored Oct 29, 2013

f4d5d849
Merge 5.3->5.5 · 52dea410
unknown authored Oct 29, 2013

52dea410

MariaDB made be compiled by gcc 4.8.1 · 5ce11d8b

unknown authored Oct 29, 2013

There was 2 problems:
1) coping/moving of the same type (usually casting) as sizeof() (solved in different ways depends on the cause);
2) using 'const' in SSL_CTX::getVerifyCallback() which return object (not reference) and so copy of the object will be created and 'const' has no sens.

5ce11d8b

MDEV-5195: Race when switching relay log causing crash · f2799c68

unknown authored Oct 29, 2013

In parallel replication, when the IO thread switches relay log,
the SQL thread re-opens the current relaylog and seeks to the
current position. There was a race that would cause it to
sometimes seek to the wrong position, causing corruption and
crash.

f2799c68

MDEV-5104 crash in Item_field::used_tables with broken order by · 883af99e

timour@askmonty.org authored Oct 29, 2013

Analysis:
st_select_lex_unit::prepare() computes can_skip_order_by as TRUE.
As a result join->prepare() gets called with order == NULL, and
doesn't do name resolution for the inner ORDER clause. Due to this
the prepare phase doesn't detect that the query references non-exiting
function and field.
  
Later join->optimize() calls update_used_tables() for a non-resolved
Item_field, which understandably has no Field object. This call results
in a crash.

Solution:
Resolve unnecessary ORDER BY clauses to detect if they reference non-exising
objects. Then remove such clauses from the JOIN object.

883af99e

28 Oct, 2013 2 commits

MDEV-4506: Parallel replication. · 2fbd1c73

unknown authored Oct 28, 2013

MDEV-5189: Error handling in parallel replication.

Fix error handling in parallel worker threads when a query fails:

 - Report the error to the error log.

 - Return the error back, and set rli->abort_slave.

 - Stop executing more events after the error.

2fbd1c73

Don't allow authentication clauses for roles, in particular: · fef41669

Sergei Golubchik authored Oct 28, 2013

  GRANT ... IDENTIFIED BY [ PASSWORD ] ...
  GRANT ... IDENTIFIED VIA ... [ USING ... ]
  GRANT ... REQUIRE ...
  GRANT ... MAX_xxx ...
  SET PASSWORD FOR ... = ...

fef41669

27 Oct, 2013 1 commit
- post-review cleanup · d5c97122
  Sergei Golubchik authored Oct 27, 2013
  
  d5c97122
26 Oct, 2013 2 commits
- remove inherited routine grants when a routine is dropped · e46eea86
  Sergei Golubchik authored Oct 26, 2013
  
  e46eea86
- Implemented REVOKE ALL FROM for Roles and role grants. · 2eed3b7d
  Vicențiu Ciorbaru authored Oct 26, 2013
  
  2eed3b7d
25 Oct, 2013 2 commits

MDEV-5189: Incorrect parallel apply in parallel replication · 6a38b594

unknown authored Oct 25, 2013

Two problems were fixed:

1. When not in GTID mode (master_use_gtid=no), then we must not apply events
   in different domains in parallel (in non-GTID mode we are not capable of
   restarting at different points in different domains).

2. When transactions B and C group commit together, but after and separate
   from A, we can apply B and C in parallel, but both B and C must not start
   until A has committed. Fix sub_id to be globally increasing (not just
   per-domain increasing) so that this wait (which is based on sub_id) can be
   done correctly.

6a38b594

MDEV-4506: Parallel replication. · 80d0dd7b

unknown authored Oct 25, 2013

Do not update relay-log.info and master.info on disk after every event
when using GTID mode:

 - relay-log.info and master.info are not crash-safe, and are not used
   when slave restarts in GTID mode (slave connects with GTID position
   instead and immediately rewrites the file with the new, correct
   information found).

 - When using GTID and parallel replication, the position in
   relay-log.info is misleading at best and simply wrong at worst.

 - When using parallel replication, the fact that every single
   transaction needs to do a write() syscall to the same file is
   likely to become a serious bottleneck.

The files are still written at normal slave stop.

In non-GTID mode, the files are written as normal (this is needed to
be able to restart after slave crash, even if such restart is then not
crash-safe, no change).

80d0dd7b

24 Oct, 2013 4 commits
- MDEV-4506: Parallel replication. · 7a22b6a6
  unknown authored Oct 24, 2013
```
Fix uninitialised variable.
```
  7a22b6a6
- MDEV-4506: Parallel replication. · ee8a8162
  unknown authored Oct 24, 2013
```
Implement --slave-parallel-max-queue to limit memory usage
of SQL thread read-ahead in the relay log.
```
  ee8a8162
- MDEV-5102 : MySQL Bug 69851 · 86901216
  Sergey Petrunya authored Oct 24, 2013
```
- Backport MySQL's fix: do set ha_partition::m_pkey_is_clustered for ha_partition 
  objects created with handler->clone() call.
- Also, include a testcase.
```
  86901216
- MDEV-4506: Parallel replication: Update some comments. · 96a4f1f6
  unknown authored Oct 24, 2013
  
  96a4f1f6
23 Oct, 2013 6 commits

MDEV-5176 Server crashes in fill_schema_applicable_roles on select from... · 65eee0be

Sergei Golubchik authored Oct 23, 2013

MDEV-5176 Server crashes in fill_schema_applicable_roles on select from APPLICABLE_ROLES after a suicide

Don't assume that thd->security_ctx->priv_user is an actually existing user account

65eee0be

MDEV-5170 Assertion `(&(&acl_cache->lock)->m_mutex)->count > 0 &&... · 7761a278

Sergei Golubchik authored Oct 23, 2013

MDEV-5170 Assertion `(&(&acl_cache->lock)->m_mutex)->count > 0 && pthread_equal(pthread_self(), (&(&acl_cache->lock)->m_mutex)->thread)' fails after restarting server with a pre-created role grants

lock acl_cache->lock mutex for the duration of acl_load

7761a278

MDEV-4506: Parallel replication. · a09d2b10

unknown authored Oct 23, 2013

Fix some more parts of old-style position updates.
Now we save in rgi some coordinates for master log and relay log, so
that in do_update_pos() we can use the right set of coordinates with
the right events.

The Rotate_log_event::do_update_pos() is fixed in the parallel case
to not directly update relay-log.info (as Rotate event runs directly
in the driver SQL thread, ahead of actual event execution). Instead,
group_master_log_file is updated as part of do_update_pos() in each
event execution.

In the parallel case, position updates happen in parallel without
any ordering, but taking care that position is not updated backwards.
Since position update happens only after event execution this leads
to the right result.

Also fix an access-after-free introduced in an earlier commit.

a09d2b10

MDEV-5133: Test suite tests *_func_view fail in time zones East of UTC+3 · e6ac94a6
unknown authored Oct 23, 2013
```
test time increased to be working on all timezones.
```
e6ac94a6
reset the db privilege cache when revoking db priviges on DROP ROLE · f6b8f6d1
Sergei Golubchik authored Oct 23, 2013

f6b8f6d1

MDEV-5172 safe_mutex: Trying to lock mutex when the mutex was already locked... · 61447892

Sergei Golubchik authored Oct 23, 2013

MDEV-5172 safe_mutex: Trying to lock mutex when the mutex was already locked on using a role and I_S role tables

don't forget to unlock if the current role isn't found

61447892