Commits · 17aff4b17b832f2eddee8e5088743b3101c1c2e2 · Kirill Smelkov / mariadb

13 Apr, 2015 3 commits

Merge MDEV-7936 into 10.0. · 17aff4b1
Kristian Nielsen authored Apr 13, 2015
```
Conflicts:
	sql/sql_base.cc
```
17aff4b1

MDEV-7936: Assertion `!table || table->in_use == _current_thd()' failed on... · 60d094ae

Kristian Nielsen authored Apr 13, 2015

MDEV-7936: Assertion `!table || table->in_use == _current_thd()' failed on parallel replication in optimistic mode

Make sure that in parallel replication, we execute wait_for_prior_commit()
before setting table->in_use for a temporary table. Otherwise we can end up
with two parallel replication worker threads competing with each other for
use of a temporary table.

Re-factor the use of find_temporary_table() to be able to handle errors
in the caller (as wait_for_prior_commit() can return error in case of
deadlock kill).

60d094ae

MDEV-7668: Intermediate master groups CREATE TEMPORARY with INSERT, causing... · c47fe0e9

Kristian Nielsen authored Mar 09, 2015

MDEV-7668: Intermediate master groups CREATE TEMPORARY with INSERT, causing parallel replication failure

[This commit cherry-picked to be able to merge MDEV-7936, of which it
is a pre-requisite, into both 10.0 and 10.1.]

Parallel replication depends on locking (table locks, row locks, etc.) to
prevent two conflicting transactions from running and committing in parallel.
But temporary tables are designed to be visible only to one thread, and have
no such locking.

In the concrete issue, an intermediate master could commit a CREATE TEMPORARY
TABLE in the same group commit as in INSERT into that table. Thus, a
lower-level master could attempt to run them in parallel and get an error.

More generally, we need protection from parallel replication trying to run
transactions in parallel that access a common temporary table.

This patch simply causes use of a temporary table from parallel replication
to wait for all previous transactions to commit, serialising the replication
at that point.

(A more fine-grained locking could be added later, possibly. However,
using temporary tables in statement-based replication is in any case
normally undesirable; for example a restart of the server will lose
temporary tables and can break replication).

Note that row-based replication is not affected, as it does not do any
temporary tables on the slave-side.

This patch also cleans up the locking around protecting the list of
temporary tables in Relay_log_info. This used to take the
rli->data_lock at the end of every statement, which is very bad for
concurrency. With this patch, the lock is not taken unless temporary
tables (with statement-based binlogging) are in use on the slave.

c47fe0e9

09 Apr, 2015 2 commits

Merge MDEV-7940 into 10.0 · 50d98e9c
Kristian Nielsen authored Apr 09, 2015

50d98e9c

MDEV-7940: Sporadic failure in rpl.rpl_gtid_until · 15a2b5aa

Kristian Nielsen authored Apr 09, 2015

Fix a race in the test case. When we do start_slave.inc immediately
followed by stop_slave.inc, it is possible to kill the IO thread while
it is still running inside get_master_version_and_clock(), and this
gives warnings in the error log that cause the test to fail.

15a2b5aa

08 Apr, 2015 4 commits

Merge MDEV-7910' into 10.0 · 670d4dd8
Kristian Nielsen authored Apr 08, 2015

670d4dd8

MDEV-7910: innodb.binlog_consistent fails sporadically in buildbot · b3c7c8cd

Kristian Nielsen authored Apr 08, 2015

The test case was missing --source include/wait_for_binlog_checkpoint.inc.
So it could occasionally fail if the checkpoint managed to occur just at the
right point in time between fetching the two binlog positions to compare.

b3c7c8cd

Merge MDEV-7888 and MDEV-7929 into 10.0. · accdabd6
Kristian Nielsen authored Apr 08, 2015

accdabd6

MDEV-7888, MDEV-7929: Parallel replication hangs sometimes on ANALYZE TABLE or DDL · 3b961347

Kristian Nielsen authored Apr 08, 2015

The hangs occur when the group_commit_orderer object is freed before the last
mark_start_commit() call on it - this loses the wakeup to other waiting worker
threads, causing them to hang until killed manually.

The object was freed because wakeup_subsequent_commits() was called two early
in two places. For MDEV-7888, during ANALYZE TABLE, and for MDEV-7929 during
record_gtid() after processing a DDL event. The group_commit_orderer object
can be freed when its last transaction has called wait_for_prior_commit().

Fix by implementing a suspend/resume mechanism for wakeup_subsequent_commits()
that can be used in places where a transaction is committed without this being
the commit of the actual replication event group.

Also add a protection mechanism (that asserts in debug builds) which can
prevent the too-early free and hang if other similar bugs should remain in
other parts of the code.

3b961347

06 Apr, 2015 1 commit

MDEV-7908: assertion in innobase_release_savepoint · e9c10f99

Jan Lindström authored Apr 06, 2015

Problem was that in XA prepared state we should still be able to
release a savepoint, but assertions were too strict.

e9c10f99

31 Mar, 2015 2 commits

MDEV-7367: Updating a virtual column corrupts table which crashes server · b53bcd43

Jan Lindström authored Mar 30, 2015

Analysis: MySQL table definition contains also virtual columns. Similarly,
index fielnr references MySQL table fields. However, InnoDB table definition
does not contain virtual columns. Therefore, when matching MySQL key fieldnr
we need to use actual column name to find out referenced InnoDB dictionary
column name.

Fix: Add new function to match MySQL index key columns to InnoDB dictionary.

b53bcd43

MDEV-7754: innodb assert "array->n_elems < array->max_elems" on a huge blob update · 0563f49b
Jan Lindström authored Mar 17, 2015
```
Replace static array of thread sync levels with std::vector.
```
0563f49b

30 Mar, 2015 3 commits

Merge MDEV-7847 and MDEV-7882 into 10.0. · c41e4d3b

Kristian Nielsen authored Mar 30, 2015

Conflicts:
	mysql-test/suite/rpl/r/rpl_parallel.result
	mysql-test/suite/rpl/t/rpl_parallel.test

c41e4d3b

MDEV-7847: "Slave worker thread retried transaction 10 time(s) in vain, giving... · 880f2273

Kristian Nielsen authored Mar 30, 2015

MDEV-7847: "Slave worker thread retried transaction 10 time(s) in vain, giving up", followed by replication hanging

This patch fixes a bug in the error handling in parallel replication, when one
worker thread gets a failure and other worker threads processing later
transactions have to rollback and abort.

The problem was with the lifetime of group_commit_orderer objects (GCOs).
A GCO is freed when we register that its last event group has committed. This
relies on register_wait_for_prior_commit() and wait_for_prior_commit() to
ensure that the fact that T2 has committed implies that any earlier T1 has
also committed, and can thus no longer execute mark_start_commit().

However, in the error case, the code was skipping the
register_wait_for_prior_commit() and wait_for_prior_commit() calls. Thus
commit ordering was not guaranteed, and a GCO could be freed too early. Then a
later mark_start_commit() would reference deallocated GCO, which could lead to
lost wakeup (causing slave threads to hang) or other corruption.

This patch makes also the error case respect commit order. This way, also the
error case gets the GCO lifetime correct, and the hang no longer occurs.

880f2273

MDEV-7882: Excessive transaction retry in parallel replication · a4082918

Kristian Nielsen authored Mar 30, 2015

When a transaction in parallel replication needs to retry (eg. because of
deadlock kill), first wait for all prior transactions to commit before doing
the retry. This way, we avoid the retry once again conflicting with a prior
transaction, requiring yet another retry.

Without this patch, we saw "in the wild" that transactions had to be retried
more than 10 times to succeed, which exceeds the default
--slave_transaction_retries value and is in any case undesirable.

(We already do this in 10.1 in "optimistic" parallel replication mode; this
patch just makes the code use the same logic for "conservative" mode (only
mode in 10.0)).

a4082918

25 Mar, 2015 1 commit
- Backport from 10.1 to 10.0: Merge pull request #33 from k0da/mdev-7839 · 323a7e93
  Sergei Petrunia authored Mar 25, 2015
```
Fix BigEndian build for Cassandra SE
```
  323a7e93
18 Mar, 2015 3 commits
- Better and more correct comment. · 1020d569
  Jan Lindström authored Mar 18, 2015
  
  1020d569
- Fix assertion failure seen on Buildbot win32-debug · 2bdbfd33
  Jan Lindström authored Mar 18, 2015
```
There is a bug in Visual Studio 2010
Visual Studio has a feature "Checked Iterators". In a debug build, every
iterator operation is checked at runtime for errors, e g, out of range.
Disable this "Checked Iterators" for Windows and Debug if defined.
```
  2bdbfd33
- Make sure that sync level vector is emptied. · c14d9c21
  Jan Lindström authored Mar 18, 2015
  
  c14d9c21
17 Mar, 2015 2 commits
- MDEV-7754: innodb assert "array->n_elems < array->max_elems" on a huge blob update · 99a2c061
  Jan Lindström authored Mar 17, 2015
```
Problem was that static array was used for storing thread mutex sync levels.
Fixed by using std::vector instead.

Does not contain test case to avoid too big memory/disk space usage
on buildbot VMs.
```
  99a2c061
- Fix embarrassing bug in test case that caused sporadic test failures. · 3d485015
  Kristian Nielsen authored Mar 17, 2015
  
  3d485015
16 Mar, 2015 1 commit
- MDEV-7785: errorneous -> erroneous spelling mistake · 2e82a823
  Kristian Nielsen authored Mar 16, 2015
  
  2e82a823
13 Mar, 2015 2 commits

MDEV-7249: Performance problem in parallel replication with multi-level slaves · 184f718f

Kristian Nielsen authored Mar 13, 2015

Parallel replication (in 10.0 / "conservative" mode) relies on binlog group
commits to group transactions that can be safely run in parallel on the
slave. The --binlog-commit-wait-count and --binlog-commit-wait-usec options
exist to increase the number of commits per group. But in case of conflicts
between transactions, this can cause unnecessary delay and reduced througput,
especially on a slave where commit order is fixed.

This patch adds a heuristics to reduce this problem. When transaction T1 goes
to commit, it will first wait for N transactions to queue up for a group
commit. However, if we detect that another transaction T2 is waiting for a row
lock held by T1, then we will skip the wait and let T1 commit immediately,
releasing locks and let T2 continue.

On a slave, this avoids the unfortunate situation where T1 is waiting for T2
to join the group commit, but T2 is waiting for T1 to release locks, causing
no work to be done for the duration of the --binlog-commit-wait-usec timeout.

(The heuristic seems reasonable on the master as well, so it is enabled for
all transactions, not just replication transactions).

184f718f

MDEV-7387 [PATCH] Alter table xxx CHARACTER SET utf8, CONVERT TO CHARACTER SET latin1 should fail · bc902a2b
Alexander Barkov authored Mar 13, 2015
```
A contribution from Daniel Black, with minor additional enhancements.
```
bc902a2b

12 Mar, 2015 1 commit
- MDEV-7714: Make possible to get innodb internal primary key for wrapper · 702fdc52
  Jan Lindström authored Mar 12, 2015
```
type storage engine.

Authored by: Kentoku Shiba
```
  702fdc52
11 Mar, 2015 1 commit

MDEV-5289: master server starts slave parallel threads · ed04c40b

Kristian Nielsen authored Mar 11, 2015

Delay spawning parallel replication worker threads until a slave SQL
thread is running, and de-spawn them when the last SQL thread stops.

This is especially useful to avoid needless threads on a master in a
setup where same my.cnf is used on masters and slaves.

ed04c40b

09 Mar, 2015 4 commits

MDEV-7685: MariaDB - server crashes when inserting more rows than · a7fd11b3

Jan Lindström authored Mar 09, 2015

available space on disk

Add error handling when disk full situation happens and
intentionally bring server down with stacktrace because
on all cases InnoDB can't continue anyway.

a7fd11b3

MDEV-7107 Sporadic test failure in multi_source.multisource · ec16d1b6

Elena Stepanova authored Mar 09, 2015

Extend show_slave_status.inc to run SHOW ALL SLAVES STATUS and
SHOW SLAVE 'name' STATUS on demand, and make the test use
the include file instead of direct SHOW statements

ec16d1b6

MDEV-7668: Intermediate master groups CREATE TEMPORARY with INSERT, causing... · 96784eb1

Kristian Nielsen authored Mar 09, 2015

MDEV-7668: Intermediate master groups CREATE TEMPORARY with INSERT, causing parallel replication failure

Parallel replication depends on locking (table locks, row locks, etc.) to
prevent two conflicting transactions from running and committing in parallel.
But temporary tables are designed to be visible only to one thread, and have
no such locking.

In the concrete issue, an intermediate master could commit a CREATE TEMPORARY
TABLE in the same group commit as in INSERT into that table. Thus, a
lower-level master could attempt to run them in parallel and get an error.

More generally, we need protection from parallel replication trying to run
transactions in parallel that access a common temporary table.

This patch simply causes use of a temporary table from parallel replication
to wait for all previous transactions to commit, serialising the replication
at that point.

(A more fine-grained locking could be added later, possibly. However,
using temporary tables in statement-based replication is in any case
normally undesirable; for example a restart of the server will lose
temporary tables and can break replication).

Note that row-based replication is not affected, as it does not do any
temporary tables on the slave-side.

This patch also cleans up the locking around protecting the list of
temporary tables in Relay_log_info. This used to take the
rli->data_lock at the end of every statement, which is very bad for
concurrency. With this patch, the lock is not taken unless temporary
tables (with statement-based binlogging) are in use on the slave.

96784eb1

MDEV-7627 :Some symbols in table name can cause to Error Code: 1050 · 040027c8

Jan Lindström authored Mar 09, 2015

when created FK

Analysis: Table name is on filename charset but foreign key
identifiers are not. This lead incorrect foreign key
identifier number to be used.

Fix: Convert foreign key identifier to filename charset before
comparing it to table name when largest foreign key identifier
number is resolved.

040027c8

08 Mar, 2015 1 commit

MDEV-7187 perfschema.aggregate fails sporadically in buildbot · 6fc0a8af

Elena Stepanova authored Mar 08, 2015

During slow execution, e.g. under valgrind, there was a chance
that Aria checkpoint would happen while P_S tables were being
queried; it could cause different data in joined P_S, and
thus combinations of results that the test did not expect.

Fixed by disabling Aria checkpoints for the test.

6fc0a8af

06 Mar, 2015 7 commits

fix connect.json_udf test for static builds · d61573d3
Sergei Golubchik authored Mar 06, 2015

d61573d3
MDEV-7669 tmp_table_count-7586 fails in ps and embedded · c0af8213
Sergei Golubchik authored Mar 06, 2015
```
disable a broken test, pending a proper fix
```
c0af8213
Merge branch '5.5' into 10.0 · 5f510a91
Sergei Golubchik authored Mar 06, 2015

5f510a91
after innodb/xtradb merge: use the correct visibility for internal functions · 17a37796
Sergei Golubchik authored Mar 06, 2015
```
otherwise innodb plugin might invoke xtradb function with the same name,
and that might crash (./mtr --emb innodb.strict_mode)
```
17a37796
MDEV-6838 Using too big key for internal temp tables · d7d19071
Sergei Golubchik authored Mar 06, 2015
```
update test results after the fix
```
d7d19071

MDEV-7659 buildbot may leave stale mysqld · 12d87c3b

Sergei Golubchik authored Mar 06, 2015

safe_process puts its children (mysqld, in this case) into a separate
process group, to be able to kill it all at once.

buildslave kills mtr's process group when it loses connection to
the master.

result? buildslave kills mtr and safe_process, but leaves stale
mysqld processes in their own process groups.

fix: put safe_process itself into a separate process group, then
buildslave won't kill it and safe_process will kill mysqld'd
and itself when it will notice that the parent mtr no longer exists.

12d87c3b

MDEV-7672: Crash creating an InnoDB table with foreign keys · 206b111b

Jan Lindström authored Mar 06, 2015

Analysis: after a red-black-tree lookup we use node withouth
checking did lookup succeed or not. This lead to situation
where NULL-pointer was used.

Fix: Add additional check that found node from red-back-tree
is valid.

206b111b

05 Mar, 2015 2 commits

MDEV-7148 - Recurring: InnoDB: Failing assertion: !lock->recursive · e13459a1

Sergey Vojtovich authored Mar 05, 2015

Re-applied lost in the merge revision:
commit ed313e8a
Author: Sergey Vojtovich <svoj@mariadb.org>
Date:   Mon Dec 1 14:58:29 2014 +0400

    MDEV-7148 - Recurring: InnoDB: Failing assertion: !lock->recursive

    On PPC64 high-loaded server may crash due to assertion failure in InnoDB
    rwlocks code.

    This happened because load order between "recursive" and "writer_thread"
    wasn't properly enforced.

e13459a1

MDEV-7578 :Slave is ~10x slower to execute set of statements compared to master when using RBR · f66fbe8c

Jan Lindström authored Mar 05, 2015

Analysis: On master when executing (single/multi) row INSERTs/REPLACEs
InnoDB fallback to old style autoinc locks (table locks)
only if another transaction has already acquired the AUTOINC lock.
Instead on slave as we are executing log_events and sql_command
is not correctly set, InnoDB does not use new style autoinc
locks when it could.

Fix: Use new style autoinc locks also when
thd_sql_command(user_thd) == SQLCOM_END i.e. this is RBR event.

f66fbe8c