Commit 8bc44536 authored by unknown's avatar unknown

MWL#116: Efficient group commit

Tweak the commit_ordered() semantics. Now it is only called for transactions
that go through 2-phase commit. This avoids forcing engines to make commits
visible before they are durable.

Also take LOCK_commit_ordered() around START TRANSACTION WITH CONSISTENT
SNAPSHOT, to get a truly consistent snapshot.
parent 498f10a2
......@@ -1251,32 +1251,7 @@ int ha_commit_one_phase(THD *thd, bool all)
enclosing 'all' transaction is rolled back.
*/
bool is_real_trans=all || thd->transaction.all.ha_list == 0;
Ha_trx_info *ha_info= trans->ha_list;
DBUG_ENTER("ha_commit_one_phase");
#ifdef USING_TRANSACTIONS
if (ha_info)
{
if (is_real_trans)
{
bool locked= false;
for (; ha_info; ha_info= ha_info->next())
{
handlerton *ht= ha_info->ht();
if (ht->commit_ordered)
{
if (ha_info->is_trx_read_write() && !locked)
{
pthread_mutex_lock(&LOCK_commit_ordered);
locked= 1;
}
ht->commit_ordered(ht, thd, all);
}
}
if (locked)
pthread_mutex_unlock(&LOCK_commit_ordered);
}
}
#endif /* USING_TRANSACTIONS */
DBUG_RETURN(commit_one_phase_2(thd, all, trans, is_real_trans));
}
......@@ -1901,7 +1876,13 @@ int ha_start_consistent_snapshot(THD *thd)
{
bool warn= true;
/*
Holding the LOCK_commit_ordered mutex ensures that for any transaction
we either see it committed in all engines, or in none.
*/
pthread_mutex_lock(&LOCK_commit_ordered);
plugin_foreach(thd, snapshot_handlerton, MYSQL_STORAGE_ENGINE_PLUGIN, &warn);
pthread_mutex_unlock(&LOCK_commit_ordered);
/*
Same idea as when one wants to CREATE TABLE in one engine which does not
......
......@@ -667,6 +667,11 @@ struct handlerton
full transaction is committed, not for each commit of statement
transaction in a multi-statement transaction.
Not that like prepare(), commit_ordered() is only called when 2-phase
commit takes place. Ie. when no binary log and only a single engine
participates in a transaction, one commit() is called, no
commit_orderd(). So engines must be prepared for this.
The calls to commit_ordered() in multiple parallel transactions is
guaranteed to happen in the same order in every participating
handler. This can be used to ensure the same commit order among multiple
......@@ -684,11 +689,9 @@ struct handlerton
doing any time-consuming or blocking operations in commit_ordered() will
limit scalability.
Handlers can rely on commit_ordered() calls for transactions that updated
data to be serialised (no two calls can run in parallel, so no extra
locking on the handler part is required to ensure this). However, calls
for SELECT-only transactions are not serialised, so can occur in parallel
with each other and with at most one write-transaction.
Handlers can rely on commit_ordered() calls to be serialised (no two
calls can run in parallel, so no extra locking on the handler part is
required to ensure this).
Note that commit_ordered() can be called from a different thread than the
one handling the transaction! So it can not do anything that depends on
......@@ -700,7 +703,8 @@ struct handlerton
must be saved and returned from the commit() method instead.
The commit_ordered method is optional, and can be left unset if not
needed in a particular handler.
needed in a particular handler (then there will be no ordering guarantees
wrt. other engines and binary log).
*/
void (*commit_ordered)(handlerton *hton, THD *thd, bool all);
int (*rollback)(handlerton *hton, THD *thd, bool all);
......
This diff is collapsed.
......@@ -511,9 +511,10 @@ struct trx_struct{
in that case we must flush the log
in trx_commit_complete_for_mysql() */
ulint duplicates; /*!< TRX_DUP_IGNORE | TRX_DUP_REPLACE */
ulint active_trans; /*!< 1 - if a transaction in MySQL
is active. 2 - if prepare_commit_mutex
was taken */
ulint active_trans; /*!< TRX_ACTIVE_IN_MYSQL - set if a
transaction in MySQL is active.
TRX_ACTIVE_COMMIT_ORDERED - set if
innobase_commit_ordered has run */
ulint has_search_latch;
/* TRUE if this trx has latched the
search system latch in S-mode */
......@@ -824,6 +825,10 @@ Multiple flags can be combined with bitwise OR. */
#define TRX_SIG_OTHER_SESS 1 /* sent by another session (which
must hold rights to this) */
/* Flag bits for trx_struct.active_trans */
#define TRX_ACTIVE_IN_MYSQL (1<<0)
#define TRX_ACTIVE_COMMIT_ORDERED (1<<1)
/** Commit node states */
enum commit_node_state {
COMMIT_NODE_SEND = 1, /*!< about to send a commit signal to
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment