Commit 7532ad19 authored by Satya B's avatar Satya B

Applying InnoDB Plugin 1.0.5 snapshot, part 7

From revisions r5792 to r5864

Detailed revision comments:

r5792 | vasil | 2009-09-09 08:35:58 -0500 (Wed, 09 Sep 2009) | 32 lines
branches/zip:

Fix a bug in manipulating the variable innodb_old_blocks_pct:

for any value assigned it got that value -1, except for 75. When
assigned 75, it got 75.

  mysql> set global innodb_old_blocks_pct=15;
  Query OK, 0 rows affected (0.00 sec)
  
  mysql> show variables like 'innodb_old_blocks_pct';
  +-----------------------+-------+
  | Variable_name         | Value |
  +-----------------------+-------+
  | innodb_old_blocks_pct | 14    | 
  +-----------------------+-------+
  1 row in set (0.00 sec)
  
  mysql> set global innodb_old_blocks_pct=75;
  Query OK, 0 rows affected (0.00 sec)
  
  mysql> show variables like 'innodb_old_blocks_pct';
  +-----------------------+-------+
  | Variable_name         | Value |
  +-----------------------+-------+
  | innodb_old_blocks_pct | 75    | 
  +-----------------------+-------+

After the fix it gets exactly what was assigned.

Approved by:	Marko (via IM)

r5798 | calvin | 2009-09-09 10:28:10 -0500 (Wed, 09 Sep 2009) | 5 lines
branches/zip:

HA_ERR_TOO_MANY_CONCURRENT_TRXS is added in 5.1.38.
But the plugin should still work with previous versions
of MySQL.
r5804 | marko | 2009-09-10 00:29:31 -0500 (Thu, 10 Sep 2009) | 1 line
branches/zip: trx_cleanup_at_db_startup(): Fix a typo in comment.
r5822 | marko | 2009-09-10 05:10:20 -0500 (Thu, 10 Sep 2009) | 1 line
branches/zip: buf_page_release(): De-stutter the function comment.
r5825 | marko | 2009-09-10 05:47:09 -0500 (Thu, 10 Sep 2009) | 20 lines
branches/zip: Reduce mutex contention that was introduced when
addressing Bug #45015 (Issue #316), in r5703.

buf_page_set_accessed_make_young(): New auxiliary function, called by
buf_page_get_zip(), buf_page_get_gen(),
buf_page_optimistic_get_func(). Call ut_time_ms() outside of
buf_pool_mutex. Use cached access_time.

buf_page_set_accessed(): Add the parameter time_ms, so that
ut_time_ms() need not be called while holding buf_pool_mutex.

buf_page_optimistic_get_func(), buf_page_get_known_nowait(): Read
buf_page_t::access_time without holding buf_pool_mutex. This should be
OK, because the field is only used for heuristic purposes.

buf_page_peek_if_too_old(): If buf_pool->freed_page_clock == 0, return
FALSE, so that we will not waste time moving blocks in the LRU list in
the warm-up phase or when the workload fits in the buffer pool.

rb://156 approved by Sunny Bains
r5826 | marko | 2009-09-10 06:29:46 -0500 (Thu, 10 Sep 2009) | 12 lines
branches/zip: Roll back recovered dictionary transactions before
dropping incomplete indexes (Issue #337).

trx_rollback_or_clean_recovered(ibool all): New function, split from
trx_rollback_or_clean_all_recovered().  all==FALSE will only roll back
dictionary transactions.

recv_recovery_from_checkpoint_finish(): Call
trx_rollback_or_clean_recovered(FALSE) before
row_merge_drop_temp_indexes().

rb://158 approved by Sunny Bains
r5858 | vasil | 2009-09-11 12:46:47 -0500 (Fri, 11 Sep 2009) | 4 lines
branches/zip:

Fix the indentation of the closing bracket.

r5863 | vasil | 2009-09-12 02:07:08 -0500 (Sat, 12 Sep 2009) | 10 lines
branches/zip:

Check that pthread_t can indeed be passed to Solaris atomic functions, instead
of assuming that it can be passed if 0 can be assigned to it. It could be that:
* 0 can be assigned, but pthread_t cannot be passed and
* 0 cannot be assigned but pthread_t can be passed

Better to check what we are interested in, not something else and make
assumptions.

r5864 | vasil | 2009-09-12 02:22:55 -0500 (Sat, 12 Sep 2009) | 4 lines
branches/zip:

Include string.h which is needed for memset().
parent 963d2d43
......@@ -1489,7 +1489,7 @@ buf_pool_resize(void)
/********************************************************************//**
Moves a page to the start of the buffer pool LRU list. This high-level
function can be used to prevent an important page from from slipping out of
function can be used to prevent an important page from slipping out of
the buffer pool. */
UNIV_INTERN
void
......@@ -1506,6 +1506,36 @@ buf_page_make_young(
buf_pool_mutex_exit();
}
/********************************************************************//**
Sets the time of the first access of a page and moves a page to the
start of the buffer pool LRU list if it is too old. This high-level
function can be used to prevent an important page from slipping
out of the buffer pool. */
static
void
buf_page_set_accessed_make_young(
/*=============================*/
buf_page_t* bpage, /*!< in/out: buffer block of a
file page */
unsigned access_time) /*!< in: bpage->access_time
read under mutex protection,
or 0 if unknown */
{
ut_ad(!buf_pool_mutex_own());
ut_a(buf_page_in_file(bpage));
if (buf_page_peek_if_too_old(bpage)) {
buf_pool_mutex_enter();
buf_LRU_make_block_young(bpage);
buf_pool_mutex_exit();
} else if (!access_time) {
ulint time_ms = ut_time_ms();
buf_pool_mutex_enter();
buf_page_set_accessed(bpage, time_ms);
buf_pool_mutex_exit();
}
}
/********************************************************************//**
Resets the check_index_page_at_flush field of a page if found in the buffer
pool. */
......@@ -1637,6 +1667,7 @@ buf_page_get_zip(
buf_page_t* bpage;
mutex_t* block_mutex;
ibool must_read;
unsigned access_time;
#ifndef UNIV_LOG_DEBUG
ut_ad(!ibuf_inside());
......@@ -1704,17 +1735,14 @@ err_exit:
got_block:
must_read = buf_page_get_io_fix(bpage) == BUF_IO_READ;
if (buf_page_peek_if_too_old(bpage)) {
buf_LRU_make_block_young(bpage);
}
buf_page_set_accessed(bpage);
access_time = buf_page_is_accessed(bpage);
buf_pool_mutex_exit();
mutex_exit(block_mutex);
buf_page_set_accessed_make_young(bpage, access_time);
#ifdef UNIV_DEBUG_FILE_ACCESSES
ut_a(!bpage->file_page_was_freed);
#endif
......@@ -2244,14 +2272,10 @@ wait_until_unfixed:
access_time = buf_page_is_accessed(&block->page);
if (buf_page_peek_if_too_old(&block->page)) {
buf_LRU_make_block_young(&block->page);
}
buf_page_set_accessed(&block->page);
buf_pool_mutex_exit();
buf_page_set_accessed_make_young(&block->page, access_time);
#ifdef UNIV_DEBUG_FILE_ACCESSES
ut_a(!block->page.file_page_was_freed);
#endif
......@@ -2353,18 +2377,13 @@ buf_page_optimistic_get_func(
mutex_exit(&block->mutex);
buf_pool_mutex_enter();
/* Check if this is the first access to the page.
We do a dirty read on purpose, to avoid mutex contention.
This field is only used for heuristic purposes; it does not
affect correctness. */
/* Check if this is the first access to the page */
access_time = buf_page_is_accessed(&block->page);
if (buf_page_peek_if_too_old(&block->page)) {
buf_LRU_make_block_young(&block->page);
}
buf_page_set_accessed(&block->page);
buf_pool_mutex_exit();
buf_page_set_accessed_make_young(&block->page, access_time);
ut_ad(!ibuf_inside()
|| ibuf_page(buf_block_get_space(block),
......@@ -2477,15 +2496,21 @@ buf_page_get_known_nowait(
mutex_exit(&block->mutex);
buf_pool_mutex_enter();
if (mode == BUF_MAKE_YOUNG && buf_page_peek_if_too_old(&block->page)) {
buf_pool_mutex_enter();
buf_LRU_make_block_young(&block->page);
}
buf_page_set_accessed(&block->page);
buf_pool_mutex_exit();
} else if (!buf_page_is_accessed(&block->page)) {
/* Above, we do a dirty read on purpose, to avoid
mutex contention. The field buf_page_t::access_time
is only used for heuristic purposes. Writes to the
field must be protected by mutex, however. */
ulint time_ms = ut_time_ms();
buf_pool_mutex_exit();
buf_pool_mutex_enter();
buf_page_set_accessed(&block->page, time_ms);
buf_pool_mutex_exit();
}
ut_ad(!ibuf_inside() || (mode == BUF_KEEP_OLD));
......@@ -2917,6 +2942,7 @@ buf_page_create(
buf_frame_t* frame;
buf_block_t* block;
buf_block_t* free_block = NULL;
ulint time_ms = ut_time_ms();
ut_ad(mtr);
ut_ad(space || !zip_size);
......@@ -3000,7 +3026,7 @@ buf_page_create(
rw_lock_x_unlock(&block->lock);
}
buf_page_set_accessed(&block->page);
buf_page_set_accessed(&block->page, time_ms);
buf_pool_mutex_exit();
......
......@@ -1866,7 +1866,9 @@ buf_LRU_old_ratio_update(
buf_LRU_old_ratio = ratio;
}
return(ratio * 100 / BUF_LRU_OLD_RATIO_DIV);
/* the reverse of
ratio = old_pct * BUF_LRU_OLD_RATIO_DIV / 100 */
return((uint) (ratio * 100 / (double) BUF_LRU_OLD_RATIO_DIV + 0.5));
}
/********************************************************************//**
......
......@@ -868,17 +868,14 @@ convert_error_code_to_mysql(
return(ER_PRIMARY_CANT_HAVE_NULL);
case DB_TOO_MANY_CONCURRENT_TRXS:
/* Once MySQL add the appropriate code to errmsg.txt then
we can get rid of this #ifdef. NOTE: The code checked by
the #ifdef is the suggested name for the error condition
and the actual error code name could very well be different.
This will require some monitoring, ie. the status
of this request on our part.*/
#ifdef ER_TOO_MANY_CONCURRENT_TRXS
return(ER_TOO_MANY_CONCURRENT_TRXS);
#else
/* New error code HA_ERR_TOO_MANY_CONCURRENT_TRXS is only
available in 5.1.38 and later, but the plugin should still
work with previous versions of MySQL. */
#ifdef HA_ERR_TOO_MANY_CONCURRENT_TRXS
return(HA_ERR_TOO_MANY_CONCURRENT_TRXS);
#else /* HA_ERR_TOO_MANY_CONCURRENT_TRXS */
return(HA_ERR_RECORD_FILE_FULL);
#endif
#endif /* HA_ERR_TOO_MANY_CONCURRENT_TRXS */
case DB_UNSUPPORTED:
return(HA_ERR_UNSUPPORTED);
}
......
......@@ -346,7 +346,7 @@ buf_page_release(
mtr_t* mtr); /*!< in: mtr */
/********************************************************************//**
Moves a page to the start of the buffer pool LRU list. This high-level
function can be used to prevent an important page from from slipping out of
function can be used to prevent an important page from slipping out of
the buffer pool. */
UNIV_INTERN
void
......@@ -821,7 +821,8 @@ UNIV_INLINE
void
buf_page_set_accessed(
/*==================*/
buf_page_t* bpage) /*!< in/out: control block */
buf_page_t* bpage, /*!< in/out: control block */
ulint time_ms) /*!< in: ut_time_ms() */
__attribute__((nonnull));
/*********************************************************************//**
Gets the buf_block_t handle of a buffered file block if an uncompressed
......
......@@ -72,10 +72,16 @@ buf_page_peek_if_too_old(
/*=====================*/
const buf_page_t* bpage) /*!< in: block to make younger */
{
if (buf_LRU_old_threshold_ms && bpage->old) {
if (UNIV_UNLIKELY(buf_pool->freed_page_clock == 0)) {
/* If eviction has not started yet, do not update the
statistics or move blocks in the LRU list. This is
either the warm-up phase or an in-memory workload. */
return(FALSE);
} else if (buf_LRU_old_threshold_ms && bpage->old) {
unsigned access_time = buf_page_is_accessed(bpage);
if (access_time && ut_time_ms() - access_time
if (access_time > 0
&& (ut_time_ms() - access_time)
>= buf_LRU_old_threshold_ms) {
return(TRUE);
}
......@@ -85,10 +91,10 @@ buf_page_peek_if_too_old(
} else {
/* FIXME: bpage->freed_page_clock is 31 bits */
return((buf_pool->freed_page_clock & ((1UL << 31) - 1))
> bpage->freed_page_clock
+ (buf_pool->curr_size
* (BUF_LRU_OLD_RATIO_DIV - buf_LRU_old_ratio)
/ (BUF_LRU_OLD_RATIO_DIV * 4)));
> ((ulint) bpage->freed_page_clock
+ (buf_pool->curr_size
* (BUF_LRU_OLD_RATIO_DIV - buf_LRU_old_ratio)
/ (BUF_LRU_OLD_RATIO_DIV * 4))));
}
}
......@@ -490,14 +496,15 @@ UNIV_INLINE
void
buf_page_set_accessed(
/*==================*/
buf_page_t* bpage) /*!< in/out: control block */
buf_page_t* bpage, /*!< in/out: control block */
ulint time_ms) /*!< in: ut_time_ms() */
{
ut_a(buf_page_in_file(bpage));
ut_ad(buf_pool_mutex_own());
if (!bpage->access_time) {
/* Make this the time of the first access. */
bpage->access_time = ut_time_ms();
bpage->access_time = time_ms;
}
}
......
......@@ -133,6 +133,17 @@ trx_rollback(
Rollback or clean up any incomplete transactions which were
encountered in crash recovery. If the transaction already was
committed, then we clean up a possible insert undo log. If the
transaction was not yet committed, then we roll it back. */
UNIV_INTERN
void
trx_rollback_or_clean_recovered(
/*============================*/
ibool all); /*!< in: FALSE=roll back dictionary transactions;
TRUE=roll back all non-PREPARED transactions */
/*******************************************************************//**
Rollback or clean up any incomplete transactions which were
encountered in crash recovery. If the transaction already was
committed, then we clean up a possible insert undo log. If the
transaction was not yet committed, then we roll it back.
Note: this is done in a background thread.
@return a dummy parameter */
......
......@@ -179,7 +179,7 @@ trx_commit_off_kernel(
/****************************************************************//**
Cleans up a transaction at database startup. The cleanup is needed if
the transaction already got to the middle of a commit when the database
crashed, andf we cannot roll it back. */
crashed, and we cannot roll it back. */
UNIV_INTERN
void
trx_cleanup_at_db_startup(
......
......@@ -3118,6 +3118,11 @@ recv_recovery_from_checkpoint_finish(void)
#ifndef UNIV_LOG_DEBUG
recv_sys_free();
#endif
/* Roll back any recovered data dictionary transactions, so
that the data dictionary tables will be free of any locks.
The data dictionary latch should guarantee that there is at
most one data dictionary transaction active at a time. */
trx_rollback_or_clean_recovered(FALSE);
/* Drop partially created indexes. */
row_merge_drop_temp_indexes();
......
......@@ -112,7 +112,7 @@ MYSQL_PLUGIN_ACTIONS(innodb_plugin, [
[
AC_MSG_RESULT(no)
]
)
)
AC_MSG_CHECKING(whether pthread_t can be used by GCC atomic builtins)
AC_TRY_RUN(
......@@ -142,7 +142,7 @@ MYSQL_PLUGIN_ACTIONS(innodb_plugin, [
[
AC_MSG_RESULT(no)
]
)
)
# Try using solaris atomics on SunOS if GCC atomics are not available
AC_CHECK_DECLS(
......
......@@ -532,28 +532,26 @@ trx_rollback_active(
Rollback or clean up any incomplete transactions which were
encountered in crash recovery. If the transaction already was
committed, then we clean up a possible insert undo log. If the
transaction was not yet committed, then we roll it back.
Note: this is done in a background thread.
@return a dummy parameter */
transaction was not yet committed, then we roll it back. */
UNIV_INTERN
os_thread_ret_t
trx_rollback_or_clean_all_recovered(
/*================================*/
void* arg __attribute__((unused)))
/*!< in: a dummy parameter required by
os_thread_create */
void
trx_rollback_or_clean_recovered(
/*============================*/
ibool all) /*!< in: FALSE=roll back dictionary transactions;
TRUE=roll back all non-PREPARED transactions */
{
trx_t* trx;
mutex_enter(&kernel_mutex);
if (UT_LIST_GET_FIRST(trx_sys->trx_list)) {
if (!UT_LIST_GET_FIRST(trx_sys->trx_list)) {
goto leave_function;
}
if (all) {
fprintf(stderr,
"InnoDB: Starting in background the rollback"
" of uncommitted transactions\n");
} else {
goto leave_function;
}
mutex_exit(&kernel_mutex);
......@@ -582,18 +580,42 @@ loop:
goto loop;
case TRX_ACTIVE:
mutex_exit(&kernel_mutex);
trx_rollback_active(trx);
goto loop;
if (all || trx_get_dict_operation(trx)
!= TRX_DICT_OP_NONE) {
mutex_exit(&kernel_mutex);
trx_rollback_active(trx);
goto loop;
}
}
}
ut_print_timestamp(stderr);
fprintf(stderr,
" InnoDB: Rollback of non-prepared transactions completed\n");
if (all) {
ut_print_timestamp(stderr);
fprintf(stderr,
" InnoDB: Rollback of non-prepared"
" transactions completed\n");
}
leave_function:
mutex_exit(&kernel_mutex);
}
/*******************************************************************//**
Rollback or clean up any incomplete transactions which were
encountered in crash recovery. If the transaction already was
committed, then we clean up a possible insert undo log. If the
transaction was not yet committed, then we roll it back.
Note: this is done in a background thread.
@return a dummy parameter */
UNIV_INTERN
os_thread_ret_t
trx_rollback_or_clean_all_recovered(
/*================================*/
void* arg __attribute__((unused)))
/*!< in: a dummy parameter required by
os_thread_create */
{
trx_rollback_or_clean_recovered(TRUE);
/* We count the number of threads in os_thread_exit(). A created
thread should always use that to exit and not use return() to exit. */
......
......@@ -950,7 +950,7 @@ trx_commit_off_kernel(
/****************************************************************//**
Cleans up a transaction at database startup. The cleanup is needed if
the transaction already got to the middle of a commit when the database
crashed, andf we cannot roll it back. */
crashed, and we cannot roll it back. */
UNIV_INTERN
void
trx_cleanup_at_db_startup(
......
......@@ -17,18 +17,38 @@ Place, Suite 330, Boston, MA 02111-1307 USA
*****************************************************************************/
/*****************************************************************************
If this program compiles, then pthread_t objects can be used as arguments
to Solaris libc atomic functions.
If this program compiles and returns 0, then pthread_t objects can be used as
arguments to Solaris libc atomic functions.
Created April 18, 2009 Vasil Dimov
*****************************************************************************/
#include <pthread.h>
#include <string.h>
int
main(int argc, char** argv)
{
pthread_t x = 0;
pthread_t x1;
pthread_t x2;
pthread_t x3;
memset(&x1, 0x0, sizeof(x1));
memset(&x2, 0x0, sizeof(x2));
memset(&x3, 0x0, sizeof(x3));
if (sizeof(pthread_t) == 4) {
atomic_cas_32(&x1, x2, x3);
} else if (sizeof(pthread_t) == 8) {
atomic_cas_64(&x1, x2, x3);
} else {
return(1);
}
return(0);
}
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment