Commit a5bcb63f authored by Guilhem Bichot's avatar Guilhem Bichot

WL#4374 "Maria - force start if Recovery fails multiple times"

http://forge.mysql.com/worklog/task.php?id=4374
new option --maria-force-start-after-recovery-failures=N; number of consecutive recovery failures (failures
of log reading or recovery processing, anything in [translog_init(),maria_recovery_from_log()])
is stored in the control file; if at a Maria start they are more than N, logs are removed. This is for automated
systems which have to run whatever happens. As tables risk staying corrupted, --maria-recover should also
be used on them: this revision makes maria-recover work (it was disabled).
Fixed bug in translog_is_log_files(). translog_init() now prints message to error log if failed.
Removed \0 in the output of SHOW ENGINE MARIA LOGS; removed hard-coded engine name there.

KNOWN_BUGS.txt:
  As option --maria-force-start-after-recovery-failures is added, it corresponds to the wish "we should fix that if this happens etc".
  LOAD INDEX is not ignored since a few weeks. Listed concurrency bugs have been fixed some time ago.
  Recovery of fulltext and GIS indexes works since a few weeks.
mysql-test/include/maria_make_snapshot.inc:
  configurable prefix in table's name (so far 't' or 't_corrupted')
mysql-test/include/maria_make_snapshot_for_comparison.inc:
  configurable prefix in table's name (so far 't' or 't_corrupted')
mysql-test/include/maria_make_snapshot_for_feeding_recovery.inc:
  configurable prefix in table's name (so far 't' or 't_corrupted')
mysql-test/include/maria_verify_recovery.inc:
  configurable prefix in table's name (so far 't' or 't_corrupted')
mysql-test/lib/mtr_report.pl:
  new test maria-recover.test generates expected corruption warnings in the error log. maria-recovery.test's corrupted table is renamed to t_corrupted1 instead of t1.
mysql-test/r/maria-preload.result:
  result update. maria_pagecache_read* values are similar to the previous version of this file, though a bit bigger
  because using the information_schema and the join leads to some internal maria temp table being used, and thus some
  blocks of it being read.
mysql-test/r/maria-purge.result:
  engine's name in SHOW ENGINE MARIA LOGS changed.
mysql-test/r/maria-recover.result:
  result for new test. We see corruption messages at first SELECT and then none at second SELECT, expected.
mysql-test/r/maria-recovery.result:
  result update
mysql-test/r/maria.result:
  new variables show up
mysql-test/t/disabled.def:
  BUG#34911 is not fixed but the test had been made independent of the bug (workaround). A new bug (crash) has popped recently, so it has to stay
  disabled (BUG#35107).
mysql-test/t/maria-preload.test:
  Work around BUG#34911 "FLUSH STATUS doesn't flush what it should":
  compute differences in status variables before and after relevant queries
mysql-test/t/maria-recover-master.opt:
  test --maria-recover
mysql-test/t/maria-recover.test:
  Test of the --maria-recover option (build a corrupted table and see if it is auto-repaired)
mysql-test/t/maria-recovery-big.test:
  update for new API of include/maria*.inc
mysql-test/t/maria-recovery-bitmap.test:
  update for new API of include/maria*.inc
mysql-test/t/maria-recovery.test:
  update for new API of include/maria*.inc. Corrupted table t1 renamed to t_corrupted1, so that mtr_report.pl
  does not blindly remove all corruption messages for t1 which is
  a common name.
storage/maria/ha_maria.cc:
  Enabling maria-recover.
  Adding option and global variable --maria_force_start_after_recovery_failures: ha_maria_init()
  calls mark_recovery_start() and mark_recovery_success() to keep track of failed consecutive recoveries
  and remove logs if needed.
  Removed \0 in the output of SHOW ENGINE MARIA LOGS; removed hard-coded engine name there.
storage/maria/ma_checkpoint.c:
  new prototype
storage/maria/ma_control_file.c:
  Storing in one byte in the control file, the number of consecutive recovery failures.
storage/maria/ma_control_file.h:
  new prototype
storage/maria/ma_init.c:
  new prototype
storage/maria/ma_locking.c:
  Need to update open_count on disk at first write and close for transactional tables, like we already did for
  non-transactional tables, otherwise we cannot notice that the table is dubious.
storage/maria/ma_loghandler.c:
  translog_is_log_files() is made more generic to serve either to search or to delete logs (the latter is
  for --maria-force-start-after-recovery-failures). It also had a bug (always returned FALSE).
storage/maria/ma_loghandler.h:
  export function because ha_maria::mark_recovery_start() needs it
storage/maria/ma_recovery.c:
  changing name of maria_recover() to distinguish from the maria-recover option.
storage/maria/ma_recovery.h:
  changing name of maria_recover() to distinguish from the maria-recover option.
storage/maria/ma_test_force_start.pl:
  Test of --maria-force-start-after-recovery-failures (and also, to be realistic, of --maria-recover).
  This is standalone because mysql-test-run does not support testing that multiple mysqld restarts expectedly failed.
  I'll have to run it on my machine and also on a Windows machine.
storage/maria/unittest/ma_control_file-t.c:
  adding recovery_failures to the test
storage/maria/unittest/ma_test_loghandler_multigroup-t.c:
  fix for compiler warning (unused variable in non-debug build)
parent 2d64cd05
......@@ -24,23 +24,9 @@ or in the worst case add it here for others to know!
Known bugs that we are working on and will be fixed shortly
===========================================================
- If the log files are damaged or inconsistent, Maria may fail to start.
We should fix that if this happens and mysqld is restarted (thanks to
mysqld_safe, instance manager or other script) it should disregard the
old logs, start anyway and automaticly repair any tables that was found
to be crashed on open.
Temporary fix is to remove or maria_log.???????? files from the data
directory, restart mysqld and run CHECK TABLE / REPAIR TABLE or
mysqlcheck on your Maria tables
- We have some instabilities in log writing that is under investigatation
This causes mainly assert to triggers in the code and sometimes
the log handler doesn't start up after restart.
- LOAD INDEX commands are for the moment ignored for Maria tables
(The code needs to be rewritten to do all reads through page cache to
avoid half-block reads)
- Some concurrency bugs in Maria's page cache which sometimes show up
under load http://bugs.mysql.com/bug.php?id=34161 and
http://bugs.mysql.com/bug.php?id=34634 .
Known bugs that are planned to be fixed before Beta
===================================================
......@@ -61,19 +47,15 @@ Known bugs that are planned to be fixed before Beta
Missing features that is planned to fix before Beta
===================================================
- We will add an maria-recover option to automaticly repair any
crashed tables on open. (This is needed for not transactional tables
and also in edge cases for transactional tables when the table
crashed because of a bug in MySQL or Maria code)
- Multiple concurrent inserts & multiple concurrent readers at same time
with full MVCC control. Note that UPDATE and DELETE will still be
blocking (as with MyISAM)
- COUNT(*) and TABLE CHECKSUM under MVCC (ie, they are instant and kept up
to date even with multiple inserter)
- Recovery of fulltext and GIS indexes.
Features planned for future releases
====================================
http://forge.mysql.com/worklog/
(you can enter "maria" in the "quick search" field there).
......@@ -10,28 +10,29 @@
# $mms_copy : to copy table from database to spare directory
# $mms_reverse : to copy it back
# $mms_compare_physically : to compare both byte-for-byte
# 2) set $mms_table_to_use to a number N: table will be mysqltest.tN
# 2) set $mms_tname to a string and set $mms_table_to_use to a number: tables
# will be mysqltest.$mms_tname$mms_table_to_use.
# 3) set $mms_purpose to say what this copy is for (influences the naming
# of the spare directory).
if ($mms_copy)
{
--echo * copied t$mms_table_to_use for $mms_purpose
copy_file $MYSQLTEST_VARDIR/master-data/mysqltest/t$mms_table_to_use.MAD $MYSQLTEST_VARDIR/master-data/mysqltest_for_$mms_purpose/t$mms_table_to_use.MAD;
copy_file $MYSQLTEST_VARDIR/master-data/mysqltest/t$mms_table_to_use.MAI $MYSQLTEST_VARDIR/master-data/mysqltest_for_$mms_purpose/t$mms_table_to_use.MAI;
copy_file $MYSQLTEST_VARDIR/master-data/mysqltest/t$mms_table_to_use.frm $MYSQLTEST_VARDIR/master-data/mysqltest_for_$mms_purpose/t$mms_table_to_use.frm;
--echo * copied $mms_tname$mms_table_to_use for $mms_purpose
copy_file $MYSQLTEST_VARDIR/master-data/mysqltest/$mms_tname$mms_table_to_use.MAD $MYSQLTEST_VARDIR/master-data/mysqltest_for_$mms_purpose/$mms_tname$mms_table_to_use.MAD;
copy_file $MYSQLTEST_VARDIR/master-data/mysqltest/$mms_tname$mms_table_to_use.MAI $MYSQLTEST_VARDIR/master-data/mysqltest_for_$mms_purpose/$mms_tname$mms_table_to_use.MAI;
copy_file $MYSQLTEST_VARDIR/master-data/mysqltest/$mms_tname$mms_table_to_use.frm $MYSQLTEST_VARDIR/master-data/mysqltest_for_$mms_purpose/$mms_tname$mms_table_to_use.frm;
}
if ($mms_reverse_copy)
{
# do not call this without flushing target table first!
--echo * copied t$mms_table_to_use back for $mms_purpose
--echo * copied $mms_tname$mms_table_to_use back for $mms_purpose
-- error 0,1
remove_file $MYSQLTEST_VARDIR/master-data/mysqltest/t$mms_table_to_use.MAD;
copy_file $MYSQLTEST_VARDIR/master-data/mysqltest_for_$mms_purpose/t$mms_table_to_use.MAD $MYSQLTEST_VARDIR/master-data/mysqltest/t$mms_table_to_use.MAD;
remove_file $MYSQLTEST_VARDIR/master-data/mysqltest/$mms_tname$mms_table_to_use.MAD;
copy_file $MYSQLTEST_VARDIR/master-data/mysqltest_for_$mms_purpose/$mms_tname$mms_table_to_use.MAD $MYSQLTEST_VARDIR/master-data/mysqltest/$mms_tname$mms_table_to_use.MAD;
-- error 0,1
remove_file $MYSQLTEST_VARDIR/master-data/mysqltest/t$mms_table_to_use.MAI;
copy_file $MYSQLTEST_VARDIR/master-data/mysqltest_for_$mms_purpose/t$mms_table_to_use.MAI $MYSQLTEST_VARDIR/master-data/mysqltest/t$mms_table_to_use.MAI;
remove_file $MYSQLTEST_VARDIR/master-data/mysqltest/$mms_tname$mms_table_to_use.MAI;
copy_file $MYSQLTEST_VARDIR/master-data/mysqltest_for_$mms_purpose/$mms_tname$mms_table_to_use.MAI $MYSQLTEST_VARDIR/master-data/mysqltest/$mms_tname$mms_table_to_use.MAI;
}
if ($mms_compare_physically)
......@@ -41,8 +42,8 @@ if ($mms_compare_physically)
# So, do this only when testing REDO phase.
# If UNDO phase, we nevertheless compare checksums
# (see maria_verify_recovery.inc).
--echo * compared t$mms_table_to_use to old version
diff_files $MYSQLTEST_VARDIR/master-data/mysqltest/t$mms_table_to_use.MAD $MYSQLTEST_VARDIR/master-data/mysqltest_for_$mms_purpose/t$mms_table_to_use.MAD;
--echo * compared $mms_tname$mms_table_to_use to old version
diff_files $MYSQLTEST_VARDIR/master-data/mysqltest/$mms_tname$mms_table_to_use.MAD $MYSQLTEST_VARDIR/master-data/mysqltest_for_$mms_purpose/$mms_tname$mms_table_to_use.MAD;
# index file not yet recovered
# diff_files $MYSQLTEST_VARDIR/master-data/mysqltest/t$mms_table_to_use.MAI $MYSQLTEST_VARDIR/master-data/mysqltest_for_$mms_purpose/t$mms_table_to_use.MAI;
# diff_files $MYSQLTEST_VARDIR/master-data/mysqltest/$mms_tname$mms_table_to_use.MAI $MYSQLTEST_VARDIR/master-data/mysqltest_for_$mms_purpose/$mms_tname$mms_table_to_use.MAI;
}
# Maria helper script
# Copies clean tables' data and index file to other directory
# Tables are t1...t[$mms_tables]
# Tables are $mms_tname1...$mms_tname[$mms_tables]
# They are later used as a reference to see if recovery works.
# API:
# set $mms_tables to N, the script will cover tables mysqltest.t1,...tN
# set $mms_tname to a string, and $mms_tables to a number N, the script will
# cover tables mysqltest.$mms_tname1,...$mms_tnameN
connection admin;
......@@ -22,7 +23,7 @@ eval create database mysqltest_for_$mms_purpose;
while ($mms_table_to_use)
{
# to serve as a reference, table must be in a clean state
eval flush table t$mms_table_to_use;
eval flush table $mms_tname$mms_table_to_use;
-- source include/maria_make_snapshot.inc
dec $mms_table_to_use;
}
......
# Maria helper script
# Copies tables' data and index file to other directory, and control file.
# Tables are t1...t[$mms_tables].
# Tables are $mms_tname1...$mms_tname[$mms_tables].
# Later, mysqld is shutdown, and that snapshot is put back into the
# datadir, control file too ("flashing recovery's brain"), and recovery is let
# to run on it (see maria_verify_recovery.inc).
# API:
# set $mms_tables to N, the script will cover tables mysqltest.t1,...tN
# set $mms_tname to a string, and $mms_tables to a number N, the script will
# cover tables mysqltest.$mms_tname1,...$mms_tnameN
connection admin;
......
......@@ -2,7 +2,8 @@
# Runs recovery, compare with expected table data.
# API:
# 1) set $mms_tables to N, the script will cover tables mysqltest.t1,...tN
# 1) set $mms_tname to a string, and $mms_tables to a number N, the script
# will cover tables mysqltest.$mms_tname1,...$mms_tnameN
# 2) set $mvr_debug_option to the crash way
# 3) set $mvr_crash_statement to the statement which will trigger a crash
# 4) set $mvr_restore_old_snapshot to 1 if you want recovery to run on
......@@ -77,10 +78,10 @@ let $mms_purpose=comparison;
let $mms_compare_physically=$mms_compare_physically_save;
while ($mms_table_to_use)
{
eval check table t$mms_table_to_use extended;
eval check table $mms_tname$mms_table_to_use extended;
--echo * testing that checksum after recovery is as expected
let $new_checksum=`CHECKSUM TABLE t$mms_table_to_use`;
let $old_checksum=`CHECKSUM TABLE mysqltest_for_$mms_purpose.t$mms_table_to_use`;
let $new_checksum=`CHECKSUM TABLE $mms_tname$mms_table_to_use`;
let $old_checksum=`CHECKSUM TABLE mysqltest_for_$mms_purpose.$mms_tname$mms_table_to_use`;
# the $ text variables above are of the form "db.tablename\tchecksum",
# as db differs, we use substring().
--disable_query_log
......
......@@ -405,7 +405,12 @@ sub mtr_report_stats ($) {
# maria-recovery.test has warning about missing log file
/File '.*maria_log.000.*' not found \(Errcode: 2\)/ or
# and about marked-corrupted table
/Table '.\/mysqltest\/t1' is crashed, skipping it. Please repair it with maria_chk -r/
/Table '.\/mysqltest\/t_corrupted1' is crashed, skipping it. Please repair it with maria_chk -r/ or
# maria-recover.test corrupts tables on purpose
/Checking table: '.\/mysqltest\/t_corrupted2'/ or
/Recovering table: '.\/mysqltest\/t_corrupted2'/ or
/Table '.\/mysqltest\/t_corrupted2' is marked as crashed and should be repaired/ or
/Incorrect key file for table '.\/mysqltest\/t_corrupted2.MAI'; try to repair it/
)
{
next; # Skip these lines
......
drop table if exists t1, t2;
create temporary table initial
select variable_name,variable_value from
information_schema.global_status where variable_name like "Maria_pagecache_read%";
create table t1 (
a int not null auto_increment,
b char(16) not null,
......@@ -46,24 +49,24 @@ count(*)
20672
flush tables;
flush status;
show status like "maria_pagecache_read%";
Variable_name Value
Maria_pagecache_read_requests 211388
Maria_pagecache_reads 115
select g.variable_name,g.variable_value-i.variable_value from information_schema.global_status as g,initial as i where g.variable_name like "Maria_pagecache_read%" and g.variable_name=i.variable_name order by g.variable_name desc;
variable_name g.variable_value-i.variable_value
MARIA_PAGECACHE_READ_REQUESTS 211644
MARIA_PAGECACHE_READS 3
select count(*) from t1 where b = 'test1';
count(*)
4181
show status like "maria_pagecache_read%";
Variable_name Value
Maria_pagecache_read_requests 211414
Maria_pagecache_reads 122
select g.variable_name,g.variable_value-i.variable_value from information_schema.global_status as g,initial as i where g.variable_name like "Maria_pagecache_read%" and g.variable_name=i.variable_name order by g.variable_name desc;
variable_name g.variable_value-i.variable_value
MARIA_PAGECACHE_READ_REQUESTS 211926
MARIA_PAGECACHE_READS 11
select count(*) from t1 where b = 'test1';
count(*)
4181
show status like "maria_pagecache_read%";
Variable_name Value
Maria_pagecache_read_requests 211440
Maria_pagecache_reads 122
select g.variable_name,g.variable_value-i.variable_value from information_schema.global_status as g,initial as i where g.variable_name like "Maria_pagecache_read%" and g.variable_name=i.variable_name order by g.variable_name desc;
variable_name g.variable_value-i.variable_value
MARIA_PAGECACHE_READ_REQUESTS 212208
MARIA_PAGECACHE_READS 12
flush tables;
flush status;
select @@preload_buffer_size;
......@@ -72,23 +75,23 @@ select @@preload_buffer_size;
load index into cache t1;
Table Op Msg_type Msg_text
test.t1 preload_keys status OK
show status like "maria_pagecache_read%";
Variable_name Value
Maria_pagecache_read_requests 211511
Maria_pagecache_reads 193
select g.variable_name,g.variable_value-i.variable_value from information_schema.global_status as g,initial as i where g.variable_name like "Maria_pagecache_read%" and g.variable_name=i.variable_name order by g.variable_name desc;
variable_name g.variable_value-i.variable_value
MARIA_PAGECACHE_READ_REQUESTS 212535
MARIA_PAGECACHE_READS 84
select count(*) from t1 where b = 'test1';
count(*)
4181
show status like "maria_pagecache_read%";
Variable_name Value
Maria_pagecache_read_requests 211537
Maria_pagecache_reads 193
select g.variable_name,g.variable_value-i.variable_value from information_schema.global_status as g,initial as i where g.variable_name like "Maria_pagecache_read%" and g.variable_name=i.variable_name order by g.variable_name desc;
variable_name g.variable_value-i.variable_value
MARIA_PAGECACHE_READ_REQUESTS 212817
MARIA_PAGECACHE_READS 85
flush tables;
flush status;
show status like "maria_pagecache_read%";
Variable_name Value
Maria_pagecache_read_requests 211537
Maria_pagecache_reads 193
select g.variable_name,g.variable_value-i.variable_value from information_schema.global_status as g,initial as i where g.variable_name like "Maria_pagecache_read%" and g.variable_name=i.variable_name order by g.variable_name desc;
variable_name g.variable_value-i.variable_value
MARIA_PAGECACHE_READ_REQUESTS 213073
MARIA_PAGECACHE_READS 86
set session preload_buffer_size=256*1024;
select @@preload_buffer_size;
@@preload_buffer_size
......@@ -96,23 +99,23 @@ select @@preload_buffer_size;
load index into cache t1 ignore leaves;
Table Op Msg_type Msg_text
test.t1 preload_keys status OK
show status like "maria_pagecache_read%";
Variable_name Value
Maria_pagecache_read_requests 211608
Maria_pagecache_reads 264
select g.variable_name,g.variable_value-i.variable_value from information_schema.global_status as g,initial as i where g.variable_name like "Maria_pagecache_read%" and g.variable_name=i.variable_name order by g.variable_name desc;
variable_name g.variable_value-i.variable_value
MARIA_PAGECACHE_READ_REQUESTS 213400
MARIA_PAGECACHE_READS 158
select count(*) from t1 where b = 'test1';
count(*)
4181
show status like "maria_pagecache_read%";
Variable_name Value
Maria_pagecache_read_requests 211634
Maria_pagecache_reads 270
select g.variable_name,g.variable_value-i.variable_value from information_schema.global_status as g,initial as i where g.variable_name like "Maria_pagecache_read%" and g.variable_name=i.variable_name order by g.variable_name desc;
variable_name g.variable_value-i.variable_value
MARIA_PAGECACHE_READ_REQUESTS 213682
MARIA_PAGECACHE_READS 165
flush tables;
flush status;
show status like "maria_pagecache_read%";
Variable_name Value
Maria_pagecache_read_requests 211634
Maria_pagecache_reads 270
select g.variable_name,g.variable_value-i.variable_value from information_schema.global_status as g,initial as i where g.variable_name like "Maria_pagecache_read%" and g.variable_name=i.variable_name order by g.variable_name desc;
variable_name g.variable_value-i.variable_value
MARIA_PAGECACHE_READ_REQUESTS 213938
MARIA_PAGECACHE_READS 166
set session preload_buffer_size=1*1024;
select @@preload_buffer_size;
@@preload_buffer_size
......@@ -121,52 +124,53 @@ load index into cache t1, t2 key (primary,b) ignore leaves;
Table Op Msg_type Msg_text
test.t1 preload_keys status OK
test.t2 preload_keys status OK
show status like "maria_pagecache_read%";
Variable_name Value
Maria_pagecache_read_requests 211748
Maria_pagecache_reads 384
select g.variable_name,g.variable_value-i.variable_value from information_schema.global_status as g,initial as i where g.variable_name like "Maria_pagecache_read%" and g.variable_name=i.variable_name order by g.variable_name desc;
variable_name g.variable_value-i.variable_value
MARIA_PAGECACHE_READ_REQUESTS 214308
MARIA_PAGECACHE_READS 281
select count(*) from t1 where b = 'test1';
count(*)
4181
select count(*) from t2 where b = 'test1';
count(*)
2584
show status like "maria_pagecache_read%";
Variable_name Value
Maria_pagecache_read_requests 211788
Maria_pagecache_reads 387
select g.variable_name,g.variable_value-i.variable_value from information_schema.global_status as g,initial as i where g.variable_name like "Maria_pagecache_read%" and g.variable_name=i.variable_name order by g.variable_name desc;
variable_name g.variable_value-i.variable_value
MARIA_PAGECACHE_READ_REQUESTS 214604
MARIA_PAGECACHE_READS 285
flush tables;
flush status;
show status like "maria_pagecache_read%";
Variable_name Value
Maria_pagecache_read_requests 211788
Maria_pagecache_reads 387
select g.variable_name,g.variable_value-i.variable_value from information_schema.global_status as g,initial as i where g.variable_name like "Maria_pagecache_read%" and g.variable_name=i.variable_name order by g.variable_name desc;
variable_name g.variable_value-i.variable_value
MARIA_PAGECACHE_READ_REQUESTS 214860
MARIA_PAGECACHE_READS 286
load index into cache t3, t2 key (primary,b) ;
Table Op Msg_type Msg_text
test.t3 preload_keys Error Table 'test.t3' doesn't exist
test.t3 preload_keys error Corrupt
test.t2 preload_keys status OK
show status like "maria_pagecache_read%";
Variable_name Value
Maria_pagecache_read_requests 211831
Maria_pagecache_reads 430
select g.variable_name,g.variable_value-i.variable_value from information_schema.global_status as g,initial as i where g.variable_name like "Maria_pagecache_read%" and g.variable_name=i.variable_name order by g.variable_name desc;
variable_name g.variable_value-i.variable_value
MARIA_PAGECACHE_READ_REQUESTS 215159
MARIA_PAGECACHE_READS 330
flush tables;
flush status;
show status like "maria_pagecache_read%";
Variable_name Value
Maria_pagecache_read_requests 211831
Maria_pagecache_reads 430
select g.variable_name,g.variable_value-i.variable_value from information_schema.global_status as g,initial as i where g.variable_name like "Maria_pagecache_read%" and g.variable_name=i.variable_name order by g.variable_name desc;
variable_name g.variable_value-i.variable_value
MARIA_PAGECACHE_READ_REQUESTS 215415
MARIA_PAGECACHE_READS 331
load index into cache t3 key (b), t2 key (c) ;
Table Op Msg_type Msg_text
test.t3 preload_keys Error Table 'test.t3' doesn't exist
test.t3 preload_keys error Corrupt
test.t2 preload_keys Error Key 'c' doesn't exist in table 't2'
test.t2 preload_keys status Operation failed
show status like "maria_pagecache_read%";
Variable_name Value
Maria_pagecache_read_requests 211831
Maria_pagecache_reads 430
select g.variable_name,g.variable_value-i.variable_value from information_schema.global_status as g,initial as i where g.variable_name like "Maria_pagecache_read%" and g.variable_name=i.variable_name order by g.variable_name desc;
variable_name g.variable_value-i.variable_value
MARIA_PAGECACHE_READ_REQUESTS 215671
MARIA_PAGECACHE_READS 332
drop table t1, t2;
drop temporary table initial;
show status like "key_read%";
Variable_name Value
Key_read_requests 0
......
......@@ -37,13 +37,13 @@ set global maria_log_file_size=16777216;
set global maria_checkpoint_interval=30;
SHOW ENGINE maria logs;
Type Name Status
maria master-data/maria_log.00000002 in use
MARIA master-data/maria_log.00000002 in use
insert into t2 select * from t1;
insert into t1 select * from t2;
set global maria_checkpoint_interval=30;
SHOW ENGINE maria logs;
Type Name Status
maria master-data/maria_log.00000004 in use
MARIA master-data/maria_log.00000004 in use
set global maria_log_file_size=16777216;
select @@global.maria_log_file_size;
@@global.maria_log_file_size
......@@ -51,7 +51,7 @@ select @@global.maria_log_file_size;
set global maria_checkpoint_interval=30;
SHOW ENGINE maria logs;
Type Name Status
maria master-data/maria_log.00000004 in use
MARIA master-data/maria_log.00000004 in use
set global maria_log_file_size=8388608;
select @@global.maria_log_file_size;
@@global.maria_log_file_size
......@@ -61,32 +61,32 @@ insert into t1 select * from t2;
set global maria_checkpoint_interval=30;
SHOW ENGINE maria logs;
Type Name Status
maria master-data/maria_log.00000004 free
maria master-data/maria_log.00000005 free
maria master-data/maria_log.00000006 free
maria master-data/maria_log.00000007 free
maria master-data/maria_log.00000008 in use
MARIA master-data/maria_log.00000004 free
MARIA master-data/maria_log.00000005 free
MARIA master-data/maria_log.00000006 free
MARIA master-data/maria_log.00000007 free
MARIA master-data/maria_log.00000008 in use
flush logs;
SHOW ENGINE maria logs;
Type Name Status
maria master-data/maria_log.00000008 in use
MARIA master-data/maria_log.00000008 in use
set global maria_log_file_size=16777216;
set global maria_log_purge_type=external;
insert into t1 select * from t2;
set global maria_checkpoint_interval=30;
SHOW ENGINE maria logs;
Type Name Status
maria master-data/maria_log.00000008 free
maria master-data/maria_log.00000009 in use
MARIA master-data/maria_log.00000008 free
MARIA master-data/maria_log.00000009 in use
flush logs;
SHOW ENGINE maria logs;
Type Name Status
maria master-data/maria_log.00000008 free
maria master-data/maria_log.00000009 in use
MARIA master-data/maria_log.00000008 free
MARIA master-data/maria_log.00000009 in use
set global maria_log_purge_type=immediate;
insert into t1 select * from t2;
set global maria_checkpoint_interval=30;
SHOW ENGINE maria logs;
Type Name Status
maria master-data/maria_log.00000011 in use
MARIA master-data/maria_log.00000011 in use
drop table t1, t2;
select @@global.maria_recover;
@@global.maria_recover
BACKUP
set global maria_recover=off;
select @@global.maria_recover;
@@global.maria_recover
OFF
set global maria_recover=default;
select @@global.maria_recover;
@@global.maria_recover
OFF
set global maria_recover=normal;
select @@global.maria_recover;
@@global.maria_recover
NORMAL
drop database if exists mysqltest;
create database mysqltest;
use mysqltest;
create table t1 (a varchar(1000), index(a)) engine=maria;
insert into t1 values("ThursdayMorningsMarket");
flush table t1;
insert into t1 select concat(a,'b') from t1 limit 1;
select * from t_corrupted2;
a
ThursdayMorningsMarket
Warnings:
Error 145 Table './mysqltest/t_corrupted2' is marked as crashed and should be repaired
Error 1194 Table 't_corrupted2' is marked as crashed and should be repaired
Error 1034 1 client is using or hasn't closed the table properly
Error 126 Incorrect key file for table './mysqltest/t_corrupted2.MAI'; try to repair it
Error 1034 Wrong base information on indexpage at page: 1
select * from t_corrupted2;
a
ThursdayMorningsMarket
drop database mysqltest;
......@@ -362,24 +362,24 @@ Table Non_unique Key_name Seq_in_index Column_name Collation Cardinality Sub_par
t1 1 a 1 a A 1 NULL NULL YES BTREE
drop table t1;
* TEST of recovery when OPTIMIZE has replaced the index file and crash
create table t1 (a varchar(100), key(a)) engine=maria;
insert into t1 select (rand()) from t2;
flush table t1;
* copied t1 for comparison
create table t_corrupted1 (a varchar(100), key(a)) engine=maria;
insert into t_corrupted1 select (rand()) from t2;
flush table t_corrupted1;
* copied t_corrupted1 for comparison
SET SESSION debug="+d,maria_flush_whole_log,maria_flush_whole_page_cache,maria_crash_sort_index";
* crashing mysqld intentionally
optimize table t1;
optimize table t_corrupted1;
ERROR HY000: Lost connection to MySQL server during query
* recovery happens
check table t1 extended;
check table t_corrupted1 extended;
Table Op Msg_type Msg_text
mysqltest.t1 check warning Table is marked as crashed and last repair failed
mysqltest.t1 check status OK
mysqltest.t_corrupted1 check warning Table is marked as crashed and last repair failed
mysqltest.t_corrupted1 check status OK
* testing that checksum after recovery is as expected
Checksum-check
ok
use mysqltest;
drop table t1, t2;
drop table t_corrupted1, t2;
drop database mysqltest_for_feeding_recovery;
drop database mysqltest_for_comparison;
drop database mysqltest;
......@@ -2121,6 +2121,7 @@ show variables like 'maria%';
Variable_name Value
maria_block_size 8192
maria_checkpoint_interval 30
maria_force_start_after_recovery_failures 0
maria_log_file_size 4294959104
maria_log_purge_type immediate
maria_max_sort_file_size 9223372036854775807
......@@ -2128,6 +2129,7 @@ maria_page_checksum OFF
maria_pagecache_age_threshold 300
maria_pagecache_buffer_size 8388600
maria_pagecache_division_limit 100
maria_recover OFF
maria_repair_threads 1
maria_sort_buffer_size 8388608
maria_stats_method nulls_unequal
......
......@@ -19,4 +19,4 @@ ctype_create : Bug#32965 main.ctype_create fails
status : Bug#32966 main.status fails
ps_ddl : Bug#12093 2007-12-14 pending WL#4165 / WL#4166
csv_alter_table : Bug#33696 2008-01-21 pcrews no .result file - bug allows NULL columns in CSV tables
maria-preload : Bug#34911 unrepeatable output of SHOW STATUS
maria-preload : Bug#35107 crashes
......@@ -8,6 +8,11 @@
drop table if exists t1, t2;
--enable_warnings
# Work around BUG#34911 "FLUSH STATUS doesn't flush what it should":
# compute differences in status variables before and after relevant queries
create temporary table initial
select variable_name,variable_value from
information_schema.global_status where variable_name like "Maria_pagecache_read%";
# we don't use block-format because we want page cache stats
# about indices and not data pages.
......@@ -59,50 +64,50 @@ select count(*) from t1;
select count(*) from t2;
flush tables; flush status;
show status like "maria_pagecache_read%";
let $show_stat=select g.variable_name,g.variable_value-i.variable_value from information_schema.global_status as g,initial as i where g.variable_name like "Maria_pagecache_read%" and g.variable_name=i.variable_name order by g.variable_name desc;
eval $show_stat;
select count(*) from t1 where b = 'test1';
show status like "maria_pagecache_read%";
eval $show_stat;
select count(*) from t1 where b = 'test1';
show status like "maria_pagecache_read%";
eval $show_stat;
flush tables; flush status;
select @@preload_buffer_size;
load index into cache t1;
show status like "maria_pagecache_read%";
eval $show_stat;
select count(*) from t1 where b = 'test1';
show status like "maria_pagecache_read%";
eval $show_stat;
flush tables; flush status;
show status like "maria_pagecache_read%";
eval $show_stat;
set session preload_buffer_size=256*1024;
select @@preload_buffer_size;
load index into cache t1 ignore leaves;
show status like "maria_pagecache_read%";
eval $show_stat;
select count(*) from t1 where b = 'test1';
show status like "maria_pagecache_read%";
eval $show_stat;
flush tables; flush status;
show status like "maria_pagecache_read%";
eval $show_stat;
set session preload_buffer_size=1*1024;
select @@preload_buffer_size;
load index into cache t1, t2 key (primary,b) ignore leaves;
show status like "maria_pagecache_read%";
eval $show_stat;
select count(*) from t1 where b = 'test1';
select count(*) from t2 where b = 'test1';
show status like "maria_pagecache_read%";
eval $show_stat;
flush tables; flush status;
show status like "maria_pagecache_read%";
eval $show_stat;
load index into cache t3, t2 key (primary,b) ;
show status like "maria_pagecache_read%";
eval $show_stat;
flush tables; flush status;
show status like "maria_pagecache_read%";
eval $show_stat;
load index into cache t3 key (b), t2 key (c) ;
show status like "maria_pagecache_read%";
eval $show_stat;
drop table t1, t2;
drop temporary table initial;
# check that Maria didn't use key cache
show status like "key_read%";
--maria-recover=backup --maria-log-dir-path=../tmp
# Test of the --maria-recover option.
--source include/have_maria.inc
select @@global.maria_recover;
set global maria_recover=off;
select @@global.maria_recover;
set global maria_recover=default;
select @@global.maria_recover;
set global maria_recover=normal;
select @@global.maria_recover;
--disable_warnings
drop database if exists mysqltest;
--enable_warnings
create database mysqltest;
use mysqltest;
create table t1 (a varchar(1000), index(a)) engine=maria;
insert into t1 values("ThursdayMorningsMarket");
flush table t1; # put index page on disk
insert into t1 select concat(a,'b') from t1 limit 1;
# now t1 has its open_count>0 and so will t2_corrupted.
# It is not named t2 because the corruption messages which will be put
# in the error log need to be detected in mtr_process.pl, and we want
# a specific name to do specific detection (don't want to ignore
# any corruption messages of other tests using "t2" as table).
copy_file $MYSQLTEST_VARDIR/master-data/mysqltest/t1.frm $MYSQLTEST_VARDIR/master-data/mysqltest/t_corrupted2.frm;
copy_file $MYSQLTEST_VARDIR/master-data/mysqltest/t1.MAD $MYSQLTEST_VARDIR/master-data/mysqltest/t_corrupted2.MAD;
copy_file $MYSQLTEST_VARDIR/master-data/mysqltest/t1.MAI $MYSQLTEST_VARDIR/master-data/mysqltest/t_corrupted2.MAI;
# Ruin the index file.
# If maria-block-size is smaller than the default, the corruption
# messages will differ.
perl;
use strict;
use warnings;
my $fname= "$ENV{'MYSQLTEST_VARDIR'}/master-data/mysqltest/t_corrupted2.MAI";
open(FILE, "+<", $fname) or die;
my $whatever= ("\xAB" x 100);
sysseek (FILE, 8192, 0) or die;
syswrite (FILE, $whatever) or die;
close FILE;
EOF
select * from t_corrupted2; # should show corruption and repair messages
select * from t_corrupted2; # should show just rows
drop database mysqltest;
......@@ -15,6 +15,7 @@ set global maria_log_file_size=4294967295;
drop database if exists mysqltest;
--enable_warnings
create database mysqltest;
let $mms_tname=t;
# Include scripts can perform SQL. For it to not influence the main test
# they use a separate connection. This way if they use a DDL it would
......
......@@ -11,6 +11,7 @@
drop database if exists mysqltest;
--enable_warnings
create database mysqltest;
let $mms_tname=t;
# Include scripts can perform SQL. For it to not influence the main test
# they use a separate connection. This way if they use a DDL it would
......
......@@ -14,6 +14,7 @@ let $MARIA_LOG=.;
drop database if exists mysqltest;
--enable_warnings
create database mysqltest;
let $mms_tname=t;
# Include scripts can perform SQL. For it to not influence the main test
# they use a separate connection. This way if they use a DDL it would
......
......@@ -12,6 +12,7 @@ let $MARIA_LOG=../tmp;
drop database if exists mysqltest;
--enable_warnings
create database mysqltest;
let $mms_tname=t;
# Include scripts can perform SQL. For it to not influence the main test
# they use a separate connection. This way if they use a DDL it would
......@@ -297,19 +298,25 @@ show keys from t1; # should be enabled
drop table t1;
--echo * TEST of recovery when OPTIMIZE has replaced the index file and crash
create table t1 (a varchar(100), key(a)) engine=maria;
create table t_corrupted1 (a varchar(100), key(a)) engine=maria;
# we use a special name because this test portion will generate
# corruption warnings, which we tell mtr_report.pl to ignore by
# putting the message in mtr_report.pl, but we don't want to it ignore
# corruption messages of other tests, hence the special name
# 't_corrupted' and not just 't'.
let $mms_tname=t_corrupted;
let $mvr_restore_old_snapshot=0;
let $mms_compare_physically=0;
let $mvr_crash_statement= optimize table t1;
let $mvr_crash_statement= optimize table t_corrupted1;
let $mvr_debug_option="+d,maria_flush_whole_log,maria_flush_whole_page_cache,maria_crash_sort_index";
insert into t1 select (rand()) from t2;
insert into t_corrupted1 select (rand()) from t2;
-- source include/maria_make_snapshot_for_comparison.inc
# Recovery will not fix the table, but we expect to see it marked
# "crashed on repair".
# Because crash is mild, the table is actually not corrupted, so the
# "check table extended" done below fixes the table.
-- source include/maria_verify_recovery.inc
drop table t1, t2;
drop table t_corrupted1, t2;
# clean up everything
let $mms_purpose=feeding_recovery;
......
......@@ -49,13 +49,11 @@ ulong pagecache_division_limit, pagecache_age_threshold;
ulonglong pagecache_buffer_size;
/**
@todo For now there is no way for a user to set a different value of
maria_recover_options, i.e. auto-check-and-repair is always disabled.
We could enable it. As the auto-repair is initiated when opened from the
SQL layer (open_unireg_entry(), check_and_repair()), it does not happen
when Maria's Recovery internally opens the table to apply log records to
it, which is good. It would happen only after Recovery, if the table is
still corrupted.
As the auto-repair is initiated when opened from the SQL layer
(open_unireg_entry(), check_and_repair()), it does not happen when Maria's
Recovery internally opens the table to apply log records to it, which is
good. It would happen only after Recovery, if the table is still
corrupted.
*/
ulong maria_recover_options= HA_RECOVER_NONE;
handlerton *maria_hton;
......@@ -63,7 +61,14 @@ handlerton *maria_hton;
/* bits in maria_recover_options */
const char *maria_recover_names[]=
{
"DEFAULT", "BACKUP", "FORCE", "QUICK", NullS
/*
Compared to MyISAM, "default" was renamed to "normal" as it collided with
SET var=default which sets to the var's default i.e. what happens when the
var is not set i.e. HA_RECOVER_NONE.
Another change is that OFF is used to disable, not ""; this is to have OFF
display in SHOW VARIABLES which is better than "".
*/
"OFF", "NORMAL", "BACKUP", "FORCE", "QUICK", NullS
};
TYPELIB maria_recover_typelib=
{
......@@ -103,11 +108,13 @@ TYPELIB maria_sync_log_dir_typelib=
maria_sync_log_dir_names, NULL
};
/** @brief Interval between background checkpoints in seconds */
/** Interval between background checkpoints in seconds */
static ulong checkpoint_interval;
static void update_checkpoint_interval(MYSQL_THD thd,
struct st_mysql_sys_var *var,
void *var_ptr, const void *save);
/** After that many consecutive recovery failures, remove logs */
static ulong force_start_after_recovery_failures;
static void update_log_file_size(MYSQL_THD thd,
struct st_mysql_sys_var *var,
void *var_ptr, const void *save);
......@@ -124,6 +131,17 @@ static MYSQL_SYSVAR_ULONG(checkpoint_interval, checkpoint_interval,
" 'no automatic checkpoints' which makes sense only for testing.",
NULL, update_checkpoint_interval, 30, 0, UINT_MAX, 1);
static MYSQL_SYSVAR_ULONG(force_start_after_recovery_failures,
force_start_after_recovery_failures,
/*
Read-only because setting it on the fly has no useful effect,
should be set on command-line.
*/
PLUGIN_VAR_RQCMDARG | PLUGIN_VAR_READONLY,
"Number of consecutive log recovery failures after which logs will be"
" automatically deleted to cure the problem; 0 (the default) disables"
" the feature.", NULL, NULL, 0, 0, UINT_MAX8, 1);
static MYSQL_SYSVAR_BOOL(page_checksum, maria_page_checksums, 0,
"Maintain page checksums (can be overridden per table "
"with PAGE_CHECKSUM clause in CREATE TABLE)", 0, 0, 1);
......@@ -175,6 +193,12 @@ static MYSQL_SYSVAR_ULONG(pagecache_division_limit, pagecache_division_limit,
"The minimum percentage of warm blocks in key cache", 0, 0,
100, 1, 100, 1);
static MYSQL_SYSVAR_ENUM(recover, maria_recover_options, PLUGIN_VAR_OPCMDARG,
"Specifies how corrupted tables should be automatically repaired."
" Possible values are \"NORMAL\" (the default), \"BACKUP\", \"FORCE\","
" \"QUICK\", or \"OFF\" which is like not using the option.",
NULL, NULL, HA_RECOVER_NONE, &maria_recover_typelib);
static MYSQL_THDVAR_ULONG(repair_threads, PLUGIN_VAR_RQCMDARG,
"Number of threads to use when repairing maria tables. The value of 1 "
"disables parallel repair.",
......@@ -186,7 +210,7 @@ static MYSQL_THDVAR_ULONG(sort_buffer_size, PLUGIN_VAR_RQCMDARG,
0, 0, 8192*1024, 4, ~0L, 1);
static MYSQL_THDVAR_ENUM(stats_method, PLUGIN_VAR_RQCMDARG,
"Specifies how maria index statistics collection code should threat "
"Specifies how maria index statistics collection code should treat "
"NULLs. Possible values are \"nulls_unequal\", \"nulls_equal\", "
"and \"nulls_ignored\".", 0, 0, 0, &maria_stats_method_typelib);
......@@ -870,6 +894,12 @@ int ha_maria::open(const char *name, int mode, uint test_if_locked)
test_if_locked|= HA_OPEN_MMAP;
#endif
if (unlikely(maria_recover_options != HA_RECOVER_NONE))
{
/* user asked to trigger a repair if table was not properly closed */
test_if_locked|= HA_OPEN_ABORT_IF_CRASHED;
}
if (!(file= maria_open(name, mode, test_if_locked | HA_OPEN_FROM_SQL_LAYER)))
return (my_errno ? my_errno : -1);
......@@ -2728,7 +2758,7 @@ bool maria_show_status(handlerton *hton,
stat_print_fn *print,
enum ha_stat_type stat)
{
char engine_name[]= "maria";
const LEX_STRING *engine_name= hton_name(hton);
switch (stat) {
case HA_ENGINE_LOGS:
{
......@@ -2745,8 +2775,8 @@ bool maria_show_status(handlerton *hton,
if (first_file == 0)
{
const char error[]= "error";
print(thd, engine_name, sizeof(engine_name),
STRING_WITH_LEN(""), error, sizeof(error));
print(thd, engine_name->str, engine_name->length,
STRING_WITH_LEN(""), error, sizeof(error) - 1);
break;
}
......@@ -2762,7 +2792,7 @@ bool maria_show_status(handlerton *hton,
if (!(stat= my_stat(file, &stat_buff, MYF(MY_WME))))
{
status= error;
status_len= sizeof(error);
status_len= sizeof(error) - 1;
length= my_snprintf(object, SHOW_MSG_LEN, "Size unknown ; %s", file);
}
else
......@@ -2770,23 +2800,23 @@ bool maria_show_status(handlerton *hton,
if (first_needed == 0)
{
status= unknown;
status_len= sizeof(unknown);
status_len= sizeof(unknown) - 1;
}
else if (i < first_needed)
{
status= unneeded;
status_len= sizeof(unneeded);
status_len= sizeof(unneeded) - 1;
}
else
{
status= needed;
status_len= sizeof(needed);
status_len= sizeof(needed) - 1;
}
length= my_snprintf(object, SHOW_MSG_LEN, "Size %12lu ; %s",
(ulong) stat->st_size, file);
}
print(thd, engine_name, sizeof(engine_name),
print(thd, engine_name->str, engine_name->length,
object, length, status, status_len);
}
break;
......@@ -2799,9 +2829,90 @@ bool maria_show_status(handlerton *hton,
return 0;
}
/**
Callback to delete all logs in directory. This is lower-level than other
functions in ma_loghandler.c which delete logs, as it does not rely on
translog_init() having been called first.
@param directory directory where file is
@param filename base name of the file to delete
*/
static my_bool translog_callback_delete_all(const char *directory,
const char *filename)
{
char complete_name[FN_REFLEN];
fn_format(complete_name, filename, directory, "", MYF(MY_UNPACK_FILENAME));
return my_delete(complete_name, MYF(MY_WME));
}
/**
Helper function for option maria-force-start-after-recovery-failures.
Deletes logs if too many failures. Otherwise, increments the counter of
failures in the control file.
Notice how this has to be called _before_ translog_init() (if log is
corrupted, translog_init() might crash the server, so we need to remove logs
before).
@param log_dir directory where logs to be deleted are
*/
static int mark_recovery_start(const char* log_dir)
{
int res;
DBUG_ENTER("mark_recovery_start");
if (unlikely(maria_recover_options == HA_RECOVER_NONE))
ma_message_no_user(ME_JUST_WARNING, "Please consider using option"
" --maria-recover[=...] to automatically check and"
" repair tables when logs are removed by option"
" --maria-force-start-after-recovery-failures=#");
if (recovery_failures >= force_start_after_recovery_failures)
{
/*
Remove logs which cause the problem; keep control file which has
critical info like uuid, max_trid (removing control file may make
correct tables look corrupted!).
*/
char msg[100];
res= translog_walk_filenames(log_dir, &translog_callback_delete_all);
my_snprintf(msg, sizeof(msg),
"%s logs after %u consecutive failures of"
" recovery from logs",
(res ? "failed to remove some" : "removed all"),
recovery_failures);
ma_message_no_user((res ? 0 : ME_JUST_WARNING), msg);
}
else
res= ma_control_file_write_and_force(last_checkpoint_lsn, last_logno,
max_trid_in_control_file,
recovery_failures + 1);
DBUG_RETURN(res);
}
/**
Helper function for option maria-force-start-after-recovery-failures.
Records in the control file that recovery was a success, so that it's not
counted for maria-force-start-after-recovery-failures.
*/
static int mark_recovery_success(void)
{
/* success of recovery, reset recovery_failures: */
int res;
DBUG_ENTER("mark_recovery_success");
res= ma_control_file_write_and_force(last_checkpoint_lsn, last_logno,
max_trid_in_control_file, 0);
DBUG_RETURN(res);
}
static int ha_maria_init(void *p)
{
int res;
const char *log_dir= maria_data_root;
maria_hton= (handlerton *)p;
maria_hton->state= SHOW_OPTION_YES;
maria_hton->db_type= DB_TYPE_UNKNOWN;
......@@ -2816,6 +2927,8 @@ static int ha_maria_init(void *p)
bzero(maria_log_pagecache, sizeof(*maria_log_pagecache));
maria_tmpdir= &mysql_tmpdir_list; /* For REDO */
res= maria_init() || ma_control_file_open(TRUE, TRUE) ||
((force_start_after_recovery_failures != 0) &&
mark_recovery_start(log_dir)) ||
!init_pagecache(maria_pagecache,
(size_t) pagecache_buffer_size, pagecache_division_limit,
pagecache_age_threshold, maria_block_size, 0) ||
......@@ -2825,7 +2938,8 @@ static int ha_maria_init(void *p)
translog_init(maria_data_root, log_file_size,
MYSQL_VERSION_ID, server_id, maria_log_pagecache,
TRANSLOG_DEFAULT_FLAGS, 0) ||
maria_recover() ||
maria_recovery_from_log() ||
((force_start_after_recovery_failures != 0) && mark_recovery_success()) ||
ma_checkpoint_init(checkpoint_interval);
maria_multi_threaded= TRUE;
return res ? HA_ERR_INITIALIZATION : 0;
......@@ -2913,6 +3027,7 @@ my_bool ha_maria::register_query_cache_table(THD *thd, char *table_name,
static struct st_mysql_sys_var* system_variables[]= {
MYSQL_SYSVAR(block_size),
MYSQL_SYSVAR(checkpoint_interval),
MYSQL_SYSVAR(force_start_after_recovery_failures),
MYSQL_SYSVAR(page_checksum),
MYSQL_SYSVAR(log_dir_path),
MYSQL_SYSVAR(log_file_size),
......@@ -2921,6 +3036,7 @@ static struct st_mysql_sys_var* system_variables[]= {
MYSQL_SYSVAR(pagecache_age_threshold),
MYSQL_SYSVAR(pagecache_buffer_size),
MYSQL_SYSVAR(pagecache_division_limit),
MYSQL_SYSVAR(recover),
MYSQL_SYSVAR(repair_threads),
MYSQL_SYSVAR(sort_buffer_size),
MYSQL_SYSVAR(stats_method),
......
......@@ -245,7 +245,8 @@ static int really_execute_checkpoint(void)
that log was flushed before we write to the control file).
*/
if (unlikely(ma_control_file_write_and_force(lsn, last_logno,
max_trid_in_control_file)))
max_trid_in_control_file,
recovery_failures)))
{
translog_unlock();
goto err;
......
......@@ -39,6 +39,8 @@ Start of changeable part:
- Checksum of changeable part
- LSN of last checkpoint
- Number of last log file
- Max trid in control file (since Maria 1.5 May 2008)
- Number of consecutive recovery failures (since Maria 1.5 May 2008)
..... Here we can add new variables without changing format
The idea is that one can add new variables to the control file and still
......@@ -80,7 +82,9 @@ one should increment the control file version number.
#define CF_FILENO_SIZE 4
#define CF_MAX_TRID_OFFSET (CF_FILENO_OFFSET + CF_FILENO_SIZE)
#define CF_MAX_TRID_SIZE TRANSID_SIZE
#define CF_CHANGEABLE_TOTAL_SIZE (CF_MAX_TRID_OFFSET + CF_MAX_TRID_SIZE)
#define CF_RECOV_FAIL_OFFSET (CF_MAX_TRID_OFFSET + CF_MAX_TRID_SIZE)
#define CF_RECOV_FAIL_SIZE 1
#define CF_CHANGEABLE_TOTAL_SIZE (CF_RECOV_FAIL_OFFSET + CF_RECOV_FAIL_SIZE)
/*
The following values should not be changed, except when changing version
......@@ -108,6 +112,12 @@ uint32 last_logno= FILENO_IMPOSSIBLE;
*/
TrID max_trid_in_control_file= 0;
/**
Number of consecutive log or recovery failures. Reset to 0 after recovery's
success.
*/
uint8 recovery_failures= 0;
/**
@brief If log's lock should be asserted when writing to control file.
......@@ -188,7 +198,7 @@ static CONTROL_FILE_ERROR create_control_file(const char *name,
/* init the file with these "undefined" values */
DBUG_RETURN(ma_control_file_write_and_force(LSN_IMPOSSIBLE,
FILENO_IMPOSSIBLE, 0));
FILENO_IMPOSSIBLE, 0, 0));
}
......@@ -420,6 +430,9 @@ CONTROL_FILE_ERROR ma_control_file_open(my_bool create_if_missing,
if (new_cf_changeable_size >= (CF_MAX_TRID_OFFSET + CF_MAX_TRID_SIZE))
max_trid_in_control_file=
transid_korr(buffer + new_cf_create_time_size + CF_MAX_TRID_OFFSET);
if (new_cf_changeable_size >= (CF_RECOV_FAIL_OFFSET + CF_RECOV_FAIL_SIZE))
recovery_failures=
(buffer + new_cf_create_time_size + CF_RECOV_FAIL_OFFSET)[0];
ok:
DBUG_RETURN(0);
......@@ -436,19 +449,21 @@ err:
/*
Write information durably to the control file; stores this information into
the last_checkpoint_lsn, last_logno, max_trid_in_control_file global
variables.
the last_checkpoint_lsn, last_logno, max_trid_in_control_file,
recovery_failures global variables.
Called when we have created a new log (after syncing this log's creation),
when we have written a checkpoint (after syncing this log record), and at
shutdown (for storing trid in case logs are soon removed by user).
when we have written a checkpoint (after syncing this log record), at
shutdown (for storing trid in case logs are soon removed by user), and
before and after recovery (to store recovery_failures).
Variables last_checkpoint_lsn and last_logno must be protected by caller
using log's lock, unless this function is called at startup.
SYNOPSIS
ma_control_file_write_and_force()
checkpoint_lsn LSN of last checkpoint
logno last log file number
trid maximum transaction longid.
last_checkpoint_lsn_arg LSN of last checkpoint
last_logno_arg last log file number
max_trid_arg maximum transaction longid
recovery_failures_arg consecutive recovery failures
NOTE
We always want to do one single my_pwrite() here to be as atomic as
......@@ -459,17 +474,26 @@ err:
1 - Error
*/
int ma_control_file_write_and_force(LSN checkpoint_lsn, uint32 logno,
TrID trid)
int ma_control_file_write_and_force(LSN last_checkpoint_lsn_arg,
uint32 last_logno_arg,
TrID max_trid_arg,
uint8 recovery_failures_arg)
{
uchar buffer[CF_MAX_SIZE];
uint32 sum;
my_bool no_need_sync;
DBUG_ENTER("ma_control_file_write_and_force");
if ((last_checkpoint_lsn == checkpoint_lsn) &&
(last_logno == logno) &&
(max_trid_in_control_file == trid))
DBUG_RETURN(0); /* no need to write */
/*
We don't need to sync if this is just an increase of
recovery_failures: it's even good if that counter is not increased on disk
in case of power or hardware failure (less false positives when removing
logs).
*/
no_need_sync= ((last_checkpoint_lsn == last_checkpoint_lsn_arg) &&
(last_logno == last_logno_arg) &&
(max_trid_in_control_file == max_trid_arg) &&
(recovery_failures_arg > 0));
if (control_file_fd < 0)
DBUG_RETURN(1);
......@@ -479,9 +503,10 @@ int ma_control_file_write_and_force(LSN checkpoint_lsn, uint32 logno,
translog_lock_handler_assert_owner();
#endif
lsn_store(buffer + CF_LSN_OFFSET, checkpoint_lsn);
int4store(buffer + CF_FILENO_OFFSET, logno);
transid_store(buffer + CF_MAX_TRID_OFFSET, trid);
lsn_store(buffer + CF_LSN_OFFSET, last_checkpoint_lsn_arg);
int4store(buffer + CF_FILENO_OFFSET, last_logno_arg);
transid_store(buffer + CF_MAX_TRID_OFFSET, max_trid_arg);
(buffer + CF_RECOV_FAIL_OFFSET)[0]= recovery_failures_arg;
if (cf_changeable_size > CF_CHANGEABLE_TOTAL_SIZE)
{
......@@ -514,12 +539,13 @@ int ma_control_file_write_and_force(LSN checkpoint_lsn, uint32 logno,
if (my_pwrite(control_file_fd, buffer, cf_changeable_size,
cf_create_time_size, MYF(MY_FNABP | MY_WME)) ||
my_sync(control_file_fd, MYF(MY_WME)))
(!no_need_sync && my_sync(control_file_fd, MYF(MY_WME))))
DBUG_RETURN(1);
last_checkpoint_lsn= checkpoint_lsn;
last_logno= logno;
max_trid_in_control_file= trid;
last_checkpoint_lsn= last_checkpoint_lsn_arg;
last_logno= last_logno_arg;
max_trid_in_control_file= max_trid_arg;
recovery_failures= recovery_failures_arg;
cf_changeable_size= CF_CHANGEABLE_TOTAL_SIZE; /* no more warning */
DBUG_RETURN(0);
......@@ -558,7 +584,7 @@ int ma_control_file_end(void)
*/
last_checkpoint_lsn= LSN_IMPOSSIBLE;
last_logno= FILENO_IMPOSSIBLE;
max_trid_in_control_file= 0;
max_trid_in_control_file= recovery_failures= 0;
DBUG_RETURN(close_error);
}
......
......@@ -44,6 +44,8 @@ extern uint32 last_logno;
extern TrID max_trid_in_control_file;
extern uint8 recovery_failures;
extern my_bool maria_multi_threaded, maria_in_recovery;
typedef enum enum_control_file_error {
......@@ -63,7 +65,9 @@ typedef enum enum_control_file_error {
C_MODE_START
CONTROL_FILE_ERROR ma_control_file_open(my_bool create_if_missing,
my_bool print_error);
int ma_control_file_write_and_force(LSN checkpoint_lsn, uint32 logno, TrID trid);
int ma_control_file_write_and_force(LSN last_checkpoint_lsn_arg,
uint32 last_logno_arg, TrID max_trid_arg,
uint8 recovery_failures_arg);
int ma_control_file_end(void);
my_bool ma_control_file_inited(void);
C_MODE_END
......
......@@ -86,7 +86,7 @@ void maria_end(void)
from the log, as it cannot process REDOs).
*/
(void)ma_control_file_write_and_force(last_checkpoint_lsn, last_logno,
trid);
trid, recovery_failures);
}
trnman_destroy();
if (translog_status == TRANSLOG_OK)
......
......@@ -381,7 +381,7 @@ int _ma_test_if_changed(register MARIA_HA *info)
tells us if the MARIA file wasn't properly closed. (This is true if
my_disable_locking is set).
open_count is not maintained on disk for transactional or temporary tables.
open_count is not maintained on disk for temporary tables.
*/
int _ma_mark_file_changed(MARIA_HA *info)
......@@ -400,11 +400,16 @@ int _ma_mark_file_changed(MARIA_HA *info)
share->state.open_count++;
}
/*
temp tables don't need an open_count as they are removed on crash;
transactional tables are fixed by log-based recovery, so don't need an
open_count either (and we thus avoid the disk write below).
Temp tables don't need an open_count as they are removed on crash.
In theory transactional tables are fixed by log-based recovery, so don't
need an open_count either, but if recovery has failed and logs have been
removed (by maria-force-start-after-recovery-failures), we still need to
detect dubious tables.
If we didn't maintain open_count on disk for a table, after a crash
we wouldn't know if it was closed at crash time (thus does not need a
check) or not. So we would have to check all tables: overkill.
*/
if (!(share->temporary | share->base.born_transactional))
if (!share->temporary)
{
mi_int2store(buff,share->state.open_count);
buff[2]=1; /* Mark that it's changed */
......@@ -471,7 +476,7 @@ int _ma_decrement_open_count(MARIA_HA *info)
{
share->state.open_count--;
share->changed= 1; /* We have to update state */
if (!(share->temporary | share->base.born_transactional))
if (!share->temporary)
{
mi_int2store(buff,share->state.open_count);
write_error= (int) my_pwrite(share->kfile.file, buff, sizeof(buff),
......
......@@ -17,6 +17,7 @@
#include "trnman.h"
#include "ma_blockrec.h" /* for some constants and in-write hooks */
#include "ma_key_recover.h" /* For some in-write hooks */
#include "ma_checkpoint.h"
/*
On Windows, neither my_open() nor my_sync() work for directories.
......@@ -1522,7 +1523,8 @@ static my_bool translog_create_new_file()
DBUG_RETURN(1);
if (ma_control_file_write_and_force(last_checkpoint_lsn, file_no,
max_trid_in_control_file))
max_trid_in_control_file,
recovery_failures))
{
translog_stop_writing();
DBUG_RETURN(1);
......@@ -3211,21 +3213,29 @@ static my_bool translog_truncate_log(TRANSLOG_ADDRESS addr)
/**
@brief Check log files presence
Applies function 'callback' to all files (in a directory) which
name looks like a log's name (maria_log.[0-9]{7}).
If 'callback' returns TRUE this interrupts the walk and returns
TRUE. Otherwise FALSE is returned after processing all log files.
It cannot just use log_descriptor.directory because that may not yet have
been initialized.
@retval 0 no log files.
@retval 1 there is at least 1 log file in the directory
@param directory directory to scan
@param callback function to apply; is passed directory and base
name of found file
*/
my_bool translog_is_log_files()
my_bool translog_walk_filenames(const char *directory,
my_bool (*callback)(const char *,
const char *))
{
MY_DIR *dirp;
uint i;
my_bool rc= FALSE;
/* Finds and removes transaction log files */
if (!(dirp = my_dir(log_descriptor.directory, MYF(MY_DONT_SORT))))
return 1;
if (!(dirp = my_dir(directory, MYF(MY_DONT_SORT))))
return FALSE;
for (i= 0; i < dirp->number_off_files; i++)
{
......@@ -3239,14 +3249,14 @@ my_bool translog_is_log_files()
file[15] >= '0' && file[15] <= '9' &&
file[16] >= '0' && file[16] <= '9' &&
file[17] >= '0' && file[17] <= '9' &&
file[18] == '\0')
file[18] == '\0' && (*callback)(directory, file))
{
rc= TRUE;
break;
}
}
my_dirend(dirp);
return FALSE;
return rc;
}
......@@ -3269,6 +3279,19 @@ static void translog_fill_overhead_table()
}
/**
Callback to find first log in directory.
*/
static my_bool translog_callback_search_first(const char *directory
__attribute__((unused)),
const char *filename
__attribute__((unused)))
{
return TRUE;
}
/**
@brief Checks that chunk is LSN one
......@@ -3353,7 +3376,7 @@ my_bool translog_init_with_table(const char *directory,
my_init_dynamic_array(&log_descriptor.unfinished_files,
sizeof(struct st_file_counter),
10, 10))
DBUG_RETURN(1);
goto err;
log_descriptor.min_need_file= 0;
log_descriptor.min_file_number= 0;
log_descriptor.last_lsn_checked= LSN_IMPOSSIBLE;
......@@ -3367,7 +3390,7 @@ my_bool translog_init_with_table(const char *directory,
my_errno= errno;
DBUG_PRINT("error", ("Error %d during opening directory '%s'",
errno, log_descriptor.directory));
DBUG_RETURN(1);
goto err;
}
#endif
log_descriptor.in_buffers_only= LSN_IMPOSSIBLE;
......@@ -3417,7 +3440,7 @@ my_bool translog_init_with_table(const char *directory,
for (i= 0; i < TRANSLOG_BUFFERS_NO; i++)
{
if (translog_buffer_init(log_descriptor.buffers + i))
DBUG_RETURN(1);
goto err;
#ifndef DBUG_OFF
log_descriptor.buffers[i].buffer_no= (uint8) i;
#endif
......@@ -3461,7 +3484,8 @@ my_bool translog_init_with_table(const char *directory,
log_descriptor.horizon= last_page= MAKE_LSN(last_logno, 0);
if (translog_get_last_page_addr(&last_page, &pageok, no_errors))
{
if (!translog_is_log_files())
if (!translog_walk_filenames(log_descriptor.directory,
&translog_callback_search_first))
{
/*
Files was deleted, just start from the next log number, so that
......@@ -3472,7 +3496,7 @@ my_bool translog_init_with_table(const char *directory,
logs_found= 0;
}
else
DBUG_RETURN(1);
goto err;
}
else if (LSN_OFFSET(last_page) == 0)
{
......@@ -3485,7 +3509,7 @@ my_bool translog_init_with_table(const char *directory,
{
last_page-= LSN_ONE_FILE;
if (translog_get_last_page_addr(&last_page, &pageok, 0))
DBUG_RETURN(1);
goto err;
}
}
if (logs_found)
......@@ -3497,7 +3521,7 @@ my_bool translog_init_with_table(const char *directory,
if (allocate_dynamic(&log_descriptor.open_files,
log_descriptor.max_file -
log_descriptor.min_file + 1))
DBUG_RETURN(1);
goto err;
for (i = log_descriptor.max_file; i >= log_descriptor.min_file; i--)
{
/*
......@@ -3526,10 +3550,10 @@ my_bool translog_init_with_table(const char *directory,
if (file)
{
free(file);
DBUG_RETURN(1);
goto err;
}
else
DBUG_RETURN(1);
goto err;
}
translog_file_init(file, i, 1);
/* we allocated space so it can't fail */
......@@ -3543,7 +3567,7 @@ my_bool translog_init_with_table(const char *directory,
{
/* There is no logs and there is read-only mode => nothing to read */
DBUG_PRINT("error", ("No logs and read-only mode"));
DBUG_RETURN(1);
goto err;
}
if (logs_found)
......@@ -3568,7 +3592,7 @@ my_bool translog_init_with_table(const char *directory,
TRANSLOG_ADDRESS current_file_last_page;
current_file_last_page= current_page;
if (translog_get_last_page_addr(&current_file_last_page, &pageok, 0))
DBUG_RETURN(1);
goto err;
if (!pageok)
{
DBUG_PRINT("error", ("File %lu have no complete last page",
......@@ -3585,7 +3609,7 @@ my_bool translog_init_with_table(const char *directory,
uchar *page;
data.addr= &current_page;
if ((page= translog_get_page(&data, psize_buff.buffer, NULL)) == NULL)
DBUG_RETURN(1);
goto err;
if (data.was_recovered)
{
DBUG_PRINT("error", ("file no: %lu (%d) "
......@@ -3614,7 +3638,7 @@ my_bool translog_init_with_table(const char *directory,
{
/* Panic!!! Even page which should be valid is invalid */
/* TODO: issue error */
DBUG_RETURN(1);
goto err;
}
DBUG_PRINT("info", ("Last valid page is in file: %lu "
"offset: %lu (0x%lx) "
......@@ -3639,7 +3663,7 @@ my_bool translog_init_with_table(const char *directory,
LSN_FILE_NO(log_descriptor.horizon));
if ((page= translog_get_page(&data, psize_buff.buffer, NULL)) == NULL ||
(chunk_offset= translog_get_first_chunk_offset(page)) == 0)
DBUG_RETURN(1);
goto err;
/* Puts filled part of old page in the buffer */
log_descriptor.horizon= last_valid_page;
......@@ -3654,7 +3678,7 @@ my_bool translog_init_with_table(const char *directory,
uint16 chunk_length;
if ((chunk_length=
translog_get_total_chunk_length(page, chunk_offset)) == 0)
DBUG_RETURN(1);
goto err;
DBUG_PRINT("info", ("chunk: offset: %u length: %u",
(uint) chunk_offset, (uint) chunk_length));
chunk_offset+= chunk_length;
......@@ -3690,7 +3714,7 @@ my_bool translog_init_with_table(const char *directory,
open_files,
0, TRANSLOG_FILE **))->
handler.file))
DBUG_RETURN(1);
goto err;
version_changed= (info.maria_version != TRANSLOG_VERSION_ID);
}
}
......@@ -3702,25 +3726,26 @@ my_bool translog_init_with_table(const char *directory,
MYF(0));
DBUG_PRINT("info", ("The log is not found => we will create new log"));
if (file == NULL)
DBUG_RETURN(1);
goto err;
/* Start new log system from scratch */
log_descriptor.horizon= MAKE_LSN(start_file_num,
TRANSLOG_PAGE_SIZE); /* header page */
if ((file->handler.file=
create_logfile_by_number_no_cache(start_file_num)) == -1)
DBUG_RETURN(1);
goto err;
translog_file_init(file, start_file_num, 0);
if (insert_dynamic(&log_descriptor.open_files, (uchar*)&file))
DBUG_RETURN(1);
goto err;
log_descriptor.min_file= log_descriptor.max_file= start_file_num;
if (translog_write_file_header())
DBUG_RETURN(1);
goto err;
DBUG_ASSERT(log_descriptor.max_file - log_descriptor.min_file + 1 ==
log_descriptor.open_files.elements);
if (ma_control_file_write_and_force(checkpoint_lsn, start_file_num,
max_trid_in_control_file))
DBUG_RETURN(1);
max_trid_in_control_file,
recovery_failures))
goto err;
/* assign buffer 0 */
translog_start_buffer(log_descriptor.buffers, &log_descriptor.bc, 0);
translog_new_page_header(&log_descriptor.horizon, &log_descriptor.bc);
......@@ -3734,7 +3759,7 @@ my_bool translog_init_with_table(const char *directory,
log_descriptor.horizon= LSN_REPLACE_OFFSET(log_descriptor.horizon,
TRANSLOG_PAGE_SIZE);
if (translog_create_new_file())
DBUG_RETURN(1);
goto err;
/*
Buffer system left untouched after recovery => we should init it
(starting from buffer 0)
......@@ -3767,7 +3792,7 @@ my_bool translog_init_with_table(const char *directory,
id_to_share= (MARIA_SHARE **) my_malloc(SHARE_ID_MAX * sizeof(MARIA_SHARE*),
MYF(MY_WME | MY_ZEROFILL));
if (unlikely(!id_to_share))
DBUG_RETURN(1);
goto err;
id_to_share--; /* min id is 1 */
/* Check the last LSN record integrity */
......@@ -3783,7 +3808,7 @@ my_bool translog_init_with_table(const char *directory,
page_addr= (log_descriptor.horizon -
((log_descriptor.horizon - 1) % TRANSLOG_PAGE_SIZE + 1));
if (translog_scanner_init(page_addr, 1, &scanner, 1))
DBUG_RETURN(1);
goto err;
scanner.page_offset= page_overhead[scanner.page[TRANSLOG_PAGE_FLAGS]];
for (;;)
{
......@@ -3797,7 +3822,7 @@ my_bool translog_init_with_table(const char *directory,
if (translog_get_next_chunk(&scanner))
{
translog_destroy_scanner(&scanner);
DBUG_RETURN(1);
goto err;
}
if (scanner.page != END_OF_LOG)
chunk_1byte= scanner.page[scanner.page_offset];
......@@ -3808,7 +3833,7 @@ my_bool translog_init_with_table(const char *directory,
if (translog_get_next_chunk(&scanner))
{
translog_destroy_scanner(&scanner);
DBUG_RETURN(1);
goto err;
}
if (scanner.page == END_OF_LOG)
break; /* it was the last record */
......@@ -3845,7 +3870,7 @@ my_bool translog_init_with_table(const char *directory,
}
translog_destroy_scanner(&scanner);
if (translog_scanner_init(page_addr, 1, &scanner, 1))
DBUG_RETURN(1);
goto err;
scanner.page_offset= page_overhead[scanner.page[TRANSLOG_PAGE_FLAGS]];
}
translog_destroy_scanner(&scanner);
......@@ -3872,7 +3897,7 @@ my_bool translog_init_with_table(const char *directory,
else if (translog_truncate_log(last_lsn))
{
translog_free_record_header(&rec);
DBUG_RETURN(1);
goto err;
}
}
else
......@@ -3898,7 +3923,7 @@ my_bool translog_init_with_table(const char *directory,
else if (translog_truncate_log(last_lsn))
{
translog_free_record_header(&rec);
DBUG_RETURN(1);
goto err;
}
}
}
......@@ -3907,6 +3932,9 @@ my_bool translog_init_with_table(const char *directory,
}
}
DBUG_RETURN(0);
err:
ma_message_no_user(0, "log initialization failed");
DBUG_RETURN(1);
}
......
......@@ -317,6 +317,10 @@ extern void translog_deassign_id_from_share(struct st_maria_share *share);
extern void
translog_assign_id_to_share_from_recovery(struct st_maria_share *share,
uint16 id);
extern my_bool translog_walk_filenames(const char *directory,
my_bool (*callback)(const char *,
const char *));
enum enum_translog_status
{
TRANSLOG_UNINITED, /* no initialization done or error during initialization */
......
......@@ -191,12 +191,12 @@ static void print_preamble()
@retval !=0 Error
*/
int maria_recover(void)
int maria_recovery_from_log(void)
{
int res= 1;
FILE *trace_file;
uint warnings_count;
DBUG_ENTER("maria_recover");
DBUG_ENTER("maria_recovery_from_log");
DBUG_ASSERT(!maria_in_recovery);
maria_in_recovery= TRUE;
......@@ -462,7 +462,12 @@ end:
"Maria recovery failed. Please run maria_chk -r on all maria "
"tables and delete all maria_log.######## files", MYF(0));
procent_printed= 0;
/* we don't cleanly close tables if we hit some error (may corrupt them) */
/*
We don't cleanly close tables if we hit some error (may corrupt them by
flushing some wrong blocks made from wrong REDOs). It also leaves their
open_count>0, which ensures that --maria-recover, if used, will try to
repair them.
*/
DBUG_RETURN(error);
}
......@@ -1224,6 +1229,12 @@ static int new_table(uint16 sid, const char *name, LSN lsn_of_file_id)
" maria_chk -r", share->open_file_name);
error= -1; /* not fatal, try with other tables */
goto end;
/*
Note that if a first recovery fails to apply a REDO, it marks the table
corrupted and stops the entire recovery. A second recovery will find the
table is marked corrupted and skip it (and thus possibly handle other
tables).
*/
}
/* don't log any records for this work */
_ma_tmp_disable_logging_for_table(info, FALSE);
......
......@@ -25,7 +25,7 @@
C_MODE_START
enum maria_apply_log_way
{ MARIA_LOG_APPLY, MARIA_LOG_DISPLAY_HEADER, MARIA_LOG_CHECK };
int maria_recover(void);
int maria_recovery_from_log(void);
int maria_apply_log(LSN lsn, enum maria_apply_log_way apply,
FILE *trace_file,
my_bool execute_undo_phase, my_bool skip_DDLs,
......
#!/usr/bin/env perl
use strict;
use warnings;
my $usage= <<EOF;
This program tests that the options
--maria-force-start-after-recovery-failures --maria-recover work as
expected.
It has to be run from directory mysql-test, and works with non-debug
and debug binaries.
Pass it option -d or -i (to test corruption of data or index file).
EOF
# -d currently exhibits BUG#36578
# "Maria: maria-recover may fail to autorepair a table"
die($usage) if (@ARGV == 0);
my $corrupt_index;
if ($ARGV[0] eq '-d')
{
$corrupt_index= 0;
}
elsif ($ARGV[0] eq '-i')
{
$corrupt_index= 1;
}
else
{
die($usage);
}
my $force_after= 3;
my $corrupt_file= $corrupt_index ? "MAI" : "MAD";
my $corrupt_message=
"\\[ERROR\\] mysqld: Table '.\/test\/t1' is marked as crashed and should be repaired";
my $sql_name= "./var/tmp/create_table.sql";
my $error_log_name= "./var/log/master.err";
my @cmd_output;
my $whatever; # garbage data
my $base_server_cmd= "perl mysql-test-run.pl --mem --mysqld=--maria-force-start-after-recovery-failures=$force_after maria-recover";
my $server_cmd;
my $client_cmd= "../client/mysql -u root -S var/tmp/master.sock test < $sql_name";
my $server_pid_name="./var/run/master.pid";
my $server_pid;
my $i; # count of server restarts
sub kill_server;
print "starting mysqld\n";
$server_cmd= $base_server_cmd . " --start-and-exit 2>&1";
@cmd_output=`$server_cmd`;
die if $?;
open(FILE, ">", $sql_name) or die;
# To exhibit BUG#36578 with -d, we don't create an index if -d. This is
# because the presence of an index will cause repair-by-sort to be used,
# where sort_get_next_record() is only called inside
#_ma_create_index_by_sort(), so the latter function fails and in this
# case retry_repair is set, so bug does not happen. Whereas without
# an index, repair-with-key-cache is called, which calls
# sort_get_next_record() whose failure itself does not cause a retry.
print FILE "create table t1 (a varchar(1000)".
($corrupt_index ? ", index(a)" : "") .") engine=maria;\n";
print FILE <<EOF;
insert into t1 values("ThursdayMorningsMarket");
# If Recovery executes REDO_INDEX_NEW_PAGE it will overwrite our
# intentional corruption; we make Recovery skip this record by bumping
# create_rename_lsn using OPTIMIZE TABLE. This also makes sure to put
# the pages on disk, so that we can corrupt them.
optimize table t1;
# mark table open, so that --maria-recover repairs it
insert into t1 select concat(a,'b') from t1 limit 1;
EOF
close FILE;
print "creating table\n";
`$client_cmd`;
die if $?;
print "killing mysqld hard\n";
kill_server(9);
print "ruining " .
($corrupt_index ? "first page of keys" : "bitmap page") .
" in table to test maria-recover\n";
open(FILE, "+<", "./var/master-data/test/t1.$corrupt_file") or die;
$whatever= ("\xAB" x 100);
sysseek (FILE, $corrupt_index ? 8192 : (8192-100-100), 0) or die;
syswrite (FILE, $whatever) or die;
close FILE;
print "ruining log to make recovery fail; mysqld should fail the $force_after first restarts\n";
open(FILE, "+<", "./var/tmp/maria_log.00000001") or die;
$whatever= ("\xAB" x 8192);
sysseek (FILE, 99, 0) or die;
syswrite (FILE, $whatever) or die;
close FILE;
$server_cmd= $base_server_cmd . " --start-dirty 2>&1";
for($i= 1; $i <= $force_after; $i= $i + 1)
{
print "mysqld restart number $i... ";
unlink($error_log_name) or die;
`$server_cmd`;
# mysqld should return 1 when can't read log
die unless (($? >> 8) == 1);
open(FILE, "<", $error_log_name) or die;
@cmd_output= <FILE>;
close FILE;
die unless grep(/\[ERROR\] mysqld: Maria engine: log initialization failed/, @cmd_output);
die unless grep(/\[ERROR\] Plugin 'MARIA' init function returned error./, @cmd_output);
print "failed - ok\n";
}
print "mysqld restart number $i... ";
unlink($error_log_name) or die;
@cmd_output=`$server_cmd`;
die if $?;
open(FILE, "<", $error_log_name) or die;
@cmd_output= <FILE>;
close FILE;
die unless grep(/\[Warning\] mysqld: Maria engine: removed all logs after [\d]+ consecutive failures of recovery from logs/, @cmd_output);
die unless grep(/\[ERROR\] mysqld: File '..\/tmp\/maria_log.00000001' not found \(Errcode: 2\)/, @cmd_output);
print "success - ok\n";
open(FILE, ">", $sql_name) or die;
print FILE <<EOF;
set global maria_recover=normal;
insert into t1 values('aaa');
EOF
close FILE;
# verify corruption has not yet been noticed
open(FILE, "<", $error_log_name) or die;
@cmd_output= <FILE>;
close FILE;
die if grep(/$corrupt_message/, @cmd_output);
print "inserting in table\n";
`$client_cmd`;
die if $?;
print "table is usable - ok\n";
open(FILE, "<", $error_log_name) or die;
@cmd_output= <FILE>;
close FILE;
die unless grep(/$corrupt_message/, @cmd_output);
die unless grep(/\[Warning\] Recovering table: '.\/test\/t1'/, @cmd_output);
print "was corrupted and automatically repaired - ok\n";
# remove our traces
kill_server(15);
print "TEST ALL OK\n";
# kills mysqld with signal given in parameter
sub kill_server
{
my ($sig)= @_;
my $wait_count= 0;
open(FILE, "<", $server_pid_name) or die;
@cmd_output= <FILE>;
close FILE;
$server_pid= $cmd_output[0];
die unless $server_pid > 0;
kill($sig, $server_pid) or die;
while (kill (0, $server_pid))
{
print "waiting for mysqld to die\n" if ($wait_count > 30);
$wait_count= $wait_count + 1;
select(undef, undef, undef, 0.1);
}
}
......@@ -45,6 +45,7 @@ char file_name[FN_REFLEN];
LSN expect_checkpoint_lsn;
uint32 expect_logno;
TrID expect_max_trid;
uint8 expect_recovery_failures;
static int delete_file(myf my_flags);
/*
......@@ -55,10 +56,11 @@ static int close_file(void); /* wraps ma_control_file_end */
/* wraps ma_control_file_open_or_create */
static int open_file(void);
/* wraps ma_control_file_write_and_force */
static int write_file(LSN checkpoint_lsn, uint32 logno, TrID trid);
static int write_file(LSN checkpoint_lsn, uint32 logno, TrID trid,
uint8 rec_failures);
/* Tests */
static int test_one_log(void);
static int test_one_log_and_recovery_failures(void);
static int test_five_logs_and_max_trid(void);
static int test_3_checkpoints_and_2_logs(void);
static int test_binary_content(void);
......@@ -135,7 +137,8 @@ int main(int argc,char *argv[])
RET_ERR_UNLESS(0 == delete_file(0)); /* if fails, can't continue */
diag("Tests of normal conditions");
ok(0 == test_one_log(), "test of creating one log");
ok(0 == test_one_log_and_recovery_failures(),
"test of creating one log and recording recovery failures");
ok(0 == test_five_logs_and_max_trid(),
"test of creating five logs and many transactions");
ok(0 == test_3_checkpoints_and_2_logs(),
......@@ -167,7 +170,7 @@ static int delete_file(myf my_flags)
my_delete(file_name, my_flags);
expect_checkpoint_lsn= LSN_IMPOSSIBLE;
expect_logno= FILENO_IMPOSSIBLE;
expect_max_trid= 0;
expect_max_trid= expect_recovery_failures= 0;
return 0;
}
......@@ -181,6 +184,7 @@ static int verify_module_values_match_expected(void)
RET_ERR_UNLESS(last_logno == expect_logno);
RET_ERR_UNLESS(last_checkpoint_lsn == expect_checkpoint_lsn);
RET_ERR_UNLESS(max_trid_in_control_file == expect_max_trid);
RET_ERR_UNLESS(recovery_failures == expect_recovery_failures);
return 0;
}
......@@ -215,21 +219,28 @@ static int open_file(void)
return 0;
}
static int write_file(LSN checkpoint_lsn, uint32 logno, TrID trid)
static int write_file(LSN checkpoint_lsn, uint32 logno, TrID trid,
uint8 rec_failures)
{
RET_ERR_UNLESS(ma_control_file_write_and_force(checkpoint_lsn, logno, trid)
RET_ERR_UNLESS(ma_control_file_write_and_force(checkpoint_lsn, logno, trid,
rec_failures)
== 0);
/* Check that the module reports expected information */
RET_ERR_UNLESS(verify_module_values_match_expected() == 0);
return 0;
}
static int test_one_log(void)
static int test_one_log_and_recovery_failures(void)
{
RET_ERR_UNLESS(open_file() == CONTROL_FILE_OK);
expect_logno= 123;
RET_ERR_UNLESS(write_file(last_checkpoint_lsn, expect_logno,
max_trid_in_control_file) == 0);
max_trid_in_control_file,
recovery_failures) == 0);
expect_recovery_failures= 158;
RET_ERR_UNLESS(write_file(last_checkpoint_lsn, expect_logno,
max_trid_in_control_file,
expect_recovery_failures) == 0);
RET_ERR_UNLESS(close_file() == 0);
return 0;
}
......@@ -245,7 +256,8 @@ static int test_five_logs_and_max_trid(void)
{
expect_logno*= 3;
RET_ERR_UNLESS(write_file(last_checkpoint_lsn, expect_logno,
expect_max_trid) == 0);
expect_max_trid,
recovery_failures) == 0);
}
RET_ERR_UNLESS(close_file() == 0);
return 0;
......@@ -260,23 +272,28 @@ static int test_3_checkpoints_and_2_logs(void)
RET_ERR_UNLESS(open_file() == CONTROL_FILE_OK);
expect_checkpoint_lsn= MAKE_LSN(5, 10000);
RET_ERR_UNLESS(write_file(expect_checkpoint_lsn, expect_logno,
max_trid_in_control_file) == 0);
max_trid_in_control_file,
recovery_failures) == 0);
expect_logno= 17;
RET_ERR_UNLESS(write_file(expect_checkpoint_lsn, expect_logno,
max_trid_in_control_file) == 0);
max_trid_in_control_file,
recovery_failures) == 0);
expect_checkpoint_lsn= MAKE_LSN(17, 20000);
RET_ERR_UNLESS(write_file(expect_checkpoint_lsn, expect_logno,
max_trid_in_control_file) == 0);
max_trid_in_control_file,
recovery_failures) == 0);
expect_checkpoint_lsn= MAKE_LSN(17, 45000);
RET_ERR_UNLESS(write_file(expect_checkpoint_lsn, expect_logno,
max_trid_in_control_file) == 0);
max_trid_in_control_file,
recovery_failures) == 0);
expect_logno= 19;
RET_ERR_UNLESS(write_file(expect_checkpoint_lsn, expect_logno,
max_trid_in_control_file) == 0);
max_trid_in_control_file,
recovery_failures) == 0);
RET_ERR_UNLESS(close_file() == 0);
return 0;
}
......
......@@ -129,10 +129,10 @@ static my_bool read_and_check_content(TRANSLOG_HEADER_BUFFER *rec,
}
static const char *load_default_groups[]= {"ma_unit_loghandler", 0};
#if defined(__WIN__)
static const char *default_dbug_option= "d:t:i:O,\\ma_test_loghandler.trace";
#else
static const char *default_dbug_option= "d:t:i:o,/tmp/ma_test_loghandler.trace";
#ifndef DBUG_OFF
static const char *default_dbug_option=
IF_WIN("d:t:i:O,\\ma_test_loghandler.trace",
"d:t:i:o,/tmp/ma_test_loghandler.trace");
#endif
static const char *opt_wfile= NULL;
static const char *opt_rfile= NULL;
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment