Fix for BUG#39363 "Concurent inserts in the same table lead to hang in maria engine"
(need a mutex when modifying bitmap->non_flushable), which I hit when running maria_bulk_insert.yy. After fixing this, I hit an assertion in check_and_set_lsn() saying that the page was PAGECACHE_PLAIN_PAGE. This could be caused by pages left by an operation which had transactions disabled (like a bulk insert with repair): in this patch we remove those pages out of the cache when we re-enable transactions. After fixing this, I get page cache deadlocks, pushbuild2 also has some, to be looked at. No testcase, requires concurrency and running for 15 minutes, but automatically tested by pushbuild2. storage/maria/ma_bitmap.c: Doing bitmap->non_flushable++ without mutex was wrong. If this ++ happened while another ++ or -- was happening in another thread, one ++ or -- could be missed and the bitmap code would behave wrongly. For example, if a ++ was missed, the DBUG_ASSERT(((int) (bitmap->non_flushable)) >= 0) in _ma_bitmap_release_unused() could fire. I saw this assertion happen in practice in maria_bulk_insert.yy. Adding this mutex lock eliminated the assertion problem. The >=0 was wrong, should be >0 (or the variable could go negative). storage/maria/ma_recovery.c: When we re-enable transactionality, as we may have created pages of type PAGECACHE_PLAIN_PAGE before, we need to remove them from the cache (FLUSH_RELEASE). Or they would stay this way, and later when we maria_write() to them, we would try to tag them with a LSN (ma_unpin_all_pages()), which is incorrect for a plain page (and causes assertion in the page cache at start of check_and_set_lsn()). I saw the assertion fire with maria_bulk_insert.yy, and this seems to cure it. page cache
Showing
Please register or sign in to comment