Commit 010228e4 authored by Guoqing Jiang's avatar Guoqing Jiang Committed by Shaohua Li

md-cluster: clear another node's suspend_area after the copy is finished

When one node leaves cluster or stops the resyncing
(resync or recovery) array, then other nodes need to
call recover_bitmaps to continue the unfinished task.

But we need to clear suspend_area later after other
nodes copy the resync information to their bitmap
(by call bitmap_copy_from_slot). Otherwise, all nodes
could write to the suspend_area even the suspend_area
is not handled by any node, because area_resyncing
returns 0 at the beginning of raid1_write_request.
Which means one node could write suspend_area while
another node is resyncing the same area, then data
could be inconsistent.

So let's clear suspend_area later to avoid above issue
with the protection of bm lock. Also it is straightforward
to clear suspend_area after nodes have copied the resync
info to bitmap.
Signed-off-by: default avatarGuoqing Jiang <gqjiang@suse.com>
Reviewed-by: default avatarNeilBrown <neilb@suse.com>
Signed-off-by: default avatarShaohua Li <shli@fb.com>
parent 06c85639
...@@ -304,15 +304,6 @@ static void recover_bitmaps(struct md_thread *thread) ...@@ -304,15 +304,6 @@ static void recover_bitmaps(struct md_thread *thread)
while (cinfo->recovery_map) { while (cinfo->recovery_map) {
slot = fls64((u64)cinfo->recovery_map) - 1; slot = fls64((u64)cinfo->recovery_map) - 1;
/* Clear suspend_area associated with the bitmap */
spin_lock_irq(&cinfo->suspend_lock);
list_for_each_entry_safe(s, tmp, &cinfo->suspend_list, list)
if (slot == s->slot) {
list_del(&s->list);
kfree(s);
}
spin_unlock_irq(&cinfo->suspend_lock);
snprintf(str, 64, "bitmap%04d", slot); snprintf(str, 64, "bitmap%04d", slot);
bm_lockres = lockres_init(mddev, str, NULL, 1); bm_lockres = lockres_init(mddev, str, NULL, 1);
if (!bm_lockres) { if (!bm_lockres) {
...@@ -331,6 +322,16 @@ static void recover_bitmaps(struct md_thread *thread) ...@@ -331,6 +322,16 @@ static void recover_bitmaps(struct md_thread *thread)
pr_err("md-cluster: Could not copy data from bitmap %d\n", slot); pr_err("md-cluster: Could not copy data from bitmap %d\n", slot);
goto clear_bit; goto clear_bit;
} }
/* Clear suspend_area associated with the bitmap */
spin_lock_irq(&cinfo->suspend_lock);
list_for_each_entry_safe(s, tmp, &cinfo->suspend_list, list)
if (slot == s->slot) {
list_del(&s->list);
kfree(s);
}
spin_unlock_irq(&cinfo->suspend_lock);
if (hi > 0) { if (hi > 0) {
if (lo < mddev->recovery_cp) if (lo < mddev->recovery_cp)
mddev->recovery_cp = lo; mddev->recovery_cp = lo;
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment