• Josef Bacik's avatar
    btrfs: flush write bio if we loop in extent_write_cache_pages · 42ffb0bf
    Josef Bacik authored
    There exists a deadlock with range_cyclic that has existed forever.  If
    we loop around with a bio already built we could deadlock with a writer
    who has the page locked that we're attempting to write but is waiting on
    a page in our bio to be written out.  The task traces are as follows
    
      PID: 1329874  TASK: ffff889ebcdf3800  CPU: 33  COMMAND: "kworker/u113:5"
       #0 [ffffc900297bb658] __schedule at ffffffff81a4c33f
       #1 [ffffc900297bb6e0] schedule at ffffffff81a4c6e3
       #2 [ffffc900297bb6f8] io_schedule at ffffffff81a4ca42
       #3 [ffffc900297bb708] __lock_page at ffffffff811f145b
       #4 [ffffc900297bb798] __process_pages_contig at ffffffff814bc502
       #5 [ffffc900297bb8c8] lock_delalloc_pages at ffffffff814bc684
       #6 [ffffc900297bb900] find_lock_delalloc_range at ffffffff814be9ff
       #7 [ffffc900297bb9a0] writepage_delalloc at ffffffff814bebd0
       #8 [ffffc900297bba18] __extent_writepage at ffffffff814bfbf2
       #9 [ffffc900297bba98] extent_write_cache_pages at ffffffff814bffbd
    
      PID: 2167901  TASK: ffff889dc6a59c00  CPU: 14  COMMAND:
      "aio-dio-invalid"
       #0 [ffffc9003b50bb18] __schedule at ffffffff81a4c33f
       #1 [ffffc9003b50bba0] schedule at ffffffff81a4c6e3
       #2 [ffffc9003b50bbb8] io_schedule at ffffffff81a4ca42
       #3 [ffffc9003b50bbc8] wait_on_page_bit at ffffffff811f24d6
       #4 [ffffc9003b50bc60] prepare_pages at ffffffff814b05a7
       #5 [ffffc9003b50bcd8] btrfs_buffered_write at ffffffff814b1359
       #6 [ffffc9003b50bdb0] btrfs_file_write_iter at ffffffff814b5933
       #7 [ffffc9003b50be38] new_sync_write at ffffffff8128f6a8
       #8 [ffffc9003b50bec8] vfs_write at ffffffff81292b9d
       #9 [ffffc9003b50bf00] ksys_pwrite64 at ffffffff81293032
    
    I used drgn to find the respective pages we were stuck on
    
    page_entry.page 0xffffea00fbfc7500 index 8148 bit 15 pid 2167901
    page_entry.page 0xffffea00f9bb7400 index 7680 bit 0 pid 1329874
    
    As you can see the kworker is waiting for bit 0 (PG_locked) on index
    7680, and aio-dio-invalid is waiting for bit 15 (PG_writeback) on index
    8148.  aio-dio-invalid has 7680, and the kworker epd looks like the
    following
    
      crash> struct extent_page_data ffffc900297bbbb0
      struct extent_page_data {
        bio = 0xffff889f747ed830,
        tree = 0xffff889eed6ba448,
        extent_locked = 0,
        sync_io = 0
      }
    
    Probably worth mentioning as well that it waits for writeback of the
    page to complete while holding a lock on it (at prepare_pages()).
    
    Using drgn I walked the bio pages looking for page
    0xffffea00fbfc7500 which is the one we're waiting for writeback on
    
      bio = Object(prog, 'struct bio', address=0xffff889f747ed830)
      for i in range(0, bio.bi_vcnt.value_()):
          bv = bio.bi_io_vec[i]
          if bv.bv_page.value_() == 0xffffea00fbfc7500:
    	  print("FOUND IT")
    
    which validated what I suspected.
    
    The fix for this is simple, flush the epd before we loop back around to
    the beginning of the file during writeout.
    
    Fixes: b293f02e ("Btrfs: Add writepages support")
    CC: stable@vger.kernel.org # 4.4+
    Reviewed-by: default avatarFilipe Manana <fdmanana@suse.com>
    Signed-off-by: default avatarJosef Bacik <josef@toxicpanda.com>
    Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
    42ffb0bf
extent_io.c 154 KB