• Josef Bacik's avatar
    btrfs: set start on clone before calling copy_extent_buffer_full · 53e24158
    Josef Bacik authored
    Our subpage testing started hanging on generic/560 and I bisected it
    down to 1cab1375 ("btrfs: reuse cloned extent buffer during
    fiemap to avoid re-allocations").  This is subtle because we use
    eb->start to figure out where in the folio we're copying to when we're
    subpage, as our ->start may refer to an area inside of the folio.
    
    For example, assume a 16K page size machine with a 4K node size, and
    assume that we already have a cloned extent buffer when we cloned the
    previous search.
    
    copy_extent_buffer_full() will do the following when copying the extent
    buffer path->nodes[0] (src) into cloned (dest):
    
      src->start = 8k; // this is the new leaf we're cloning
      cloned->start = 4k; // this is left over from the previous clone
    
      src_addr = folio_address(src->folios[0]);
      dest_addr = folio_address(dest->folios[0]);
    
      memcpy(dest_addr + get_eb_offset_in_folio(dst, 0),
    	 src_addr + get_eb_offset_in_folio(src, 0), src->len);
    
    Now get_eb_offset_in_folio() is where the problems occur, because for
    sub-pagesize blocksize we can have multiple eb's per folio, the code for
    this is as follows
    
      size_t get_eb_offset_in_folio(eb, offset) {
    	  return (eb->start + offset & (folio_size(eb->folio[0]) - 1));
      }
    
    So in the above example we are copying into offset 4K inside the folio.
    However once we update cloned->start to 8K to match the src the math for
    get_eb_offset_in_folio() changes, and any subsequent reads (i.e.
    btrfs_item_key_to_cpu()) will start reading from the offset 8K instead
    of 4K where we copied to, giving us garbage.
    
    Fix this by setting start before we co copy_extent_buffer_full() to make
    sure that we're copying into the same offset inside of the folio that we
    will read from later.
    
    All other sites of copy_extent_buffer_full() are correct because we
    either set ->start beforehand or we simply don't change it in the case
    of the tree-log usage.
    
    With this fix we now pass generic/560 on our subpage tests.
    
    Fixes: 1cab1375 ("btrfs: reuse cloned extent buffer during fiemap to avoid re-allocations")
    Reviewed-by: default avatarFilipe Manana <fdmanana@suse.com>
    Reviewed-by: default avatarQu Wenruo <wqu@suse.com>
    Signed-off-by: default avatarJosef Bacik <josef@toxicpanda.com>
    Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
    53e24158
extent_io.c 146 KB