• Filipe Manana's avatar
    Btrfs: teach backref walking about backrefs with underflowed offset values · d6589101
    Filipe Manana authored
    When cloning/deduplicating file extents (through the clone and extent_same
    ioctls) we can get data back references with offset values that are a
    result of an unsigned integer arithmetic underflow, that is, values that
    are much larger then they could be otherwise.
    
    This is not a problem when decrementing or dropping the back references
    (happens when we overwrite the extents or punch a hole for example, through
    __btrfs_drop_extents()), since we compute the same too large offset value,
    but it is a problem for the backref walking code, used by an incremental
    send and the ioctls that are used by the btrfs tool "inspect-internal"
    commands, as it makes it miss the corresponding file extent items because
    the search key is set for an extent item that starts at an offset matching
    the exceptionally large offset value of the data back reference. For an
    incremental send this causes the send ioctl to fail with -EIO.
    
    So teach the backref walking code to deal with these cases by setting the
    search key's offset to 0 if the backref's offset value is larger than
    LLONG_MAX (the largest possible file offset). This makes sure the backref
    walking code finds the corresponding file extent items at the expense of
    scanning more items and leafs in the btree.
    
    Fixing the clone/dedup ioctls to not produce such underflowed results would
    require major changes breaking backward compatibility, updating user space
    tools, etc.
    
    Simple reproducer case for fstests:
    
      seq=`basename $0`
      seqres=$RESULT_DIR/$seq
      echo "QA output created by $seq"
    
      tmp=/tmp/$$
      status=1	# failure is the default!
      trap "_cleanup; exit \$status" 0 1 2 3 15
    
      _cleanup()
      {
          rm -fr $send_files_dir
          rm -f $tmp.*
      }
    
      # get standard environment, filters and checks
      . ./common/rc
      . ./common/filter
    
      # real QA test starts here
      _supported_fs btrfs
      _supported_os Linux
      _require_scratch
      _require_cloner
      _need_to_be_root
    
      send_files_dir=$TEST_DIR/btrfs-test-$seq
    
      rm -f $seqres.full
      rm -fr $send_files_dir
      mkdir $send_files_dir
    
      _scratch_mkfs >>$seqres.full 2>&1
      _scratch_mount
    
      # Create our test file with a single extent of 64K starting at file
      # offset 128K.
      $XFS_IO_PROG -f -c "pwrite -S 0xaa 128K 64K" $SCRATCH_MNT/foo \
          | _filter_xfs_io
    
      _run_btrfs_util_prog subvolume snapshot -r $SCRATCH_MNT \
          $SCRATCH_MNT/mysnap1
    
      # Now clone parts of the original extent into lower offsets of the file.
      #
      # The first clone operation adds a file extent item to file offset 0
      # that points to our initial extent with a data offset of 16K. The
      # corresponding data back reference in the extent tree has an offset of
      # 18446744073709535232, which is the result of file_offset - data_offset
      # = 0 - 16K.
      #
      # The second clone operation adds a file extent item to file offset 16K
      # that points to our initial extent with a data offset of 48K. The
      # corresponding data back reference in the extent tree has an offset of
      # 18446744073709518848, which is the result of file_offset - data_offset
      # = 16K - 48K.
      #
      # Those large back reference offsets (result of unsigned arithmetic
      # underflow) confused the back reference walking code (used by an
      # incremental send and the multiple inspect-internal ioctls) and made it
      # miss the back references, which for the case of an incremental send it
      # made it fail with -EIO and print a message like the following to
      # dmesg:
      #
      # "BTRFS error (device sdc): did not find backref in send_root. \
      #  inode=257, offset=0, disk_byte=12845056 found extent=12845056"
      #
      $CLONER_PROG -s $(((128 + 16) * 1024)) -d 0 -l $((16 * 1024)) \
          $SCRATCH_MNT/foo $SCRATCH_MNT/foo
      $CLONER_PROG -s $(((128 + 48) * 1024)) -d $((16 * 1024)) \
          -l $((16 * 1024)) $SCRATCH_MNT/foo $SCRATCH_MNT/foo
    
      _run_btrfs_util_prog subvolume snapshot -r $SCRATCH_MNT \
          $SCRATCH_MNT/mysnap2
    
      _run_btrfs_util_prog send $SCRATCH_MNT/mysnap1 -f $send_files_dir/1.snap
      _run_btrfs_util_prog send -p $SCRATCH_MNT/mysnap1 $SCRATCH_MNT/mysnap2 \
          -f $send_files_dir/2.snap
    
      echo "File digest in the original filesystem:"
      md5sum $SCRATCH_MNT/mysnap2/foo | _filter_scratch
    
      # Now recreate the filesystem by receiving both send streams and verify
      # we get the same file contents that the original filesystem had.
      _scratch_unmount
      _scratch_mkfs >>$seqres.full 2>&1
      _scratch_mount
    
      _run_btrfs_util_prog receive $SCRATCH_MNT -f $send_files_dir/1.snap
      _run_btrfs_util_prog receive $SCRATCH_MNT -f $send_files_dir/2.snap
    
      echo "File digest in the new filesystem:"
      md5sum $SCRATCH_MNT/mysnap2/foo | _filter_scratch
    
      status=0
      exit
    
    The test's expected golden output is:
    
      wrote 65536/65536 bytes at offset 131072
      XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
      File digest in the original filesystem:
      6c6079335cff141b8a31233ead04cbff  SCRATCH_MNT/mysnap2/foo
      File digest in the new filesystem:
      6c6079335cff141b8a31233ead04cbff  SCRATCH_MNT/mysnap2/foo
    
    But it failed with:
    
        (...)
        @@ -1,7 +1,5 @@
         QA output created by 097
         wrote 65536/65536 bytes at offset 131072
         XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
        -File digest in the original filesystem:
        -6c6079335cff141b8a31233ead04cbff  SCRATCH_MNT/mysnap2/foo
        -File digest in the new filesystem:
        -6c6079335cff141b8a31233ead04cbff  SCRATCH_MNT/mysnap2/foo
        ...
    
      $ cat /home/fdmanana/git/hub/xfstests/results//btrfs/097.full
      (...)
      ERROR: send ioctl failed with -5: Input/output error
    Signed-off-by: default avatarFilipe Manana <fdmanana@suse.com>
    Signed-off-by: default avatarChris Mason <clm@fb.com>
    d6589101
backref.c 52.3 KB