1. 11 Oct, 2011 1 commit
    • Chris Mason's avatar
      Btrfs: make sure not to defrag extents past i_size · f7f43cc8
      Chris Mason authored
      The btrfs file defrag code will loop through the extents and
      force COW on them.  But there is a concurrent truncate in the middle of
      the defrag, it might end up defragging the same range over and over
      again.
      
      The problem is that writepage won't go through and do anything on pages
      past i_size, so the cow won't happen, so the file will appear to still
      be fragmented.  defrag will end up hitting the same extents again and
      again.
      
      In the worst case, the truncate can actually live lock with the defrag
      because the defrag keeps creating new ordered extents which the truncate
      code keeps waiting on.
      
      The fix here is to make defrag check for i_size inside the main loop,
      instead of just once before the looping starts.
      Signed-off-by: default avatarChris Mason <chris.mason@oracle.com>
      f7f43cc8
  2. 10 Oct, 2011 1 commit
    • Li Zefan's avatar
      Btrfs: fix recursive auto-defrag · 2a0f7f57
      Li Zefan authored
      Follow those steps:
      
        # mount -o autodefrag /dev/sda7 /mnt
        # dd if=/dev/urandom of=/mnt/tmp bs=200K count=1
        # sync
        # dd if=/dev/urandom of=/mnt/tmp bs=8K count=1 conv=notrunc
      
      and then it'll go into a loop: writeback -> defrag -> writeback ...
      
      It's because writeback writes [8K, 200K] and then writes [0, 8K].
      
      I tried to make writeback know if the pages are dirtied by defrag,
      but the patch was a bit intrusive. Here I simply set writeback_index
      when we defrag a file.
      Signed-off-by: default avatarLi Zefan <lizf@cn.fujitsu.com>
      Signed-off-by: default avatarChris Mason <chris.mason@oracle.com>
      2a0f7f57
  3. 30 Sep, 2011 1 commit
    • Josef Bacik's avatar
      Btrfs: force a page fault if we have a shorty copy on a page boundary · b6316429
      Josef Bacik authored
      A user reported a problem where ceph was getting into 100% cpu usage while doing
      some writing.  It turns out it's because we were doing a short write on a not
      uptodate page, which means we'd fall back at one page at a time and fault the
      page in.  The problem is our position is on the page boundary, so our fault in
      logic wasn't actually reading the page, so we'd just spin forever or until the
      page got read in by somebody else.  This will force a readpage if we end up
      doing a short copy.  Alexandre could reproduce this easily with ceph and reports
      it fixes his problem.  I also wrote a reproducer that no longer hangs my box
      with this patch.  Thanks,
      Reported-and-tested-by: default avatarAlexandre Oliva <aoliva@redhat.com>
      Signed-off-by: default avatarJosef Bacik <josef@redhat.com>
      Signed-off-by: default avatarChris Mason <chris.mason@oracle.com>
      b6316429
  4. 20 Sep, 2011 1 commit
  5. 18 Sep, 2011 4 commits
    • Li Zefan's avatar
      Btrfs: don't change inode flag of the dest clone file · dde820fb
      Li Zefan authored
      The dst file will have the same inode flags with dst file after
      file clone, and I think it's unexpected.
      
      For example, the dst file will suddenly become immutable after
      getting some share of data with src file, if the src is immutable.
      Signed-off-by: default avatarLi Zefan <lizf@cn.fujitsu.com>
      Signed-off-by: default avatarChris Mason <chris.mason@oracle.com>
      dde820fb
    • Li Zefan's avatar
      Btrfs: don't make a file partly checksummed through file clone · 0e7b824c
      Li Zefan authored
      To reproduce the bug:
      
        # mount /dev/sda7 /mnt
        # dd if=/dev/zero of=/mnt/src bs=4K count=1
        # umount /mnt
      
        # mount -o nodatasum /dev/sda7 /mnt
        # dd if=/dev/zero of=/mnt/dst bs=4K count=1
        # clone_range -s 4K -l 4K /mnt/src /mnt/dst
      
        # echo 3 > /proc/sys/vm/drop_caches
        # cat /mnt/dst
        # dmesg
        ...
        btrfs no csum found for inode 258 start 0
        btrfs csum failed ino 258 off 0 csum 2566472073 private 0
      
      It's because part of the file is checksummed and the other part is not,
      and then btrfs will complain checksum is not found when we read the file.
      
      Disallow file clone if src and dst file have different checksum flag,
      so we ensure a file is completely checksummed or unchecksummed.
      Signed-off-by: default avatarLi Zefan <lizf@cn.fujitsu.com>
      Signed-off-by: default avatarChris Mason <chris.mason@oracle.com>
      0e7b824c
    • Li Zefan's avatar
      Btrfs: fix pages truncation in btrfs_ioctl_clone() · 71ef0786
      Li Zefan authored
      It's a bug in commit f81c9cdc
      (Btrfs: truncate pages from clone ioctl target range)
      
      We should pass the dest range to the truncate function, but not the
      src range.
      
      Also move the function before locking extent state.
      Signed-off-by: default avatarLi Zefan <lizf@cn.fujitsu.com>
      Signed-off-by: default avatarChris Mason <chris.mason@oracle.com>
      71ef0786
    • Hidetoshi Seto's avatar
      btrfs: fix d_off in the first dirent · 3765fefa
      Hidetoshi Seto authored
      Since the d_off in the first dirent for "." (that originates from
      the 4th argument "offset" of filldir() for the 2nd dirent for "..")
      is wrongly assigned in btrfs_real_readdir(), telldir returns same
      offset for different locations.
      
       | # mkfs.btrfs /dev/sdb1
       | # mount /dev/sdb1 fs0
       | # cd fs0
       | # touch file0 file1
       | # ../test
       | telldir: 0
       | readdir: d_off = 2, d_name = "."
       | telldir: 2
       | readdir: d_off = 2, d_name = ".."
       | telldir: 2
       | readdir: d_off = 3, d_name = "file0"
       | telldir: 3
       | readdir: d_off = 2147483647, d_name = "file1"
       | telldir: 2147483647
      
      To fix this problem, pass filp->f_pos (which is loff_t) instead.
      
       | # ../test
       | telldir: 0
       | readdir: d_off = 1, d_name = "."
       | telldir: 1
       | readdir: d_off = 2, d_name = ".."
       | telldir: 2
       | readdir: d_off = 3, d_name = "file0"
       :
      
      At the moment the "offset" for "." is unused because there is no
      preceding dirent, however it is better to pass filp->f_pos to follow
      grammatical usage.
      Signed-off-by: default avatarHidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
      Signed-off-by: default avatarChris Mason <chris.mason@oracle.com>
      3765fefa
  6. 11 Sep, 2011 11 commits
  7. 18 Aug, 2011 1 commit
  8. 17 Aug, 2011 10 commits
  9. 05 Aug, 2011 1 commit
  10. 01 Aug, 2011 9 commits