1. 01 Feb, 2013 7 commits
    • Chris Mason's avatar
      Btrfs: reduce lock contention on extent buffer locks · 242e18c7
      Chris Mason authored
      The extent buffers have a refs_lock which we use to make coordinate freeing
      the extent buffer with operations on the radix tree.  On tree roots and
      other extent buffers that very cache hot, this can be highly contended.
      
      These are also the extent buffers that are basically pinned in memory.
      This commit adds code to cmpxchg our way through the ref modifications,
      and as long as the result of the reference change is still pinned in
      ram, we skip the expensive spinlock.
      Signed-off-by: default avatarChris Mason <chris.mason@fusionio.com>
      242e18c7
    • Chris Mason's avatar
      Btrfs: fix cluster alignment for mount -o ssd · 8de972b4
      Chris Mason authored
      With the new raid56 code, we want to make sure we're
      properly aligning our allocation clusters with -o ssd
      Signed-off-by: default avatarChris Mason <chris.mason@fusionio.com>
      8de972b4
    • Chris Mason's avatar
      Btrfs: add a plugging callback to raid56 writes · 6ac0f488
      Chris Mason authored
      Buffered writes and DIRECT_IO writes will often break up
      big contiguous changes to the file into sub-stripe writes.
      
      This adds a plugging callback to gather those smaller writes full stripe
      writes.
      
      Example on flash:
      
      fio job to do 64K writes in batches of 3 (which makes a full stripe):
      
      With plugging: 450MB/s
      Without plugging: 220MB/s
      Signed-off-by: default avatarChris Mason <chris.mason@fusionio.com>
      6ac0f488
    • Chris Mason's avatar
      Btrfs: Add a stripe cache to raid56 · 4ae10b3a
      Chris Mason authored
      The stripe cache allows us to avoid extra read/modify/write cycles
      by caching the pages we read off the disk.  Pages are cached when:
      
      * They are read in during a read/modify/write cycle
      
      * They are written during a read/modify/write cycle
      
      * They are involved in a parity rebuild
      
      Pages are not cached if we're doing a full stripe write.  We're
      assuming that a full stripe write won't be followed by another
      partial stripe write any time soon.
      
      This provides a substantial boost in performance for workloads that
      synchronously modify adjacent offsets in the file, and for the parity
      rebuild use case in general.
      
      The size of the stripe cache isn't tunable (yet) and is set at 1024
      entries.
      
      Example on flash: dd if=/dev/zero of=/mnt/xxx bs=4K oflag=direct
      
      Without the stripe cache  -- 2.1MB/s
      With the stripe cache 21MB/s
      Signed-off-by: default avatarChris Mason <chris.mason@fusionio.com>
      4ae10b3a
    • David Woodhouse's avatar
      Btrfs: RAID5 and RAID6 · 53b381b3
      David Woodhouse authored
      This builds on David Woodhouse's original Btrfs raid5/6 implementation.
      The code has changed quite a bit, blame Chris Mason for any bugs.
      
      Read/modify/write is done after the higher levels of the filesystem have
      prepared a given bio.  This means the higher layers are not responsible
      for building full stripes, and they don't need to query for the topology
      of the extents that may get allocated during delayed allocation runs.
      It also means different files can easily share the same stripe.
      
      But, it does expose us to incorrect parity if we crash or lose power
      while doing a read/modify/write cycle.  This will be addressed in a
      later commit.
      
      Scrub is unable to repair crc errors on raid5/6 chunks.
      
      Discard does not work on raid5/6 (yet)
      
      The stripe size is fixed at 64KiB per disk.  This will be tunable
      in a later commit.
      Signed-off-by: default avatarChris Mason <chris.mason@fusionio.com>
      53b381b3
    • David Woodhouse's avatar
      Btrfs: add rw argument to merge_bio_hook() · 64a16701
      David Woodhouse authored
      We'll want to merge writes so they can fill a full RAID[56] stripe, but
      not necessarily reads.
      Signed-off-by: default avatarDavid Woodhouse <David.Woodhouse@intel.com>
      Signed-off-by: default avatarChris Mason <chris.mason@fusionio.com>
      64a16701
    • Eric Sandeen's avatar
      btrfs: don't try to notify udev about missing devices · 3c911608
      Eric Sandeen authored
      If we remove a missing device, bdev is null, and if we
      send that off to btrfs_kobject_uevent we'll panic.
      Signed-off-by: default avatarEric Sandeen <sandeen@redhat.com>
      Signed-off-by: default avatarJosef Bacik <jbacik@fusionio.com>
      Signed-off-by: default avatarChris Mason <chris.mason@fusionio.com>
      3c911608
  2. 19 Dec, 2012 1 commit
  3. 18 Dec, 2012 1 commit
  4. 17 Dec, 2012 31 commits