1. 04 Jun, 2009 2 commits
    • Chris Mason's avatar
      Btrfs: Fix oops and use after free during space balancing · 44fb5511
      Chris Mason authored
      The btrfs allocator uses list_for_each to walk the available block
      groups when searching for free blocks.  It starts off with a hint
      to help find the best block group for a given allocation.
      
      The hint is resolved into a block group, but we don't properly check
      to make sure the block group we find isn't in the middle of being
      freed due to filesystem shrinking or balancing.  If it is being
      freed, the list pointers in it are bogus and can't be trusted.  But,
      the code happily goes along and uses them in the list_for_each loop,
      leading to all kinds of fun.
      
      The fix used here is to check to make sure the block group we find really
      is on the list before we use it.  list_del_init is used when removing
      it from the list, so we can do a proper check.
      
      The allocation clustering code has a similar bug where it will trust
      the block group in the current free space cluster.  If our allocation
      flags have changed (going from single spindle dup to raid1 for example)
      because the drives in the FS have changed, we're not allowed to use
      the old block group any more.
      
      The fix used here is to check the current cluster against the
      current allocation flags.
      Signed-off-by: default avatarChris Mason <chris.mason@oracle.com>
      44fb5511
    • Yan Zheng's avatar
      Btrfs: set device->total_disk_bytes when adding new device · 2cc3c559
      Yan Zheng authored
      It was not being properly initialized, and so the size saved to
      disk was not correct.
      Signed-off-by: default avatarChris Mason <chris.mason@oracle.com>
      2cc3c559
  2. 14 May, 2009 6 commits
  3. 27 Apr, 2009 8 commits
  4. 24 Apr, 2009 6 commits
  5. 21 Apr, 2009 1 commit
    • Chris Mason's avatar
      Btrfs: fix btrfs fallocate oops and deadlock · 546888da
      Chris Mason authored
      Btrfs fallocate was incorrectly starting a transaction with a lock held
      on the extent_io tree for the file, which could deadlock.  Strictly
      speaking it was using join_transaction which would be safe, but it is better
      to move the transaction outside of the lock.
      
      When preallocated extents are overwritten, btrfs_mark_buffer_dirty was
      being called on an unlocked buffer.  This was triggering an assertion and
      oops because the lock is supposed to be held.
      
      The bug was calling btrfs_mark_buffer_dirty on a leaf after btrfs_del_item had
      been run.  btrfs_del_item takes care of dirtying things, so the solution is a
      to skip the btrfs_mark_buffer_dirty call in this case.
      Signed-off-by: default avatarChris Mason <chris.mason@oracle.com>
      546888da
  6. 20 Apr, 2009 4 commits
    • Chris Mason's avatar
      Btrfs: use the right node in reada_for_balance · 8c594ea8
      Chris Mason authored
      reada_for_balance was using the wrong index into the path node array,
      so it wasn't reading the right blocks.  We never directly used the
      results of the read done by this function because the btree search is
      started over at the end.
      
      This fixes reada_for_balance to reada in the correct node and to
      avoid searching past the last slot in the node.  It also makes sure to
      hold the parent lock while we are finding the nodes to read.
      Signed-off-by: default avatarChris Mason <chris.mason@oracle.com>
      8c594ea8
    • Chris Mason's avatar
      Btrfs: fix oops on page->mapping->host during writepage · 11c8349b
      Chris Mason authored
      The extent_io writepage call updates the writepage index in the inode
      as it makes progress.  But, it was doing the update after unlocking the page,
      which isn't legal because page->mapping can't be trusted once the page
      is unlocked.
      
      This lead to an oops, especially common with compression turned on.  The
      fix here is to update the writeback index before unlocking the page.
      Signed-off-by: default avatarChris Mason <chris.mason@oracle.com>
      11c8349b
    • Chris Mason's avatar
      Btrfs: add a priority queue to the async thread helpers · d313d7a3
      Chris Mason authored
      Btrfs is using WRITE_SYNC_PLUG to send down synchronous IOs with a
      higher priority.  But, the checksumming helper threads prevent it
      from being fully effective.
      
      There are two problems.  First, a big queue of pending checksumming
      will delay the synchronous IO behind other lower priority writes.  Second,
      the checksumming uses an ordered async work queue.  The ordering makes sure
      that IOs are sent to the block layer in the same order they are sent
      to the checksumming threads.  Usually this gives us less seeky IO.
      
      But, when we start mixing IO priorities, the lower priority IO can delay
      the higher priority IO.
      
      This patch solves both problems by adding a high priority list to the async
      helper threads, and a new btrfs_set_work_high_prio(), which is used
      to make put a new async work item onto the higher priority list.
      
      The ordering is still done on high priority IO, but all of the high
      priority bios are ordered separately from the low priority bios.  This
      ordering is purely an IO optimization, it is not involved in data
      or metadata integrity.
      Signed-off-by: default avatarChris Mason <chris.mason@oracle.com>
      d313d7a3
    • Chris Mason's avatar
      Btrfs: use WRITE_SYNC for synchronous writes · ffbd517d
      Chris Mason authored
      Part of reducing fsync/O_SYNC/O_DIRECT latencies is using WRITE_SYNC for
      writes we plan on waiting on in the near future.  This patch
      mirrors recent changes in other filesystems and the generic code to
      use WRITE_SYNC when WB_SYNC_ALL is passed and to use WRITE_SYNC for
      other latency critical writes.
      
      Btrfs uses async worker threads for checksumming before the write is done,
      and then again to actually submit the bios.  The bio submission code just
      runs a per-device list of bios that need to be sent down the pipe.
      
      This list is split into low priority and high priority lists so the
      WRITE_SYNC IO happens first.
      Signed-off-by: default avatarChris Mason <chris.mason@oracle.com>
      ffbd517d
  7. 14 Apr, 2009 13 commits