1. 06 Jul, 2011 3 commits
    • Miao Xie's avatar
      btrfs: fix oops when doing space balance · 149e2d76
      Miao Xie authored
      We need to make sure the data relocation inode doesn't go through
      the delayed metadata updates, otherwise we get an oops during balance:
      
      kernel BUG at fs/btrfs/relocation.c:4303!
      [SNIP]
      Call Trace:
       [<ffffffffa03143fd>] ? update_ref_for_cow+0x22d/0x330 [btrfs]
       [<ffffffffa0314951>] __btrfs_cow_block+0x451/0x5e0 [btrfs]
       [<ffffffffa031355d>] ? read_block_for_search+0x14d/0x4d0 [btrfs]
       [<ffffffffa0314beb>] btrfs_cow_block+0x10b/0x240 [btrfs]
       [<ffffffffa031acae>] btrfs_search_slot+0x49e/0x7a0 [btrfs]
       [<ffffffffa032d8af>] btrfs_lookup_inode+0x2f/0xa0 [btrfs]
       [<ffffffff8147bf0e>] ? mutex_lock+0x1e/0x50
       [<ffffffffa0380cf1>] btrfs_update_delayed_inode+0x71/0x160 [btrfs]
       [<ffffffffa037ff27>] ? __btrfs_release_delayed_node+0x67/0x190 [btrfs]
       [<ffffffffa0381cf8>] btrfs_run_delayed_items+0xe8/0x120 [btrfs]
       [<ffffffffa03365e0>] btrfs_commit_transaction+0x250/0x850 [btrfs]
       [<ffffffff810f91d9>] ? find_get_pages+0x39/0x130
       [<ffffffffa0336cd5>] ? join_transaction+0x25/0x250 [btrfs]
       [<ffffffff81081de0>] ? wake_up_bit+0x40/0x40
       [<ffffffffa03785fa>] prepare_to_relocate+0xda/0xf0 [btrfs]
       [<ffffffffa037f2bb>] relocate_block_group+0x4b/0x620 [btrfs]
       [<ffffffffa0334cf5>] ? btrfs_clean_old_snapshots+0x35/0x150 [btrfs]
       [<ffffffffa037fa43>] btrfs_relocate_block_group+0x1b3/0x2e0 [btrfs]
       [<ffffffffa0368ec0>] ? btrfs_tree_unlock+0x50/0x50 [btrfs]
       [<ffffffffa035e39b>] btrfs_relocate_chunk+0x8b/0x670 [btrfs]
       [<ffffffffa031303d>] ? btrfs_set_path_blocking+0x3d/0x50 [btrfs]
       [<ffffffffa03577d8>] ? read_extent_buffer+0xd8/0x1d0 [btrfs]
       [<ffffffffa031bea1>] ? btrfs_previous_item+0xb1/0x150 [btrfs]
       [<ffffffffa03577d8>] ? read_extent_buffer+0xd8/0x1d0 [btrfs]
       [<ffffffffa035f5aa>] btrfs_balance+0x21a/0x2b0 [btrfs]
       [<ffffffffa0368898>] btrfs_ioctl+0x798/0xd20 [btrfs]
       [<ffffffff8111e358>] ? handle_mm_fault+0x148/0x270
       [<ffffffff814809e8>] ? do_page_fault+0x1d8/0x4b0
       [<ffffffff81160d6a>] do_vfs_ioctl+0x9a/0x540
       [<ffffffff811612b1>] sys_ioctl+0xa1/0xb0
       [<ffffffff81484ec2>] system_call_fastpath+0x16/0x1b
      [SNIP]
      RIP  [<ffffffffa037c1cc>] btrfs_reloc_cow_block+0x22c/0x270 [btrfs]
      Signed-off-by: default avatarMiao Xie <miaox@cn.fujitsu.com>
      Signed-off-by: default avatarChris Mason <chris.mason@oracle.com>
      149e2d76
    • Josef Bacik's avatar
      Btrfs: don't panic if we get an error while balancing V2 · 508794eb
      Josef Bacik authored
      A user reported an error where if we try to balance an fs after a device has
      been removed it will blow up.  This is because we get an EIO back and this is
      where BUG_ON(ret) bites us in the ass.  To fix we just exit.  Thanks,
      Reported-by: default avatarAnand Jain <Anand.Jain@oracle.com>
      Signed-off-by: default avatarJosef Bacik <josef@redhat.com>
      Signed-off-by: default avatarChris Mason <chris.mason@oracle.com>
      508794eb
    • David Sterba's avatar
      btrfs: add missing options displayed in mount output · 0942caa3
      David Sterba authored
      There are three missed mount options settable by user which are not
      currently displayed in mount output.
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.cz>
      Signed-off-by: default avatarChris Mason <chris.mason@oracle.com>
      0942caa3
  2. 27 Jun, 2011 1 commit
    • Miao Xie's avatar
      btrfs: fix inconsonant inode information · 2f7e33d4
      Miao Xie authored
      When iputting the inode, We may leave the delayed nodes if they have some
      delayed items that have not been dealt with. So when the inode is read again,
      we must look up the relative delayed node, and use the information in it to
      initialize the inode. Or we will get inconsonant inode information, it may
      cause that the same directory index number is allocated again, and hit the
      following oops:
      
      [ 5447.554187] err add delayed dir index item(name: pglog_0.965_0) into the
      insertion tree of the delayed node(root id: 262, inode id: 258, errno: -17)
      [ 5447.569766] ------------[ cut here ]------------
      [ 5447.575361] kernel BUG at fs/btrfs/delayed-inode.c:1301!
      [SNIP]
      [ 5447.790721] Call Trace:
      [ 5447.793191]  [<ffffffffa0641c4e>] btrfs_insert_dir_item+0x189/0x1bb [btrfs]
      [ 5447.800156]  [<ffffffffa0651a45>] btrfs_add_link+0x12b/0x191 [btrfs]
      [ 5447.806517]  [<ffffffffa0651adc>] btrfs_add_nondir+0x31/0x58 [btrfs]
      [ 5447.812876]  [<ffffffffa0651d6a>] btrfs_create+0xf9/0x197 [btrfs]
      [ 5447.818961]  [<ffffffff8111f840>] vfs_create+0x72/0x92
      [ 5447.824090]  [<ffffffff8111fa8c>] do_last+0x22c/0x40b
      [ 5447.829133]  [<ffffffff8112076a>] path_openat+0xc0/0x2ef
      [ 5447.834438]  [<ffffffff810c58e2>] ? __perf_event_task_sched_out+0x24/0x44
      [ 5447.841216]  [<ffffffff8103ecdd>] ? perf_event_task_sched_out+0x59/0x67
      [ 5447.847846]  [<ffffffff81121a79>] do_filp_open+0x3d/0x87
      [ 5447.853156]  [<ffffffff811e126c>] ? strncpy_from_user+0x43/0x4d
      [ 5447.859072]  [<ffffffff8111f1f5>] ? getname_flags+0x2e/0x80
      [ 5447.864636]  [<ffffffff8111f179>] ? do_getname+0x14b/0x173
      [ 5447.870112]  [<ffffffff8111f1b7>] ? audit_getname+0x16/0x26
      [ 5447.875682]  [<ffffffff8112b1ab>] ? spin_lock+0xe/0x10
      [ 5447.880882]  [<ffffffff81112d39>] do_sys_open+0x69/0xae
      [ 5447.886153]  [<ffffffff81112db1>] sys_open+0x20/0x22
      [ 5447.891114]  [<ffffffff813b9aab>] system_call_fastpath+0x16/0x1b
      
      Fix it by reusing the old delayed node.
      Reported-by: default avatarJim Schutt <jaschut@sandia.gov>
      Signed-off-by: default avatarMiao Xie <miaox@cn.fujitsu.com>
      Tested-by: default avatarJim Schutt <jaschut@sandia.gov>
      Signed-off-by: default avatarChris Mason <chris.mason@oracle.com>
      2f7e33d4
  3. 25 Jun, 2011 2 commits
  4. 24 Jun, 2011 1 commit
  5. 17 Jun, 2011 7 commits
    • Chris Mason's avatar
      Btrfs: avoid delayed metadata items during commits · e999376f
      Chris Mason authored
      Snapshot creation has two phases.  One is the initial snapshot setup,
      and the second is done during commit, while nobody is allowed to modify
      the root we are snapshotting.
      
      The delayed metadata insertion code can break that rule, it does a
      delayed inode update on the inode of the parent of the snapshot,
      and delayed directory item insertion.
      
      This makes sure to run the pending delayed operations before we
      record the snapshot root, which avoids corruptions.
      Signed-off-by: default avatarChris Mason <chris.mason@oracle.com>
      e999376f
    • David Sterba's avatar
      btrfs: fix uninitialized return value · 35a30d7c
      David Sterba authored
      When allocation fails in btrfs_read_fs_root_no_name, ret is not set
      although it is returned, holding a garbage value.
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.cz>
      Reviewed-by: default avatarLi Zefan <lizf@cn.fujitsu.com>
      Signed-off-by: default avatarChris Mason <chris.mason@oracle.com>
      35a30d7c
    • Miao Xie's avatar
      btrfs: fix wrong reservation when doing delayed inode operations · 19fd2949
      Miao Xie authored
      We have migrated the space for the delayed inode items from
      trans_block_rsv to global_block_rsv, but we forgot to set trans->block_rsv to
      global_block_rsv when we doing delayed inode operations, and the following Oops
      happened:
      
      [ 9792.654889] ------------[ cut here ]------------
      [ 9792.654898] WARNING: at fs/btrfs/extent-tree.c:5681
      btrfs_alloc_free_block+0xca/0x27c [btrfs]()
      [ 9792.654899] Hardware name: To Be Filled By O.E.M.
      [ 9792.654900] Modules linked in: btrfs zlib_deflate libcrc32c
      ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter ip6_tables
      arc4 rt61pci rt2x00pci rt2x00lib snd_hda_codec_hdmi mac80211
      snd_hda_codec_realtek cfg80211 snd_hda_intel edac_core snd_seq rfkill
      pcspkr serio_raw snd_hda_codec eeprom_93cx6 edac_mce_amd sp5100_tco
      i2c_piix4 k10temp snd_hwdep snd_seq_device snd_pcm floppy r8169 xhci_hcd
      mii snd_timer snd soundcore snd_page_alloc ipv6 firewire_ohci pata_acpi
      ata_generic firewire_core pata_via crc_itu_t radeon ttm drm_kms_helper
      drm i2c_algo_bit i2c_core [last unloaded: scsi_wait_scan]
      [ 9792.654919] Pid: 2762, comm: rm Tainted: G        W   2.6.39+ #1
      [ 9792.654920] Call Trace:
      [ 9792.654922]  [<ffffffff81053c4a>] warn_slowpath_common+0x83/0x9b
      [ 9792.654925]  [<ffffffff81053c7c>] warn_slowpath_null+0x1a/0x1c
      [ 9792.654933]  [<ffffffffa038e747>] btrfs_alloc_free_block+0xca/0x27c [btrfs]
      [ 9792.654945]  [<ffffffffa03b8562>] ? map_extent_buffer+0x6e/0xa8 [btrfs]
      [ 9792.654953]  [<ffffffffa038189b>] __btrfs_cow_block+0xfc/0x30c [btrfs]
      [ 9792.654963]  [<ffffffffa0396aa6>] ? btrfs_buffer_uptodate+0x47/0x58 [btrfs]
      [ 9792.654970]  [<ffffffffa0382e48>] ? read_block_for_search+0x94/0x368 [btrfs]
      [ 9792.654978]  [<ffffffffa0381ba9>] btrfs_cow_block+0xfe/0x146 [btrfs]
      [ 9792.654986]  [<ffffffffa03848b0>] btrfs_search_slot+0x14d/0x4b6 [btrfs]
      [ 9792.654997]  [<ffffffffa03b8562>] ? map_extent_buffer+0x6e/0xa8 [btrfs]
      [ 9792.655022]  [<ffffffffa03938e8>] btrfs_lookup_inode+0x2f/0x8f [btrfs]
      [ 9792.655025]  [<ffffffff8147afac>] ? _cond_resched+0xe/0x22
      [ 9792.655027]  [<ffffffff8147b892>] ? mutex_lock+0x29/0x50
      [ 9792.655039]  [<ffffffffa03d41b1>] btrfs_update_delayed_inode+0x72/0x137 [btrfs]
      [ 9792.655051]  [<ffffffffa03d4ea2>] btrfs_run_delayed_items+0x90/0xdb [btrfs]
      [ 9792.655062]  [<ffffffffa039a69b>] btrfs_commit_transaction+0x228/0x654 [btrfs]
      [ 9792.655064]  [<ffffffff8106e8da>] ? remove_wait_queue+0x3a/0x3a
      [ 9792.655075]  [<ffffffffa03a2fa5>] btrfs_evict_inode+0x14d/0x202 [btrfs]
      [ 9792.655077]  [<ffffffff81132bd6>] evict+0x71/0x111
      [ 9792.655079]  [<ffffffff81132de0>] iput+0x12a/0x132
      [ 9792.655081]  [<ffffffff8112aa3a>] do_unlinkat+0x106/0x155
      [ 9792.655083]  [<ffffffff81127b83>] ? path_put+0x1f/0x23
      [ 9792.655085]  [<ffffffff8109c53c>] ? audit_syscall_entry+0x145/0x171
      [ 9792.655087]  [<ffffffff81128410>] ? putname+0x34/0x36
      [ 9792.655090]  [<ffffffff8112b441>] sys_unlinkat+0x29/0x2b
      [ 9792.655092]  [<ffffffff81482c42>] system_call_fastpath+0x16/0x1b
      [ 9792.655093] ---[ end trace 02b696eb02b3f768 ]---
      
      This patch fix it by setting the reservation of the transaction handle to the
      correct one.
      Reported-by: default avatarJosef Bacik <josef@redhat.com>
      Signed-off-by: default avatarMiao Xie <miaox@cn.fujitsu.com>
      Signed-off-by: default avatarChris Mason <chris.mason@oracle.com>
      19fd2949
    • Maarten Lankhorst's avatar
      btrfs: Remove unused sysfs code · 9fe6a50f
      Maarten Lankhorst authored
      Removes code no longer used. The sysfs file itself is kept, because the
      btrfs developers expressed interest in putting new entries to sysfs.
      Signed-off-by: default avatarMaarten Lankhorst <m.b.lankhorst@gmail.com>
      Signed-off-by: default avatarChris Mason <chris.mason@oracle.com>
      9fe6a50f
    • David Sterba's avatar
      btrfs: fix dereference of ERR_PTR value · 3ed4498c
      David Sterba authored
      smatch reports:
      
      btrfs_recover_log_trees error: 'wc.replay_dest' dereferencing
      possible ERR_PTR()
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.cz>
      Signed-off-by: default avatarChris Mason <chris.mason@oracle.com>
      3ed4498c
    • Chris Mason's avatar
      Merge branch 'for-chris' of... · e038dca8
      Chris Mason authored
      Merge branch 'for-chris' of git://git.kernel.org/pub/scm/linux/kernel/git/josef/btrfs-work into for-linus
      
      Conflicts:
      	fs/btrfs/transaction.c
      Signed-off-by: default avatarChris Mason <chris.mason@oracle.com>
      e038dca8
    • Chris Mason's avatar
      Btrfs: fix relocation races · 7585717f
      Chris Mason authored
      The recent commit to get rid of our trans_mutex introduced
      some races with block group relocation.  The problem is that relocation
      needs to do some record keeping about each root, and it was relying
      on the transaction mutex to coordinate things in subtle ways.
      
      This fix adds a mutex just for the relocation code and makes sure
      it doesn't have a big impact on normal operations.  The race is
      really fixed in btrfs_record_root_in_trans, which is where we
      step back and wait for the relocation code to finish accounting
      setup.
      Signed-off-by: default avatarChris Mason <chris.mason@oracle.com>
      7585717f
  6. 15 Jun, 2011 3 commits
    • Josef Bacik's avatar
      Btrfs: set no_trans_join after trying to expand the transaction · ed0ca140
      Josef Bacik authored
      We can lockup if we try to allow new writers join the transaction and we have
      flushoncommit set or have a pending snapshot.  This is because we set
      no_trans_join and then loop around and try to wait for ordered extents again.
      The problem is the ordered endio stuff needs to join the transaction, which it
      can't do because no_trans_join is set.  So instead wait until after this loop to
      set no_trans_join and then make sure to wait for num_writers == 1 in case
      anybody got started in between us exiting the loop and setting no_trans_join.
      This could easily be reproduced by mounting -o flushoncommit and running xfstest
      13.  It cannot be reproduced with this patch.  Thanks,
      Reported-by: default avatarJim Schutt <jaschut@sandia.gov>
      Signed-off-by: default avatarJosef Bacik <josef@redhat.com>
      ed0ca140
    • Josef Bacik's avatar
      Btrfs: protect the pending_snapshots list with trans_lock · 8351583e
      Josef Bacik authored
      Currently there is nothing protecting the pending_snapshots list on the
      transaction.  We only hold the directory mutex that we are snapshotting and a
      read lock on the subvol_sem, so we could race with somebody else creating a
      snapshot in a different directory and end up with list corruption.  So protect
      this list with the trans_lock.  Thanks,
      Signed-off-by: default avatarJosef Bacik <josef@redhat.com>
      8351583e
    • Josef Bacik's avatar
      Btrfs: fix path leakage on subvol deletion · 71d7aed0
      Josef Bacik authored
      The delayed ref patch accidently removed the btrfs_free_path in
      btrfs_unlink_subvol, this puts it back and means we don't leak a path.  Thanks,
      Signed-off-by: default avatarJosef Bacik <josef@redhat.com>
      71d7aed0
  7. 13 Jun, 2011 2 commits
  8. 11 Jun, 2011 1 commit
  9. 10 Jun, 2011 12 commits
  10. 09 Jun, 2011 2 commits
  11. 08 Jun, 2011 6 commits
    • Josef Bacik's avatar
      Btrfs: fix duplicate checking logic · f6a39829
      Josef Bacik authored
      When merging my code into the integration test the second check for duplicate
      entries got screwed up.  This patch fixes it by dropping ret2 and just using ret
      for the return value, and checking if we got an error before adding the bitmap
      to the local list.  Thanks,
      Signed-off-by: default avatarJosef Bacik <josef@redhat.com>
      f6a39829
    • Josef Bacik's avatar
      Btrfs: fix the allocator loop logic · 723bda20
      Josef Bacik authored
      I was testing with empty_cluster = 0 to try and reproduce a problem and kept
      hitting early enospc panics.  This was because our loop logic was a little
      confused.  So this is what I did
      
      1) Make the loop variable the ultimate decider on wether we should loop again
      isntead of checking to see if we had an uncached bg, empty size or empty
      cluster.
      
      2) Increment loop before checking to see what we are on to make the loop
      definitions make more sense.
      
      3) If we are on the chunk alloc loop don't set empty_size/empty_cluster to 0
      unless we didn't actually allocate a chunk.  If we did allocate a chunk we
      should be able to easily setup a new cluster so clearing
      empty_size/empty_cluster makes us less efficient.
      
      This kept me from hitting panics while trying to reproduce the other problem.
      Thanks,
      Signed-off-by: default avatarJosef Bacik <josef@redhat.com>
      723bda20
    • Josef Bacik's avatar
      Btrfs: fix bitmap regression · 2cdc342c
      Josef Bacik authored
      In cleaning up the clustering code I accidently introduced a regression by
      adding bitmap entries to the cluster rb tree.  The problem is if we've maxed out
      the number of bitmaps we can have for the block group we can only add free space
      to the bitmaps, but since the bitmap is on the cluster we can't find it and we
      try to create another one.  This would result in a panic because the total
      bitmaps was bigger than the max bitmaps that were allowed.  This patch fixes
      this by checking to see if we have a cluster, and then looking at the cluster rb
      tree to see if it has a bitmap entry and if it does and that space belongs to
      that bitmap, go ahead and add it to that bitmap.
      
      I could hit this panic every time with an fs_mark test within a couple of
      minutes.  With this patch I no longer hit the panic and fs_mark goes to
      completion.  Thanks,
      Signed-off-by: default avatarJosef Bacik <josef@redhat.com>
      2cdc342c
    • Josef Bacik's avatar
      Btrfs: don't commit the transaction if we dont have enough pinned bytes · f2bb8f5c
      Josef Bacik authored
      I noticed when running an enospc test that we would get stuck committing the
      transaction in check_data_space even though we truly didn't have enough space.
      So check to see if bytes_pinned is bigger than num_bytes, if it's not don't
      commit the transaction.  Thanks,
      Signed-off-by: default avatarJosef Bacik <josef@redhat.com>
      f2bb8f5c
    • Josef Bacik's avatar
      Btrfs: noinline the cluster searching functions · 3de85bb9
      Josef Bacik authored
      When profiling the find cluster code it's hard to tell where we are spending our
      time because the bitmap and non-bitmap functions get inlined by the compiler, so
      make that not happen.  Thanks,
      Signed-off-by: default avatarJosef Bacik <josef@redhat.com>
      3de85bb9
    • Josef Bacik's avatar
      Btrfs: cache bitmaps when searching for a cluster · 86d4a77b
      Josef Bacik authored
      If we are looking for a cluster in a particularly sparse or fragmented block
      group, we will do a lot of looping through the free space tree looking for
      various things, and if we need to look at bitmaps we will endup doing the whole
      dance twice.  So instead add the bitmap entries to a temporary list so if we
      have to do the bitmap search we can just look through the list of entries we've
      found quickly instead of having to loop through the entire tree again.  Thanks,
      Signed-off-by: default avatarJosef Bacik <josef@redhat.com>
      86d4a77b