1. 14 Apr, 2015 1 commit
    • Ryusuke Konishi's avatar
      nilfs2: fix deadlock of segment constructor over I_SYNC flag · 28cd54f2
      Ryusuke Konishi authored
      commit 7ef3ff2f
      
       upstream.
      
      Nilfs2 eventually hangs in a stress test with fsstress program.  This
      issue was caused by the following deadlock over I_SYNC flag between
      nilfs_segctor_thread() and writeback_sb_inodes():
      
        nilfs_segctor_thread()
          nilfs_segctor_thread_construct()
            nilfs_segctor_unlock()
              nilfs_dispose_list()
                iput()
                  iput_final()
                    evict()
                      inode_wait_for_writeback()  * wait for I_SYNC flag
      
        writeback_sb_inodes()
           * set I_SYNC flag on inode->i_state
          __writeback_single_inode()
            do_writepages()
              nilfs_writepages()
                nilfs_construct_dsync_segment()
                  nilfs_segctor_sync()
                     * wait for completion of segment constructor
          inode_sync_complete()
             * clear I_SYNC flag after __writeback_single_inode() completed
      
      writeback_sb_inodes() calls do_writepages() for dirty inodes after
      setting I_SYNC flag on inode->i_state.  do_writepages() in turn calls
      nilfs_writepages(), which can run segment constructor and wait for its
      completion.  On the other hand, segment constructor calls iput(), which
      can call evict() and wait for the I_SYNC flag on
      inode_wait_for_writeback().
      
      Since segment constructor doesn't know when I_SYNC will be set, it
      cannot know whether iput() will block or not unless inode->i_nlink has a
      non-zero count.  We can prevent evict() from being called in iput() by
      implementing sop->drop_inode(), but it's not preferable to leave inodes
      with i_nlink == 0 for long periods because it even defers file
      truncation and inode deallocation.  So, this instead resolves the
      deadlock by calling iput() asynchronously with a workqueue for inodes
      with i_nlink == 0.
      Signed-off-by: default avatarRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Tested-by: default avatarRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      28cd54f2
  2. 04 Jan, 2012 1 commit
  3. 01 Nov, 2011 1 commit
  4. 21 Jul, 2011 1 commit
    • Josef Bacik's avatar
      fs: push i_mutex and filemap_write_and_wait down into ->fsync() handlers · 02c24a82
      Josef Bacik authored
      
      Btrfs needs to be able to control how filemap_write_and_wait_range() is called
      in fsync to make it less of a painful operation, so push down taking i_mutex and
      the calling of filemap_write_and_wait() down into the ->fsync() handlers.  Some
      file systems can drop taking the i_mutex altogether it seems, like ext3 and
      ocfs2.  For correctness sake I just pushed everything down in all cases to make
      sure that we keep the current behavior the same for everybody, and then each
      individual fs maintainer can make up their mind about what to do from there.
      Thanks,
      Acked-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarJosef Bacik <josef@redhat.com>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      02c24a82
  5. 20 Jul, 2011 1 commit
  6. 27 May, 2011 1 commit
    • Christoph Hellwig's avatar
      fs: pass exact type of data dirties to ->dirty_inode · aa385729
      Christoph Hellwig authored
      
      Tell the filesystem if we just updated timestamp (I_DIRTY_SYNC) or
      anything else, so that the filesystem can track internally if it
      needs to push out a transaction for fdatasync or not.
      
      This is just the prototype change with no user for it yet.  I plan
      to push large XFS changes for the next merge window, and getting
      this trivial infrastructure in this window would help a lot to avoid
      tree interdependencies.
      
      Also remove incorrect comments that ->dirty_inode can't block.  That
      has been changed a long time ago, and many implementations rely on it.
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      aa385729
  7. 10 May, 2011 2 commits
  8. 30 Mar, 2011 1 commit
  9. 09 Mar, 2011 2 commits
    • Ryusuke Konishi's avatar
      nilfs2: get rid of nilfs_sb_info structure · e3154e97
      Ryusuke Konishi authored
      
      This directly uses sb->s_fs_info to keep a nilfs filesystem object and
      fully removes the intermediate nilfs_sb_info structure.  With this
      change, the hierarchy of on-memory structures of nilfs will be
      simplified as follows:
      
      Before:
        super_block
             -> nilfs_sb_info
                   -> the_nilfs
                         -> cptree --+-> nilfs_root (current file system)
                                     +-> nilfs_root (snapshot A)
                                     +-> nilfs_root (snapshot B)
                                     :
                   -> nilfs_sc_info (log writer structure)
      After:
        super_block
             -> the_nilfs
                   -> cptree --+-> nilfs_root (current file system)
                               +-> nilfs_root (snapshot A)
                               +-> nilfs_root (snapshot B)
                               :
                   -> nilfs_sc_info (log writer structure)
      
      The reason why we didn't design so from the beginning is because the
      initial shape also differed from the above.  The early hierachy was
      composed of "per-mount-point" super_block -> nilfs_sb_info pairs and a
      shared nilfs object.  On the kernel 2.6.37, it was changed to the
      current shape in order to unify super block instances into one per
      device, and this cleanup became applicable as the result.
      Signed-off-by: default avatarRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
      e3154e97
    • Ryusuke Konishi's avatar
      nilfs2: use sb instance instead of nilfs_sb_info struct · f7545144
      Ryusuke Konishi authored
      
      This replaces sbi uses with direct reference to sb instance.
      Signed-off-by: default avatarRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
      f7545144
  10. 08 Mar, 2011 3 commits
  11. 10 Jan, 2011 3 commits
  12. 07 Jan, 2011 1 commit
  13. 23 Oct, 2010 9 commits
  14. 09 Aug, 2010 1 commit
  15. 23 Jul, 2010 6 commits
  16. 28 May, 2010 1 commit
  17. 03 Mar, 2010 1 commit
  18. 01 Oct, 2009 1 commit
  19. 22 Sep, 2009 2 commits
  20. 24 Jun, 2009 1 commit