An error occurred fetching the project authors.
  1. 04 Sep, 2015 1 commit
    • Joseph Qi's avatar
      ocfs2: fix race between dio and recover orphan · 512f62ac
      Joseph Qi authored
      During direct io the inode will be added to orphan first and then
      deleted from orphan.  There is a race window that the orphan entry will
      be deleted twice and thus trigger the BUG when validating
      OCFS2_DIO_ORPHANED_FL in ocfs2_del_inode_from_orphan.
      
      ocfs2_direct_IO_write
          ...
          ocfs2_add_inode_to_orphan
          >>>>>>>> race window.
                   1) another node may rm the file and then down, this node
                   take care of orphan recovery and clear flag
                   OCFS2_DIO_ORPHANED_FL.
                   2) since rw lock is unlocked, it may race with another
                   orphan recovery and append dio.
          ocfs2_del_inode_from_orphan
      
      So take inode mutex lock when recovering orphans and make rw unlock at the
      end of aio write in case of append dio.
      Signed-off-by: default avatarJoseph Qi <joseph.qi@huawei.com>
      Reported-by: default avatarYiwen Jiang <jiangyiwen@huawei.com>
      Cc: Weiwei Wang <wangww631@huawei.com>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      512f62ac
  2. 21 Apr, 2015 1 commit
    • Linus Torvalds's avatar
      Revert "ocfs2: incorrect check for debugfs returns" · 8f443e23
      Linus Torvalds authored
      This reverts commit e2ac55b6.
      
      Huang Ying reports that this causes a hang at boot with debugfs disabled.
      
      It is true that the debugfs error checks are kind of confusing, and this
      code certainly merits more cleanup and thinking about it, but there's
      something wrong with the trivial "check not just for NULL, but for error
      pointers too" patch.
      
      Yes, with debugfs disabled, we will end up setting the o2hb_debug_dir
      pointer variable to an error pointer (-ENODEV), and then continue as if
      everything was fine.  But since debugfs is disabled, all the _users_ of
      that pointer end up being compiled away, so even though the pointer can
      not be dereferenced, that's still fine.
      
      So it's confusing and somewhat questionable, but the "more correct"
      error checks end up causing more trouble than they fix.
      Reported-by: default avatarHuang Ying <ying.huang@intel.com>
      Acked-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Acked-by: default avatarChengyu Song <csong84@gatech.edu>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      8f443e23
  3. 14 Apr, 2015 4 commits
    • Vladimir Davydov's avatar
      cleancache: zap uuid arg of cleancache_init_shared_fs · 9de16262
      Vladimir Davydov authored
      Use super_block->s_uuid instead.  Every shared filesystem using cleancache
      must now initialize super_block->s_uuid before calling
      cleancache_init_shared_fs.  The only one on the tree, ocfs2, already meets
      this requirement.
      Signed-off-by: default avatarVladimir Davydov <vdavydov@parallels.com>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: David Vrabel <david.vrabel@citrix.com>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Stefan Hengelein <ilendir@googlemail.com>
      Cc: Florian Schmaus <fschmaus@gmail.com>
      Cc: Andor Daam <andor.daam@googlemail.com>
      Cc: Dan Magenheimer <dan.magenheimer@oracle.com>
      Cc: Bob Liu <lliubbo@gmail.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      9de16262
    • Vladimir Davydov's avatar
      ocfs2: copy fs uuid to superblock · 58be19dc
      Vladimir Davydov authored
      Currently, maximal number of cleancache enabled filesystems equals 32,
      which is insufficient nowadays, because a Linux host can have hundreds
      of containers on board, each of which might want its own filesystem.
      This patch set targets at removing this limitation - see patch 4 for
      more details.  Patches 1-3 prepare the code for this change.
      
      This patch (of 4):
      
      This will allow us to remove the uuid argument from
      cleancache_init_shared_fs.
      Signed-off-by: default avatarVladimir Davydov <vdavydov@parallels.com>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: David Vrabel <david.vrabel@citrix.com>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Stefan Hengelein <ilendir@googlemail.com>
      Cc: Florian Schmaus <fschmaus@gmail.com>
      Cc: Andor Daam <andor.daam@googlemail.com>
      Cc: Dan Magenheimer <dan.magenheimer@oracle.com>
      Cc: Bob Liu <lliubbo@gmail.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      58be19dc
    • Joe Perches's avatar
      ocfs2: logging: remove static buffer, use vsprintf extension %pV · 1543306e
      Joe Perches authored
      Use the vsprintf %pV extension to avoid using a static buffer and remove
      the now unnecessary buffer.
      Signed-off-by: default avatarJoe Perches <joe@perches.com>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      1543306e
    • Chengyu Song's avatar
      ocfs2: incorrect check for debugfs returns · e2ac55b6
      Chengyu Song authored
      debugfs_create_dir and debugfs_create_file may return -ENODEV when debugfs
      is not configured, so the return value should be checked against
      ERROR_VALUE as well, otherwise the later dereference of the dentry pointer
      would crash the kernel.
      
      This patch tries to solve this problem by fixing certain checks. However,
      I have that found other call sites are protected by #ifdef CONFIG_DEBUG_FS.
      In current implementation, if CONFIG_DEBUG_FS is defined, then the above
      two functions will never return any ERROR_VALUE. So another possibility
      to fix this is to surround all the buggy checks/functions with the same
      #ifdef CONFIG_DEBUG_FS. But I'm not sure if this would break any functionality,
      as only OCFS2_FS_STATS declares dependency on DEBUG_FS.
      Signed-off-by: default avatarChengyu Song <csong84@gatech.edu>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      e2ac55b6
  4. 17 Feb, 2015 1 commit
  5. 10 Feb, 2015 1 commit
  6. 30 Jan, 2015 1 commit
    • Jan Kara's avatar
      ocfs2: Use generic helpers for quotaon and quotaoff · 664dbd5f
      Jan Kara authored
      Ocfs2 can just use the generic helpers provided by quota code for
      turning quotas on and off when quota files are stored as system inodes.
      The only difference is the feature test in ocfs2_quota_on() and that is
      covered by dquot_quota_enable() checking whether usage tracking is
      enabled (which can happen only if the filesystem has the quota feature
      set).
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      664dbd5f
  7. 11 Dec, 2014 1 commit
  8. 10 Nov, 2014 1 commit
  9. 26 Sep, 2014 1 commit
  10. 17 Sep, 2014 1 commit
    • Jan Kara's avatar
      ocfs2: Don't use MAXQUOTAS value · 52362810
      Jan Kara authored
      MAXQUOTAS value defines maximum number of quota types VFS supports.
      This isn't necessarily the number of types ocfs2 supports and with
      addition of project quotas these two numbers stop matching. So make
      ocfs2 use its private definition.
      
      CC: Mark Fasheh <mfasheh@suse.com>
      CC: Joel Becker <jlbec@evilplan.org>
      CC: ocfs2-devel@oss.oracle.com
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      52362810
  11. 23 Jun, 2014 1 commit
  12. 04 Jun, 2014 2 commits
  13. 03 Apr, 2014 6 commits
    • jiangyiwen's avatar
      ocfs2: avoid system inode ref confusion by adding mutex lock · 43b10a20
      jiangyiwen authored
      The following case may lead to the same system inode ref in confusion.
      
      A thread                            B thread
      ocfs2_get_system_file_inode
      ->get_local_system_inode
      ->_ocfs2_get_system_file_inode
                                          because of *arr == NULL,
                                          ocfs2_get_system_file_inode
                                          ->get_local_system_inode
                                          ->_ocfs2_get_system_file_inode
      gets first ref thru
      _ocfs2_get_system_file_inode,
      gets second ref thru igrab and
      set *arr = inode
                                          at the moment, B thread also gets
                                          two refs, so lead to one more
                                          inode ref.
      
      So add mutex lock to avoid multi thread set two inode ref once at the
      same time.
      Signed-off-by: default avatarjiangyiwen <jiangyiwen@huawei.com>
      Reviewed-by: default avatarJoseph Qi <joseph.qi@huawei.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      43b10a20
    • Goldwyn Rodrigues's avatar
      ocfs2: revert iput deferring code in ocfs2_drop_dentry_lock · 8ed6b237
      Goldwyn Rodrigues authored
      The following patches are reverted in this patch because these patches
      caused performance regression in the remote unlink() calls.
      
        ea455f8a - ocfs2: Push out dropping of dentry lock to ocfs2_wq
        f7b1aa69 - ocfs2: Fix deadlock on umount
        5fd13189 - ocfs2: Don't oops in ocfs2_kill_sb on a failed mount
      
      Previous patches in this series removed the possible deadlocks from
      downconvert thread so the above patches shouldn't be needed anymore.
      
      The regression is caused because these patches delay the iput() in case
      of dentry unlocks.  This also delays the unlocking of the open lockres.
      The open lockresource is required to test if the inode can be wiped from
      disk or not.  When the deleting node does not get the open lock, it
      marks it as orphan (even though it is not in use by another
      node/process) and causes a journal checkpoint.  This delays operations
      following the inode eviction.  This also moves the inode to the orphaned
      inode which further causes more I/O and a lot of unneccessary orphans.
      
      The following script can be used to generate the load causing issues:
      
        declare -a create
        declare -a remove
        declare -a iterations=(1 2 4 8 16 32 64 128 256 512 1024 2048 4096 8192 16384)
        unique="`mktemp -u XXXXX`"
        script="/tmp/idontknow-${unique}.sh"
        cat <<EOF > "${script}"
        for n in {1..8}; do mkdir -p test/dir\${n}
          eval touch test/dir\${n}/foo{1.."\$1"}
        done
        EOF
        chmod 700 "${script}"
      
        function fcreate ()
        {
          exec 2>&1 /usr/bin/time --format=%E "${script}" "$1"
        }
      
        function fremove ()
        {
          exec 2>&1 /usr/bin/time --format=%E ssh node2 "cd `pwd`; rm -Rf test*"
        }
      
        function fcp ()
        {
          exec 2>&1 /usr/bin/time --format=%E ssh node3 "cd `pwd`; cp -R test test.new"
        }
      
        echo -------------------------------------------------
        echo "| # files | create #s | copy #s | remove #s |"
        echo -------------------------------------------------
        for ((x=0; x < ${#iterations[*]} ; x++)) do
          create[$x]="`fcreate ${iterations[$x]}`"
          copy[$x]="`fcp ${iterations[$x]}`"
          remove[$x]="`fremove`"
          printf "| %8d | %9s | %9s | %9s |\n" ${iterations[$x]} ${create[$x]} ${copy[$x]} ${remove[$x]}
        done
        rm "${script}"
        echo "------------------------"
      Signed-off-by: default avatarSrinivas Eeda <srinivas.eeda@oracle.com>
      Signed-off-by: default avatarGoldwyn Rodrigues <rgoldwyn@suse.com>
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Reviewed-by: default avatarMark Fasheh <mfasheh@suse.de>
      Cc: Joel Becker <jlbec@evilplan.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      8ed6b237
    • Jan Kara's avatar
      ocfs2: implement delayed dropping of last dquot reference · e3a767b6
      Jan Kara authored
      We cannot drop last dquot reference from downconvert thread as that
      creates the following deadlock:
      
      NODE 1                                  NODE2
      holds dentry lock for 'foo'
      holds inode lock for GLOBAL_BITMAP_SYSTEM_INODE
                                              dquot_initialize(bar)
                                                ocfs2_dquot_acquire()
                                                  ocfs2_inode_lock(USER_QUOTA_SYSTEM_INODE)
                                                  ...
      downconvert thread (triggered from another
      node or a different process from NODE2)
        ocfs2_dentry_post_unlock()
          ...
          iput(foo)
            ocfs2_evict_inode(foo)
              ocfs2_clear_inode(foo)
                dquot_drop(inode)
                  ...
      	    ocfs2_dquot_release()
                    ocfs2_inode_lock(USER_QUOTA_SYSTEM_INODE)
                     - blocks
                                                  finds we need more space in
                                                  quota file
                                                  ...
                                                  ocfs2_extend_no_holes()
                                                    ocfs2_inode_lock(GLOBAL_BITMAP_SYSTEM_INODE)
                                                      - deadlocks waiting for
                                                        downconvert thread
      
      We solve the problem by postponing dropping of the last dquot reference to
      a workqueue if it happens from the downconvert thread.
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Reviewed-by: default avatarMark Fasheh <mfasheh@suse.de>
      Reviewed-by: default avatarSrinivas Eeda <srinivas.eeda@oracle.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      e3a767b6
    • Darrick J. Wong's avatar
      ocfs2: improve fsync efficiency and fix deadlock between aio_write and sync_file · 2931cdcb
      Darrick J. Wong authored
      Currently, ocfs2_sync_file grabs i_mutex and forces the current journal
      transaction to complete.  This isn't terribly efficient, since sync_file
      really only needs to wait for the last transaction involving that inode
      to complete, and this doesn't require i_mutex.
      
      Therefore, implement the necessary bits to track the newest tid
      associated with an inode, and teach sync_file to wait for that instead
      of waiting for everything in the journal to commit.  Furthermore, only
      issue the flush request to the drive if jbd2 hasn't already done so.
      
      This also eliminates the deadlock between ocfs2_file_aio_write() and
      ocfs2_sync_file().  aio_write takes i_mutex then calls
      ocfs2_aiodio_wait() to wait for unaligned dio writes to finish.
      However, if that dio completion involves calling fsync, then we can get
      into trouble when some ocfs2_sync_file tries to take i_mutex.
      Signed-off-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      Reviewed-by: default avatarMark Fasheh <mfasheh@suse.de>
      Cc: Joel Becker <jlbec@evilplan.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      2931cdcb
    • joyce.xue's avatar
      ocfs2: remove unused variable uuid_net_key in ocfs2_initialize_super · a75fe48c
      joyce.xue authored
      Variable uuid_net_key in ocfs2_initialize_super() is not used.  Clean it
      up.
      Signed-off-by: default avatarjoyce.xue <xuejiufei@huawei.com>
      Signed-off-by: default avatarJoseph Qi <joseph.qi@huawei.com>
      Acked-by: default avatarMark Fasheh <mfasheh@suse.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      a75fe48c
    • Wengang Wang's avatar
      ocfs2: change ip_unaligned_aio to of type mutex from atomit_t · c18ceab0
      Wengang Wang authored
      There is a problem that waitqueue_active() may check stale data thus miss
      a wakeup of threads waiting on ip_unaligned_aio.
      
      The valid value of ip_unaligned_aio is only 0 and 1 so we can change it to
      be of type mutex thus the above prolem is avoid.  Another benifit is that
      mutex which works as FIFO is fairer than wake_up_all().
      Signed-off-by: default avatarWengang Wang <wen.gang.wang@oracle.com>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      c18ceab0
  14. 13 Mar, 2014 1 commit
    • Theodore Ts'o's avatar
      fs: push sync_filesystem() down to the file system's remount_fs() · 02b9984d
      Theodore Ts'o authored
      Previously, the no-op "mount -o mount /dev/xxx" operation when the
      file system is already mounted read-write causes an implied,
      unconditional syncfs().  This seems pretty stupid, and it's certainly
      documented or guaraunteed to do this, nor is it particularly useful,
      except in the case where the file system was mounted rw and is getting
      remounted read-only.
      
      However, it's possible that there might be some file systems that are
      actually depending on this behavior.  In most file systems, it's
      probably fine to only call sync_filesystem() when transitioning from
      read-write to read-only, and there are some file systems where this is
      not needed at all (for example, for a pseudo-filesystem or something
      like romfs).
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      Cc: linux-fsdevel@vger.kernel.org
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Artem Bityutskiy <dedekind1@gmail.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Evgeniy Dushistov <dushistov@mail.ru>
      Cc: Jan Kara <jack@suse.cz>
      Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
      Cc: Anders Larsen <al@alarsen.net>
      Cc: Phillip Lougher <phillip@squashfs.org.uk>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Mikulas Patocka <mikulas@artax.karlin.mff.cuni.cz>
      Cc: Petr Vandrovec <petr@vandrovec.name>
      Cc: xfs@oss.sgi.com
      Cc: linux-btrfs@vger.kernel.org
      Cc: linux-cifs@vger.kernel.org
      Cc: samba-technical@lists.samba.org
      Cc: codalist@coda.cs.cmu.edu
      Cc: linux-ext4@vger.kernel.org
      Cc: linux-f2fs-devel@lists.sourceforge.net
      Cc: fuse-devel@lists.sourceforge.net
      Cc: cluster-devel@redhat.com
      Cc: linux-mtd@lists.infradead.org
      Cc: jfs-discussion@lists.sourceforge.net
      Cc: linux-nfs@vger.kernel.org
      Cc: linux-nilfs@vger.kernel.org
      Cc: linux-ntfs-dev@lists.sourceforge.net
      Cc: ocfs2-devel@oss.oracle.com
      Cc: reiserfs-devel@vger.kernel.org
      02b9984d
  15. 22 Jan, 2014 3 commits
  16. 13 Nov, 2013 1 commit
  17. 25 Sep, 2013 1 commit
  18. 29 Aug, 2013 1 commit
  19. 03 Jul, 2013 1 commit
  20. 07 Mar, 2013 1 commit
  21. 22 Feb, 2013 1 commit
  22. 03 Oct, 2012 1 commit
  23. 21 Mar, 2012 3 commits
  24. 07 Jan, 2012 1 commit
  25. 04 Jan, 2012 1 commit
    • Al Viro's avatar
      vfs: fix the stupidity with i_dentry in inode destructors · 6b520e05
      Al Viro authored
      Seeing that just about every destructor got that INIT_LIST_HEAD() copied into
      it, there is no point whatsoever keeping this INIT_LIST_HEAD in inode_init_once();
      the cost of taking it into inode_init_always() will be negligible for pipes
      and sockets and negative for everything else.  Not to mention the removal of
      boilerplate code from ->destroy_inode() instances...
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      6b520e05
  26. 17 Nov, 2011 1 commit
  27. 28 Jul, 2011 1 commit
    • Mark Fasheh's avatar
      ocfs2: serialize unaligned aio · a11f7e63
      Mark Fasheh authored
      Fix a corruption that can happen when we have (two or more) outstanding
      aio's to an overlapping unaligned region.  Ext4
      (e9e3bcec) and xfs recently had to fix
      similar issues.
      
      In our case what happens is that we can have an outstanding aio on a region
      and if a write comes in with some bytes overlapping the original aio we may
      decide to read that region into a page before continuing (typically because
      of buffered-io fallback).  Since we have no ordering guarantees with the
      aio, we can read stale or bad data into the page and then write it back out.
      
      If the i/o is page and block aligned, then we avoid this issue as there
      won't be any need to read data from disk.
      
      I took the same approach as Eric in the ext4 patch and introduced some
      serialization of unaligned async direct i/o.  I don't expect this to have an
      effect on the most common cases of AIO.  Unaligned aio will be slower
      though, but that's far more acceptable than data corruption.
      Signed-off-by: default avatarMark Fasheh <mfasheh@suse.com>
      Signed-off-by: default avatarJoel Becker <jlbec@evilplan.org>
      a11f7e63