1. 31 Jul, 2008 10 commits
    • Sunil Mushran's avatar
      [PATCH 2/2] ocfs2: Fix race between mount and recovery · 539d8264
      Sunil Mushran authored
      As the fs recovery is asynchronous, there is a small chance that another
      node can mount (and thus recover) the slot before the recovery thread
      gets to it.
      
      If this happens, the recovery thread will block indefinitely on the
      journal/slot lock as that lock will be held for the duration of the mount
      (by design) by the node assigned to that slot.
      
      The solution implemented is to keep track of the journal replays using
      a recovery generation in the journal inode, which will be incremented by the
      thread replaying that journal. The recovery thread, before attempting the
      blocking lock on the journal/slot lock, will compare the generation on disk
      with what it has cached and skip recovery if it does not match.
      
      This bug appears to have been inadvertently introduced during the mount/umount
      vote removal by mainline commit 34d024f8. In the
      mount voting scheme, the messaging would indirectly indicate that the slot
      was being recovered.
      Signed-off-by: default avatarSunil Mushran <sunil.mushran@oracle.com>
      Signed-off-by: default avatarMark Fasheh <mfasheh@suse.com>
      539d8264
    • Sunil Mushran's avatar
      [PATCH 1/2] ocfs2: Add counter in struct ocfs2_dinode to track journal replays · c69991aa
      Sunil Mushran authored
      This patch renames the ij_pad to ij_recovery_generation in struct ocfs2_dinode.
      This will be used to keep count of journal replays after an unclean shutdown.
      Signed-off-by: default avatarSunil Mushran <sunil.mushran@oracle.com>
      Signed-off-by: default avatarMark Fasheh <mfasheh@suse.com>
      c69991aa
    • Joel Becker's avatar
      [PATCH] configfs: Convenience macros for attribute definition. · ecb3d28c
      Joel Becker authored
      Sysfs has the _ATTR() and _ATTR_RO() macros to make defining extended
      form attributes easier.  configfs should have something similiar.
      
      - _CONFIGFS_ATTR() and _CONFIGFS_ATTR_RO() are the counterparts to the
        sysfs macros.
      - CONFIGFS_ATTR_STRUCT() creates the extended form attribute structure.
      - CONFIGFS_ATTR_OPS() defines the show_attribute()/store_attribute()
        operations that call the show()/store() operations of the extended
        form configfs_attributes.
      Signed-off-by: default avatarJoel Becker <joel.becker@oracle.com>
      Signed-off-by: default avatarMark Fasheh <mfasheh@suse.com>
      ecb3d28c
    • Joel Becker's avatar
      [PATCH] configfs: Pin configfs subsystems separately from new config_items. · 70526b67
      Joel Becker authored
      configfs_mkdir() creates a new item by calling its parent's
      ->make_item/group() functions.  Once that object is created,
      configfs_mkdir() calls try_module_get() on the new item's module.  If it
      succeeds, the module owning the new item cannot be unloaded, and
      configfs is safe to reference the item.
      
      If the item and the subsystem it belongs to are part of the same module,
      the subsystem is also pinned.  This is the common case.
      
      However, if the subsystem is made up of multiple modules, this may not
      pin the subsystem.  Thus, it would be possible to unload the toplevel
      subsystem module while there is still a child item.  Thus, we now
      try_module_get() the subsystem's module.  This only really affects
      children of the toplevel subsystem group.  Deeper children already have
      their parents pinned.
      Signed-off-by: default avatarJoel Becker <joel.becker@oracle.com>
      Signed-off-by: default avatarMark Fasheh <mfasheh@suse.com>
      70526b67
    • Louis Rilling's avatar
      [PATCH] configfs: Fix open directory making rmdir() fail · 99cefda4
      Louis Rilling authored
      When checking for user-created elements under an item to be removed by rmdir(),
      configfs_detach_prep() counts fake configfs_dirents created by dir_open() as
      user-created and fails when finding one. It is however perfectly valid to remove
      a directory that is open.
      
      Simply make configfs_detach_prep() skip fake configfs_dirent, like it already
      does for attributes, and like detach_groups() does.
      Signed-off-by: default avatarLouis Rilling <louis.rilling@kerlabs.com>
      Signed-off-by: default avatarJoel Becker <joel.becker@oracle.com>
      Signed-off-by: default avatarMark Fasheh <mfasheh@suse.com>
      99cefda4
    • Louis Rilling's avatar
      [PATCH] configfs: Lock new directory inodes before removing on cleanup after failure · 2e2ce171
      Louis Rilling authored
      Once a new configfs directory is created by configfs_attach_item() or
      configfs_attach_group(), a failure in the remaining initialization steps leads
      to removing a directory which inode the VFS may have already accessed.
      
      This commit adds the necessary inode locking to safely remove configfs
      directories while cleaning up after a failure. As an advantage, the locking
      rules of populate_groups() and detach_groups() become the same: the caller must
      have the group's inode mutex locked.
      Signed-off-by: default avatarLouis Rilling <louis.rilling@kerlabs.com>
      Signed-off-by: default avatarJoel Becker <joel.becker@oracle.com>
      Signed-off-by: default avatarMark Fasheh <mfasheh@suse.com>
      2e2ce171
    • Louis Rilling's avatar
      [PATCH] configfs: Prevent userspace from creating new entries under attaching directories · 2a109f2a
      Louis Rilling authored
      process 1: 					process 2:
      configfs_mkdir("A")
        attach_group("A")
          attach_item("A")
            d_instantiate("A")
          populate_groups("A")
            mutex_lock("A")
            attach_group("A/B")
              attach_item("A")
                d_instantiate("A/B")
      						mkdir("A/B/C")
      						  do_path_lookup("A/B/C", LOOKUP_PARENT)
      						    ok
      						  lookup_create("A/B/C")
      						    mutex_lock("A/B")
      						    ok
      						  configfs_mkdir("A/B/C")
      						    ok
            attach_group("A/C")
              attach_item("A/C")
                d_instantiate("A/C")
              populate_groups("A/C")
                mutex_lock("A/C")
                attach_group("A/C/D")
                  attach_item("A/C/D")
                    failure
                mutex_unlock("A/C")
                detach_groups("A/C")
                  nothing to do
      						mkdir("A/C/E")
      						  do_path_lookup("A/C/E", LOOKUP_PARENT)
      						    ok
      						  lookup_create("A/C/E")
      						    mutex_lock("A/C")
      						    ok
      						  configfs_mkdir("A/C/E")
      						    ok
              detach_item("A/C")
              d_delete("A/C")
            mutex_unlock("A")
            detach_groups("A")
              mutex_lock("A/B")
              detach_group("A/B")
      	  detach_groups("A/B")
      	    nothing since no _default_ group
                detach_item("A/B")
              mutex_unlock("A/B")
              d_delete("A/B")
          detach_item("A")
          d_delete("A")
      
      Two bugs:
      
      1/ "A/B/C" and "A/C/E" are created, but never removed while their parent are
      removed in the end. The same could happen with symlink() instead of mkdir().
      
      2/ "A" and "A/C" inodes are not locked while detach_item() is called on them,
         which may probably confuse VFS.
      
      This commit fixes 1/, tagging new directories with CONFIGFS_USET_CREATING before
      building the inode and instantiating the dentry, and validating the whole
      group+default groups hierarchy in a second pass by clearing
      CONFIGFS_USET_CREATING.
      	mkdir(), symlink(), lookup(), and dir_open() simply return -ENOENT if
      called in (or linking to) a directory tagged with CONFIGFS_USET_CREATING. This
      does not prevent userspace from calling stat() successfuly on such directories,
      but this prevents userspace from adding (children to | symlinking from/to |
      read/write attributes of | listing the contents of) not validated items. In
      other words, userspace will not interact with the subsystem on a new item until
      the new item creation completes correctly.
      	It was first proposed to re-use CONFIGFS_USET_IN_MKDIR instead of a new
      flag CONFIGFS_USET_CREATING, but this generated conflicts when checking the
      target of a new symlink: a valid target directory in the middle of attaching
      a new user-created child item could be wrongly detected as being attached.
      
      2/ is fixed by next commit.
      Signed-off-by: default avatarLouis Rilling <louis.rilling@kerlabs.com>
      Signed-off-by: default avatarJoel Becker <joel.becker@oracle.com>
      Signed-off-by: default avatarMark Fasheh <mfasheh@suse.com>
      2a109f2a
    • Louis Rilling's avatar
      [PATCH] configfs: Fix failing symlink() making rmdir() fail · 9a73d78c
      Louis Rilling authored
      On a similar pattern as mkdir() vs rmdir(), a failing symlink() may make rmdir()
      fail for the symlink's parent and the symlink's target as well.
      
      failing symlink() making target's rmdir() fail:
      
      	process 1:				process 2:
      	symlink("A/S" -> "B")
      	  allow_link()
      	  create_link()
      	    attach to "B" links list
      						rmdir("B")
      						  detach_prep("B")
      						    error because of new link
      	    configfs_create_link("A", "S")
      	      error (eg -ENOMEM)
      
      failing symlink() making parent's rmdir() fail:
      
      	process 1:				process 2:
      	symlink("A/D/S" -> "B")
      	  allow_link()
      	  create_link()
      	    attach to "B" links list
      	    configfs_create_link("A/D", "S")
      	      make_dirent("A/D", "S")
      						rmdir("A")
      						  detach_prep("A")
      						    detach_prep("A/D")
      						      error because of "S"
      	      create("S")
      	        error (eg -ENOMEM)
      
      We cannot use the same solution as for mkdir() vs rmdir(), since rmdir() on the
      target cannot wait on the i_mutex of the new symlink's parent without risking a
      deadlock (with other symlink() or sys_rename()). Instead we define a global
      mutex protecting all configfs symlinks attachment, so that rmdir() can avoid the
      races above.
      Signed-off-by: default avatarLouis Rilling <louis.rilling@kerlabs.com>
      Signed-off-by: default avatarJoel Becker <joel.becker@oracle.com>
      Signed-off-by: default avatarMark Fasheh <mfasheh@suse.com>
      9a73d78c
    • Louis Rilling's avatar
      [PATCH] configfs: Fix symlink() to a removing item · 4768e9b1
      Louis Rilling authored
      The rule for configfs symlinks is that symlinks always point to valid
      config_items, and prevent the target from being removed. However,
      configfs_symlink() only checks that it can grab a reference on the target item,
      without ensuring that it remains alive until the symlink is correctly attached.
      
      This patch makes configfs_symlink() fail whenever the target is being removed,
      using the CONFIGFS_USET_DROPPING flag set by configfs_detach_prep() and
      protected by configfs_dirent_lock.
      
      This patch introduces a similar (weird?) behavior as with mkdir failures making
      rmdir fail: if symlink() races with rmdir() of the parent directory (or its
      youngest user-created ancestor if parent is a default group) or rmdir() of the
      target directory, and then fails in configfs_create(), this can make the racing
      rmdir() fail despite the concerned directory having no user-created entry (resp.
      no symlink pointing to it or one of its default groups) in the end.
      This behavior is fixed in later patches.
      Signed-off-by: default avatarLouis Rilling <louis.rilling@kerlabs.com>
      Signed-off-by: default avatarJoel Becker <joel.becker@oracle.com>
      Signed-off-by: default avatarMark Fasheh <mfasheh@suse.com>
      4768e9b1
    • Joel Becker's avatar
      [PATCH] configfs: Include linux/err.h in linux/configfs.h · dacdd0e0
      Joel Becker authored
      We now use PTR_ERR() in the ->make_item() and ->make_group() operations.
      Folks including configfs.h need err.h.
      Signed-off-by: default avatarJoel Becker <joel.becker@oracle.com>
      Signed-off-by: default avatarMark Fasheh <mfasheh@suse.com>
      dacdd0e0
  2. 30 Jul, 2008 30 commits