An error occurred fetching the project authors.
  1. 10 Oct, 2018 1 commit
  2. 17 Feb, 2016 1 commit
  3. 30 Dec, 2015 1 commit
    • xuejiufei's avatar
      ocfs2/dlm: clear migration_pending when migration target goes down · cc28d6d8
      xuejiufei authored
      We have found a BUG on res->migration_pending when migrating lock
      resources.  The situation is as follows.
      
      dlm_mark_lockres_migration
        res->migration_pending = 1;
        __dlm_lockres_reserve_ast
        dlm_lockres_release_ast returns with res->migration_pending remains
            because other threads reserve asts
        wait dlm_migration_can_proceed returns 1
        >>>>>>> o2hb found that target goes down and remove target
                from domain_map
        dlm_migration_can_proceed returns 1
        dlm_mark_lockres_migrating returns -ESHOTDOWN with
            res->migration_pending still remains.
      
      When reentering dlm_mark_lockres_migrating(), it will trigger the BUG_ON
      with res->migration_pending.  So clear migration_pending when target is
      down.
      Signed-off-by: default avatarJiufei Xue <xuejiufei@huawei.com>
      Reviewed-by: default avatarJoseph Qi <joseph.qi@huawei.com>
      Cc: Mark Fasheh <mfasheh@suse.de>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Junxiao Bi <junxiao.bi@oracle.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      cc28d6d8
  4. 23 Oct, 2015 1 commit
  5. 22 Sep, 2015 1 commit
    • Joseph Qi's avatar
      ocfs2/dlm: fix deadlock when dispatch assert master · 012572d4
      Joseph Qi authored
      The order of the following three spinlocks should be:
      dlm_domain_lock < dlm_ctxt->spinlock < dlm_lock_resource->spinlock
      
      But dlm_dispatch_assert_master() is called while holding
      dlm_ctxt->spinlock and dlm_lock_resource->spinlock, and then it calls
      dlm_grab() which will take dlm_domain_lock.
      
      Once another thread (for example, dlm_query_join_handler) has already
      taken dlm_domain_lock, and tries to take dlm_ctxt->spinlock deadlock
      happens.
      Signed-off-by: default avatarJoseph Qi <joseph.qi@huawei.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Cc: "Junxiao Bi" <junxiao.bi@oracle.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      012572d4
  6. 04 Sep, 2015 1 commit
  7. 06 May, 2015 1 commit
    • Junxiao Bi's avatar
      ocfs2: dlm: fix race between purge and get lock resource · b1432a2a
      Junxiao Bi authored
      There is a race window in dlm_get_lock_resource(), which may return a
      lock resource which has been purged.  This will cause the process to
      hang forever in dlmlock() as the ast msg can't be handled due to its
      lock resource not existing.
      
          dlm_get_lock_resource {
              ...
              spin_lock(&dlm->spinlock);
              tmpres = __dlm_lookup_lockres_full(dlm, lockid, namelen, hash);
              if (tmpres) {
                   spin_unlock(&dlm->spinlock);
                   >>>>>>>> race window, dlm_run_purge_list() may run and purge
                                    the lock resource
                   spin_lock(&tmpres->spinlock);
                   ...
                   spin_unlock(&tmpres->spinlock);
              }
          }
      Signed-off-by: default avatarJunxiao Bi <junxiao.bi@oracle.com>
      Cc: Joseph Qi <joseph.qi@huawei.com>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      b1432a2a
  8. 19 Dec, 2014 1 commit
  9. 11 Dec, 2014 1 commit
    • Srinivas Eeda's avatar
      ocfs2: o2dlm: fix a race between purge and master query · cb79662b
      Srinivas Eeda authored
      Node A sends master query request to node B which is the master.  At this
      time lockres happens to be on purgelist.  dlm_master_request_handler gets
      the dlm spinlock, finds the resource and releases the dlm spin lock.
      Right at this dlm_thread on this node could purge the lockres.
      dlm_master_request_handler can then acquire lockres spinlock and reply to
      Node A that node B is the master even though lockres on node B is purged.
      
      The above scenario will now make node A falsely think node B is the master
      which is inconsistent.  Further if another node C tries to master the same
      resource, every node will respond they are not the master.  Node C then
      masters the resource and sends assert master to all nodes.  This will now
      make node A crash with the following message.
      
      dlm_assert_master_handler:1831 ERROR: DIE! Mastery assert from 9, but current
      owner is 10!
      Signed-off-by: default avatarSrinivas Eeda <srinivas.eeda@oracle.com>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Reviewed-by: default avatarWengang Wang <wen.gang.wang@oracle.com>
      Tested-by: default avatarJoseph Qi <joseph.qi@huawei.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      cb79662b
  10. 10 Oct, 2014 1 commit
  11. 02 Oct, 2014 1 commit
  12. 26 Sep, 2014 1 commit
  13. 07 Aug, 2014 1 commit
    • Tariq Saeed's avatar
      ocfs2: race between umount and unfinished remastering during recovery · bba1cb17
      Tariq Saeed authored
      Orabug: 19074140
      
      When umount is issued during recovery on the new master that has not
      finished remastering locks, it triggers BUG() in
      dlm_send_mig_lockres_msg().  Here is the situation:
      
       1) node A has a lock on resource X mastered by node B.
      
       2) node B dies ->  node A sets recovering flag for res X
      
       3) Node C becomes the new master for resources owned by the
          dead node and is remastering locks of the dead node but
          has not finished the remastering process yet.
      
       4) umount is issued on node C.
      
       5) During processing of umount, ignoring unfished recovery,
          node C attempts to migrate resource X to node A.
      
       6) node A finds res X in DLM_LOCK_RES_RECOVERING state, considers
          it a logic error and sends back -EFAULT.
      
       7) node C asserts BUG() upon seeing EFAULT resp from node B.
      
      Fix is to delay migrating res X till remastering is finished at which
      point recovering flag will be cleared on both A and C.
      Signed-off-by: default avatarTariq Saeed <tariq.x.saeed@oracle.com>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      bba1cb17
  14. 23 Jun, 2014 2 commits
  15. 04 Jun, 2014 1 commit
  16. 23 May, 2014 1 commit
  17. 13 Nov, 2013 2 commits
  18. 11 Sep, 2013 1 commit
  19. 26 Feb, 2013 1 commit
  20. 24 Jul, 2011 3 commits
  21. 26 May, 2011 1 commit
    • Sunil Mushran's avatar
      ocfs2/dlm: Do not migrate resource to a node that is leaving the domain · 66effd3c
      Sunil Mushran authored
      During dlm domain shutdown, o2dlm has to free all the lock resources. Ones that
      have no locks and references are freed. Ones that have locks and/or references
      are migrated to another node.
      
      The first task in migration is finding a target. Currently we scan the lock
      resource and find one node that either has a lock or a reference. This is not
      very efficient in a parallel umount case as we might end up migrating the
      lock resource to a node which itself may have to migrate it to a third node.
      
      The patch scans the dlm->exit_domain_map to ensure the target node is not
      leaving the domain. If no valid target node is found, o2dlm does not migrate
      the resource but instead waits for the unlock and deref messages that will
      allow it to free the resource.
      Signed-off-by: default avatarSunil Mushran <sunil.mushran@oracle.com>
      Signed-off-by: default avatarJoel Becker <jlbec@evilplan.org>
      66effd3c
  22. 24 May, 2011 1 commit
  23. 13 May, 2011 1 commit
  24. 31 Mar, 2011 1 commit
  25. 21 Feb, 2011 1 commit
    • Tao Ma's avatar
      ocfs2: Remove ENTRY from masklog. · ef6b689b
      Tao Ma authored
      ENTRY is used to record the entry of a function.
      But because it is added in so many functions, if we enable it,
      the system logs get filled up quickly and cause too much I/O.
      So actually no one can open it for a production system or even
      for a test.
      
      So for mlog_entry_void, we just remove it.
      for mlog_entry(...), we replace it with mlog(0,...), and they
      will be replace by trace event later.
      Signed-off-by: default avatarTao Ma <boyu.mt@taobao.com>
      ef6b689b
  26. 09 Dec, 2010 1 commit
    • Sunil Mushran's avatar
      ocfs2/dlm: Migrate lockres with no locks if it has a reference · 388c4bcb
      Sunil Mushran authored
      o2dlm was not migrating resources with zero locks because it assumed that that
      resource would get purged by dlm_thread. However, some usage patterns involve
      creating and dropping locks at a high rate leading to the migrate thread seeing
      zero locks but the purge thread seeing an active reference. When this happens,
      the dlm_thread cannot purge the resource and the migrate thread sees no reason
      to migrate that resource. The spell is broken when the migrate thread catches
      the resource with a lock.
      
      The fix is to make the migrate thread also consider the reference map.
      
      This usage pattern can be triggered by userspace on userdlm locks and flocks.
      Signed-off-by: default avatarSunil Mushran <sunil.mushran@oracle.com>
      Signed-off-by: default avatarJoel Becker <joel.becker@oracle.com>
      388c4bcb
  27. 23 Sep, 2010 1 commit
  28. 07 Aug, 2010 2 commits
    • Wengang Wang's avatar
      ocfs2/dlm: remove potential deadlock -V3 · b11f1f1a
      Wengang Wang authored
      When we need to take both dlm_domain_lock and dlm->spinlock, we should take
      them in order of: dlm_domain_lock then dlm->spinlock.
      
      There is pathes disobey this order. That is calling dlm_lockres_put() with
      dlm->spinlock held in dlm_run_purge_list. dlm_lockres_put() calls dlm_put() at
      the ref and dlm_put() locks on dlm_domain_lock.
      
      Fix:
      Don't grab/put the dlm when the initialising/releasing lockres.
      That grab is not required because we don't call dlm_unregister_domain()
      based on refcount.
      Signed-off-by: default avatarWengang Wang <wen.gang.wang@oracle.com>
      Cc: stable@kernel.org
      Signed-off-by: default avatarJoel Becker <joel.becker@oracle.com>
      b11f1f1a
    • Wengang Wang's avatar
      ocfs2/dlm: fix a dead lock · 6d98c3cc
      Wengang Wang authored
      When we have to take both dlm->master_lock and lockres->spinlock,
      take them in order
      
      lockres->spinlock and then dlm->master_lock.
      
      The patch fixes a violation of the rule.
      We can simply move taking dlm->master_lock to where we have dropped res->spinlock
      since when we access res->state and free mle memory we don't need master_lock's
      protection.
      Signed-off-by: default avatarWengang Wang <wen.gang.wang@oracle.com>
      Cc: stable@kernel.org
      Signed-off-by: default avatarJoel Becker <joel.becker@oracle.com>
      6d98c3cc
  29. 15 Jul, 2010 1 commit
  30. 18 May, 2010 1 commit
  31. 06 May, 2010 1 commit
  32. 24 Mar, 2010 1 commit
    • Srinivas Eeda's avatar
      ocfs2: Fix a race in o2dlm lockres mastery · 14741472
      Srinivas Eeda authored
      In o2dlm, the master of a lock resource keeps a map of all interested
      nodes.  This prevents the master from purging the resource before an
      interested node can create a lock.
      
      A race between the mastery thread and the mastery handler allowed an
      interested node to discover who the master is without informing the
      master directly.  This is easily fixed by holding the dlm spinlock a
      little longer in the mastery handler.
      Signed-off-by: default avatarSrinivas Eeda <srinivas.eeda@oracle.com>
      Signed-off-by: default avatarJoel Becker <joel.becker@oracle.com>
      14741472
  33. 26 Jan, 2010 1 commit
  34. 04 Dec, 2009 1 commit
  35. 24 Sep, 2009 1 commit