• Ben Hutchings's avatar
    dcache: Fix locking bugs in backported "deal with deadlock in d_walk()" · 20defcec
    Ben Hutchings authored
    Steven Rostedt reported:
    > Porting -rt to the latest 3.2 stable tree I triggered this bug:
    > 
    > =====================================
    > [ BUG: bad unlock balance detected! ]
    > -------------------------------------
    > rm/1638 is trying to release lock (rcu_read_lock) at:
    > [<c04fde6c>] rcu_read_unlock+0x0/0x23
    > but there are no more locks to release!
    > 
    > other info that might help us debug this:
    > 2 locks held by rm/1638:
    >  #0:  (&sb->s_type->i_mutex_key#9/1){+.+.+.}, at: [<c04f93eb>] do_rmdir+0x5f/0xd2
    >  #1:  (&sb->s_type->i_mutex_key#9){+.+.+.}, at: [<c04f9329>] vfs_rmdir+0x49/0xac
    > 
    > stack backtrace:
    > Pid: 1638, comm: rm Not tainted 3.2.66-test-rt96+ #2
    > Call Trace:
    >  [<c083f390>] ? printk+0x1d/0x1f
    >  [<c0463cdf>] print_unlock_inbalance_bug+0xc3/0xcd
    >  [<c04653a8>] lock_release_non_nested+0x98/0x1ec
    >  [<c046228d>] ? trace_hardirqs_off_caller+0x18/0x90
    >  [<c0456f1c>] ? local_clock+0x2d/0x50
    >  [<c04fde6c>] ? d_hash+0x2f/0x2f
    >  [<c04fde6c>] ? d_hash+0x2f/0x2f
    >  [<c046568e>] lock_release+0x192/0x1ad
    >  [<c04fde83>] rcu_read_unlock+0x17/0x23
    >  [<c04ff344>] shrink_dcache_parent+0x227/0x270
    >  [<c04f9348>] vfs_rmdir+0x68/0xac
    >  [<c04f9424>] do_rmdir+0x98/0xd2
    >  [<c04f03ad>] ? fput+0x1a3/0x1ab
    >  [<c084dd42>] ? sysenter_exit+0xf/0x1a
    >  [<c0465b58>] ? trace_hardirqs_on_caller+0x118/0x149
    >  [<c04fa3e0>] sys_unlinkat+0x2b/0x35
    >  [<c084dd13>] sysenter_do_call+0x12/0x12
    > 
    > 
    > 
    > 
    > There's a path to calling rcu_read_unlock() without calling
    > rcu_read_lock() in have_submounts().
    > 
    > 	goto positive;
    > 
    > positive:
    > 	if (!locked && read_seqretry(&rename_lock, seq))
    > 		goto rename_retry;
    > 
    > rename_retry:
    > 	rcu_read_unlock();
    > 
    > in the above path, rcu_read_lock() is never done before calling
    > rcu_read_unlock();
    
    I reviewed locking contexts in all three functions that I changed when
    backporting "deal with deadlock in d_walk()".  It's actually worse
    than this:
    
    - We don't hold this_parent->d_lock at the 'positive' label in
      have_submounts(), but it is unlocked after 'rename_retry'.
    - There is an rcu_read_unlock() after the 'out' label in
      select_parent(), but it's not held at the 'goto out'.
    
    Fix all three lock imbalances.
    Reported-by: default avatarSteven Rostedt <rostedt@goodmis.org>
    Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
    Tested-by: default avatarSteven Rostedt <rostedt@goodmis.org>
    20defcec
dcache.c 77.9 KB