• Eric W. Biederman's avatar
    proc/sysctl: Don't grab i_lock under sysctl_lock. · 631f93a6
    Eric W. Biederman authored
    commit ace0c791 upstream.
    
    Konstantin Khlebnikov <khlebnikov@yandex-team.ru> writes:
    > This patch has locking problem. I've got lockdep splat under LTP.
    >
    > [ 6633.115456] ======================================================
    > [ 6633.115502] [ INFO: possible circular locking dependency detected ]
    > [ 6633.115553] 4.9.10-debug+ #9 Tainted: G             L
    > [ 6633.115584] -------------------------------------------------------
    > [ 6633.115627] ksm02/284980 is trying to acquire lock:
    > [ 6633.115659]  (&sb->s_type->i_lock_key#4){+.+...}, at: [<ffffffff816bc1ce>] igrab+0x1e/0x80
    > [ 6633.115834] but task is already holding lock:
    > [ 6633.115882]  (sysctl_lock){+.+...}, at: [<ffffffff817e379b>] unregister_sysctl_table+0x6b/0x110
    > [ 6633.116026] which lock already depends on the new lock.
    > [ 6633.116026]
    > [ 6633.116080]
    > [ 6633.116080] the existing dependency chain (in reverse order) is:
    > [ 6633.116117]
    > -> #2 (sysctl_lock){+.+...}:
    > -> #1 (&(&dentry->d_lockref.lock)->rlock){+.+...}:
    > -> #0 (&sb->s_type->i_lock_key#4){+.+...}:
    >
    > d_lock nests inside i_lock
    > sysctl_lock nests inside d_lock in d_compare
    >
    > This patch adds i_lock nesting inside sysctl_lock.
    
    Al Viro <viro@ZenIV.linux.org.uk> replied:
    > Once ->unregistering is set, you can drop sysctl_lock just fine.  So I'd
    > try something like this - use rcu_read_lock() in proc_sys_prune_dcache(),
    > drop sysctl_lock() before it and regain after.  Make sure that no inodes
    > are added to the list ones ->unregistering has been set and use RCU list
    > primitives for modifying the inode list, with sysctl_lock still used to
    > serialize its modifications.
    >
    > Freeing struct inode is RCU-delayed (see proc_destroy_inode()), so doing
    > igrab() is safe there.  Since we don't drop inode reference until after we'd
    > passed beyond it in the list, list_for_each_entry_rcu() should be fine.
    
    I agree with Al Viro's analsysis of the situtation.
    
    Fixes: d6cffbbe ("proc/sysctl: prune stale dentries during unregistering")
    Reported-by: default avatarKonstantin Khlebnikov <khlebnikov@yandex-team.ru>
    Tested-by: default avatarKonstantin Khlebnikov <khlebnikov@yandex-team.ru>
    Suggested-by: default avatarAl Viro <viro@ZenIV.linux.org.uk>
    Signed-off-by: default avatar"Eric W. Biederman" <ebiederm@xmission.com>
    Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
    631f93a6
proc_sysctl.c 40 KB