• Yijing Wang's avatar
    sysfs: driver core: Fix glue dir race condition by gdp_mutex · 6ae35c94
    Yijing Wang authored
    commit e4a60d13 upstream.
    
    There is a race condition when removing glue directory.
    It can be reproduced in following test:
    
    path 1: Add first child device
    device_add()
        get_device_parent()
                /*find parent from glue_dirs.list*/
                list_for_each_entry(k, &dev->class->p->glue_dirs.list, entry)
                        if (k->parent == parent_kobj) {
                                kobj = kobject_get(k);
                                break;
                        }
                ....
                class_dir_create_and_add()
    
    path2: Remove last child device under glue dir
    device_del()
        cleanup_device_parent()
                cleanup_glue_dir()
                        kobject_put(glue_dir);
    
    If path2 has been called cleanup_glue_dir(), but not
    call kobject_put(glue_dir), the glue dir is still
    in parent's kset list. Meanwhile, path1 find the glue
    dir from the glue_dirs.list. Path2 may release glue dir
    before path1 call kobject_get(). So kernel will report
    the warning and bug_on.
    
    This is a "classic" problem we have of a kref in a list
    that can be found while the last instance could be removed
    at the same time.
    
    This patch reuse gdp_mutex to fix this race condition.
    
    The following calltrace is captured in kernel 3.4, but
    the latest kernel still has this bug.
    
    -----------------------------------------------------
    <4>[ 3965.441471] WARNING: at ...include/linux/kref.h:41 kobject_get+0x33/0x40()
    <4>[ 3965.441474] Hardware name: Romley
    <4>[ 3965.441475] Modules linked in: isd_iop(O) isd_xda(O)...
    ...
    <4>[ 3965.441605] Call Trace:
    <4>[ 3965.441611]  [<ffffffff8103717a>] warn_slowpath_common+0x7a/0xb0
    <4>[ 3965.441615]  [<ffffffff810371c5>] warn_slowpath_null+0x15/0x20
    <4>[ 3965.441618]  [<ffffffff81215963>] kobject_get+0x33/0x40
    <4>[ 3965.441624]  [<ffffffff812d1e45>] get_device_parent.isra.11+0x135/0x1f0
    <4>[ 3965.441627]  [<ffffffff812d22d4>] device_add+0xd4/0x6d0
    <4>[ 3965.441631]  [<ffffffff812d0dbc>] ? dev_set_name+0x3c/0x40
    ....
    <2>[ 3965.441912] kernel BUG at ..../fs/sysfs/group.c:65!
    <4>[ 3965.441915] invalid opcode: 0000 [#1] SMP
    ...
    <4>[ 3965.686743]  [<ffffffff811a677e>] sysfs_create_group+0xe/0x10
    <4>[ 3965.686748]  [<ffffffff810cfb04>] blk_trace_init_sysfs+0x14/0x20
    <4>[ 3965.686753]  [<ffffffff811fcabb>] blk_register_queue+0x3b/0x120
    <4>[ 3965.686756]  [<ffffffff812030bc>] add_disk+0x1cc/0x490
    ....
    -------------------------------------------------------
    Signed-off-by: default avatarYijing Wang <wangyijing@huawei.com>
    Signed-off-by: default avatarWeng Meiling <wengmeiling.weng@huawei.com>
    Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
    Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
    6ae35c94
core.c 54.2 KB