• Vaibhav Jain's avatar
    kernfs: Check KERNFS_HAS_RELEASE before calling kernfs_release_file() · 966fa72a
    Vaibhav Jain authored
    Recently started seeing a kernel oops when a module tries removing a
    memory mapped sysfs bin_attribute. On closer investigation the root
    cause seems to be kernfs_release_file() trying to call
    kernfs_op.release() callback that's NULL for such sysfs
    bin_attributes. The oops occurs when kernfs_release_file() is called from
    kernfs_drain_open_files() to cleanup any open handles with active
    memory mappings.
    
    The patch fixes this by checking for flag KERNFS_HAS_RELEASE before
    calling kernfs_release_file() in function kernfs_drain_open_files().
    
    On ppc64-le arch with cxl module the oops back-trace is of the
    form below:
    [  861.381126] Unable to handle kernel paging request for instruction fetch
    [  861.381360] Faulting instruction address: 0x00000000
    [  861.381428] Oops: Kernel access of bad area, sig: 11 [#1]
    ....
    [  861.382481] NIP: 0000000000000000 LR: c000000000362c60 CTR:
    0000000000000000
    ....
    Call Trace:
    [c000000f1680b750] [c000000000362c34] kernfs_drain_open_files+0x104/0x1d0 (unreliable)
    [c000000f1680b790] [c00000000035fa00] __kernfs_remove+0x260/0x2c0
    [c000000f1680b820] [c000000000360da0] kernfs_remove_by_name_ns+0x60/0xe0
    [c000000f1680b8b0] [c0000000003638f4] sysfs_remove_bin_file+0x24/0x40
    [c000000f1680b8d0] [c00000000062a164] device_remove_bin_file+0x24/0x40
    [c000000f1680b8f0] [d000000009b7b22c] cxl_sysfs_afu_remove+0x144/0x170 [cxl]
    [c000000f1680b940] [d000000009b7c7e4] cxl_remove+0x6c/0x1a0 [cxl]
    [c000000f1680b990] [c00000000052f694] pci_device_remove+0x64/0x110
    [c000000f1680b9d0] [c0000000006321d4] device_release_driver_internal+0x1f4/0x2b0
    [c000000f1680ba20] [c000000000525cb0] pci_stop_bus_device+0xa0/0xd0
    [c000000f1680ba60] [c000000000525e80] pci_stop_and_remove_bus_device+0x20/0x40
    [c000000f1680ba90] [c00000000004a6c4] pci_hp_remove_devices+0x84/0xc0
    [c000000f1680bad0] [c00000000004a688] pci_hp_remove_devices+0x48/0xc0
    [c000000f1680bb10] [c0000000009dfda4] eeh_reset_device+0xb0/0x290
    [c000000f1680bbb0] [c000000000032b4c] eeh_handle_normal_event+0x47c/0x530
    [c000000f1680bc60] [c000000000032e64] eeh_handle_event+0x174/0x350
    [c000000f1680bd10] [c000000000033228] eeh_event_handler+0x1e8/0x1f0
    [c000000f1680bdc0] [c0000000000d384c] kthread+0x14c/0x190
    [c000000f1680be30] [c00000000000b5a0] ret_from_kernel_thread+0x5c/0xbc
    
    Fixes: f83f3c51 ("kernfs: fix locking around kernfs_ops->release() callback")
    Signed-off-by: default avatarVaibhav Jain <vaibhav@linux.vnet.ibm.com>
    Acked-by: default avatarTejun Heo <tj@kernel.org>
    Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
    966fa72a
file.c 25.2 KB