• Manoj Kumar's avatar
    cxlflash: Resolve oops in wait_port_offline · 20ab6948
    Manoj Kumar authored
    [ Upstream commit b45cdbaf ]
    
    If an async error interrupt is generated, and the error requires the FC
    link to be reset, it cannot be performed in the interrupt context. So a
    work element is scheduled to complete the link reset in a process
    context. If either an EEH event or an escalation occurs in between when
    the interrupt is generated and the scheduled work is started, the MMIO
    space may no longer be available. This will cause an oops in the worker
    thread.
    
    [  606.806583] NIP kthread_data+0x28/0x40
    [  606.806633] LR wq_worker_sleeping+0x30/0x100
    [  606.806694] Call Trace:
    [  606.806721] 0x50 (unreliable)
    [  606.806796] wq_worker_sleeping+0x30/0x100
    [  606.806884] __schedule+0x69c/0x8a0
    [  606.806959] schedule+0x44/0xc0
    [  606.807034] do_exit+0x770/0xb90
    [  606.807109] die+0x300/0x460
    [  606.807185] bad_page_fault+0xd8/0x150
    [  606.807259] handle_page_fault+0x2c/0x30
    [  606.807338] wait_port_offline.constprop.12+0x60/0x130 [cxlflash]
    
    To prevent the problem space area from being unmapped, when there is
    pending work, a mapcount (using the kref mechanism) is held.  The
    mapcount is released only when the work is completed.  The last
    reference release is tied to the unmapping service.
    Signed-off-by: default avatarManoj N. Kumar <manoj@linux.vnet.ibm.com>
    Acked-by: default avatarMatthew R. Ochs <mrochs@linux.vnet.ibm.com>
    Reviewed-by: default avatarUma Krishnan <ukrishn@linux.vnet.ibm.com>
    Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
    Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
    Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
    20ab6948
common.h 5.41 KB