• Ewan D. Milne's avatar
    nvme: check for valid nvme_identify_ns() before using it · d8b90d60
    Ewan D. Milne authored
    When scanning namespaces, it is possible to get valid data from the first
    call to nvme_identify_ns() in nvme_alloc_ns(), but not from the second
    call in nvme_update_ns_info_block().  In particular, if the NSID becomes
    inactive between the two commands, a storage device may return a buffer
    filled with zero as per 4.1.5.1.  In this case, we can get a kernel crash
    due to a divide-by-zero in blk_stack_limits() because ns->lba_shift will
    be set to zero.
    
    PID: 326      TASK: ffff95fec3cd8000  CPU: 29   COMMAND: "kworker/u98:10"
     #0 [ffffad8f8702f9e0] machine_kexec at ffffffff91c76ec7
     #1 [ffffad8f8702fa38] __crash_kexec at ffffffff91dea4fa
     #2 [ffffad8f8702faf8] crash_kexec at ffffffff91deb788
     #3 [ffffad8f8702fb00] oops_end at ffffffff91c2e4bb
     #4 [ffffad8f8702fb20] do_trap at ffffffff91c2a4ce
     #5 [ffffad8f8702fb70] do_error_trap at ffffffff91c2a595
     #6 [ffffad8f8702fbb0] exc_divide_error at ffffffff928506e6
     #7 [ffffad8f8702fbd0] asm_exc_divide_error at ffffffff92a00926
        [exception RIP: blk_stack_limits+434]
        RIP: ffffffff92191872  RSP: ffffad8f8702fc80  RFLAGS: 00010246
        RAX: 0000000000000000  RBX: ffff95efa0c91800  RCX: 0000000000000001
        RDX: 0000000000000000  RSI: 0000000000000001  RDI: 0000000000000001
        RBP: 00000000ffffffff   R8: ffff95fec7df35a8   R9: 0000000000000000
        R10: 0000000000000000  R11: 0000000000000001  R12: 0000000000000000
        R13: 0000000000000000  R14: 0000000000000000  R15: ffff95fed33c09a8
        ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
     #8 [ffffad8f8702fce0] nvme_update_ns_info_block at ffffffffc06d3533 [nvme_core]
     #9 [ffffad8f8702fd18] nvme_scan_ns at ffffffffc06d6fa7 [nvme_core]
    
    This happened when the check for valid data was moved out of nvme_identify_ns()
    into one of the callers.  Fix this by checking in both callers.
    
    Link: https://bugzilla.kernel.org/show_bug.cgi?id=218186
    Fixes: 0dd6fff2 ("nvme: bring back auto-removal of deleted namespaces during sequential scan")
    Cc: stable@vger.kernel.org
    Signed-off-by: default avatarEwan D. Milne <emilne@redhat.com>
    Signed-off-by: default avatarKeith Busch <kbusch@kernel.org>
    d8b90d60
core.c 126 KB