1. 14 Feb, 2023 1 commit
    • Dan Williams's avatar
      cxl/pmem: Fix nvdimm registration races · f57aec44
      Dan Williams authored
      A loop of the form:
      
          while true; do modprobe cxl_pci; modprobe -r cxl_pci; done
      
      ...fails with the following crash signature:
      
          BUG: kernel NULL pointer dereference, address: 0000000000000040
          [..]
          RIP: 0010:cxl_internal_send_cmd+0x5/0xb0 [cxl_core]
          [..]
          Call Trace:
           <TASK>
           cxl_pmem_ctl+0x121/0x240 [cxl_pmem]
           nvdimm_get_config_data+0xd6/0x1a0 [libnvdimm]
           nd_label_data_init+0x135/0x7e0 [libnvdimm]
           nvdimm_probe+0xd6/0x1c0 [libnvdimm]
           nvdimm_bus_probe+0x7a/0x1e0 [libnvdimm]
           really_probe+0xde/0x380
           __driver_probe_device+0x78/0x170
           driver_probe_device+0x1f/0x90
           __device_attach_driver+0x85/0x110
           bus_for_each_drv+0x7d/0xc0
           __device_attach+0xb4/0x1e0
           bus_probe_device+0x9f/0xc0
           device_add+0x445/0x9c0
           nd_async_device_register+0xe/0x40 [libnvdimm]
           async_run_entry_fn+0x30/0x130
      
      ...namely that the bottom half of async nvdimm device registration runs
      after the CXL has already torn down the context that cxl_pmem_ctl()
      needs. Unlike the ACPI NFIT case that benefits from launching multiple
      nvdimm device registrations in parallel from those listed in the table,
      CXL is already marked PROBE_PREFER_ASYNCHRONOUS. So provide for a
      synchronous registration path to preclude this scenario.
      
      Fixes: 21083f51 ("cxl/pmem: Register 'pmem' / cxl_nvdimm devices")
      Cc: <stable@vger.kernel.org>
      Reported-by: default avatarDave Jiang <dave.jiang@intel.com>
      Signed-off-by: default avatarDan Williams <dan.j.williams@intel.com>
      f57aec44
  2. 11 Feb, 2023 26 commits
  3. 09 Feb, 2023 1 commit
  4. 07 Feb, 2023 4 commits
  5. 30 Jan, 2023 3 commits
  6. 29 Jan, 2023 5 commits