An error occurred fetching the project authors.
- 23 May, 2023 2 commits
-
-
Davidlohr Bueso authored
Factor out common functionality/semantics for cxl shared interrupts into a new helper on top of devm_request_irq(). Suggested-by:
Dan Williams <dan.j.williams@intel.com> Signed-off-by:
Davidlohr Bueso <dave@stgolabs.net> Link: https://lore.kernel.org/r/20230523170927.20685-4-dave@stgolabs.netSigned-off-by:
Dan Williams <dan.j.williams@intel.com>
-
Davidlohr Bueso authored
Move the cxl_alloc_irq_vectors() call further up in the probing in order to allow for mailbox interrupt usage. No change in semantics. Reviewed-by:
Dave Jiang <dave.jiang@intel.com> Reviewed-by:
Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by:
Davidlohr Bueso <dave@stgolabs.net> Link: https://lore.kernel.org/r/20230523170927.20685-3-dave@stgolabs.netSigned-off-by:
Dan Williams <dan.j.williams@intel.com>
-
- 18 May, 2023 1 commit
-
-
Dave Jiang authored
Move cxl_await_media_ready() to cxl_pci probe before driver starts issuing IDENTIFY and retrieving memory device information to ensure that the device is ready to provide the information. Allow cxl_pci_probe() to succeed even if media is not ready. Cache the media failure in cxlds and don't ask the device for any media information. The rationale for proceeding in the !media_ready case is to allow for mailbox operations to interrogate and/or remediate the device. After media is repaired then rebinding the cxl_pci driver is expected to restart the capacity scan. Suggested-by:
Dan Williams <dan.j.williams@intel.com> Fixes: b39cb105 ("cxl/mem: Register CXL memX devices") Reviewed-by:
Ira Weiny <ira.weiny@intel.com> Signed-off-by:
Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/168445310026.3251520.8124296540679268206.stgit@djiang5-mobl3 [djbw: fixup cxl_test] Signed-off-by:
Dan Williams <dan.j.williams@intel.com>
-
- 23 Apr, 2023 1 commit
-
-
Alison Schofield authored
Driver reads of the poison list are synchronized to ensure that a reader does not get an incomplete list because their request overlapped (was interrupted or preceded by) another read request of the same DPA range. (CXL Spec 3.0 Section 8.2.9.8.4.1). The driver maintains state information to achieve this goal. To initialize the state, first recognize the poison commands in the CEL (Command Effects Log). If the device supports Get Poison List, allocate a single buffer for the poison list and protect it with a lock. Signed-off-by:
Alison Schofield <alison.schofield@intel.com> Link: https://lore.kernel.org/r/9078d180769be28a5087288b38cdfc827cae58bf.1681838291.git.alison.schofield@intel.comReviewed-by:
Jonathan Cameron <Jonathan.Cameron@huawei.com> Tested-by:
Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by:
Dan Williams <dan.j.williams@intel.com>
-
- 18 Apr, 2023 1 commit
-
-
Lukas Wunner authored
The PCI core has just been amended to create a pci_doe_mb struct for every DOE instance on device enumeration. Drop creation of a (duplicate) CDAT DOE mailbox on cxl probing in favor of the one already created by the PCI core. Tested-by:
Ira Weiny <ira.weiny@intel.com> Signed-off-by:
Lukas Wunner <lukas@wunner.de> Reviewed-by:
Ira Weiny <ira.weiny@intel.com> Reviewed-by:
Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/becaf70e8faf9681d474200117d62d7eaac46cca.1678543498.git.lukas@wunner.deSigned-off-by:
Dan Williams <dan.j.williams@intel.com>
-
- 14 Feb, 2023 2 commits
-
-
Dave Jiang authored
By default the CXL RAS mask registers bits are defaulted to 1's and suppress all error reporting. If the kernel has negotiated ownership of error handling for CXL then unmask the mask registers by writing 0s. PCI_EXP_DEVCTL capability is checked to see uncorrectable or correctable errors bits are set before unmasking the respective errors. Acked-by: Bjorn Helgaas <bhelgaas@google.com> # pci_regs.h Reviewed-by:
Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by:
Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by:
Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/167639402301.778884.12556849214955646539.stgit@djiang5-mobl3.localSigned-off-by:
Dan Williams <dan.j.williams@intel.com>
-
Dave Jiang authored
With this [1] commit upstream, pci_enable_pci_error_report() is no longer necessary for the driver to call. Remove call and related cleanups. [1]: f26e58bf ("PCI/AER: Enable error reporting when AER is native") Signed-off-by:
Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/167632012093.4153151.5360778069735064322.stgit@djiang5-mobl3.localSigned-off-by:
Dan Williams <dan.j.williams@intel.com>
-
- 30 Jan, 2023 2 commits
-
-
Dan Williams authored
The IRQ core expects that users of the default hardirq handler specify IRQF_ONESHOT to keep interrupts disabled until the threaded handler runs. That meets the CXL driver's expectations since it is an edge triggered MSI and this flag would have been passed by default using pci_request_irq() instead of devm_request_threaded_irq(). Fixes: a49aa814 ("cxl/mem: Wire up event interrupts") Reported-by:
kernel test robot <lkp@intel.com> Reported-by:
Julia Lawall <julia.lawall@lip6.fr> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Ira Weiny <ira.weiny@intel.com> Signed-off-by:
Dan Williams <dan.j.williams@intel.com>
-
Jonathan Cameron authored
CXL r3.0 section 8.2.9.4.2 "Set Timestamp" recommends that the host sets the timestamp after every Conventional or CXL Reset to ensure accurate timestamps. This should include on initial boot up. The time base that is being set is used by a device for the poison list overflow timestamp and all event timestamps. Note that the command is optional and if not supported and the device cannot return accurate timestamps it will fill the fields in with an appropriate marker (see the specification description of each timestamp). Signed-off-by:
Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20230130151327.32415-1-Jonathan.Cameron@huawei.comSigned-off-by:
Dan Williams <dan.j.williams@intel.com>
-
- 27 Jan, 2023 1 commit
-
-
Davidlohr Bueso authored
Currently the only CXL features targeted for irq support require their message numbers to be within the first 16 entries. The device may however support less than 16 entries depending on the support it provides. Attempt to allocate these 16 irq vectors. If the device supports less then the PCI infrastructure will allocate that number. Upon successful allocation, users can plug in their respective isr at any point thereafter. CXL device events are signaled via interrupts. Each event log may have a different interrupt message number. These message numbers are reported in the Get Event Interrupt Policy mailbox command. Add interrupt support for event logs. Interrupts are allocated as shared interrupts. Therefore, all or some event logs can share the same message number. In addition all logs are queried on any interrupt in order of the most to least severe based on the status register. Finally place all event configuration logic into cxl_event_config(). Previously the logic was a simple 'read all' on start up. But interrupts must be configured prior to any reads to ensure no events are missed. A single event configuration function results in a cleaner over all implementation. Cc: Bjorn Helgaas <helgaas@kernel.org> Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com> Co-developed-by:
Ira Weiny <ira.weiny@intel.com> Signed-off-by:
Davidlohr Bueso <dave@stgolabs.net> Reviewed-by:
Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by:
Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/20221216-cxl-ev-log-v7-2-2316a5c8f7d8@intel.comSigned-off-by:
Dan Williams <dan.j.williams@intel.com>
-
- 26 Jan, 2023 1 commit
-
-
Ira Weiny authored
CXL devices have multiple event logs which can be queried for CXL event records. Devices are required to support the storage of at least one event record in each event log type. Devices track event log overflow by incrementing a counter and tracking the time of the first and last overflow event seen. Software queries events via the Get Event Record mailbox command; CXL rev 3.0 section 8.2.9.2.2 and clears events via CXL rev 3.0 section 8.2.9.2.3 Clear Event Records mailbox command. If the result of negotiating CXL Error Reporting Control is OS control, read and clear all event logs on driver load. Ensure a clean slate of events by reading and clearing the events on driver load. The status register is not used because a device may continue to trigger events and the only requirement is to empty the log at least once. This allows for the required transition from empty to non-empty for interrupt generation. Handling of interrupts is in a follow on patch. The device can return up to 1MB worth of event records per query. Allocate a shared large buffer to handle the max number of records based on the mailbox payload size. This patch traces a raw event record and leaves specific event record type tracing to subsequent patches. Macros are created to aid in tracing the common CXL Event header fields. Each record is cleared explicitly. A clear all bit is specified but is only valid when the log overflows. Reviewed-by:
Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by:
Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/20221216-cxl-ev-log-v7-1-2316a5c8f7d8@intel.comSigned-off-by:
Dan Williams <dan.j.williams@intel.com>
-
- 25 Jan, 2023 1 commit
-
-
Robert Richter authored
For debugging it is very helpful to see which commands are sent. Add it to the debug message. Signed-off-by:
Robert Richter <rrichter@amd.com> Link: https://lore.kernel.org/r/20230103210151.1126873-1-rrichter@amd.comSigned-off-by:
Dan Williams <dan.j.williams@intel.com>
-
- 09 Jan, 2023 1 commit
-
-
Dave Jiang authored
'addr' that contains RAS UE register address is re-assigned to RAS_CAP_CONTROL offset if there are multiple UE errors. Use different addr variable to avoid the reassignment mistake. Fixes: 2905cb52 ("cxl/pci: Add (hopeful) error handling support") Reported-by:
Jonathan Cameron <jonathan.cameron@huawei.com> Signed-off-by:
Dave Jiang <dave.jiang@intel.com> Reviewed-by:
Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by:
Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/167302318779.580155.15233596744650706167.stgit@djiang5-mobl3.localSigned-off-by:
Dan Williams <dan.j.williams@intel.com>
-
- 05 Jan, 2023 1 commit
-
-
Dan Williams authored
CXL is using tracepoints for reporting RAS capability register payloads for AER events, and has plans to use tracepoints for the output payload of Get Poison List and Get Event Records commands. For organization purposes it would be nice to keep those all under a single + local CXL trace system. This also organization also potentially helps in the future when CXL drivers expand beyond generic memory expanders, however that would also entail a move away from the expander-specific cxl_dev_state context, save that for later. Note that the powerpc-specific drivers/misc/cxl/ also defines a 'cxl' trace system, however, it is unlikely that a single platform will ever load both drivers simultaneously. Cc: Steven Rostedt <rostedt@goodmis.org> Tested-by:
Alison Schofield <alison.schofield@intel.com> Reviewed-by:
Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/167051869176.436579.9728373544811641087.stgit@dwillia2-xfh.jf.intel.comSigned-off-by:
Dan Williams <dan.j.williams@intel.com>
-
- 06 Dec, 2022 2 commits
-
-
Dan Williams authored
readl() already handles endian conversion. That's the main difference between readl() and __raw_readl(). This is benign on little-endian systems, but big endian systems will end up byte-swabbing twice. Fixes: 2905cb52 ("cxl/pci: Add (hopeful) error handling support") Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com> Cc: Dave Jiang <dave.jiang@intel.com> Reviewed-by:
Dave Jiang <dave.jiang@intel.com> Reviewed-by:
Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by:
Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/167030092025.4045167.10651070153523351093.stgit@dwillia2-xfh.jf.intel.comSigned-off-by:
Dan Williams <dan.j.williams@intel.com>
-
Dan Williams authored
The first argument to the CXL AER trace points is the source device. Pass a 'const struct device *' rather than a 'const char *' for more type precision / safety. Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com> Cc: Dave Jiang <dave.jiang@intel.com> Cc: Steven Rostedt <rostedt@goodmis.org> Reviewed-by:
Dave Jiang <dave.jiang@intel.com> Reviewed-by:
Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by:
Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/167030091477.4045167.15174636482098463885.stgit@dwillia2-xfh.jf.intel.comSigned-off-by:
Dan Williams <dan.j.williams@intel.com>
-
- 05 Dec, 2022 1 commit
-
-
Dan Williams authored
Unlike a CXL memory expander in a VH topology that has at least one intervening 'struct cxl_port' instance between itself and the CXL root device, an RCD attaches one-level higher. For example: VH ┌──────────┐ │ ACPI0017 │ │ root0 │ └─────┬────┘ │ ┌─────┴────┐ │ dport0 │ ┌─────┤ ACPI0016 ├─────┐ │ │ port1 │ │ │ └────┬─────┘ │ │ │ │ ┌──┴───┐ ┌──┴───┐ ┌───┴──┐ │dport0│ │dport1│ │dport2│ │ RP0 │ │ RP1 │ │ RP2 │ └──────┘ └──┬───┘ └──────┘ │ ┌───┴─────┐ │endpoint0│ │ port2 │ └─────────┘ ...vs: RCH ┌──────────┐ │ ACPI0017 │ │ root0 │ └────┬─────┘ │ ┌───┴────┐ │ dport0 │ │ACPI0016│ └───┬────┘ │ ┌────┴─────┐ │endpoint0 │ │ port1 │ └──────────┘ So arrange for endpoint port in the RCH/RCD case to appear directly connected to the host-bridge in its singular role as a dport. Compare that to the VH case where the host-bridge serves a dual role as a 'cxl_dport' for the CXL root device *and* a 'cxl_port' upstream port for the Root Ports in the Root Complex that are modeled as 'cxl_dport' instances in the CXL topology. Another deviation from the VH case is that RCDs may need to look up their component registers from the Root Complex Register Block (RCRB). That platform firmware specified RCRB area is cached by the cxl_acpi driver and conveyed via the host-bridge dport to the cxl_mem driver to perform the cxl_rcrb_to_component() lookup for the endpoint port (See 9.11.8 CXL Devices Attached to an RCH for the lookup of the upstream port component registers). Tested-by:
Robert Richter <rrichter@amd.com> Link: https://lore.kernel.org/r/166993045621.1882361.1730100141527044744.stgit@dwillia2-xfh.jf.intel.comReviewed-by:
Robert Richter <rrichter@amd.com> Reviewed-by:
Jonathan Camerom <Jonathan.Cameron@huawei.com> Signed-off-by:
Dan Williams <dan.j.williams@intel.com>
-
- 03 Dec, 2022 7 commits
-
-
Dave Jiang authored
Add AER error handler callback to read the RAS capability structure correctable error (CE) status register for the CXL device. Log the error as a trace event and clear the error. For CXL devices, the driver also needs to write back to the status register to clear the unmasked correctable errors. See CXL spec rev3.0 8.2.4.16 for RAS capability structure CE Status Register. Suggested-by:
Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by:
Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by:
Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/166985287203.2871899.13605149073500556137.stgit@djiang5-desk3.ch.intel.comSigned-off-by:
Dan Williams <dan.j.williams@intel.com>
-
Dan Williams authored
Add nominal error handling that tears down CXL.mem in response to error notifications that imply a device reset. Given some CXL.mem may be operating as System RAM, there is a high likelihood that these error events are fatal. However, if the system survives the notification the expectation is that the driver behavior is equivalent to a hot-unplug and re-plug of an endpoint. Note that this does not change the mask values from the default. That awaits CXL _OSC support to determine whether platform firmware is in control of the mask registers. Reviewed-by:
Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by:
Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/166974413966.1608150.15522782911404473932.stgit@djiang5-desk3.ch.intel.comSigned-off-by:
Dan Williams <dan.j.williams@intel.com>
-
Dave Jiang authored
Add tracepoint events for recording the CXL uncorrectable and correctable errors. For uncorrectable errors, there is additional data of 512B from the header log register (CXL spec rev3 8.2.4.16.7). The trace event will intake a dynamic array that will dump the entire Header Log data. If multiple errors are set in the status register, then the 'first error' field (CXL spec rev3 v8.2.4.16.6) is read from the Error Capabilities and Control Register in order to determine the error. This implementation does not include CXL IDE Error details. Cc: Steven Rostedt <rostedt@goodmis.org> Reviewed-by:
Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by:
Dave Jiang <dave.jiang@intel.com> Reviewed-by:
Steven Rostedt (Google) <rostedt@goodmis.org> Link: https://lore.kernel.org/r/166974413388.1608150.5875712482260436188.stgit@djiang5-desk3.ch.intel.comSigned-off-by:
Dan Williams <dan.j.williams@intel.com>
-
Dan Williams authored
The RAS Capability Structure has some ancillary information that may be relevant with respect to AER events, link and protcol error status registers. Map the RAS Capability Registers in support of defining a 'struct pci_error_handlers' instance for the cxl_pci driver. Reviewed-by:
Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by:
Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/166974412803.1608150.7096566580400947001.stgit@djiang5-desk3.ch.intel.comSigned-off-by:
Dan Williams <dan.j.williams@intel.com>
-
Dan Williams authored
There is no need to carry the barno and the block offset through the stack, just convert them to a resource base immediately. Reviewed-by:
Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by:
Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/166974411035.1608150.8605988708101648442.stgit@djiang5-desk3.ch.intel.comSigned-off-by:
Dan Williams <dan.j.williams@intel.com>
-
Dan Williams authored
The component registers are currently unused by the cxl_pci driver. Only the physical address base of the component registers is conveyed to the cxl_mem driver. Just call cxl_map_device_registers() directly. Reviewed-by:
Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by:
Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/166974410443.1608150.15855499736133349600.stgit@djiang5-desk3.ch.intel.comSigned-off-by:
Dan Williams <dan.j.williams@intel.com>
-
Dan Williams authored
The three objects 'struct cxl_nvdimm_bridge', 'struct cxl_nvdimm', and 'struct cxl_pmem_region' manage CXL persistent memory resources. The bridge represents base platform resources, the nvdimm represents one or more endpoints, and the region is a collection of nvdimms that contribute to an assembled address range. Their relationship is such that a region is torn down if any component endpoints are removed. All regions and endpoints are torn down if the foundational bridge device goes down. A workqueue was deployed to manage these interdependencies, but it is difficult to reason about, and fragile. A recent attempt to take the CXL root device lock in the cxl_mem driver was reported by lockdep as colliding with the flush_work() in the cxl_pmem flows. Instead of the workqueue, arrange for all pmem/nvdimm devices to be torn down immediately and hierarchically. A similar change is made to both the 'cxl_nvdimm' and 'cxl_pmem_region' objects. For bisect-ability both changes are made in the same patch which unfortunately makes the patch bigger than desired. Arrange for cxl_memdev and cxl_region to register a cxl_nvdimm and cxl_pmem_region as a devres release action of the bridge device. Additionally, include a devres release action of the cxl_memdev or cxl_region device that triggers the bridge's release action if an endpoint exits before the bridge. I.e. this allows either unplugging the bridge, or unplugging and endpoint to result in the same cleanup actions. To keep the patch smaller the cleanup of the now defunct workqueue infrastructure is saved for a follow-on patch. Tested-by:
Robert Richter <rrichter@amd.com> Link: https://lore.kernel.org/r/166993041773.1882361.16444301376147207609.stgit@dwillia2-xfh.jf.intel.comReviewed-by:
Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by:
Dan Williams <dan.j.williams@intel.com>
-
- 14 Nov, 2022 1 commit
-
-
Ira Weiny authored
The PCIE Data Object Exchange (DOE) mailbox is a protocol run over configuration cycles. It assumes one initiator at a time. While the kernel has control of the mailbox user space writes could interfere with the kernel access. Mark DOE mailbox config space exclusive when iterated by the CXL driver. Signed-off-by:
Ira Weiny <ira.weiny@intel.com> Reviewed-by:
Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20220926215711.2893286-3-ira.weiny@intel.comSigned-off-by:
Dan Williams <dan.j.williams@intel.com>
-
- 19 Jul, 2022 1 commit
-
-
Ira Weiny authored
DOE mailbox objects will be needed for various mailbox communications with each memory device. Iterate each DOE mailbox capability and create PCI DOE mailbox objects as found. It is not anticipated that this is the final resting place for the iteration of the DOE devices. The support of switch ports will drive this code into the PCIe side. In this imagined architecture the CXL port driver would then query into the PCI device for the DOE mailbox array. For now creating the mailboxes in the CXL port is good enough for the endpoints. Later PCIe ports will need to support this to support switch ports more generically. Cc: Dan Williams <dan.j.williams@intel.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Lukas Wunner <lukas@wunner.de> Reviewed-by:
Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by:
Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/20220719205249.566684-5-ira.weiny@intel.comSigned-off-by:
Dan Williams <dan.j.williams@intel.com>
-
- 10 Jul, 2022 1 commit
-
-
Dan Williams authored
To date the per-device-partition DPA range information has only been used for enumeration purposes. In preparation for allocating regions from available DPA capacity, convert those ranges into DPA-type resource trees. With resources and the new add_dpa_res() helper some open coded end address calculations and debug prints can be cleaned. The 'cxlds->pmem_res' and 'cxlds->ram_res' resources are child resources of the total-device DPA space and they in turn will host DPA allocations from cxl_endpoint_decoder instances (tracked by cxled->dpa_res). Cc: Ira Weiny <ira.weiny@intel.com> Reviewed-by:
Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/165603878921.551046.8127845916514734142.stgit@dwillia2-xfhSigned-off-by:
Dan Williams <dan.j.williams@intel.com>
-
- 19 May, 2022 4 commits
-
-
Dan Williams authored
In preparation for fixing the setting of the 'mem_enabled' bit in CXL DVSEC Control register, move all CXL DVSEC range enumeration into the same source file. Reviewed-by:
Ira Weiny <ira.weiny@intel.com> Reviewed-by:
Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/165291688886.1426646.15046138604010482084.stgit@dwillia2-xfhSigned-off-by:
Dan Williams <dan.j.williams@intel.com>
-
Dan Williams authored
Allow cxl_await_media_ready() to be mocked for testing purposes rather than carrying the maintenance burden of an indirect function call in the mainline driver. With the move cxl_await_media_ready() can no longer reuse the mailbox timeout override, so add a media_ready_timeout module parameter to the core to backfill. Reviewed-by:
Ira Weiny <ira.weiny@intel.com> Reviewed-by:
Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/165291688340.1426646.4755627801983775011.stgit@dwillia2-xfhSigned-off-by:
Dan Williams <dan.j.williams@intel.com>
-
Dan Williams authored
A check mem_info_valid already happens in __cxl_dvsec_ranges(). Rely on that instead of calling wait_for_valid again. Reviewed-by:
Ira Weiny <ira.weiny@intel.com> Reviewed-by:
Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/165291686632.1426646.7479581732894574486.stgit@dwillia2-xfhSigned-off-by:
Dan Williams <dan.j.williams@intel.com>
-
Dan Williams authored
Now that wait_for_media() does nothing supplemental to wait_for_media_ready() just promote wait_for_media_ready() to a common helper and drop wait_for_media(). Reviewed-by:
Ira Weiny <ira.weiny@intel.com> Reviewed-by:
Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/165291686046.1426646.4390664747934592185.stgit@dwillia2-xfhSigned-off-by:
Dan Williams <dan.j.williams@intel.com>
-
- 13 Apr, 2022 2 commits
-
-
Dan Williams authored
cxl_dvsec_ranges(), the helper for enumerating the presence of an active legacy CXL.mem configuration on a CXL 2.0 Memory Expander, is not fatal for cxl_pci because there is still value to enable mailbox operations even if CXL.mem operation is disabled. Recall that the reason cxl_pci does this initialization and not cxl_mem is to preserve the useful property (for unit testing) that cxl_mem is cxl_memdev + mmio generic, and does not require access to a 'struct pci_dev' to issue config cycles. Update 'struct cxl_endpoint_dvsec_info' to carry either a positive number of non-zero size legacy CXL DVSEC ranges, or the negative error code from __cxl_dvsec_ranges() in its @ranges member. Reported-by:
Krzysztof Zach <krzysztof.zach@intel.com> Fixes: 560f7855 ("cxl/pci: Retrieve CXL DVSEC memory info") Reviewed-by:
Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by:
Davidlohr Bueso <dave@stgolabs.net> Link: https://lore.kernel.org/r/164730735869.3806189.4032428192652531946.stgit@dwillia2-desk3.amr.corp.intel.comSigned-off-by:
Dan Williams <dan.j.williams@intel.com>
-
Dan Williams authored
In preparation for not treating DVSEC range initialization failures as fatal to cxl_pci_probe() add individual dev_dbg() statements for each of the major failure reasons in cxl_dvsec_ranges(). The rationale for cxl_dvsec_ranges() failure not being fatal is that there is still value for cxl_pci to enable mailbox operations even if CXL.mem operation is disabled. Reviewed-by:
Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by:
Ben Widawsky <ben.widawsky@intel.com> Reviewed-by:
Davidlohr Bueso <dave@stgolabs.net> Link: https://lore.kernel.org/r/164730734812.3806189.2726330688692684104.stgit@dwillia2-desk3.amr.corp.intel.comSigned-off-by:
Dan Williams <dan.j.williams@intel.com>
-
- 12 Apr, 2022 3 commits
-
-
Davidlohr Bueso authored
Use the global cxl_mbox_cmd_rc table to improve debug messaging in __cxl_pci_mbox_send_cmd() and allow cxl_mbox_send_cmd() to map to proper kernel style errno codes - this patch continues to use -ENXIO only so no change in semantics. Signed-off-by:
Davidlohr Bueso <dave@stgolabs.net> Reviewed by: Adam Manzanares <a.manzanares@samsung.com> Link: https://lore.kernel.org/r/20220404021216.66841-5-dave@stgolabs.netSigned-off-by:
Dan Williams <dan.j.williams@intel.com>
-
Davidlohr Bueso authored
Upon a completed command the caller is still expected to check the actual return_code register to ensure it succeed. This adds, per the spec, the potential command return codes. It maps the hardware return code with the kernel's errno style, and by default continues to use -ENXIO (Command completed, but device reported an error). Signed-off-by:
Davidlohr Bueso <dave@stgolabs.net> Reviewed by: Adam Manzanares <a.manzanares@samsung.com> Link: https://lore.kernel.org/r/20220404021216.66841-4-dave@stgolabs.netSigned-off-by:
Dan Williams <dan.j.williams@intel.com>
-
Davidlohr Bueso authored
Also mention the need for the caller to check against any errors from the hardware in return_code. Signed-off-by:
Davidlohr Bueso <dave@stgolabs.net> Reviewed by: Adam Manzanares <a.manzanares@samsung.com> Link: https://lore.kernel.org/r/20220404021216.66841-3-dave@stgolabs.netSigned-off-by:
Dan Williams <dan.j.williams@intel.com>
-
- 08 Apr, 2022 1 commit
-
-
Dan Williams authored
0day reports that wait_for_media_ready() declares an @rc variable twice. >> drivers/cxl/pci.c:439:7: warning: Local variable 'rc' shadows outer variable [shadowVariable] int rc; ^ drivers/cxl/pci.c:431:6: note: Shadowed declaration int rc, i; ^ drivers/cxl/pci.c:439:7: note: Shadow variable int rc; ^ Cc: Randy Dunlap <rdunlap@infradead.org> Fixes: 523e594d ("cxl/pci: Implement wait for media active") Acked-by:
Randy Dunlap <rdunlap@infradead.org> Tested-by:
Randy Dunlap <rdunlap@infradead.org> Reported-by:
kernel test robot <lkp@intel.com> Reviewed-by:
Vishal Verma <vishal.l.verma@intel.com> Link: https://lore.kernel.org/r/164944636936.455177.14136200464724208233.stgit@dwillia2-desk3.amr.corp.intel.comSigned-off-by:
Dan Williams <dan.j.williams@intel.com>
-
- 09 Feb, 2022 3 commits
-
-
Dan Williams authored
Per the CXL specification (8.1.12.2 Memory Device PCIe Capabilities and Extended Capabilities) the Device Serial Number capability is mandatory. Emit it for user tooling to identify devices. It is reasonable to ask whether the attribute should be added to the list of PCI sysfs device attributes. The PCI layer can optionally emit it too, but the CXL subsystem is aiming to preserve its independence and the possibility of CXL topologies with non-PCI devices in it. To date that has only proven useful for the 'cxl_test' model, but as can be seen with seen with ACPI0016 devices, sometimes all that is needed is a platform firmware table to point to CXL Component Registers in MMIO space to define a "CXL" device. Reviewed-by:
Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/164366608838.196598.16856227191534267098.stgit@dwillia2-desk3.amr.corp.intel.comSigned-off-by:
Dan Williams <dan.j.williams@intel.com>
-
Ben Widawsky authored
CXL 2.0 8.1.3.8.2 states: Memory_Active: When set, indicates that the CXL Range 1 memory is fully initialized and available for software use. Must be set within Range 1. Memory_Active_Timeout of deassertion of reset to CXL device if CXL.mem HwInit Mode=1 Unfortunately, Memory_Active can take quite a long time depending on media size (up to 256s per 2.0 spec). Provide a callback for the eventual establishment of CXL.mem operations via the 'cxl_mem' driver the 'struct cxl_memdev'. The implementation waits for 60s by default for now and can be overridden by the mbox_ready_time module parameter. Signed-off-by:
Ben Widawsky <ben.widawsky@intel.com> [djbw: switch to sleeping wait] Reviewed-by:
Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/164298427373.3018233.9309741847039301834.stgit@dwillia2-desk3.amr.corp.intel.comSigned-off-by:
Dan Williams <dan.j.williams@intel.com>
-
Ben Widawsky authored
Before CXL 2.0 HDM Decoder Capability mechanisms can be utilized in a device the driver must determine that the device is ready for CXL.mem operation and that platform firmware, or some other agent, has established an active decode via the legacy CXL 1.1 decoder mechanism. This legacy mechanism is defined in the CXL DVSEC as a set of range registers and status bits that take time to settle after a reset. Validate the CXL memory decode setup via the DVSEC and cache it for later consideration by the cxl_mem driver (to be added). Failure to validate is not fatal to the cxl_pci driver since that is only providing CXL command support over PCI.mmio, and might be needed to rectify CXL DVSEC validation problems. Any potential ranges that the device is already claiming via DVSEC need to be reconciled with the dynamic provisioning ranges provided by platform firmware (like ACPI CEDT.CFMWS). Leave that reconciliation to the cxl_mem driver. [djbw: shorten defines] [djbw: change precise spin wait to generous msleep] Reported-by:
kernel test robot <lkp@intel.com> Signed-off-by:
Ben Widawsky <ben.widawsky@intel.com> [djbw: clarify changelog] Reviewed-by:
Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/164375911821.559935.7375160041663453400.stgit@dwillia2-desk3.amr.corp.intel.comSigned-off-by:
Dan Williams <dan.j.williams@intel.com>
-