• Lukas Wunner's avatar
    PCI: pciehp: Avoid slot access during reset · 5b3f7b7d
    Lukas Wunner authored
    The ->reset_slot callback introduced by commits:
    
      2e35afae ("PCI: pciehp: Add reset_slot() method") and
      06a8d89a ("PCI: pciehp: Disable link notification across slot reset")
    
    disables notification of Presence Detect Changed and Data Link Layer
    State Changed events for the duration of a secondary bus reset.
    
    However a bus reset not only triggers these events, but may also clear
    the Presence Detect State bit in the Slot Status register and the Data
    Link Layer Link Active bit in the Link Status register momentarily.
    According to Sinan Kaya:
    
     "I know for a fact that bus reset clears the Data Link Layer Active bit
      as soon as link goes down.  It gets set again following link up.
      Presence detect depends on the HW implementation.  QDT root ports
      don't change presence detect for instance since nobody actually
      removed the card.  If an implementation supports in-band presence
      detect, the answer is yes.  As soon as the link goes down, presence
      detect bit will get cleared until recovery."
      https://lkml.kernel.org/r/42e72f83-3b24-f7ef-e5bc-290fae99259a@codeaurora.org
    
      In-band presence detect is also covered in Table 4-15 in PCIe r4.0,
      sec 4.2.6.
    
    pciehp should therefore ensure that any parts of the driver that access
    those bits do not run concurrently to a bus reset.  The only precaution
    the commits took to that effect was to halt interrupt polling.  They
    made no effort to drain the slot workqueue, cancel an outstanding
    Attention Button work, or block slot enable/disable requests via sysfs
    and in the ->probe hook.
    
    Now that pciehp is converted to enable/disable the slot exclusively from
    the IRQ thread, the only places accessing the two above-mentioned bits
    are the IRQ thread and the ->probe hook.  Add locking to serialize them
    with a bus reset.  This obviates the need to halt interrupt polling.
    Do not add locking to the ->get_adapter_status sysfs callback to afford
    users unfettered access to that bit.  Use an rw_semaphore in lieu of a
    regular mutex to allow parallel execution of the non-reset code paths
    accessing the critical bits, i.e. the IRQ thread and the ->probe hook.
    Signed-off-by: default avatarLukas Wunner <lukas@wunner.de>
    Signed-off-by: default avatarBjorn Helgaas <bhelgaas@google.com>
    Cc: Rajat Jain <rajatja@google.com>
    Cc: Alex Williamson <alex.williamson@redhat.com>
    Cc: Sinan Kaya <okaya@kernel.org>
    5b3f7b7d
pciehp.h 7.97 KB