• Abhishek Sahu's avatar
    vfio/pci: Mask INTx during runtime suspend · 4813724c
    Abhishek Sahu authored
    This patch adds INTx handling during runtime suspend/resume.
    All the suspend/resume related code for the user to put the device
    into the low power state will be added in subsequent patches.
    
    The INTx lines may be shared among devices. Whenever any INTx
    interrupt comes for the VFIO devices, then vfio_intx_handler() will be
    called for each device sharing the interrupt. Inside vfio_intx_handler(),
    it calls pci_check_and_mask_intx() and checks if the interrupt has
    been generated for the current device. Now, if the device is already
    in the D3cold state, then the config space can not be read. Attempt
    to read config space in D3cold state can cause system unresponsiveness
    in a few systems. To prevent this, mask INTx in runtime suspend callback,
    and unmask the same in runtime resume callback. If INTx has been already
    masked, then no handling is needed in runtime suspend/resume callbacks.
    'pm_intx_masked' tracks this, and vfio_pci_intx_mask() has been updated
    to return true if the INTx vfio_pci_irq_ctx.masked value is changed
    inside this function.
    
    For the runtime suspend which is triggered for the no user of VFIO
    device, the 'irq_type' will be VFIO_PCI_NUM_IRQS and these
    callbacks won't do anything.
    
    The MSI/MSI-X are not shared so similar handling should not be
    needed for MSI/MSI-X. vfio_msihandler() triggers eventfd_signal()
    without doing any device-specific config access. When the user performs
    any config access or IOCTL after receiving the eventfd notification,
    then the device will be moved to the D0 state first before
    servicing any request.
    
    Another option was to check this flag 'pm_intx_masked' inside
    vfio_intx_handler() instead of masking the interrupts. This flag
    is being set inside the runtime_suspend callback but the device
    can be in non-D3cold state (for example, if the user has disabled D3cold
    explicitly by sysfs, the D3cold is not supported in the platform, etc.).
    Also, in D3cold supported case, the device will be in D0 till the
    PCI core moves the device into D3cold. In this case, there is
    a possibility that the device can generate an interrupt. Adding check
    in the IRQ handler will not clear the IRQ status and the interrupt
    line will still be asserted. This can cause interrupt flooding.
    Signed-off-by: default avatarAbhishek Sahu <abhsahu@nvidia.com>
    Link: https://lore.kernel.org/r/20220829114850.4341-4-abhsahu@nvidia.comSigned-off-by: default avatarAlex Williamson <alex.williamson@redhat.com>
    4813724c
vfio_pci_priv.h 3.04 KB