- 13 May, 2022 5 commits
-
-
Jason Gunthorpe authored
Following patches will change the APIs to use the struct file as the handle instead of the vfio_group, so hang on to a reference to it with the same duration of as the vfio_group. Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Yi Liu <yi.l.liu@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/2-v3-f7729924a7ea+25e33-vfio_kvm_no_group_jgg@nvidia.comSigned-off-by: Alex Williamson <alex.williamson@redhat.com>
-
Jason Gunthorpe authored
To make it easier to read and change in following patches. Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Yi Liu <yi.l.liu@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/1-v3-f7729924a7ea+25e33-vfio_kvm_no_group_jgg@nvidia.comSigned-off-by: Alex Williamson <alex.williamson@redhat.com>
-
Jason Gunthorpe authored
Now that the iommu core takes care of isolation there is no race between driver attach and container unset. Once iommu_group_release_dma_owner() returns the device can immediately be re-used. Remove this mechanism. Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Link: https://lore.kernel.org/r/0-v1-a1e8791d795b+6b-vfio_container_q_jgg@nvidia.comSigned-off-by: Alex Williamson <alex.williamson@redhat.com>
-
Alex Williamson authored
Merge IOMMU dependencies for vfio. Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
-
Jason Gunthorpe via iommu authored
Once the group enters 'owned' mode it can never be assigned back to the default_domain or to a NULL domain. It must always be actively assigned to a current domain. If the caller hasn't provided a domain then the core must provide an explicit DMA blocking domain that has no DMA map. Lazily create a group-global blocking DMA domain when iommu_group_claim_dma_owner is first called and immediately assign the group to it. This ensures that DMA is immediately fully isolated on all IOMMU drivers. If the user attaches/detaches while owned then detach will set the group back to the blocking domain. Slightly reorganize the call chains so that __iommu_group_set_core_domain() is the function that removes any caller configured domain and sets the domains back a core owned domain with an appropriate lifetime. __iommu_group_set_domain() is the worker function that can change the domain assigned to a group to any target domain, including NULL. Add comments clarifying how the NULL vs detach_dev vs default_domain works based on Robin's remarks. This fixes an oops with VFIO and SMMUv3 because VFIO will call iommu_detach_group() and then immediately iommu_domain_free(), but SMMUv3 has no way to know that the domain it is holding a pointer to has been freed. Now the iommu_detach_group() will assign the blocking domain and SMMUv3 will no longer hold a stale domain reference. Fixes: 1ea2a07a ("iommu: Add DMA ownership management interfaces") Reported-by: Qian Cai <quic_qiancai@quicinc.com> Tested-by: Baolu Lu <baolu.lu@linux.intel.com> Tested-by: Nicolin Chen <nicolinc@nvidia.com> Co-developed-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> -- Just minor polishing as discussed v3: - Change names to __iommu_group_set_domain() / __iommu_group_set_core_domain() - Clarify comments - Call __iommu_group_set_domain() directly in iommu_group_release_dma_owner() since we know it is always selecting the default_domain - Remove redundant detach_dev ops check in __iommu_detach_device and make the added WARN_ON fail instead - Check for blocking_domain in __iommu_attach_group() so VFIO can actually attach a new group - Update comments and spelling - Fix missed change to new_domain in iommu_group_do_detach_device() v2: https://lore.kernel.org/r/0-v2-f62259511ac0+6-iommu_dma_block_jgg@nvidia.com v1: https://lore.kernel.org/r/0-v1-6e9d2d0a759d+11b-iommu_dma_block_jgg@nvidia.comReviewed-by: Kevin Tian <kevin.tian@intel.com> Link: https://lore.kernel.org/r/0-v3-db7f0785022b+149-iommu_dma_block_jgg@nvidia.comSigned-off-by: Joerg Roedel <jroedel@suse.de>
-
- 11 May, 2022 15 commits
-
-
Jason Gunthorpe authored
The last user of this function is in PCI callbacks that want to convert their struct pci_dev to a vfio_device. Instead of searching use the vfio_device available trivially through the drvdata. When a callback in the device_driver is called, the caller must hold the device_lock() on dev. The purpose of the device_lock is to prevent remove() from being called (see __device_release_driver), and allow the driver to safely interact with its drvdata without races. The PCI core correctly follows this and holds the device_lock() when calling error_detected (see report_error_detected) and sriov_configure (see sriov_numvfs_store). Further, since the drvdata holds a positive refcount on the vfio_device any access of the drvdata, under the device_lock(), from a driver callback needs no further protection or refcounting. Thus the remark in the vfio_device_get_from_dev() comment does not apply here, VFIO PCI drivers all call vfio_unregister_group_dev() from their remove callbacks under the device_lock() and cannot race with the remaining callers. Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/2-v4-c841817a0349+8f-vfio_get_from_dev_jgg@nvidia.comSigned-off-by: Alex Williamson <alex.williamson@redhat.com>
-
Jason Gunthorpe authored
Having a consistent pointer in the drvdata will allow the next patch to make use of the drvdata from some of the core code helpers. Use a WARN_ON inside vfio_pci_core_register_device() to detect drivers that miss this. Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/1-v4-c841817a0349+8f-vfio_get_from_dev_jgg@nvidia.comSigned-off-by: Alex Williamson <alex.williamson@redhat.com>
-
Jason Gunthorpe authored
When the open_device() op is called the container_users is incremented and held incremented until close_device(). Thus, so long as drivers call functions within their open_device()/close_device() region they do not need to worry about the container_users. These functions can all only be called between open_device() and close_device(): vfio_pin_pages() vfio_unpin_pages() vfio_dma_rw() vfio_register_notifier() vfio_unregister_notifier() Eliminate the calls to vfio_group_add_container_user() and add vfio_assert_device_open() to detect driver mis-use. This causes the close_device() op to check device->open_count so always leave it elevated while calling the op. Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/7-v4-8045e76bf00b+13d-vfio_mdev_no_group_jgg@nvidia.comSigned-off-by: Alex Williamson <alex.williamson@redhat.com>
-
Jason Gunthorpe authored
Now that callers have been updated to use the vfio_device APIs the driver facing group interface is no longer used, delete it: - vfio_group_get_external_user_from_dev() - vfio_group_pin_pages() - vfio_group_unpin_pages() - vfio_group_iommu_domain() -- Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/6-v4-8045e76bf00b+13d-vfio_mdev_no_group_jgg@nvidia.comSigned-off-by: Alex Williamson <alex.williamson@redhat.com>
-
Jason Gunthorpe authored
Use the existing vfio_device versions of vfio_(un)pin_pages(). There is no reason to use a group interface here, kvmgt has easy access to a vfio_device. Delete kvmgt_vdev::vfio_group since these calls were the last users. Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Acked-by: Zhi Wang <zhi.a.wang@intel.com> Link: https://lore.kernel.org/r/5-v4-8045e76bf00b+13d-vfio_mdev_no_group_jgg@nvidia.comSigned-off-by: Alex Williamson <alex.williamson@redhat.com>
-
Jason Gunthorpe authored
Every caller has a readily available vfio_device pointer, use that instead of passing in a generic struct device. Change vfio_dma_rw() to take in the struct vfio_device and move the container users that would have been held by vfio_group_get_external_user_from_dev() to vfio_dma_rw() directly, like vfio_pin/unpin_pages(). Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/4-v4-8045e76bf00b+13d-vfio_mdev_no_group_jgg@nvidia.comSigned-off-by: Alex Williamson <alex.williamson@redhat.com>
-
Jason Gunthorpe authored
Every caller has a readily available vfio_device pointer, use that instead of passing in a generic struct device. The struct vfio_device already contains the group we need so this avoids complexity, extra refcountings, and a confusing lifecycle model. Reviewed-by: Christoph Hellwig <hch@lst.de> Acked-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Jason J. Herne <jjherne@linux.ibm.com> Reviewed-by: Tony Krowiak <akrowiak@linux.ibm.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/3-v4-8045e76bf00b+13d-vfio_mdev_no_group_jgg@nvidia.comSigned-off-by: Alex Williamson <alex.williamson@redhat.com>
-
Jason Gunthorpe authored
The next patch wants the vfio_device instead. There is no reason to store a pointer here since we can container_of back to the vfio_device. Reviewed-by: Eric Farman <farman@linux.ibm.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/2-v4-8045e76bf00b+13d-vfio_mdev_no_group_jgg@nvidia.comSigned-off-by: Alex Williamson <alex.williamson@redhat.com>
-
Jason Gunthorpe authored
All callers have a struct vfio_device trivially available, pass it in directly and avoid calling the expensive vfio_group_get_from_dev(). Acked-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Jason J. Herne <jjherne@linux.ibm.com> Reviewed-by: Tony Krowiak <akrowiak@linux.ibm.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/1-v4-8045e76bf00b+13d-vfio_mdev_no_group_jgg@nvidia.comSigned-off-by: Alex Williamson <alex.williamson@redhat.com>
-
Robin Murphy authored
IOMMU groups have been mandatory for some time now, so a device without one is necessarily a device without any usable IOMMU, therefore the iommu_present() check is redundant (or at best unhelpful). Signed-off-by: Robin Murphy <robin.murphy@arm.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/537103bbd7246574f37f2c88704d7824a3a889f2.1649160714.git.robin.murphy@arm.comSigned-off-by: Alex Williamson <alex.williamson@redhat.com>
-
Alex Williamson authored
Merge GVT-g dependencies for vfio. Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
-
Alex Williamson authored
Merge tag 'mlx5-lm-parallel' of https://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux into v5.19/vfio/next Improve mlx5 live migration driver From Yishai: This series improves mlx5 live migration driver in few aspects as of below. Refactor to enable running migration commands in parallel over the PF command interface. To achieve that we exposed from mlx5_core an API to let the VF be notified before that the PF command interface goes down/up. (e.g. PF reload upon health recovery). Once having the above functionality in place mlx5 vfio doesn't need any more to obtain the global PF lock upon using the command interface but can rely on the above mechanism to be in sync with the PF. This can enable parallel VFs migration over the PF command interface from kernel driver point of view. In addition, Moved to use the PF async command mode for the SAVE state command. This enables returning earlier to user space upon issuing successfully the command and improve latency by let things run in parallel. Alex, as this series touches mlx5_core we may need to send this in a pull request format to VFIO to avoid conflicts before acceptance. Link: https://lore.kernel.org/all/20220510090206.90374-1-yishaih@nvidia.comSigned-of-by: Leon Romanovsky <leonro@nvidia.com>
-
Yishai Hadas authored
Use the PF asynchronous command mode for the SAVE state command. This enables returning earlier to user space upon issuing successfully the command and improve latency by let things run in parallel. Link: https://lore.kernel.org/r/20220510090206.90374-5-yishaih@nvidia.comSigned-off-by: Yishai Hadas <yishaih@nvidia.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
-
Yishai Hadas authored
Refactor to enable different VFs to run their commands over the PF command interface in parallel and to not block one each other. This is done by not using the global PF lock that was used before but relying on the VF attach/detach mechanism to sync. Link: https://lore.kernel.org/r/20220510090206.90374-4-yishaih@nvidia.comSigned-off-by: Yishai Hadas <yishaih@nvidia.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
-
Yishai Hadas authored
Manage the VF attach/detach callback from the PF. This lets the driver to enable parallel VFs migration as will be introduced in the next patch. As part of this, reorganize the VF is migratable code to be in a separate function and rename it to be set_migratable() to match its functionality. Link: https://lore.kernel.org/r/20220510090206.90374-3-yishaih@nvidia.comSigned-off-by: Yishai Hadas <yishaih@nvidia.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
-
- 10 May, 2022 1 commit
-
-
Yishai Hadas authored
Expose mlx5_sriov_blocking_notifier_register / unregister APIs to let a VF register to be notified for its enablement / disablement by the PF. Upon VF probe it will call mlx5_sriov_blocking_notifier_register() with its notifier block and upon VF remove it will call mlx5_sriov_blocking_notifier_unregister() to drop its registration. This can give a VF the ability to clean some resources upon disable before that the command interface goes down and on the other hand sets some stuff before that it's enabled. This may be used by a VF which is migration capable in few cases.(e.g. PF load/unload upon an health recovery). Link: https://lore.kernel.org/r/20220510090206.90374-2-yishaih@nvidia.comSigned-off-by: Yishai Hadas <yishaih@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
-
- 08 May, 2022 19 commits
-
-
Linus Torvalds authored
-
git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linuxLinus Torvalds authored
Pull parisc architecture fixes from Helge Deller: "Some reverts of existing patches, which were necessary because of boot issues due to wrong CPU clock handling and cache issues which led to userspace segfaults with 32bit kernels. Dave has a whole bunch of upcoming cache fixes which I then plan to push in the next merge window. Other than that just small updates and fixes, e.g. defconfig updates, spelling fixes, a clocksource fix, boot topology fixes and a fix for /proc/cpuinfo output to satisfy lscpu" * tag 'for-5.18/parisc-3' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux: Revert "parisc: Increase parisc_cache_flush_threshold setting" parisc: Mark cr16 clock unstable on all SMP machines parisc: Fix typos in comments parisc: Change MAX_ADDRESS to become unsigned long long parisc: Merge model and model name into one line in /proc/cpuinfo parisc: Re-enable GENERIC_CPU_DEVICES for !SMP parisc: Update 32- and 64-bit defconfigs parisc: Only list existing CPUs in cpu_possible_mask Revert "parisc: Fix patch code locking and flushing" Revert "parisc: Mark sched_clock unstable only if clocks are not syncronized" Revert "parisc: Mark cr16 CPU clocksource unstable on all SMP machines"
-
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linuxLinus Torvalds authored
Pull powerpc fixes from Michael Ellerman: - Fix the DWARF CFI in our VDSO time functions, allowing gdb to backtrace through them correctly. - Fix a buffer overflow in the papr_scm driver, only triggerable by hypervisor input. - A fix in the recently added QoS handling for VAS (used for communicating with coprocessors). Thanks to Alan Modra, Haren Myneni, Kajol Jain, and Segher Boessenkool. * tag 'powerpc-5.18-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/papr_scm: Fix buffer overflow issue with CONFIG_FORTIFY_SOURCE powerpc/vdso: Fix incorrect CFI in gettimeofday.S powerpc/pseries/vas: Use QoS credits from the userspace
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull x86 fix from Thomas Gleixner: "A fix and an email address update: - Prevent FPU state corruption. The condition in irq_fpu_usable() grants FPU usage when the FPU is not used in the kernel. That's just wrong as it does not take the fpregs_lock()'ed regions into account. If FPU usage happens within such a region from interrupt context, then the FPU state gets corrupted. That's a long standing bug, which got unearthed by the recent changes to the random code. - Josh wants to use his kernel.org email address" * tag 'x86-urgent-2022-05-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/fpu: Prevent FPU state corruption MAINTAINERS: Update Josh Poimboeuf's email address
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull timer fix from Thomas Gleixner: "A fix and an email address update: - Mark the NMI safe time accessors notrace to prevent tracer recursion when they are selected as trace clocks. - John Stultz has a new email address" * tag 'timers-urgent-2022-05-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: timekeeping: Mark NMI safe time accessors as notrace MAINTAINERS: Update email address for John Stultz
-
Helge Deller authored
This reverts commit a58e9d09. Triggers segfaults with 32-bit kernels on PA8500 machines. Signed-off-by: Helge Deller <deller@gmx.de>
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull irq fix from Thomas Gleixner: "A fix for the threaded interrupt core. A quick sequence of request/free_irq() can result in a hang because the interrupt thread did not reach the thread function and got stopped in the kthread core already. That leaves a state active counter arround which makes a invocation of synchronized_irq() on that interrupt hang forever. Ensure that the thread reached the thread function in request_irq() to prevent that" * tag 'irq-urgent-2022-05-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: genirq: Synchronize interrupt thread startup
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull locking fixlet from Thomas Gleixner: "Just a email address update for MAINTAINERS and mailmap" * tag 'locking-urgent-2022-05-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: futex: MAINTAINERS, .mailmap: Update André's email address
-
Helge Deller authored
The cr16 interval timers are not synchronized across CPUs, even with just one dual-core CPU. This becomes visible if the machines have a longer uptime. Signed-off-by: Helge Deller <deller@gmx.de>
-
Julia Lawall authored
Various spelling mistakes in comments. Detected with the help of Coccinelle. Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr> Signed-off-by: Helge Deller <deller@gmx.de>
-
Helge Deller authored
Dave noticed that for the 32-bit kernel MAX_ADDRESS should be a ULL, otherwise this define would become 0: MAX_ADDRESS (1UL << MAX_ADDRBITS) It has no real effect on the kernel. Signed-off-by: Helge Deller <deller@gmx.de> Noticed-by: John David Anglin <dave.anglin@bell.net>
-
Helge Deller authored
The Linux tool "lscpu" shows the double amount of CPUs if we have "model" and "model name" in two different lines in /proc/cpuinfo. This change combines the model and the model name into one line. Signed-off-by: Helge Deller <deller@gmx.de> Cc: stable@vger.kernel.org
-
Helge Deller authored
In commit 62773112 ("parisc: Switch from GENERIC_CPU_DEVICES to GENERIC_ARCH_TOPOLOGY") GENERIC_CPU_DEVICES was unconditionally turned off, but this triggers a warning in topology_add_dev(). Turning it back on for the !SMP case avoids this warning. Reported-by: Guenter Roeck <linux@roeck-us.net> Tested-by: Guenter Roeck <linux@roeck-us.net> Fixes: 62773112 ("parisc: Switch from GENERIC_CPU_DEVICES to GENERIC_ARCH_TOPOLOGY") Signed-off-by: Helge Deller <deller@gmx.de>
-
Helge Deller authored
Enable CONFIG_CGROUPS=y on 32-bit defconfig for systemd-support, and enable CONFIG_NAMESPACES and CONFIG_USER_NS. Signed-off-by: Helge Deller <deller@gmx.de>
-
Helge Deller authored
The inventory knows which CPUs are in the system, so this bitmask should be in cpu_possible_mask instead of the bitmask based on CONFIG_NR_CPUS. Reset the cpu_possible_mask before scanning the system for CPUs, and mark each existing CPU as possible during initialization of that CPU. This avoids those warnings later on too: register_cpu_capacity_sysctl: too early to get CPU4 device! Signed-off-by: Helge Deller <deller@gmx.de> Noticed-by: John David Anglin <dave.anglin@bell.net>
-
Helge Deller authored
This reverts commit a9fe7fa7. Leads to segfaults on 32bit kernel. Signed-off-by: Helge Deller <deller@gmx.de>
-
Helge Deller authored
This reverts commit d97180ad. It triggers RCU stalls at boot with a 32-bit kernel. Signed-off-by: Helge Deller <deller@gmx.de> Noticed-by: John David Anglin <dave.anglin@bell.net> Cc: stable@vger.kernel.org # v5.15+
-
Helge Deller authored
This reverts commit afdb4a5b. It triggers RCU stalls at boot with a 32-bit kernel. Signed-off-by: Helge Deller <deller@gmx.de> Noticed-by: John David Anglin <dave.anglin@bell.net> Cc: stable@vger.kernel.org # v5.16+
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull PASID fix from Thomas Gleixner: "A single bugfix for the PASID management code, which freed the PASID too early. The PASID needs to be tied to the mm lifetime, not to the address space lifetime" * tag 'core-urgent-2022-05-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: mm: Fix PASID use-after-free issue
-