Revert "swiotlb-xen: pass dev_addr to swiotlb_tbl_unmap_single"
This reverts commit 2c3fc8d2. This commit broke on x86 PV because entries in the generic SWIOTLB are indexed using (pseudo-)physical address not DMA address and these are not the same in a x86 PV guest. Signed-off-by:David Vrabel <david.vrabel@citrix.com> Reviewed-by:
Stefano Stabellini <stefano.stabellini@eu.citrix.com> Revert "swiotlb-xen: pass dev_addr to swiotlb_tbl_unmap_single" This reverts commit 2c3fc8d2. This commit broke on x86 PV because entries in the generic SWIOTLB are indexed using (pseudo-)physical address not DMA address and these are not the same in a x86 PV guest. Signed-off-by:
David Vrabel <david.vrabel@citrix.com> Reviewed-by:
Stefano Stabellini <stefano.stabellini@eu.citrix.com> xen: annotate xen_set_identity_and_remap_chunk() with __init Commit 5b8e7d80 removed the __init annotation from xen_set_identity_and_remap_chunk(). Add it again. Signed-off-by:
Juergen Gross <jgross@suse.com> Signed-off-by:
David Vrabel <david.vrabel@citrix.com> xen: introduce helper functions to do safe read and write accesses Introduce two helper functions to safely read and write unsigned long values from or to memory when the access may fault because the mapping is non-present or read-only. These helpers can be used instead of open coded uses of __get_user() and __put_user() avoiding the need to do casts to fix sparse warnings. Use the helpers in page.h and p2m.c. This will fix the sparse warnings when doing "make C=1". Signed-off-by:
Juergen Gross <jgross@suse.com> Signed-off-by:
David Vrabel <david.vrabel@citrix.com> xen: Speed up set_phys_to_machine() by using read-only mappings Instead of checking at each call of set_phys_to_machine() whether a new p2m page has to be allocated due to writing an entry in a large invalid or identity area, just map those areas read only and react to a page fault on write by allocating the new page. This change will make the common path with no allocation much faster as it only requires a single write of the new mfn instead of walking the address translation tables and checking for the special cases. Suggested-by:
David Vrabel <david.vrabel@citrix.com> Signed-off-by:
Juergen Gross <jgross@suse.com> Reviewed-by:
David Vrabel <david.vrabel@citrix.com> Reviewed-by:
Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by:
David Vrabel <david.vrabel@citrix.com> xen: switch to linear virtual mapped sparse p2m list At start of the day the Xen hypervisor presents a contiguous mfn list to a pv-domain. In order to support sparse memory this mfn list is accessed via a three level p2m tree built early in the boot process. Whenever the system needs the mfn associated with a pfn this tree is used to find the mfn. Instead of using a software walked tree for accessing a specific mfn list entry this patch is creating a virtual address area for the entire possible mfn list including memory holes. The holes are covered by mapping a pre-defined page consisting only of "invalid mfn" entries. Access to a mfn entry is possible by just using the virtual base address of the mfn list and the pfn as index into that list. This speeds up the (hot) path of determining the mfn of a pfn. Kernel build on a Dell Latitude E6440 (2 cores, HT) in 64 bit Dom0 showed following improvements: Elapsed time: 32:50 -> 32:35 System: 18:07 -> 17:47 User: 104:00 -> 103:30 Tested with following configurations: - 64 bit dom0, 8GB RAM - 64 bit dom0, 128 GB RAM, PCI-area above 4 GB - 32 bit domU, 512 MB, 8 GB, 43 GB (more wouldn't work even without the patch) - 32 bit domU, ballooning up and down - 32 bit domU, save and restore - 32 bit domU with PCI passthrough - 64 bit domU, 8 GB, 2049 MB, 5000 MB - 64 bit domU, ballooning up and down - 64 bit domU, save and restore - 64 bit domU with PCI passthrough Signed-off-by:
Juergen Gross <jgross@suse.com> Signed-off-by:
David Vrabel <david.vrabel@citrix.com> xen: Hide get_phys_to_machine() to be able to tune common path Today get_phys_to_machine() is always called when the mfn for a pfn is to be obtained. Add a wrapper __pfn_to_mfn() as inline function to be able to avoid calling get_phys_to_machine() when possible as soon as the switch to a linear mapped p2m list has been done. Signed-off-by:
Juergen Gross <jgross@suse.com> Reviewed-by:
David Vrabel <david.vrabel@citrix.com> Signed-off-by:
David Vrabel <david.vrabel@citrix.com> x86: Introduce function to get pmd entry pointer Introduces lookup_pmd_address() to get the address of the pmd entry related to a virtual address in the current address space. This function is needed for support of a virtual mapped sparse p2m list in xen pv domains, as we need the address of the pmd entry, not the one of the pte in that case. Signed-off-by:
Juergen Gross <jgross@suse.com> Signed-off-by:
David Vrabel <david.vrabel@citrix.com> xen: Delay invalidating extra memory When the physical memory configuration is initialized the p2m entries for not pouplated memory pages are set to "invalid". As those pages are beyond the hypervisor built p2m list the p2m tree has to be extended. This patch delays processing the extra memory related p2m entries during the boot process until some more basic memory management functions are callable. This removes the need to create new p2m entries until virtual memory management is available. Signed-off-by:
Juergen Gross <jgross@suse.com> Reviewed-by:
David Vrabel <david.vrabel@citrix.com> Signed-off-by:
David Vrabel <david.vrabel@citrix.com> xen: Delay m2p_override initialization The m2p overrides are used to be able to find the local pfn for a foreign mfn mapped into the domain. They are used by driver backends having to access frontend data. As this functionality isn't used in early boot it makes no sense to initialize the m2p override functions very early. It can be done later without doing any harm, removing the need for allocating memory via extend_brk(). While at it make some m2p override functions static as they are only used internally. Signed-off-by:
Juergen Gross <jgross@suse.com> Reviewed-by:
David Vrabel <david.vrabel@citrix.com> Reviewed-by:
Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by:
David Vrabel <david.vrabel@citrix.com> xen: Delay remapping memory of pv-domain Early in the boot process the memory layout of a pv-domain is changed to match the E820 map (either the host one for Dom0 or the Xen one) regarding placement of RAM and PCI holes. This requires removing memory pages initially located at positions not suitable for RAM and adding them later at higher addresses where no restrictions apply. To be able to operate on the hypervisor supported p2m list until a virtual mapped linear p2m list can be constructed, remapping must be delayed until virtual memory management is initialized, as the initial p2m list can't be extended unlimited at physical memory initialization time due to it's fixed structure. A further advantage is the reduction in complexity and code volume as we don't have to be careful regarding memory restrictions during p2m updates. Signed-off-by:
Juergen Gross <jgross@suse.com> Reviewed-by:
David Vrabel <david.vrabel@citrix.com> Signed-off-by:
David Vrabel <david.vrabel@citrix.com> xen: use common page allocation function in p2m.c In arch/x86/xen/p2m.c three different allocation functions for obtaining a memory page are used: extend_brk(), alloc_bootmem_align() or __get_free_page(). Which of those functions is used depends on the progress of the boot process of the system. Introduce a common allocation routine selecting the to be called allocation routine dynamically based on the boot progress. This allows moving initialization steps without having to care about changing allocation calls. Signed-off-by:
Juergen Gross <jgross@suse.com> Signed-off-by:
David Vrabel <david.vrabel@citrix.com> xen: Make functions static Some functions in arch/x86/xen/p2m.c are used locally only. Make them static. Rearrange the functions in p2m.c to avoid forward declarations. Signed-off-by:
Juergen Gross <jgross@suse.com> Signed-off-by:
David Vrabel <david.vrabel@citrix.com> xen: fix some style issues in p2m.c The source arch/x86/xen/p2m.c has some coding style issues. Fix them. Signed-off-by:
Juergen Gross <jgross@suse.com> Signed-off-by:
David Vrabel <david.vrabel@citrix.com> xen/pci: Use APIC directly when APIC virtualization hardware is available When hardware supports APIC/x2APIC virtualization we don't need to use pirqs for MSI handling and instead use APIC since most APIC accesses (MMIO or MSR) will now be processed without VMEXITs. As an example, netperf on the original code produces this profile (collected wih 'xentrace -e 0x0008ffff -T 5'): 342 cpu_change 260 CPUID 34638 HLT 64067 INJ_VIRQ 28374 INTR 82733 INTR_WINDOW 10 NPF 24337 TRAP 370610 vlapic_accept_pic_intr 307528 VMENTRY 307527 VMEXIT 140998 VMMCALL 127 wrap_buffer After applying this patch the same test shows 230 cpu_change 260 CPUID 36542 HLT 174 INJ_VIRQ 27250 INTR 222 INTR_WINDOW 20 NPF 24999 TRAP 381812 vlapic_accept_pic_intr 166480 VMENTRY 166479 VMEXIT 77208 VMMCALL 81 wrap_buffer ApacheBench results (ab -n 10000 -c 200) improve by about 10% Signed-off-by:
Boris Ostrovsky <boris.ostrovsky@oracle.com> Reviewed-by:
Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by:
Andrew Cooper <andrew.cooper3@citrix.com> Signed-off-by:
David Vrabel <david.vrabel@citrix.com> xen/pci: Defer initialization of MSI ops on HVM guests If the hardware supports APIC virtualization we may decide not to use pirqs and instead use APIC/x2APIC directly, meaning that we don't want to set x86_msi.setup_msi_irqs and x86_msi.teardown_msi_irq to Xen-specific routines. However, x2APIC is not set up by the time pci_xen_hvm_init() is called so we need to postpone setting these ops until later, when we know which APIC mode is used. (Note that currently x2APIC is never initialized on HVM guests. This may change in the future) Signed-off-by:
Boris Ostrovsky <boris.ostrovsky@oracle.com> Acked-by:
Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by:
David Vrabel <david.vrabel@citrix.com> xen-pciback: drop SR-IOV VFs when PF driver unloads When a PF driver unloads, it may find it necessary to leave the VFs around simply because of pciback having marked them as assigned to a guest. Utilize a suitable notification to let go of the VFs, thus allowing the PF to go back into the state it was before its driver loaded (which in particular allows the driver to be loaded again with it being able to create the VFs anew, but which also allows to then pass through the PF instead of the VFs). Don't do this however for any VFs currently in active use by a guest. Signed-off-by:
Jan Beulich <jbeulich@suse.com> [v2: Removed the switch statement, moved it about] [v3: Redid it a bit differently] Signed-off-by:
Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by:
David Vrabel <david.vrabel@citrix.com> xen/pciback: Restore configuration space when detaching from a guest. The commit "xen/pciback: Don't deadlock when unbinding." was using the version of pci_reset_function which would lock the device lock. That is no good as we can dead-lock. As such we swapped to using the lock-less version and requiring that the callers of 'pcistub_put_pci_dev' take the device lock. And as such this bug got exposed. Using the lock-less version is OK, except that we tried to use 'pci_restore_state' after the lock-less version of __pci_reset_function_locked - which won't work as 'state_saved' is set to false. Said 'state_saved' is a toggle boolean that is to be used by the sequence of a) pci_save_state/pci_restore_state or b) pci_load_and_free_saved_state/pci_restore_state. We don't want to use a) as the guest might have messed up the PCI configuration space and we want it to revert to the state when the PCI device was binded to us. Therefore we pick b) to restore the configuration space. We restore from our 'golden' version of PCI configuration space, when an: - Device is unbinded from pciback - Device is detached from a guest. Reported-by:
Sander Eikelenboom <linux@eikelenboom.it> Signed-off-by:
Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by:
David Vrabel <david.vrabel@citrix.com> PCI: Expose pci_load_saved_state for public consumption. We have the pci_load_and_free_saved_state, and pci_store_saved_state but are missing the functionality to just load the state multiple times in the PCI device without having to free/save the state. This patch makes it possible to use this function. Signed-off-by:
Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by:
Bjorn Helgaas <bhelgaas@google.com> Signed-off-by:
David Vrabel <david.vrabel@citrix.com> xen/pciback: Remove tons of dereferences A little cleanup. No functional difference. Reviewed-by:
Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by:
Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by:
David Vrabel <david.vrabel@citrix.com> xen/pciback: Print out the domain owning the device. We had been printing it only if the device was built with debug enabled. But this information is useful in the field to troubleshoot. Signed-off-by:
Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by:
David Vrabel <david.vrabel@citrix.com> Signed-off-by:
David Vrabel <david.vrabel@citrix.com> xen/pciback: Include the domain id if removing the device whilst still in use Cleanup the function a bit - also include the id of the domain that is using the device. Signed-off-by:
Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by:
David Vrabel <david.vrabel@citrix.com> Signed-off-by:
David Vrabel <david.vrabel@citrix.com> driver core: Provide an wrapper around the mutex to do lockdep warnings Instead of open-coding it in drivers that want to double check that their functions are indeed holding the device lock. Signed-off-by:
Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Suggested-by:
David Vrabel <david.vrabel@citrix.com> Acked-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
David Vrabel <david.vrabel@citrix.com> xen/pciback: Don't deadlock when unbinding. As commit 0a9fd015 'xen/pciback: Document the entry points for 'pcistub_put_pci_dev'' explained there are four entry points in this function. Two of them are when the user fiddles in the SysFS to unbind a device which might be in use by a guest or not. Both 'unbind' states will cause a deadlock as the the PCI lock has already been taken, which then pci_device_reset tries to take. We can simplify this by requiring that all callers of pcistub_put_pci_dev MUST hold the device lock. And then we can just call the lockless version of pci_device_reset. To make it even simpler we will modify xen_pcibk_release_pci_dev to quality whether it should take a lock or not - as it ends up calling xen_pcibk_release_pci_dev and needs to hold the lock. Reviewed-by:
Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by:
Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by:
David Vrabel <david.vrabel@citrix.com> swiotlb-xen: pass dev_addr to swiotlb_tbl_unmap_single Need to pass the pointer within the swiotlb internal buffer to the swiotlb library, that in the case of xen_unmap_single is dev_addr, not paddr. Signed-off-by:
Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by:
Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> CC: stable@vger.kernel.org (cherry picked from commit dbdd7476 4ef8e3f3 2c3fc8d2) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
Showing
Please register or sign in to comment