- 18 Jul, 2016 10 commits
-
-
Andre Przywara authored
Introduce a new KVM device that represents an ARM Interrupt Translation Service (ITS) controller. Since there can be multiple of this per guest, we can't piggy back on the existing GICv3 distributor device, but create a new type of KVM device. On the KVM_CREATE_DEVICE ioctl we allocate and initialize the ITS data structure and store the pointer in the kvm_device data. Upon an explicit init ioctl from userland (after having setup the MMIO address) we register the handlers with the kvm_io_bus framework. Any reference to an ITS thus has to go via this interface. Signed-off-by: Andre Przywara <andre.przywara@arm.com> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Tested-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
-
Andre Przywara authored
The ARM GICv3 ITS emulation code goes into a separate file, but needs to be connected to the GICv3 emulation, of which it is an option. The ITS MMIO handlers require the respective ITS pointer to be passed in, so we amend the existing VGIC MMIO framework to let it cope with that. Also we introduce the basic ITS data structure and initialize it, but don't return any success yet, as we are not yet ready for the show. Signed-off-by: Andre Przywara <andre.przywara@arm.com> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Tested-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
-
Andre Przywara authored
In the GICv3 redistributor there are the PENDBASER and PROPBASER registers which we did not emulate so far, as they only make sense when having an ITS. In preparation for that emulate those MMIO accesses by storing the 64-bit data written into it into a variable which we later read in the ITS emulation. We also sanitise the registers, making sure RES0 regions are respected and checking for valid memory attributes. Signed-off-by: Andre Przywara <andre.przywara@arm.com> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Tested-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
-
Andre Przywara authored
arm-gic-v3.h contains bit and register definitions for the GICv3 and ITS, at least for the bits the we currently care about. The ITS emulation needs more definitions, so add them and refactor the memory attribute #defines to be more universally usable. To avoid changing all users, we still provide some of the old definitons defined with the help of the new macros. Signed-off-by: Andre Przywara <andre.przywara@arm.com> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Tested-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
-
Andre Przywara authored
In the moment our struct vgic_irq's are statically allocated at guest creation time. So getting a pointer to an IRQ structure is trivial and safe. LPIs are more dynamic, they can be mapped and unmapped at any time during the guest's _runtime_. In preparation for supporting LPIs we introduce reference counting for those structures using the kernel's kref infrastructure. Since private IRQs and SPIs are statically allocated, we avoid actually refcounting them, since they would never be released anyway. But we take provisions to increase the refcount when an IRQ gets onto a VCPU list and decrease it when it gets removed. Also this introduces vgic_put_irq(), which wraps kref_put and hides the release function from the callers. Signed-off-by: Andre Przywara <andre.przywara@arm.com> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Tested-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
-
Andre Przywara authored
The kvm_io_bus framework is a nice place of holding information about various MMIO regions for kernel emulated devices. Add a call to retrieve the kvm_io_device structure which is associated with a certain MMIO address. This avoids to duplicate kvm_io_bus' knowledge of MMIO regions without having to fake MMIO calls if a user needs the device a certain MMIO address belongs to. This will be used by the ITS emulation to get the associated ITS device when someone triggers an MSI via an ioctl from userspace. Signed-off-by: Andre Przywara <andre.przywara@arm.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Acked-by: Christoffer Dall <christoffer.dall@linaro.org> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Tested-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
-
Andre Przywara authored
KVM capabilities can be a per-VM property, though ARM/ARM64 currently does not pass on the VM pointer to the architecture specific capability handlers. Add a "struct kvm*" parameter to those function to later allow proper per-VM capability reporting. Signed-off-by: Andre Przywara <andre.przywara@arm.com> Reviewed-by: Eric Auger <eric.auger@linaro.org> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Acked-by: Christoffer Dall <christoffer.dall@linaro.org> Tested-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
-
Andre Przywara authored
The ARM GICv3 ITS MSI controller requires a device ID to be able to assign the proper interrupt vector. On real hardware, this ID is sampled from the bus. To be able to emulate an ITS controller, extend the KVM MSI interface to let userspace provide such a device ID. For PCI devices, the device ID is simply the 16-bit bus-device-function triplet, which should be easily available to the userland tool. Also there is a new KVM capability which advertises whether the current VM requires a device ID to be set along with the MSI data. This flag is still reported as not available everywhere, later we will enable it when ITS emulation is used. Signed-off-by: Andre Przywara <andre.przywara@arm.com> Reviewed-by: Eric Auger <eric.auger@linaro.org> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Acked-by: Christoffer Dall <christoffer.dall@linaro.org> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Tested-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
-
Andre Przywara authored
kvm_register_device_ops() can return an error, so lets check its return value and propagate this up the call chain. Signed-off-by: Andre Przywara <andre.przywara@arm.com> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Tested-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
-
Andre Przywara authored
Logically a GICv3 redistributor is assigned to a (v)CPU, so we should aim to keep redistributor related variables out of our struct vgic_dist. Let's start by replacing the redistributor related kvm_io_device array with two members in our existing struct vgic_cpu, which are naturally per-VCPU and thus don't require any allocation / freeing. So apart from the better fit with the redistributor design this saves some code as well. Signed-off-by: Andre Przywara <andre.przywara@arm.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Tested-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
-
- 14 Jul, 2016 1 commit
-
-
Dan Carpenter authored
My static checker complains that this condition looks like it should be == instead of =. This isn't a fast path, so we don't need to be fancy. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
-
- 03 Jul, 2016 19 commits
-
-
Marc Zyngier authored
We have both KERN_TO_HYP and kern_hyp_va, which do the exact same thing. Let's standardize on the latter. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
-
Marc Zyngier authored
This is more of a safety measure than anything else: If we end-up with an idmap page that intersect with the range picked for the the HYP VA space, abort the KVM setup, as it is unsafe to go further. I cannot imagine it happening on 64bit (we have a mechanism to work around it), but could potentially occur on a 32bit system with the kernel loaded high enough in memory so that in conflicts with the kernel VA. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
-
Marc Zyngier authored
We can now remove a number of dead #defines, thanks to the trampoline code being gone. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
-
Marc Zyngier authored
So far, KVM was getting in the way of kexec on 32bit (and the arm64 kexec hackers couldn't be bothered to fix it on 32bit...). With simpler page tables, tearing KVM down becomes very easy, so let's just do it. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
-
Marc Zyngier authored
Just like for arm64, we can now make the HYP setup a lot simpler, and we can now initialise it in one go (instead of the two phases we currently have). Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
-
Marc Zyngier authored
There is no way to free the boot PGD, because it doesn't exist anymore as a standalone entity. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
-
Marc Zyngier authored
Since we now only have one set of page tables, the concept of boot_pgd is useless and can be removed. We still keep it as an element of the "extended idmap" thing. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
-
Marc Zyngier authored
Now that we only have the "merged page tables" case to deal with, there is a bunch of things we can simplify in the HYP code (both at init and teardown time). Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
-
Marc Zyngier authored
We're in a position where we can now always have "merged" page tables, where both the runtime mapping and the idmap coexist. This results in some code being removed, but there is more to come. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
-
Marc Zyngier authored
Add the code that enables the switch to the lower HYP VA range. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
-
Marc Zyngier authored
Declare the __hyp_text_start/end symbols in asm/virt.h so that they can be reused without having to declare them locally. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
-
Marc Zyngier authored
As we move towards a selectable HYP VA range, it is obvious that we don't want to test a variable to find out if we need to use the bottom VA range, the top VA range, or use the address as is (for VHE). Instead, we can expand our current helper to generate the right mask or nop with code patching. We default to using the top VA space, with alternatives to switch to the bottom one or to nop out the instructions. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
-
Marc Zyngier authored
Define the two possible HYP VA regions in terms of VA_BITS, and keep HYP_PAGE_OFFSET_MASK as a temporary compatibility definition. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
-
Marc Zyngier authored
As we need to indicate to the rest of the kernel which region of the HYP VA space is safe to use, add a capability that will indicate that KVM should use the [VA_BITS-2:0] range. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
-
Marc Zyngier authored
HYP_PAGE_OFFSET is not massively useful. And the way we use it in KERN_HYP_VA is inconsistent with the equivalent operation in EL2, where we use a mask instead. Let's replace the uses of HYP_PAGE_OFFSET with HYP_PAGE_OFFSET_MASK, and get rid of the pointless macro. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
-
Marc Zyngier authored
hyp_kern_va is now completely unused, so let's remove it entirely. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
-
Marc Zyngier authored
__hyp_panic_string is passed via the HYP panic code to the panic function, and is being "upgraded" to a kernel address, as it is referenced by the HYP code (in a PC-relative way). This is a bit silly, and we'd be better off obtaining the kernel address and not mess with it at all. This patch implements this with a tiny bit of asm glue, by forcing the string pointer to be read from the literal pool. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
-
Marc Zyngier authored
Since dealing with VA ranges tends to hurt my brain badly, let's start with a bit of documentation that will hopefully help understanding what comes next... Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
-
Marc Zyngier authored
I don't think any single piece of the KVM/ARM code ever generated as much hatred as the GIC emulation. It was written by someone who had zero experience in modeling hardware (me), was riddled with design flaws, should have been scrapped and rewritten from scratch long before having a remote chance of reaching mainline, and yet we supported it for a good three years. No need to mention the names of those who suffered, the git log is singing their praises. Thankfully, we now have a much more maintainable implementation, and we can safely put the grumpy old GIC to rest. Fellow hackers, please raise your glass in memory of the GIC: The GIC is dead, long live the GIC! Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
-
- 29 Jun, 2016 5 commits
-
-
Marc Zyngier authored
Structures that can be generally written to don't have any requirement to be executable (quite the opposite). This includes the kvm and vcpu structures, as well as the stacks. Let's change the default to incorporate the XN flag. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
-
Marc Zyngier authored
There should be no reason for mapping the HYP text read/write. As such, let's have a new set of flags (PAGE_HYP_EXEC) that allows execution, but makes the page as read-only, and update the two call sites that deal with mapping code. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
-
Marc Zyngier authored
In order to be able to use C code in HYP, we're now mapping the kernel's rodata in HYP. It works absolutely fine, except that we're mapping it RWX, which is not what it should be. Add a new HYP_PAGE_RO protection, and pass it as the protection flags when mapping the rodata section. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
-
Marc Zyngier authored
EL2 page tables can be configured to deny code from being executed, which is done by setting bit 54 in the page descriptor. It is the same bit as PTE_UXN, but the "USER" reference felt odd in the hypervisor code. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
-
Marc Zyngier authored
Currently, create_hyp_mappings applies a "one size fits all" page protection (PAGE_HYP). As we're heading towards separate protections for different sections, let's make this protection a parameter, and let the callers pass their prefered protection (PAGE_HYP for everyone for the time being). Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
-
- 21 Jun, 2016 5 commits
-
-
Paolo Bonzini authored
Merge tag 'kvm-s390-next-4.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD KVM: s390: vSIE (nested virtualization) feature for 4.8 (kvm/next) With an updated QEMU this allows to create nested KVM guests (KVM under KVM) on s390. s390 memory management changes from Martin Schwidefsky or acked by Martin. One common code memory management change (pageref) acked by Andrew Morton. The feature has to be enabled with the nested medule parameter.
-
David Hildenbrand authored
Let's be careful first and allow nested virtualization only if enabled by the system administrator. In addition, user space still has to explicitly enable it via SCLP features for it to work. Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
-
David Hildenbrand authored
We have certain SIE features that we cannot support for now. Let's add these features, so user space can directly prepare to enable them, so we don't have to update yet another component. In addition, add a comment block, telling why it is for now not possible to forward/enable these features. Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
-
David Hildenbrand authored
Guest 2 sets up the epoch of guest 3 from his point of view. Therefore, we have to add the guest 2 epoch to the guest 3 epoch. We also have to take care of guest 2 epoch changes on STP syncs. This will work just fine by also updating the guest 3 epoch when a vsie_block has been set for a VCPU. Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
-
David Hildenbrand authored
Whenever a SIGP external call is injected via the SIGP external call interpretation facility, the VCPU is not kicked. When a VCPU is currently in the VSIE, the external call might not be processed immediately. Therefore we have to provoke partial execution exceptions, which leads to a kick of the VCPU and therefore also kick out of VSIE. This is done by simulating the WAIT state. This bit has no other side effects. Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
-