1. 04 Mar, 2016 7 commits
    • Radim Krčmář's avatar
      KVM: i8254: remove unnecessary uses of PIT state lock · b39c90b6
      Radim Krčmář authored
      - kvm_create_pit had to lock only because it exposed kvm->arch.vpit very
        early, but initialization doesn't use kvm->arch.vpit since the last
        patch, so we can drop locking.
      - kvm_free_pit is only run after there are no users of KVM and therefore
        is the sole actor.
      - Locking in kvm_vm_ioctl_reinject doesn't do anything, because reinject
        is only protected at that place.
      - kvm_pit_reset isn't used anywhere and its locking can be dropped if we
        hide it.
      
      Removing useless locking allows to see what actually is being protected
      by PIT state lock (values accessible from the guest).
      Signed-off-by: default avatarRadim Krčmář <rkrcmar@redhat.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      b39c90b6
    • Radim Krčmář's avatar
      KVM: i8254: pass struct kvm_pit instead of kvm in PIT · 09edea72
      Radim Krčmář authored
      This patch passes struct kvm_pit into internal PIT functions.
      Those functions used to get PIT through kvm->arch.vpit, even though most
      of them never used *kvm for other purposes.  Another benefit is that we
      don't need to set kvm->arch.vpit during initialization.
      Signed-off-by: default avatarRadim Krčmář <rkrcmar@redhat.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      09edea72
    • Radim Krčmář's avatar
      KVM: i8254: tone down WARN_ON pit.state_lock · b69d920f
      Radim Krčmář authored
      If the guest could hit this, it would hang the host kernel, bacause of
      sheer number of those reports.  Internal callers have to be sensible
      anyway, so we now only check for it in an API function.
      Signed-off-by: default avatarRadim Krčmář <rkrcmar@redhat.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      b69d920f
    • Radim Krčmář's avatar
      KVM: i8254: use atomic_t instead of pit.inject_lock · ddf54503
      Radim Krčmář authored
      The lock was an overkill, the same can be done with atomics.
      
      A mb() was added in kvm_pit_ack_irq, to pair with implicit barrier
      between pit_timer_fn and pit_do_work.  The mb() prevents a race that
      could happen if pending == 0 and irq_ack == 0:
      
        kvm_pit_ack_irq:                | pit_timer_fn:
         p = atomic_read(&ps->pending); |
                                        |  atomic_inc(&ps->pending);
                                        |  queue_work(pit_do_work);
                                        | pit_do_work:
                                        |  atomic_xchg(&ps->irq_ack, 0);
                                        |  return;
         atomic_set(&ps->irq_ack, 1);   |
         if (p == 0) return;            |
      
      where the interrupt would not be delivered in this tick of pit_timer_fn.
      PIT would have eventually delivered the interrupt, but we sacrifice
      perofmance to make sure that interrupts are not needlessly delayed.
      
      sfence isn't enough: atomic_dec_if_positive does atomic_read first and
      x86 can reorder loads before stores.  lfence isn't enough: store can
      pass lfence, turning it into a nop.  A compiler barrier would be more
      than enough as CPU needs to stall for unbelievably long to use fences.
      
      This patch doesn't do anything in kvm_pit_reset_reinject, because any
      order of resets can race, but the result differs by at most one
      interrupt, which is ok, because it's the same result as if the reset
      happened at a slightly different time.  (Original code didn't protect
      the reset path with a proper lock, so users have to be robust.)
      Signed-off-by: default avatarRadim Krčmář <rkrcmar@redhat.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      ddf54503
    • Radim Krčmář's avatar
      KVM: i8254: add kvm_pit_reset_reinject · fd700a00
      Radim Krčmář authored
      pit_state.pending and pit_state.irq_ack are always reset at the same
      time.  Create a function for them.
      Signed-off-by: default avatarRadim Krčmář <rkrcmar@redhat.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      fd700a00
    • Radim Krčmář's avatar
      KVM: i8254: simplify atomics in kvm_pit_ack_irq · f6e0a0c1
      Radim Krčmář authored
      We already have a helper that does the same thing.
      Signed-off-by: default avatarRadim Krčmář <rkrcmar@redhat.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      f6e0a0c1
    • Radim Krčmář's avatar
      KVM: i8254: change PIT discard tick policy · 7dd0fdff
      Radim Krčmář authored
      Discard policy uses ack_notifiers to prevent injection of PIT interrupts
      before EOI from the last one.
      
      This patch changes the policy to always try to deliver the interrupt,
      which makes a difference when its vector is in ISR.
      Old implementation would drop the interrupt, but proposed one injects to
      IRR, like real hardware would.
      
      The old policy breaks legacy NMI watchdogs, where PIT is used through
      virtual wire (LVT0): PIT never sends an interrupt before receiving EOI,
      thus a guest deadlock with disabled interrupts will stop NMIs.
      
      Note that NMI doesn't do EOI, so PIT also had to send a normal interrupt
      through IOAPIC.  (KVM's PIT is deeply rotten and luckily not used much
      in modern systems.)
      
      Even though there is a chance of regressions, I think we can fix the
      LVT0 NMI bug without introducing a new tick policy.
      
      Cc: <stable@vger.kernel.org>
      Reported-by: default avatarYuki Shibuya <shibuya.yk@ncos.nec.co.jp>
      Reviewed-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarRadim Krčmář <rkrcmar@redhat.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      7dd0fdff
  2. 03 Mar, 2016 15 commits
  3. 01 Mar, 2016 4 commits
  4. 29 Feb, 2016 9 commits
    • Suresh E. Warrier's avatar
      KVM: PPC: Book3S HV: Add tunable to control H_IPI redirection · 520fe9c6
      Suresh E. Warrier authored
      Redirecting the wakeup of a VCPU from the H_IPI hypercall to
      a core running in the host is usually a good idea, most workloads
      seemed to benefit. However, in one heavily interrupt-driven SMT1
      workload, some regression was observed. This patch adds a kvm_hv
      module parameter called h_ipi_redirect to control this feature.
      
      The default value for this tunable is 1 - that is enable the feature.
      Signed-off-by: default avatarSuresh Warrier <warrier@linux.vnet.ibm.com>
      Signed-off-by: default avatarPaul Mackerras <paulus@samba.org>
      520fe9c6
    • Suresh E. Warrier's avatar
      KVM: PPC: Book3S HV: Send IPI to host core to wake VCPU · e17769eb
      Suresh E. Warrier authored
      This patch adds support to real-mode KVM to search for a core
      running in the host partition and send it an IPI message with
      VCPU to be woken. This avoids having to switch to the host
      partition to complete an H_IPI hypercall when the VCPU which
      is the target of the the H_IPI is not loaded (is not running
      in the guest).
      
      The patch also includes the support in the IPI handler running
      in the host to do the wakeup by calling kvmppc_xics_ipi_action
      for the PPC_MSG_RM_HOST_ACTION message.
      
      When a guest is being destroyed, we need to ensure that there
      are no pending IPIs waiting to wake up a VCPU before we free
      the VCPUs of the guest. This is accomplished by:
      - Forces a PPC_MSG_CALL_FUNCTION IPI to be completed by all CPUs
        before freeing any VCPUs in kvm_arch_destroy_vm().
      - Any PPC_MSG_RM_HOST_ACTION messages must be executed first
        before any other PPC_MSG_CALL_FUNCTION messages.
      Signed-off-by: default avatarSuresh Warrier <warrier@linux.vnet.ibm.com>
      Acked-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarPaul Mackerras <paulus@samba.org>
      e17769eb
    • Suresh Warrier's avatar
      KVM: PPC: Book3S HV: Host side kick VCPU when poked by real-mode KVM · 0c2a6606
      Suresh Warrier authored
      This patch adds the support for the kick VCPU operation for
      kvmppc_host_rm_ops. The kvmppc_xics_ipi_action() function
      provides the function to be invoked for a host side operation
      when poked by the real mode KVM. This is initiated by KVM by
      sending an IPI to any free host core.
      
      KVM real mode must set the rm_action to XICS_RM_KICK_VCPU and
      rm_data to point to the VCPU to be woken up before sending the IPI.
      Note that we have allocated one kvmppc_host_rm_core structure
      per core. The above values need to be set in the structure
      corresponding to the core to which the IPI will be sent.
      Signed-off-by: default avatarSuresh Warrier <warrier@linux.vnet.ibm.com>
      Signed-off-by: default avatarPaul Mackerras <paulus@samba.org>
      0c2a6606
    • Suresh Warrier's avatar
      KVM: PPC: Book3S HV: kvmppc_host_rm_ops - handle offlining CPUs · 6f3bb809
      Suresh Warrier authored
      The kvmppc_host_rm_ops structure keeps track of which cores are
      are in the host by maintaining a bitmask of active/runnable
      online CPUs that have not entered the guest. This patch adds
      support to manage the bitmask when a CPU is offlined or onlined
      in the host.
      Signed-off-by: default avatarSuresh Warrier <warrier@linux.vnet.ibm.com>
      Signed-off-by: default avatarPaul Mackerras <paulus@samba.org>
      6f3bb809
    • Suresh Warrier's avatar
      KVM: PPC: Book3S HV: Manage core host state · b8e6a87c
      Suresh Warrier authored
      Update the core host state in kvmppc_host_rm_ops whenever
      the primary thread of the core enters the guest or returns
      back.
      Signed-off-by: default avatarSuresh Warrier <warrier@linux.vnet.ibm.com>
      Signed-off-by: default avatarPaul Mackerras <paulus@samba.org>
      b8e6a87c
    • Suresh Warrier's avatar
      KVM: PPC: Book3S HV: Host-side RM data structures · 79b6c247
      Suresh Warrier authored
      This patch defines the data structures to support the setting up
      of host side operations while running in real mode in the guest,
      and also the functions to allocate and free it.
      
      The operations are for now limited to virtual XICS operations.
      Currently, we have only defined one operation in the data
      structure:
               - Wake up a VCPU sleeping in the host when it
                 receives a virtual interrupt
      
      The operations are assigned at the core level because PowerKVM
      requires that the host run in SMT off mode. For each core,
      we will need to manage its state atomically - where the state
      is defined by:
      1. Is the core running in the host?
      2. Is there a Real Mode (RM) operation pending on the host?
      
      Currently, core state is only managed at the whole-core level
      even when the system is in split-core mode. This just limits
      the number of free or "available" cores in the host to perform
      any host-side operations.
      
      The kvmppc_host_rm_core.rm_data allows any data to be passed by
      KVM in real mode to the host core along with the operation to
      be performed.
      
      The kvmppc_host_rm_ops structure is allocated the very first time
      a guest VM is started. Initial core state is also set - all online
      cores are in the host. This structure is never deleted, not even
      when there are no active guests. However, it needs to be freed
      when the module is unloaded because the kvmppc_host_rm_ops_hv
      can contain function pointers to kvm-hv.ko functions for the
      different supported host operations.
      Signed-off-by: default avatarSuresh Warrier <warrier@linux.vnet.ibm.com>
      Signed-off-by: default avatarPaul Mackerras <paulus@samba.org>
      79b6c247
    • Suresh Warrier's avatar
      powerpc/xics: Add icp_native_cause_ipi_rm · ec13e9b6
      Suresh Warrier authored
      Function to cause an IPI by directly updating the MFFR register
      in the XICS. The function is meant for real-mode callers since
      they cannot use the smp_ops->cause_ipi function which uses an
      ioremapped address.
      
      Normal usage is for the the KVM real mode code to set the IPI message
      using smp_muxed_ipi_message_pass and then invoke icp_native_cause_ipi_rm
      to cause the actual IPI.
      
      The function requires kvm_hstate.xics_phys to have been initialized
      with the physical address of XICS.
      Signed-off-by: default avatarSuresh Warrier <warrier@linux.vnet.ibm.com>
      Acked-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarPaul Mackerras <paulus@samba.org>
      ec13e9b6
    • Suresh Warrier's avatar
      powerpc/smp: Add smp_muxed_ipi_set_message · 31639c77
      Suresh Warrier authored
      smp_muxed_ipi_message_pass() invokes smp_ops->cause_ipi, which
      uses an ioremapped address to access registers on the XICS
      interrupt controller to cause the IPI. Because of this real
      mode callers cannot call smp_muxed_ipi_message_pass() for IPI
      messaging.
      
      This patch creates a separate function smp_muxed_ipi_set_message
      just to set the IPI message without the cause_ipi routine.
      After calling this function to set the IPI message, real
      mode callers must cause the IPI by writing to the XICS registers
      directly.
      
      As part of this, we also change smp_muxed_ipi_message_pass
      to call smp_muxed_ipi_set_message to set the message instead
      of doing it directly inside the routine.
      Signed-off-by: default avatarSuresh Warrier <warrier@linux.vnet.ibm.com>
      Acked-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarPaul Mackerras <paulus@samba.org>
      31639c77
    • Suresh Warrier's avatar
      powerpc/smp: Support more IPI messages · bd7f561f
      Suresh Warrier authored
      This patch increases the number of demuxed messages for a
      controller with a single ipi to 8 for 64-bit systems.
      
      This is required because we want to use the IPI mechanism
      to send messages from a CPU running in KVM real mode in a
      guest to a CPU in the host to take some action. Currently,
      we only support 4 messages and all 4 are already taken.
      
      Define a fifth message PPC_MSG_RM_HOST_ACTION for this
      purpose.
      Signed-off-by: default avatarSuresh Warrier <warrier@linux.vnet.ibm.com>
      Acked-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarPaul Mackerras <paulus@samba.org>
      bd7f561f
  5. 23 Feb, 2016 5 commits