• Linus Torvalds's avatar
    Merge tag 'x86-fred-2024-03-10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 720c8579
    Linus Torvalds authored
    Pull x86 FRED support from Thomas Gleixner:
     "Support for x86 Fast Return and Event Delivery (FRED).
    
      FRED is a replacement for IDT event delivery on x86 and addresses most
      of the technical nightmares which IDT exposes:
    
       1) Exception cause registers like CR2 need to be manually preserved
          in nested exception scenarios.
    
       2) Hardware interrupt stack switching is suboptimal for nested
          exceptions as the interrupt stack mechanism rewinds the stack on
          each entry which requires a massive effort in the low level entry
          of #NMI code to handle this.
    
       3) No hardware distinction between entry from kernel or from user
          which makes establishing kernel context more complex than it needs
          to be especially for unconditionally nestable exceptions like NMI.
    
       4) NMI nesting caused by IRET unconditionally reenabling NMIs, which
          is a problem when the perf NMI takes a fault when collecting a
          stack trace.
    
       5) Partial restore of ESP when returning to a 16-bit segment
    
       6) Limitation of the vector space which can cause vector exhaustion
          on large systems.
    
       7) Inability to differentiate NMI sources
    
      FRED addresses these shortcomings by:
    
       1) An extended exception stack frame which the CPU uses to save
          exception cause registers. This ensures that the meta information
          for each exception is preserved on stack and avoids the extra
          complexity of preserving it in software.
    
       2) Hardware interrupt stack switching is non-rewinding if a nested
          exception uses the currently interrupt stack.
    
       3) The entry points for kernel and user context are separate and GS
          BASE handling which is required to establish kernel context for
          per CPU variable access is done in hardware.
    
       4) NMIs are now nesting protected. They are only reenabled on the
          return from NMI.
    
       5) FRED guarantees full restore of ESP
    
       6) FRED does not put a limitation on the vector space by design
          because it uses a central entry points for kernel and user space
          and the CPUstores the entry type (exception, trap, interrupt,
          syscall) on the entry stack along with the vector number. The
          entry code has to demultiplex this information, but this removes
          the vector space restriction.
    
          The first hardware implementations will still have the current
          restricted vector space because lifting this limitation requires
          further changes to the local APIC.
    
       7) FRED stores the vector number and meta information on stack which
          allows having more than one NMI vector in future hardware when the
          required local APIC changes are in place.
    
      The series implements the initial FRED support by:
    
       - Reworking the existing entry and IDT handling infrastructure to
         accomodate for the alternative entry mechanism.
    
       - Expanding the stack frame to accomodate for the extra 16 bytes FRED
         requires to store context and meta information
    
       - Providing FRED specific C entry points for events which have
         information pushed to the extended stack frame, e.g. #PF and #DB.
    
       - Providing FRED specific C entry points for #NMI and #MCE
    
       - Implementing the FRED specific ASM entry points and the C code to
         demultiplex the events
    
       - Providing detection and initialization mechanisms and the necessary
         tweaks in context switching, GS BASE handling etc.
    
      The FRED integration aims for maximum code reuse vs the existing IDT
      implementation to the extent possible and the deviation in hot paths
      like context switching are handled with alternatives to minimalize the
      impact. The low level entry and exit paths are seperate due to the
      extended stack frame and the hardware based GS BASE swichting and
      therefore have no impact on IDT based systems.
    
      It has been extensively tested on existing systems and on the FRED
      simulation and as of now there are no outstanding problems"
    
    * tag 'x86-fred-2024-03-10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (38 commits)
      x86/fred: Fix init_task thread stack pointer initialization
      MAINTAINERS: Add a maintainer entry for FRED
      x86/fred: Fix a build warning with allmodconfig due to 'inline' failing to inline properly
      x86/fred: Invoke FRED initialization code to enable FRED
      x86/fred: Add FRED initialization functions
      x86/syscall: Split IDT syscall setup code into idt_syscall_init()
      KVM: VMX: Call fred_entry_from_kvm() for IRQ/NMI handling
      x86/entry: Add fred_entry_from_kvm() for VMX to handle IRQ/NMI
      x86/entry/calling: Allow PUSH_AND_CLEAR_REGS being used beyond actual entry code
      x86/fred: Fixup fault on ERETU by jumping to fred_entrypoint_user
      x86/fred: Let ret_from_fork_asm() jmp to asm_fred_exit_user when FRED is enabled
      x86/traps: Add sysvec_install() to install a system interrupt handler
      x86/fred: FRED entry/exit and dispatch code
      x86/fred: Add a machine check entry stub for FRED
      x86/fred: Add a NMI entry stub for FRED
      x86/fred: Add a debug fault entry stub for FRED
      x86/idtentry: Incorporate definitions/declarations of the FRED entries
      x86/fred: Make exc_page_fault() work for FRED
      x86/fred: Allow single-step trap and NMI when starting a new task
      x86/fred: No ESPFIX needed when FRED is enabled
      ...
    720c8579
vmx.c 250 KB