1. 19 May, 2015 40 commits
    • Ingo Molnar's avatar
      x86/fpu: Simplify FPU handling by embedding the fpstate in task_struct (again) · 7366ed77
      Ingo Molnar authored
      So 6 years ago we made the FPU fpstate dynamically allocated:
      
        aa283f49 ("x86, fpu: lazy allocation of FPU area - v5")
        61c4628b ("x86, fpu: split FPU state from task struct - v5")
      
      In hindsight this was a mistake:
      
         - it complicated context allocation failure handling, such as:
      
      		/* kthread execs. TODO: cleanup this horror. */
      		if (WARN_ON(fpstate_alloc_init(fpu)))
      			force_sig(SIGKILL, tsk);
      
         - it caused us to enable irqs in fpu__restore():
      
                      local_irq_enable();
                      /*
                       * does a slab alloc which can sleep
                       */
                      if (fpstate_alloc_init(fpu)) {
                              /*
                               * ran out of memory!
                               */
                              do_group_exit(SIGKILL);
                              return;
                      }
                      local_irq_disable();
      
         - it (slightly) slowed down task creation/destruction by adding
           slab allocation/free pattens.
      
         - it made access to context contents (slightly) slower by adding
           one more pointer dereference.
      
      The motivation for the dynamic allocation was two-fold:
      
         - reduce memory consumption by non-FPU tasks
      
         - allocate and handle only the necessary amount of context for
           various XSAVE processors that have varying hardware frame
           sizes.
      
      These days, with glibc using SSE memcpy by default and GCC optimizing
      for SSE/AVX by default, the scope of FPU using apps on an x86 system is
      much larger than it was 6 years ago.
      
      For example on a freshly installed Fedora 21 desktop system, with a
      recent kernel, all non-kthread tasks have used the FPU shortly after
      bootup.
      
      Also, even modern embedded x86 CPUs try to support the latest vector
      instruction set - so they'll too often use the larger xstate frame
      sizes.
      
      So remove the dynamic allocation complication by embedding the FPU
      fpstate in task_struct again. This should make the FPU a lot more
      accessible to all sorts of atomic contexts.
      
      We could still optimize for the xstate frame size in the future,
      by moving the state structure to the last element of task_struct,
      and allocating only a part of that.
      
      This change is kept minimal by still keeping the ctx_alloc()/free()
      routines (that now do nothing substantial) - we'll remove them in
      the following patches.
      Reviewed-by: default avatarBorislav Petkov <bp@alien8.de>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      7366ed77
    • Ingo Molnar's avatar
      x86/fpu: Optimize copy_fpregs_to_fpstate() by removing the FNCLEX... · 1bc6b056
      Ingo Molnar authored
      x86/fpu: Optimize copy_fpregs_to_fpstate() by removing the FNCLEX synchronization with FP exceptions
      
      So we have the following ancient code in copy_fpregs_to_fpstate():
      
      	if (unlikely(fpu->state->fxsave.swd & X87_FSW_ES)) {
      		asm volatile("fnclex");
      		goto drop_fpregs;
      	}
      
      which clears pending FPU exceptions and then drops registers, which
      causes the next FP instruction of the saved context to re-load the
      saved FPU state, with all pending exceptions marked properly, and
      will re-start the exception handling mechanism in the hardware.
      
      Since FPU exceptions are always issued on instruction boundaries,
      in particular on the next FP instruction following the exception
      generating instruction, there's no fear of getting an FP exception
      asynchronously.
      
      They were truly asynchronous back in the IRQ13 days, when the FPU was
      a weird and expensive co-processor that did its own processing, and we
      had to synchronize with them, but that code is not working anymore:
      we don't have IRQ13 mapped in the IDT anymore.
      
      With the introduction of optimized XSAVE support there's a new
      complication: if the xstate features bit indicates that a particular
      state component is unused (in 'init state'), then the hardware does
      not guarantee that the XSAVE (et al) instruction keeps the underlying
      FPU state image in memory valid and current. In practice this means
      that the hardware won't write it, and the exceptions flag in the
      state might be an older version, with it still being set. This
      meant that we had to check the xfeatures flag as well, adding
      another memory load and branch to a critical hot path of the scheduler.
      
      So optimize all this by removing both the old quirk and the new check,
      and straight-line optimizing the most common cases with likely()
      hints. Quite a bit of code gets removed this way:
      
        arch/x86/kernel/process_64.o:
      
          text    data     bss     dec     filename
          5484       8       0    5492     process_64.o.before
          5416       8       0    5424     process_64.o.after
      
      Now there's also a chance that some weird behavior or erratum was
      masked by our IRQ13 handling quirk (or that I misunderstood the
      nature of the quirk), and that this change triggers some badness.
      
      There's no real good way to protect against that possibility other
      than keeping this change well isolated, well commented and well
      bisectable. If you bisect a weird (or not so weird) breakage to
      this commit then please let us know!
      Reviewed-by: default avatarBorislav Petkov <bp@alien8.de>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      1bc6b056
    • Ingo Molnar's avatar
      x86/fpu: Rename fpu_save_init() to copy_fpregs_to_fpstate() · 4f836347
      Ingo Molnar authored
      So fpu_save_init() is a historic name that got its name when the only
      way the FPU state was FNSAVE, which cleared (well, destroyed) the FPU
      state after saving it.
      
      Nowadays the name is misleading, because ever since the introduction of
      FXSAVE (and more modern FPU saving instructions) the 'we need to reload
      the FPU state' part is only true if there's a pending FPU exception [*],
      which is almost never the case.
      
      So rename it to copy_fpregs_to_fpstate() to make it clear what's
      happening. Also add a few comments about why we cannot keep registers
      in certain cases.
      
      Also clean up the control flow a bit, to make it more apparent when
      we are dropping/keeping FP registers, and to optimize the common
      case (of keeping fpregs) some more.
      
      [*] Probably not true anymore, modern instructions always leave the FPU
          state intact, even if exceptions are pending: because pending FP
          exceptions are posted on the next FP instruction, not asynchronously.
      
          They were truly asynchronous back in the IRQ13 case, and we had to
          synchronize with them, but that code is not working anymore: we don't
          have IRQ13 mapped in the IDT anymore.
      
          But a cleanup patch is obviously not the place to change subtle behavior.
      Reviewed-by: default avatarBorislav Petkov <bp@alien8.de>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      4f836347
    • Ingo Molnar's avatar
      x86/fpu: Uninline the irq_ts_save()/restore() functions · 91066588
      Ingo Molnar authored
      Especially the irq_ts_save() function is pretty bloaty, generating
      over a dozen instructions, so uninline them.
      
      Even though the API is used rarely, the space savings are measurable:
      
         text    data     bss     dec     hex filename
         13331995        2572920 1634304 17539219        10ba093 vmlinux.before
         13331739        2572920 1634304 17538963        10b9f93 vmlinux.after
      
      ( This also allows the removal of an include file inclusion from fpu/api.h,
        speeding up the kernel build slightly. )
      Reviewed-by: default avatarBorislav Petkov <bp@alien8.de>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      91066588
    • Ingo Molnar's avatar
      x86/fpu: Move various internal function prototypes to fpu/internal.h · 952f07ec
      Ingo Molnar authored
      There are a number of FPU internal function prototypes and an inline function
      in fpu/api.h, mostly placed so historically as the code grew over the years.
      
      Move them over into fpu/internal.h where they belong. (Add sched.h include
      to stackprotector.h which incorrectly relied on getting it from fpu/api.h.)
      
      fpu/api.h is now a pure file that only contains FPU APIs intended for driver
      use.
      Reviewed-by: default avatarBorislav Petkov <bp@alien8.de>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      952f07ec
    • Ingo Molnar's avatar
      x86/fpu: Uninline kernel_fpu_begin()/end() · d63e79b1
      Ingo Molnar authored
      Both inline functions call an inline function unconditionally, so we
      already pay the function call based clobbering cost. Uninline them.
      
      This saves quite a bit of code in various performance sensitive
      code paths:
      
         text            data    bss     dec             hex     filename
         13321334        2569888 1634304 17525526        10b6b16 vmlinux.before
         13320246        2569888 1634304 17524438        10b66d6 vmlinux.after
      Reviewed-by: default avatarBorislav Petkov <bp@alien8.de>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      d63e79b1
    • Ingo Molnar's avatar
      x86/fpu: Move fpu__save() to fpu/internals.h · e2295375
      Ingo Molnar authored
      It's an internal method, not a driver API, so move it from fpu/api.h
      to fpu/internal.h.
      Reviewed-by: default avatarBorislav Petkov <bp@alien8.de>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      e2295375
    • Ingo Molnar's avatar
      x86/fpu: Add more comments to the FPU init code · ae02679c
      Ingo Molnar authored
      Extend the comments of the FPU init code, and fix old ones.
      Reviewed-by: default avatarBorislav Petkov <bp@alien8.de>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      ae02679c
    • Ingo Molnar's avatar
      x86/fpu: Reorder init methods · 41e78410
      Ingo Molnar authored
      Reorder init methods in order of their relationship and usage, to
      form coherent blocks throughout the whole file.
      Reviewed-by: default avatarBorislav Petkov <bp@alien8.de>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      41e78410
    • Ingo Molnar's avatar
      x86/fpu: Rename fpstate_xstate_init_size() to fpu__init_system_xstate_size_legacy() · 7638b74b
      Ingo Molnar authored
      To bring it in line with the other init_system*() methods.
      Reviewed-by: default avatarBorislav Petkov <bp@alien8.de>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      7638b74b
    • Ingo Molnar's avatar
      x86/fpu: Remove the extra fpu__detect() layer · c66e3f28
      Ingo Molnar authored
      Now that fpu__detect() has become an empty layer around
      fpu__init_system(), eliminate it and make fpu__init_system()
      the main system initialization routine.
      Reviewed-by: default avatarBorislav Petkov <bp@alien8.de>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      c66e3f28
    • Ingo Molnar's avatar
      x86/fpu: Move fpu__init_system_early_generic() out of fpu__detect() · dd863880
      Ingo Molnar authored
      Move the fpu__init_system_early_generic() call into fpu__init_system(),
      which hosts all the system init calls.
      
      Expose fpu__init_system() to other modules - this will be our main and only
      system init function.
      Reviewed-by: default avatarBorislav Petkov <bp@alien8.de>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      dd863880
    • Ingo Molnar's avatar
      x86/fpu: Make check_fpu() init ordering independent · 71eb3c6d
      Ingo Molnar authored
      check_fpu() currently relies on being called early in the init sequence,
      when CR0::TS has not been set up yet.
      
      Save/restore CR0::TS across this function, to make it invariant to
      init ordering. This way we'll be able to move the generic FPU setup
      routines earlier in the init sequence.
      
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      71eb3c6d
    • Ingo Molnar's avatar
      x86/fpu: Factor out FPU bug checks into fpu/bugs.c · 0bf23f3d
      Ingo Molnar authored
      Create separate fpu/bugs.c code so that if we read generic FPU code
      we don't have to wade through all the bugcheck related code first.
      Reviewed-by: default avatarBorislav Petkov <bp@alien8.de>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      0bf23f3d
    • Ingo Molnar's avatar
      x86/fpu: Move !FPU check ingo fpu__init_system_early_generic() · e83ab9ad
      Ingo Molnar authored
      There's a !FPU related sanity check in fpu__init_cpu_generic(),
      which is executed on every CPU onlining - even though we should do
      this only once, and during system init.
      
      Move this check to fpu__init_system_early_generic().
      Reviewed-by: default avatarBorislav Petkov <bp@alien8.de>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      e83ab9ad
    • Ingo Molnar's avatar
      x86/fpu: Factor out fpu__init_system_early_generic() · 2e2f3da7
      Ingo Molnar authored
      Move the generic bits of fpu__detect() into fpu__init_system_early_generic().
      
      We'll move some other code here too in a followup patch.
      Reviewed-by: default avatarBorislav Petkov <bp@alien8.de>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      2e2f3da7
    • Ingo Molnar's avatar
      x86/fpu: Factor out fpu__init_system_generic() · 7218e8b7
      Ingo Molnar authored
      Factor out the generic bits from fpu__init_system().
      
      Rename mxcsr_feature_mask_init() to fpu__init_system_mxcsr()
      to bring it in line with the rest of the nomenclature.
      Reviewed-by: default avatarBorislav Petkov <bp@alien8.de>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      7218e8b7
    • Ingo Molnar's avatar
      x86/fpu: Factor out fpu__init_cpu_generic() · b11316ed
      Ingo Molnar authored
      Factor out the generic bits from fpu__init_cpu(), to create
      a flat sequence of per CPU initialization function calls:
      
      	fpu__init_cpu_generic();
      	fpu__init_cpu_xstate();
      	fpu__init_cpu_ctx_switch();
      Reviewed-by: default avatarBorislav Petkov <bp@alien8.de>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      b11316ed
    • Ingo Molnar's avatar
      x86/fpu: Simplify fpu__cpu_init() · 21c4cd10
      Ingo Molnar authored
      After the latest round of cleanups, fpu__cpu_init() has become
      a simple call to fpu__init_cpu().
      
      Rename fpu__init_cpu() to fpu__cpu_init() and remove the
      extra layer.
      Reviewed-by: default avatarBorislav Petkov <bp@alien8.de>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      21c4cd10
    • Ingo Molnar's avatar
      x86/fpu: Remove fpu__init_cpu_ctx_switch() call from fpu__init_system() · 7202ab46
      Ingo Molnar authored
      We are now doing the fpu__init_cpu_ctx_switch() call from fpu__init_cpu(),
      so there's no need to call it from fpu__init_system() anymore.
      Reviewed-by: default avatarBorislav Petkov <bp@alien8.de>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      7202ab46
    • Ingo Molnar's avatar
      x86/fpu: Do system-wide setup from fpu__detect() · 067051cc
      Ingo Molnar authored
      fpu__cpu_init() is called on every CPU, so it is the wrong place
      to call fpu__init_system() from. Call it from fpu__detect():
      this is early CPU init code, but we already have CPU features detected,
      so we can call the system-wide FPU init code from here.
      Reviewed-by: default avatarBorislav Petkov <bp@alien8.de>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      067051cc
    • Ingo Molnar's avatar
      x86/fpu: Call fpu__init_cpu_ctx_switch() from fpu__init_cpu() · 3960fccf
      Ingo Molnar authored
      fpu__init_cpu() is currently called from fpu__init_system(),
      which is the wrong place for it: call it from the proper high level
      per CPU init function, fpu__init_cpu().
      
      Note, we still keep the old call site as well, because it depends
      on having proper CR0::TS setup. We'll fix this in the next patch.
      Reviewed-by: default avatarBorislav Petkov <bp@alien8.de>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      3960fccf
    • Ingo Molnar's avatar
      x86/fpu: Move the fpstate_xstate_init_size() call into fpu__init_system() · 997578b1
      Ingo Molnar authored
      The fpstate_xstate_init_size() function sets up a basic xstate_size, called
      during fpu__detect() currently.
      
      Its real dependency is to be called before fpu__init_system_xstate().
      
      So move the function call site into fpu__init_system(), to right before the
      fpu__init_system_xstate() call.
      
      Also add a once-per-boot flag to fpstate_xstate_init_size(), we'll remove
      this quirk later once we've cleaned up the init dependencies.
      
      This moves the two related functions closer to each other and makes them
      both part of the _init_system() functionality.
      
      Currently we do the fpstate_xstate_init_size()
      Reviewed-by: default avatarBorislav Petkov <bp@alien8.de>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      997578b1
    • Ingo Molnar's avatar
      x86/fpu: Do CLTS fpu__init_system() · 530b37e4
      Ingo Molnar authored
      mxcsr_feature_mask_init() depends on TS being cleared, as it executes
      an FXSAVE instruction.
      
      After later changes we will move the TS setup into fpu__init_cpu(),
      which will interact with this - so clear the TS flag explicitly.
      Reviewed-by: default avatarBorislav Petkov <bp@alien8.de>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      530b37e4
    • Ingo Molnar's avatar
      x86/fpu: Split fpu__ctx_switch_init() into _cpu() and _system() portions · 011545b5
      Ingo Molnar authored
      So fpu__ctx_switch_init() has two aspects: a once per bootup functionality
      that sets up a capability flag, and a per CPU functionality that sets CR0::TS.
      
      Split the function.
      
      Note that at this stage we still have duplicate calls into these methods, as
      both the _system() and the _cpu() methods are run on all CPUs, with lower
      level on_boot_cpu flags filtering out the duplicates where needed. So add
      TS flag clearing as well, to handle the aftermath of early CPU init sequences
      that might call in without having eager-fpu set - don't assume the TS flag
      is cleared.
      
      Calling each from its respective init level will happen later on.
      Reviewed-by: default avatarBorislav Petkov <bp@alien8.de>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      011545b5
    • Ingo Molnar's avatar
      x86/fpu: Clean up eager_fpu_init() and rename it to fpu__ctx_switch_init() · 064e51e3
      Ingo Molnar authored
      It's not an xsave specific function anymore, so rename it accordingly
      and also clean it up a bit:
      
       - remove the obsolete __init_refok, as the code paths are not
         mixed anymore
      
       - rename it from eager_fpu_init() to fpu__ctx_switch_init()
      
       - remove stray 'return;'
      
       - make it static to its only user
      Reviewed-by: default avatarBorislav Petkov <bp@alien8.de>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      064e51e3
    • Ingo Molnar's avatar
      x86/fpu: Move eager_fpu_init() to fpu/init.c · 6f5d265a
      Ingo Molnar authored
      Move eager_fpu_init() and the 'eagerfpu' boot parameter handling function
      to the generic FPU init file: it's generic FPU functionality.
      Reviewed-by: default avatarBorislav Petkov <bp@alien8.de>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      6f5d265a
    • Ingo Molnar's avatar
      x86/fpu: Move all eager-fpu setup code to eager_fpu_init() · 89abbe01
      Ingo Molnar authored
      The FPU context switch type (lazy or eager) setup code is split into
      two places currently - move it all to eager_fpu_init().
      
      Note that the code we move will now be executed on non-xstate CPUs
      as well, but this should be safe: both xfeatures_mask and
      cpu_has_xsaveopt is 0 there.
      Reviewed-by: default avatarBorislav Petkov <bp@alien8.de>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      89abbe01
    • Ingo Molnar's avatar
      x86/fpu: Remove setup_init_fpu_buf() call from eager_fpu_init() · a5cb56e9
      Ingo Molnar authored
      It's a pure xstate method now, no need for this duplicate call.
      Reviewed-by: default avatarBorislav Petkov <bp@alien8.de>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      a5cb56e9
    • Ingo Molnar's avatar
      x86/fpu: Set up the legacy FPU init image from fpu__init_system() · 2507e1c0
      Ingo Molnar authored
      The legacy FPU init image is used on older CPUs who don't run xstate init.
      But the init code is called within setup_init_fpu_buf(), an xstate method.
      
      Move this legacy init out of the xstate code and put it into fpu/init.c.
      Reviewed-by: default avatarBorislav Petkov <bp@alien8.de>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      2507e1c0
    • Ingo Molnar's avatar
      x86/fpu: Do fpu__init_system_xstate only from fpu__init_system() · 429ced50
      Ingo Molnar authored
      Only call xstate system setup routines from fpu__init_system().
      
      Likewise, don't call fpu__init_cpu_xstate() from fpu__init_system().
      Reviewed-by: default avatarBorislav Petkov <bp@alien8.de>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      429ced50
    • Ingo Molnar's avatar
      x86/fpu: Remove xsave_init() · c42103b2
      Ingo Molnar authored
      Expand fpu__init_system_xstate() and fpu__init_cpu_xstate() calls
      into xsave_init() calls.
      
      (This will allow us to call the proper versions in higher level FPU init code
      later on.)
      
      No change in functionality.
      Reviewed-by: default avatarBorislav Petkov <bp@alien8.de>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      c42103b2
    • Ingo Molnar's avatar
      x86/fpu: Propagate once per boot quirk into fpu__init_system_xstate() · 62db6871
      Ingo Molnar authored
      Linearize the call sequence in xsave_init():
      
      	fpu__init_system_xstate();
      	fpu__init_cpu_xstate();
      
      We do this by propagating the boot-once quirk into
      fpu__init_system_xstate(). fpu__init_cpu_xstate() is
      safe to be called multiple time.
      Reviewed-by: default avatarBorislav Petkov <bp@alien8.de>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      62db6871
    • Ingo Molnar's avatar
      x86/fpu: Move legacy check to fpu__init_system_xstate() · e9dbfd67
      Ingo Molnar authored
      Now that legacy code can execute fpu__init_cpu_xstate() in
      xsave_init(), we can move the once per boot legacy check into
      fpu__init_system_xstate(), where it belongs.
      Reviewed-by: default avatarBorislav Petkov <bp@alien8.de>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      e9dbfd67
    • Ingo Molnar's avatar
      x86/fpu: Move CPU capability check into fpu__init_cpu_xstate() · e84611fc
      Ingo Molnar authored
      fpu__init_system_xstate() does an FPU capability check that is better
      done in fpu__init_cpu_xstate(). This will allow us to call
      fpu__init_cpu_xstate() directly on legacy CPUs as well.
      Reviewed-by: default avatarBorislav Petkov <bp@alien8.de>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      e84611fc
    • Ingo Molnar's avatar
      x86/fpu: Make the system/cpu init distinction clear in the xstate code as well · 55cc4678
      Ingo Molnar authored
      Rename existing xstate init functions along the system/cpu init principles:
      
      	fpu__init_system_xstate(): called once per system bootup
      	fpu__init_cpu_xstate():    called per CPU onlining
      
      Also make the fpu__init_cpu_xstate() early code invariant:
      if xfeatures_mask is not set yet then don't crash but return.
      Reviewed-by: default avatarBorislav Petkov <bp@alien8.de>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      55cc4678
    • Ingo Molnar's avatar
      x86/fpu: Split fpu__cpu_init() into early-boot and cpu-boot parts · e35f6f14
      Ingo Molnar authored
      There are two kinds of FPU initialization sequences necessary to bring FPU
      functionality up: once per system bootup activities, such as detection,
      feature initialization, etc. of attributes that are shared by all CPUs
      in the system - and per cpu initialization sequences run when a CPU is
      brought online (either during bootup or during CPU hotplug onlining),
      such as CR0/CR4 register setting, etc.
      
      The FPU code is mixing these roles together, with no clear distinction.
      
      Start sorting this out by splitting the main FPU detection routine
      (fpu__cpu_init()) into two parts: fpu__init_system() for
      one per system init activities, and fpu__init_cpu() for the
      per CPU onlining init activities.
      
      Note that xstate_init() is called from both variants for the time being,
      because it has a dual nature as well. We'll fix that in upcoming patches.
      
      Just do the split and call it as we used to before, don't introduce any
      change in initialization behavior yet, beyond duplicate (and harmless)
      fpu__init_cpu() and xstate_init() calls - which we'll fix in later
      patches.
      Reviewed-by: default avatarBorislav Petkov <bp@alien8.de>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      e35f6f14
    • Ingo Molnar's avatar
      x86/fpu: Remove 'init_xstate_buf' bootmem allocation · 3e5e1267
      Ingo Molnar authored
      Make init_xstate_buf allocated statically at build time.
      
      This structure's maximum size is around 1KB - and it's allocated even on
      most modern embedded x86 CPUs which strive for FPU instruction set parity
      with desktop and server CPUs, so it's not like we can save much on smaller
      systems.
      
      This removes the last bootmem allocation from the FPU init path, allowing
      it to be called earlier in the boot sequence.
      Reviewed-by: default avatarBorislav Petkov <bp@alien8.de>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      3e5e1267
    • Ingo Molnar's avatar
      x86/fpu: Make setup_init_fpu_buf() run-once explicitly · 26b1f5d0
      Ingo Molnar authored
      Remove the dependency on the init_xstate_buf == NULL check to
      implement once-per-bootup logic in eager_fpu_init(), by making
      setup_init_fpu_buf() run once per bootup explicitly.
      
      This is in preparation to make init_xstate_buf statically
      allocated.
      
      The various boot-once quirks in the FPU init code will be removed
      in a later cleanup stage.
      Reviewed-by: default avatarBorislav Petkov <bp@alien8.de>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      26b1f5d0
    • Ingo Molnar's avatar
      x86/fpu: Remove xsave_init() bootmem allocations · 966ece61
      Ingo Molnar authored
      There's only 8 xstate bits at the moment, and it's not like we
      can support unknown bits - so put xstate_offsets[] and
      xstate_sizes[] into static allocation.
      
      This is in preparation to be able to call the FPU init code
      earlier, when there's no bootmem available yet.
      Reviewed-by: default avatarBorislav Petkov <bp@alien8.de>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      966ece61