- 26 Oct, 2021 12 commits
-
-
Chang S. Bae authored
Intel's eXtended Feature Disable (XFD) feature is an extension of the XSAVE architecture. XFD allows the kernel to enable a feature state in XCR0 and to receive a #NM trap when a task uses instructions accessing that state. This is going to be used to postpone the allocation of a larger XSTATE buffer for a task to the point where it is actually using a related instruction after the permission to use that facility has been granted. XFD is not used by the kernel, but only applied to userspace. This is a matter of policy as the kernel knows how a fpstate is reallocated and the XFD state. The compacted XSAVE format is adjustable for dynamic features. Make XFD depend on XSAVES. Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20211021225527.10184-13-chang.seok.bae@intel.com
-
Chang S. Bae authored
On exec(), extended register states saved in the buffer is cleared. With dynamic features, each task carries variables besides the register states. The struct fpu has permission information and struct fpstate contains buffer size and feature masks. They are all dynamically updated with dynamic features. Reset the current task's entire FPU data before an exec() so that the new task starts with default permission and fpstate. Rename the register state reset function because the old naming confuses as it does not reset struct fpstate. Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20211021225527.10184-12-chang.seok.bae@intel.com
-
Thomas Gleixner authored
The default portion of the parent's FPU state is saved in a child task. With dynamic features enabled, the non-default portion is not saved in a child's fpstate because these register states are defined to be caller-saved. The new task's fpstate is therefore the default buffer. Fork inherits the permission of the parent. Also, do not use memcpy() when TIF_NEED_FPU_LOAD is set because it is invalid when the parent has dynamic features. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20211021225527.10184-11-chang.seok.bae@intel.com
-
Chang S. Bae authored
The software reserved portion of the fxsave frame in the signal frame is copied from structures which have been set up at boot time. With dynamically enabled features the content of these structures is no longer correct because the xfeatures and size can be different per task. Calculate the software reserved portion at runtime and fill in the xfeatures and size values from the tasks active fpstate. Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20211021225527.10184-10-chang.seok.bae@intel.com
-
Thomas Gleixner authored
Use the current->group_leader->fpu to check for pending permissions to use extended features and validate against the resulting user space size which is stored in the group leaders fpu struct as well. This prevents a task from installing a too small sized sigaltstack after permissions to use dynamically enabled features have been granted, but the task has not (yet) used a related instruction. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20211021225527.10184-9-chang.seok.bae@intel.com
-
Thomas Gleixner authored
To allow building up the infrastructure required to support dynamically enabled FPU features, add: - XFEATURES_MASK_DYNAMIC This constant will hold xfeatures which can be dynamically enabled. - fpu_state_size_dynamic() A static branch for 64-bit and a simple 'return false' for 32-bit. This helper allows to add dynamic-feature-specific changes to common code which is shared between 32-bit and 64-bit without #ifdeffery. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20211021225527.10184-8-chang.seok.bae@intel.com
-
Chang S. Bae authored
Dynamically enabled XSTATE features are by default disabled for all processes. A process has to request permission to use such a feature. To support this implement a architecture specific prctl() with the options: - ARCH_GET_XCOMP_SUPP Copies the supported feature bitmap into the user space provided u64 storage. The pointer is handed in via arg2 - ARCH_GET_XCOMP_PERM Copies the process wide permitted feature bitmap into the user space provided u64 storage. The pointer is handed in via arg2 - ARCH_REQ_XCOMP_PERM Request permission for a feature set. A feature set can be mapped to a facility, e.g. AMX, and can require one or more XSTATE components to be enabled. The feature argument is the number of the highest XSTATE component which is required for a facility to work. The request argument is not a user supplied bitmap because that makes filtering harder (think seccomp) and even impossible because to support 32bit tasks the argument would have to be a pointer. The permission mechanism works this way: Task asks for permission for a facility and kernel checks whether that's supported. If supported it does: 1) Check whether permission has already been granted 2) Compute the size of the required kernel and user space buffer (sigframe) size. 3) Validate that no task has a sigaltstack installed which is smaller than the resulting sigframe size 4) Add the requested feature bit(s) to the permission bitmap of current->group_leader->fpu and store the sizes in the group leaders fpu struct as well. If that is successful then the feature is still not enabled for any of the tasks. The first usage of a related instruction will result in a #NM trap. The trap handler validates the permission bit of the tasks group leader and if permitted it installs a larger kernel buffer and transfers the permission and size info to the new fpstate container which makes all the FPU functions which require per task information aware of the extended feature set. [ tglx: Adopted to new base code, added missing serialization, massaged namings, comments and changelog ] Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20211021225527.10184-7-chang.seok.bae@intel.com
-
Thomas Gleixner authored
The upcoming prctl() which is required to request the permission for a dynamically enabled feature will also provide an option to retrieve the supported features. If the CPU does not support XSAVE, the supported features would be 0 even when the CPU supports FP and SSE. Provide separate storage for the legacy feature set to avoid that and fill in the bits in the legacy init function. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20211021225527.10184-6-chang.seok.bae@intel.com
-
Thomas Gleixner authored
Dynamically enabled features can be requested by any thread of a running process at any time. The request does neither enable the feature nor allocate larger buffers. It just stores the permission to use the feature by adding the features to the permission bitmap and by calculating the required sizes for kernel and user space. The reallocation of the kernel buffer happens when the feature is used for the first time which is caught by an exception. The permission bitmap is then checked and if the feature is permitted, then it becomes fully enabled. If not, the task dies similarly to a task which uses an undefined instruction. The size information is precomputed to allow proper sigaltstack size checks once the feature is permitted, but not yet in use because otherwise this would open race windows where too small stacks could be installed causing a later fail on signal delivery. Initialize them to the default feature set and sizes. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20211021225527.10184-5-chang.seok.bae@intel.com
-
Chang S. Bae authored
Split out the size calculation from the paranoia check so it can be used for recalculating buffer sizes when dynamically enabled features are supported. Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com> [ tglx: Adopted to changed base code ] Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20211021225527.10184-4-chang.seok.bae@intel.com
-
Thomas Gleixner authored
For historical reasons MINSIGSTKSZ is a constant which became already too small with AVX512 support. Add a mechanism to enforce strict checking of the sigaltstack size against the real size of the FPU frame. The strict check can be enabled via a config option and can also be controlled via the kernel command line option 'strict_sas_size' independent of the config switch. Enabling it might break existing applications which allocate a too small sigaltstack but 'work' because they never get a signal delivered. Though it can be handy to filter out binaries which are not yet aware of AT_MINSIGSTKSZ. Also the upcoming support for dynamically enabled FPU features requires a strict sanity check to ensure that: - Enabling of a dynamic feature, which changes the sigframe size fits into an enabled sigaltstack - Installing a too small sigaltstack after a dynamic feature has been added is not possible. Implement the base check which is controlled by config and command line options. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20211021225527.10184-3-chang.seok.bae@intel.com
-
Thomas Gleixner authored
New x86 FPU features will be very large, requiring ~10k of stack in signal handlers. These new features require a new approach called "dynamic features". The kernel currently tries to ensure that altstacks are reasonably sized. Right now, on x86, sys_sigaltstack() requires a size of >=2k. However, that 2k is a constant. Simply raising that 2k requirement to >10k for the new features would break existing apps which have a compiled-in size of 2k. Instead of universally enforcing a larger stack, prohibit a process from using dynamic features without properly-sized altstacks. This must be enforced in two places: * A dynamic feature can not be enabled without an large-enough altstack for each process thread. * Once a dynamic feature is enabled, any request to install a too-small altstack will be rejected The dynamic feature enabling code must examine each thread in a process to ensure that the altstacks are large enough. Add a new lock (sigaltstack_lock()) to ensure that threads can not race and change their altstack after being examined. Add the infrastructure in form of a config option and provide empty stubs for architectures which do not need dynamic altstack size checks. This implementation will be fleshed out for x86 in a future patch called x86/arch_prctl: Add controls for dynamic XSTATE components [dhansen: commit message. ] Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20211021225527.10184-2-chang.seok.bae@intel.com
-
- 23 Oct, 2021 4 commits
-
-
Thomas Gleixner authored
No more users. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20211022185313.074853631@linutronix.de
-
Thomas Gleixner authored
For the upcoming AMX support it's necessary to do a proper integration with KVM. Currently KVM allocates two FPU structs which are used for saving the user state of the vCPU thread and restoring the guest state when entering vcpu_run() and doing the reverse operation before leaving vcpu_run(). With the new fpstate mechanism this can be reduced to one extra buffer by swapping the fpstate pointer in current::thread::fpu. This makes the upcoming support for AMX and XFD simpler because then fpstate information (features, sizes, xfd) are always consistent and it does not require any nasty workarounds. Convert the KVM FPU code over to this new scheme. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20211022185313.019454292@linutronix.de
-
Thomas Gleixner authored
For the upcoming AMX support it's necessary to do a proper integration with KVM. Currently KVM allocates two FPU structs which are used for saving the user state of the vCPU thread and restoring the guest state when entering vcpu_run() and doing the reverse operation before leaving vcpu_run(). With the new fpstate mechanism this can be reduced to one extra buffer by swapping the fpstate pointer in current::thread::fpu. This makes the upcoming support for AMX and XFD simpler because then fpstate information (features, sizes, xfd) are always consistent and it does not require any nasty workarounds. Provide: - An allocator which initializes the state properly - A replacement for the existing FPU swap mechanim Aside of the reduced memory footprint, this also makes state switching more efficient when TIF_FPU_NEED_LOAD is set. It does not require a memcpy as the state is already correct in the to be swapped out fpstate. The existing interfaces will be removed once KVM is converted over. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20211022185312.954684740@linutronix.de
-
Thomas Gleixner authored
For the upcoming AMX support it's necessary to do a proper integration with KVM. To avoid more nasty hackery in KVM which violate encapsulation extend struct fpu and fpstate so the fpstate switching can be consolidated and simplified. Currently KVM allocates two FPU structs which are used for saving the user state of the vCPU thread and restoring the guest state when entering vcpu_run() and doing the reverse operation before leaving vcpu_run(). With the new fpstate mechanism this can be reduced to one extra buffer by swapping the fpstate pointer in current::thread::fpu. This makes the upcoming support for AMX and XFD simpler because then fpstate information (features, sizes, xfd) are always consistent and it does not require any nasty workarounds. Add fpu::__task_fpstate to save the regular fpstate pointer while the task is inside vcpu_run(). Add some state fields to fpstate to indicate the nature of the state. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20211022185312.896403942@linutronix.de
-
- 22 Oct, 2021 3 commits
-
-
Thomas Gleixner authored
Now that everything is mopped up, move all the helpers and prototypes into the core header. They are not required by the outside. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20211014230739.514095101@linutronix.de
-
Thomas Gleixner authored
xfeatures_mask_fpstate() is no longer valid when dynamically enabled features come into play. Rework restore_regs_from_fpstate() so it takes a constant mask which will then be applied against the maximum feature set so that the restore operation brings all features which are not in the xsave buffer xfeature bitmap into init state. This ensures that if the previous task used a dynamically enabled feature that the task which restores has all unused components properly initialized. Cleanup the last user of xfeatures_mask_fpstate() as well and remove it. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20211014230739.461348278@linutronix.de
-
Thomas Gleixner authored
Use the new fpu_user_cfg to retrieve the information instead of xfeatures_mask_uabi() which will be no longer correct when dynamically enabled features become available. Using fpu_user_cfg is appropriate when setting XCOMP_BV in the init_fpstate since it has space allocated for "max_features". But, normal fpstates might only have space for default xfeatures. Since XRSTOR* derives the format of the XSAVE buffer from XCOMP_BV, this can lead to XRSTOR reading out of bounds. So when copying actively used fpstate, simply read the XCOMP_BV features bits directly out of the fpstate instead. This correction courtesy of Dave Hansen <dave.hansen@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20211014230739.408879849@linutronix.de
-
- 21 Oct, 2021 15 commits
-
-
Thomas Gleixner authored
Move the feature mask storage to the kernel and user config structs. Default and maximum feature set are the same for now. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20211014230739.352041752@linutronix.de
-
Thomas Gleixner authored
Use the new kernel and user space config storage to store and retrieve the XSTATE buffer sizes. The default and the maximum size are the same for now, but will change when support for dynamically enabled features is added. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20211014230739.296830097@linutronix.de
-
Thomas Gleixner authored
The size calculations are partially unreadable gunk. Clean them up. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20211014230739.241223689@linutronix.de
-
Thomas Gleixner authored
Clean the function up before making changes. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20211014230739.184014242@linutronix.de
-
Thomas Gleixner authored
Provide a struct to store information about the maximum supported and the default feature set and buffer sizes for both user and kernel space. This allows quick retrieval of this information for the upcoming support for dynamically enabled features. [ bp: Add vertical spacing between the struct members. ] Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20211014230739.126107370@linutronix.de
-
Thomas Gleixner authored
For dynamically enabled features it's required to get the features which are enabled for that context when restoring from sigframe. The same applies for all signal frame size calculations. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/87ilxz5iew.ffs@tglx
-
Thomas Gleixner authored
Prepare for dynamically enabled states per task. The function needs to retrieve the features and sizes which are valid in a fpstate context. Retrieve them from fpstate. Move the function declarations to the core header as they are not required anywhere else. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20211013145323.233529986@linutronix.de
-
Thomas Gleixner authored
With dynamically enabled features the copy function must know the features and the size which is valid for the task. Retrieve them from fpstate. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20211013145323.181495492@linutronix.de
-
Thomas Gleixner authored
Straight forward conversion. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20211013145323.129699950@linutronix.de
-
Thomas Gleixner authored
With dynamically enabled features the sigframe code must know the features which are enabled for the task. Get them from fpstate. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20211013145323.077781448@linutronix.de
-
Thomas Gleixner authored
With variable feature sets XSAVE[S] requires to know the feature set for which the buffer is valid. Retrieve it from fpstate. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20211013145323.025695590@linutronix.de
-
Thomas Gleixner authored
Make use of fpstate::size in various places which require the buffer size information for sanity checks or memcpy() sizing. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20211013145322.973518954@linutronix.de
-
Thomas Gleixner authored
Add state size and feature mask information to the fpstate container. This will be used for runtime checks with the upcoming support for dynamically enabled features and dynamically sized buffers. That avoids conditionals all over the place as the required information is accessible for both default and extended buffers. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20211013145322.921388806@linutronix.de
-
Thomas Gleixner authored
In preparation for dynamically enabled FPU features move the function out of line as the goal is to expose less and not more information. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20211013145322.869001791@linutronix.de
-
Thomas Gleixner authored
If fork fails early then the copied task struct would carry the fpstate pointer of the parent task. Not a problem right now, but later when dynamically allocated buffers are available, keeping the pointer might result in freeing the parent's buffer. Set it to NULL which prevents that. If fork reaches clone_thread(), the pointer will be correctly set to the new task context. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20211013145322.817101108@linutronix.de
-
- 20 Oct, 2021 6 commits
-
-
Thomas Gleixner authored
All users converted. Remove it along with the sanity checks. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20211013145322.765063318@linutronix.de
-
Thomas Gleixner authored
Convert math emulation code to the new register storage mechanism in preparation for dynamically sized buffers. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20211013145322.711347464@linutronix.de
-
Thomas Gleixner authored
Convert the rest of the core code to the new register storage mechanism in preparation for dynamically sized buffers. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20211013145322.659456185@linutronix.de
-
Thomas Gleixner authored
Convert signal related code to the new register storage mechanism in preparation for dynamically sized buffers. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20211013145322.607370221@linutronix.de
-
Thomas Gleixner authored
Convert regset related code to the new register storage mechanism in preparation for dynamically sized buffers. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20211013145322.555239736@linutronix.de
-
Thomas Gleixner authored
Convert FPU tracing code to the new register storage mechanism in preparation for dynamically sized buffers. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20211013145322.503327333@linutronix.de
-