- 23 Feb, 2020 15 commits
-
-
Ard Biesheuvel authored
We currently parse the command non-destructively, to avoid having to allocate memory for a copy before passing it to the standard parsing routines that are used by the core kernel, and which modify the input to delineate the parsed tokens with NUL characters. Instead, we call strstr() and strncmp() to go over the input multiple times, and match prefixes rather than tokens, which implies that we would match, e.g., 'nokaslrfoo' in the stub and disable KASLR, while the kernel would disregard the option and run with KASLR enabled. In order to avoid having to reason about whether and how this behavior may be abused, let's clean up the parsing routines, and rebuild them on top of the existing helpers. Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
-
Ard Biesheuvel authored
On x86, the preferred load address of the initrd is still below 4 GB, even though in some cases, we can cope with an initrd that is loaded above that. To simplify the code, and to make it more straightforward to introduce other ways to load the initrd, pass the soft and hard memory limits at the same time, and let the code handling the initrd= command line option deal with this. Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
-
Ard Biesheuvel authored
The file I/O routine that is used to load initrd or dtb files from the EFI system partition suffers from a few issues: - it converts the u8[] command line back to a UTF-16 string, which is pointless since we only handle initrd or dtb arguments provided via the loaded image protocol anyway, which is where we got the UTF-16[] command line from in the first place when booting via the PE entry point, - in the far majority of cases, only a single initrd= option is present, but it optimizes for multiple options, by going over the command line twice, allocating heap buffers for dynamically sized arrays, etc. - the coding style is hard to follow, with few comments, and all logic including string parsing etc all combined in a single routine. Let's fix this by rewriting most of it, based on the idea that in the case of multiple initrds, we can just allocate a new, bigger buffer and copy over the data before freeing the old one. Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
-
Ard Biesheuvel authored
Split off the file I/O support code into a separate source file so it ends up in a separate object file in the static library, allowing the linker to omit it if the routines are not used. Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
-
Ard Biesheuvel authored
get_dram_base() is only called from arm-stub.c so move it into the same source file as its caller. Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
-
Ard Biesheuvel authored
efi_random_alloc() is only used on arm64, but as it shares a source file with efi_random_get_seed(), the latter will pull in the former on other architectures as well. Let's take advantage of the fact that libstub is a static library, and so the linker will only incorporate objects that are needed to satisfy dependencies in other objects. This means we can move the random alloc code to a separate source file that gets built unconditionally, but only used when needed. Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
-
Ard Biesheuvel authored
We now support cmdline data that is located in memory that is not 32-bit addressable, so relax the allocation limit on systems where this feature is enabled. Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
-
Ard Biesheuvel authored
Move all the declarations that are only used in stub code from linux/efi.h to efistub.h which is only included locally. Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
-
Ard Biesheuvel authored
We now support bootparams structures that are located in memory that is not 32-bit addressable, so relax the allocation limit on systems where this feature is enabled. Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
-
Ard Biesheuvel authored
Align the naming of efi_file_io_interface_t and efi_file_handle_t with the UEFI spec, and call them efi_simple_file_system_protocol_t and efi_file_protocol_t, respectively, using the same convention we use for all other type definitions that originate in the UEFI spec. While at it, move the definitions to efistub.h, so they are only seen by code that needs them. Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
-
Ard Biesheuvel authored
Most of the EFI stub source files of all architectures reside under drivers/firmware/efi/libstub, where they share a Makefile with special CFLAGS and an include file with declarations that are only relevant for stub code. Currently, we carry a lot of stub specific stuff in linux/efi.h only because eboot.c in arch/x86 needs them as well. So let's move eboot.c into libstub/, and move the contents of eboot.h that we still care about into efistub.h Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
-
Ard Biesheuvel authored
The implementation of efi_high_alloc() uses a complicated way of traversing the memory map to find an available region that is located as close as possible to the provided upper limit, and calls AllocatePages subsequently to create the allocation at that exact address. This is precisely what the EFI_ALLOCATE_MAX_ADDRESS allocation type argument to AllocatePages() does, and considering that EFI_ALLOC_ALIGN only exceeds EFI_PAGE_SIZE on arm64, let's use AllocatePages() directly and implement the alignment using code that the compiler can remove if it does not exceed EFI_PAGE_SIZE. Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
-
Ard Biesheuvel authored
Create a new source file mem.c to keep the routines involved in memory allocation and deallocation and manipulation of the EFI memory map. Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
-
Ard Biesheuvel authored
The arm64 kernel no longer requires the FDT blob to fit inside a naturally aligned 2 MB memory block, so remove the code that aligns the allocation to 2 MB. Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
-
Ard Biesheuvel authored
Instead of setting the visibility pragma for a small set of symbol declarations that could result in absolute references that we cannot support in the stub, declare hidden visibility for all code in the EFI stub, which is more robust and future proof. To ensure that the #pragma is taken into account before any other includes are processed, put it in a header file of its own and include it via the compiler command line using the -include option. Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
-
- 22 Feb, 2020 16 commits
-
-
Ard Biesheuvel authored
When using the native PE entry point (as opposed to the EFI handover protocol entry point that is used more widely), we set code32_start, which is a 32-bit wide field, to the effective symbol address of startup_32, which could overflow given that the EFI loader may have located the running image anywhere in memory, and we haven't reached the point yet where we relocate ourselves. Since we relocate ourselves if code32_start != pref_address, this isn't likely to lead to problems in practice, given how unlikely it is that the truncated effective address of startup_32 happens to equal pref_address. But it is better to defer the assignment of code32_start to after the relocation, when it is guaranteed to fit. While at it, move the call to efi_relocate_kernel() to an earlier stage so it is more likely that our preferred offset in memory has not been occupied by other memory allocations done in the mean time. Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
-
Ard Biesheuvel authored
We have some code in the EFI stub entry point that takes the address of the apm_bios_info struct in the newly allocated and zeroed out boot_params structure, only to zero it out again. This is pointless so remove it. Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
-
Gustavo A. R. Silva authored
The current codebase makes use of the zero-length array language extension to the C90 standard, but the preferred mechanism to declare variable-length types such as these ones is a flexible array member[1][2], introduced in C99: struct foo { int stuff; struct boo array[]; }; By making use of the mechanism above, we will get a compiler warning in case the flexible array does not occur last in the structure, which will help us prevent some kind of undefined behavior bugs from being inadvertenly introduced[3] to the codebase from now on. This issue was found with the help of Coccinelle. [1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html [2] https://github.com/KSPP/linux/issues/21 [3] commit 76497732 ("cxgb3/l2t: Fix undefined behaviour") Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Link: https://lore.kernel.org/r/20200211231421.GA15697@embeddedorSigned-off-by: Ard Biesheuvel <ardb@kernel.org>
-
Ard Biesheuvel authored
The UEFI spec defines (and deprecates) a misguided and shortlived memory protection feature that is based on splitting memory regions covering PE/COFF executables into separate code and data regions, without annotating them as belonging to the same executable image. When the OS assigns the virtual addresses of these regions, it may move them around arbitrarily, without taking into account that the PE/COFF code sections may contain relative references into the data sections, which means the relative placement of these segments has to be preserved or the executable image will be corrupted. The original workaround on arm64 was to ensure that adjacent regions of the same type were mapped adjacently in the virtual mapping, but this requires sorting of the memory map, which we would prefer to avoid. Considering that the native physical mapping of the PE/COFF images does not suffer from this issue, let's preserve it at runtime, and install it as the virtual mapping as well. Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
-
Hans de Goede authored
Some (somewhat older) laptops have a correct BGRT table, except that the version field is 0 instead of 1. This has been seen on several Ivy Bridge based Lenovo models. For now the spec. only defines version 1, so it is reasonably safe to assume that tables with a version of 0 really are version 1 too, which is what this commit does so that the BGRT table will be accepted by the kernel on laptop models with this issue. Signed-off-by: Hans de Goede <hdegoede@redhat.com> Link: https://lore.kernel.org/r/20200131130623.33875-1-hdegoede@redhat.comSigned-off-by: Ard Biesheuvel <ardb@kernel.org>
-
Arvind Sankar authored
This function is only called from efi_main in the same source file. Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu> Link: https://lore.kernel.org/r/20200130222004.1932152-1-nivedita@alum.mit.eduSigned-off-by: Ard Biesheuvel <ardb@kernel.org>
-
Arvind Sankar authored
Rearrange the instructions a bit to use a 32-bit displacement once instead of 2/3 times. This saves 8 bytes of machine code. Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu> Link: https://lore.kernel.org/r/20200202171353.3736319-8-nivedita@alum.mit.eduSigned-off-by: Ard Biesheuvel <ardb@kernel.org>
-
Arvind Sankar authored
The limit value for the GDTR should be such that adding it to the base address gives the address of the last byte of the GDT, i.e. it should be one less than the size, not the size. Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu> Link: https://lore.kernel.org/r/20200202171353.3736319-7-nivedita@alum.mit.eduSigned-off-by: Ard Biesheuvel <ardb@kernel.org>
-
Arvind Sankar authored
The 64-bit kernel will already load a GDT in startup_64, which is the next function to execute after return from efi_main. Add GDT setup code to the 32-bit kernel's startup_32 as well. Doing it in the head code has the advantage that we can avoid potentially corrupting the GDT during copy/decompression. This also removes dependence on having a specific GDT layout setup by the bootloader. Both startup_32 and startup_64 now clear interrupts on entry, so we can remove that from efi_main as well. Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu> Link: https://lore.kernel.org/r/20200202171353.3736319-6-nivedita@alum.mit.eduSigned-off-by: Ard Biesheuvel <ardb@kernel.org>
-
Arvind Sankar authored
startup_32 already clears these flags on entry, do it in startup_64 as well for consistency. The direction flag in particular is not specified to be cleared in the boot protocol documentation, and we currently call into C code (paging_prepare) without explicitly clearing it. Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu> Link: https://lore.kernel.org/r/20200202171353.3736319-5-nivedita@alum.mit.eduSigned-off-by: Ard Biesheuvel <ardb@kernel.org>
-
Arvind Sankar authored
The GDT may get overwritten during the copy or during extract_kernel, which will cause problems if any segment register is touched before the GDTR is reloaded by the decompressed kernel. For safety update the GDTR to point to the GDT within the copied kernel. Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu> Link: https://lore.kernel.org/r/20200202171353.3736319-4-nivedita@alum.mit.eduSigned-off-by: Ard Biesheuvel <ardb@kernel.org>
-
Arvind Sankar authored
When booting in mixed mode, the firmware's GDT is still installed at handover entry in efi32_stub_entry. We save the GDTR for later use in __efi64_thunk but we are assuming that descriptor 2 (__KERNEL_CS) is a valid 32-bit code segment descriptor and that descriptor 3 (__KERNEL_DS/__BOOT_DS) is a valid data segment descriptor. This happens to be true for OVMF (it actually uses descriptor 1 for data segments, but descriptor 3 is also setup as data), but we shouldn't depend on this being the case. Fix this by saving the code and data selectors in addition to the GDTR in efi32_stub_entry, and restoring them in __efi64_thunk before calling the firmware. The UEFI specification guarantees that selectors will be flat, so using the DS selector for all the segment registers should be enough. We also need to install our own GDT before initializing segment registers in startup_32, so move the GDT load up to the beginning of the function. [ardb: mention mixed mode in the commit log] Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu> Link: https://lore.kernel.org/r/20200202171353.3736319-3-nivedita@alum.mit.eduSigned-off-by: Ard Biesheuvel <ardb@kernel.org>
-
Arvind Sankar authored
Commit a24e7851 ("i386: paravirt boot sequence") added this flag for use by paravirtualized environments such as Xen. However, Xen never made use of this flag [1], and it was only ever used by lguest [2]. Commit ecda85e7 ("x86/lguest: Remove lguest support") removed lguest, so KEEP_SEGMENTS has lost its last user. [1] https://lore.kernel.org/lkml/4D4B097C.5050405@goop.org [2] https://www.mail-archive.com/lguest@lists.ozlabs.org/msg00469.htmlSigned-off-by: Arvind Sankar <nivedita@alum.mit.edu> Link: https://lore.kernel.org/r/20200202171353.3736319-2-nivedita@alum.mit.eduSigned-off-by: Ard Biesheuvel <ardb@kernel.org>
-
Ard Biesheuvel authored
Expose efi_entry() as the PE/COFF entrypoint directly, instead of jumping into a wrapper that fiddles with stack buffers and other stuff that the compiler is much better at. The only reason this code exists is to obtain a pointer to the base of the image, but we can get the same value from the loaded_image protocol, which we already need for other reasons anyway. Update the return type as well, to make it consistent with what is required for a PE/COFF executable entrypoint. Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
-
Ard Biesheuvel authored
In preparation for turning the decompressor's cache clean/flush operations into proper by-VA maintenance for v7 cores, pass the start and end addresses of the regions that need cache maintenance into cache_clean_flush in registers r0 and r1. Currently, all implementations of cache_clean_flush ignore these values, so no functional change is expected as a result of this patch. Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
-
Ard Biesheuvel authored
The EFI stub executes within the context of the zImage as it was loaded by the firmware, which means it is treated as an ordinary PE/COFF executable, which is loaded into memory, and cleaned to the PoU to ensure that it can be executed safely while the MMU and caches are on. When the EFI stub hands over to the decompressor, we clean the caches by set/way and disable the MMU and D-cache, to comply with the Linux boot protocol for ARM. However, cache maintenance by set/way is not sufficient to ensure that subsequent instruction fetches and data accesses done with the MMU off see the correct data. This means that proceeding as we do currently is not safe, especially since we also perform data accesses with the MMU off, from a literal pool as well as the stack. So let's kick this can down the road a bit, and jump into the relocated zImage before disabling the caches. This removes the requirement to perform any by-VA cache maintenance on the original PE/COFF executable, but it does require that the relocated zImage is cleaned to the PoC, which is currently not the case. This will be addressed in a subsequent patch. Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
-
- 10 Feb, 2020 2 commits
-
-
Linus Torvalds authored
-
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuildLinus Torvalds authored
Pull more Kbuild updates from Masahiro Yamada: - fix randconfig to generate a sane .config - rename hostprogs-y / always to hostprogs / always-y, which are more natual syntax. - optimize scripts/kallsyms - fix yes2modconfig and mod2yesconfig - make multiple directory targets ('make foo/ bar/') work * tag 'kbuild-v5.6-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: kbuild: make multiple directory targets work kconfig: Invalidate all symbols after changing to y or m. kallsyms: fix type of kallsyms_token_table[] scripts/kallsyms: change table to store (strcut sym_entry *) scripts/kallsyms: rename local variables in read_symbol() kbuild: rename hostprogs-y/always to hostprogs/always-y kbuild: fix the document to use extra-y for vmlinux.lds kconfig: fix broken dependency in randconfig-generated .config
-
- 09 Feb, 2020 7 commits
-
-
git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/zonefsLinus Torvalds authored
Pull new zonefs file system from Damien Le Moal: "Zonefs is a very simple file system exposing each zone of a zoned block device as a file. Unlike a regular file system with native zoned block device support (e.g. f2fs or the on-going btrfs effort), zonefs does not hide the sequential write constraint of zoned block devices to the user. As a result, zonefs is not a POSIX compliant file system. Its goal is to simplify the implementation of zoned block devices support in applications by replacing raw block device file accesses with a richer file based API, avoiding relying on direct block device file ioctls which may be more obscure to developers. One example of this approach is the implementation of LSM (log-structured merge) tree structures (such as used in RocksDB and LevelDB) on zoned block devices by allowing SSTables to be stored in a zone file similarly to a regular file system rather than as a range of sectors of a zoned device. The introduction of the higher level construct "one file is one zone" can help reducing the amount of changes needed in the application while at the same time allowing the use of zoned block devices with various programming languages other than C. Zonefs IO management implementation uses the new iomap generic code. Zonefs has been successfully tested using a functional test suite (available with zonefs userland format tool on github) and a prototype implementation of LevelDB on top of zonefs" * tag 'zonefs-5.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/zonefs: zonefs: Add documentation fs: New zonefs file system
-
Marc Zyngier authored
In order to allow the GICv4 code to link properly on 32bit ARM, make sure we don't use 64bit divisions when it isn't strictly necessary. Fixes: 4e6437f1 ("irqchip/gic-v4.1: Ensure L2 vPE table is allocated at RD level") Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Zenghui Yu <yuzenghui@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
git://git.samba.org/sfrench/cifs-2.6Linus Torvalds authored
Pull cifs fixes from Steve French: "13 cifs/smb3 patches, most from testing at the SMB3 plugfest this week: - Important fix for multichannel and for modefromsid mounts. - Two reconnect fixes - Addition of SMB3 change notify support - Backup tools fix - A few additional minor debug improvements (tracepoints and additional logging found useful during testing this week)" * tag '5.6-rc-smb3-plugfest-patches' of git://git.samba.org/sfrench/cifs-2.6: smb3: Add defines for new information level, FileIdInformation smb3: print warning once if posix context returned on open smb3: add one more dynamic tracepoint missing from strict fsync path cifs: fix mode bits from dir listing when mounted with modefromsid cifs: fix channel signing cifs: add SMB3 change notification support cifs: make multichannel warning more visible cifs: fix soft mounts hanging in the reconnect code cifs: Add tracepoints for errors on flush or fsync cifs: log warning message (once) if out of disk space cifs: fail i/o on soft mounts if sessionsetup errors out smb3: fix problem with null cifs super block with previous patch SMB3: Backup intent flag missing from some more ops
-
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfsLinus Torvalds authored
Pull vboxfs from Al Viro: "This is the VirtualBox guest shared folder support by Hans de Goede, with fixups for fs_parse folded in to avoid bisection hazards from those API changes..." * 'work.vboxsf' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: fs: Add VirtualBox guest shared folder (vboxsf) support
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull x86 fixes from Thomas Gleixner: "A set of fixes for X86: - Ensure that the PIT is set up when the local APIC is disable or configured in legacy mode. This is caused by an ordering issue introduced in the recent changes which skip PIT initialization when the TSC and APIC frequencies are already known. - Handle malformed SRAT tables during early ACPI parsing which caused an infinite loop anda boot hang. - Fix a long standing race in the affinity setting code which affects PCI devices with non-maskable MSI interrupts. The problem is caused by the non-atomic writes of the MSI address (destination APIC id) and data (vector) fields which the device uses to construct the MSI message. The non-atomic writes are mandated by PCI. If both fields change and the device raises an interrupt after writing address and before writing data, then the MSI block constructs a inconsistent message which causes interrupts to be lost and subsequent malfunction of the device. The fix is to redirect the interrupt to the new vector on the current CPU first and then switch it over to the new target CPU. This allows to observe an eventually raised interrupt in the transitional stage (old CPU, new vector) to be observed in the APIC IRR and retriggered on the new target CPU and the new vector. The potential spurious interrupts caused by this are harmless and can in the worst case expose a buggy driver (all handlers have to be able to deal with spurious interrupts as they can and do happen for various reasons). - Add the missing suspend/resume mechanism for the HYPERV hypercall page which prevents resume hibernation on HYPERV guests. This change got lost before the merge window. - Mask the IOAPIC before disabling the local APIC to prevent potentially stale IOAPIC remote IRR bits which cause stale interrupt lines after resume" * tag 'x86-urgent-2020-02-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/apic: Mask IOAPIC entries when disabling the local APIC x86/hyperv: Suspend/resume the hypercall page for hibernation x86/apic/msi: Plug non-maskable MSI affinity race x86/boot: Handle malformed SRAT tables during early ACPI parsing x86/timer: Don't skip PIT setup when APIC is disabled or in legacy mode
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull SMP fixes from Thomas Gleixner: "Two fixes for the SMP related functionality: - Make the UP version of smp_call_function_single() match SMP semantics when called for a not available CPU. Instead of emitting a warning and assuming that the function call target is CPU0, return a proper error code like the SMP version does. - Remove a superfluous check in smp_call_function_many_cond()" * tag 'smp-urgent-2020-02-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: smp/up: Make smp_call_function_single() match SMP semantics smp: Remove superfluous cond_func check in smp_call_function_many_cond()
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull perf fixes from Thomas Gleixner: "A set of fixes and improvements for the perf subsystem: Kernel fixes: - Install cgroup events to the correct CPU context to prevent a potential list double add - Prevent an integer underflow in the perf mlock accounting - Add a missing prototype for arch_perf_update_userpage() Tooling: - Add a missing unlock in the error path of maps__insert() in perf maps. - Fix the build with the latest libbfd - Fix the perf parser so it does not delete parse event terms, which caused a regression for using perf with the ARM CoreSight as the sink configuration was missing due to the deletion. - Fix the double free in the perf CPU map merging test case - Add the missing ustring support for the perf probe command" * tag 'perf-urgent-2020-02-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf maps: Add missing unlock to maps__insert() error case perf probe: Add ustring support for perf probe command perf: Make perf able to build with latest libbfd perf test: Fix test case Merge cpu map perf parse: Copy string to perf_evsel_config_term perf parse: Refactor 'struct perf_evsel_config_term' kernel/events: Add a missing prototype for arch_perf_update_userpage() perf/cgroups: Install cgroup events to correct cpuctx perf/core: Fix mlock accounting in perf_mmap()
-