1. 06 Jun, 2023 3 commits
    • Kirill A. Shutemov's avatar
      efi/libstub: Implement support for unaccepted memory · 745e3ed8
      Kirill A. Shutemov authored
      UEFI Specification version 2.9 introduces the concept of memory
      acceptance: Some Virtual Machine platforms, such as Intel TDX or AMD
      SEV-SNP, requiring memory to be accepted before it can be used by the
      guest. Accepting happens via a protocol specific for the Virtual
      Machine platform.
      
      Accepting memory is costly and it makes VMM allocate memory for the
      accepted guest physical address range. It's better to postpone memory
      acceptance until memory is needed. It lowers boot time and reduces
      memory overhead.
      
      The kernel needs to know what memory has been accepted. Firmware
      communicates this information via memory map: a new memory type --
      EFI_UNACCEPTED_MEMORY -- indicates such memory.
      
      Range-based tracking works fine for firmware, but it gets bulky for
      the kernel: e820 (or whatever the arch uses) has to be modified on every
      page acceptance. It leads to table fragmentation and there's a limited
      number of entries in the e820 table.
      
      Another option is to mark such memory as usable in e820 and track if the
      range has been accepted in a bitmap. One bit in the bitmap represents a
      naturally aligned power-2-sized region of address space -- unit.
      
      For x86, unit size is 2MiB: 4k of the bitmap is enough to track 64GiB or
      physical address space.
      
      In the worst-case scenario -- a huge hole in the middle of the
      address space -- It needs 256MiB to handle 4PiB of the address
      space.
      
      Any unaccepted memory that is not aligned to unit_size gets accepted
      upfront.
      
      The bitmap is allocated and constructed in the EFI stub and passed down
      to the kernel via EFI configuration table. allocate_e820() allocates the
      bitmap if unaccepted memory is present, according to the size of
      unaccepted region.
      Signed-off-by: default avatarKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Signed-off-by: default avatarBorislav Petkov (AMD) <bp@alien8.de>
      Reviewed-by: default avatarArd Biesheuvel <ardb@kernel.org>
      Link: https://lore.kernel.org/r/20230606142637.5171-4-kirill.shutemov@linux.intel.com
      745e3ed8
    • Kirill A. Shutemov's avatar
      efi/x86: Get full memory map in allocate_e820() · 2e9f46ee
      Kirill A. Shutemov authored
      Currently allocate_e820() is only interested in the size of map and size
      of memory descriptor to determine how many e820 entries the kernel
      needs.
      
      UEFI Specification version 2.9 introduces a new memory type --
      unaccepted memory. To track unaccepted memory, the kernel needs to
      allocate a bitmap. The size of the bitmap is dependent on the maximum
      physical address present in the system. A full memory map is required to
      find the maximum address.
      
      Modify allocate_e820() to get a full memory map.
      Signed-off-by: default avatarKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Signed-off-by: default avatarBorislav Petkov (AMD) <bp@alien8.de>
      Reviewed-by: default avatarBorislav Petkov <bp@suse.de>
      Reviewed-by: default avatarTom Lendacky <thomas.lendacky@amd.com>
      Acked-by: default avatarArd Biesheuvel <ardb@kernel.org>
      Link: https://lore.kernel.org/r/20230606142637.5171-3-kirill.shutemov@linux.intel.com
      2e9f46ee
    • Kirill A. Shutemov's avatar
      mm: Add support for unaccepted memory · dcdfdd40
      Kirill A. Shutemov authored
      UEFI Specification version 2.9 introduces the concept of memory
      acceptance. Some Virtual Machine platforms, such as Intel TDX or AMD
      SEV-SNP, require memory to be accepted before it can be used by the
      guest. Accepting happens via a protocol specific to the Virtual Machine
      platform.
      
      There are several ways the kernel can deal with unaccepted memory:
      
       1. Accept all the memory during boot. It is easy to implement and it
          doesn't have runtime cost once the system is booted. The downside is
          very long boot time.
      
          Accept can be parallelized to multiple CPUs to keep it manageable
          (i.e. via DEFERRED_STRUCT_PAGE_INIT), but it tends to saturate
          memory bandwidth and does not scale beyond the point.
      
       2. Accept a block of memory on the first use. It requires more
          infrastructure and changes in page allocator to make it work, but
          it provides good boot time.
      
          On-demand memory accept means latency spikes every time kernel steps
          onto a new memory block. The spikes will go away once workload data
          set size gets stabilized or all memory gets accepted.
      
       3. Accept all memory in background. Introduce a thread (or multiple)
          that gets memory accepted proactively. It will minimize time the
          system experience latency spikes on memory allocation while keeping
          low boot time.
      
          This approach cannot function on its own. It is an extension of #2:
          background memory acceptance requires functional scheduler, but the
          page allocator may need to tap into unaccepted memory before that.
      
          The downside of the approach is that these threads also steal CPU
          cycles and memory bandwidth from the user's workload and may hurt
          user experience.
      
      Implement #1 and #2 for now. #2 is the default. Some workloads may want
      to use #1 with accept_memory=eager in kernel command line. #3 can be
      implemented later based on user's demands.
      
      Support of unaccepted memory requires a few changes in core-mm code:
      
        - memblock accepts memory on allocation. It serves early boot memory
          allocations and doesn't limit them to pre-accepted pool of memory.
      
        - page allocator accepts memory on the first allocation of the page.
          When kernel runs out of accepted memory, it accepts memory until the
          high watermark is reached. It helps to minimize fragmentation.
      
      EFI code will provide two helpers if the platform supports unaccepted
      memory:
      
       - accept_memory() makes a range of physical addresses accepted.
      
       - range_contains_unaccepted_memory() checks anything within the range
         of physical addresses requires acceptance.
      Signed-off-by: default avatarKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Signed-off-by: default avatarBorislav Petkov (AMD) <bp@alien8.de>
      Reviewed-by: default avatarVlastimil Babka <vbabka@suse.cz>
      Acked-by: Mike Rapoport <rppt@linux.ibm.com>	# memblock
      Link: https://lore.kernel.org/r/20230606142637.5171-2-kirill.shutemov@linux.intel.com
      dcdfdd40
  2. 04 Jun, 2023 9 commits
    • Linus Torvalds's avatar
      Linux 6.4-rc5 · 9561de3a
      Linus Torvalds authored
      9561de3a
    • Linus Torvalds's avatar
      Merge tag 'irq_urgent_for_v6.4_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 6f64a5eb
      Linus Torvalds authored
      Pull irq fix from Borislav Petkov:
      
       - Fix open firmware quirks validation so that they don't get applied
         wrongly
      
      * tag 'irq_urgent_for_v6.4_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        irqchip/gic: Correctly validate OF quirk descriptors
      6f64a5eb
    • Linus Torvalds's avatar
      Merge tag 'media/v6.4-4' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media · 5e89d62e
      Linus Torvalds authored
      Pull media fixes from Mauro Carvalho Chehab:
       "Some driver fixes:
         - a regression fix for the verisilicon driver
         - uvcvideo: don't expose unsupported video formats to userspace
         - camss-video: don't zero subdev format after init
         - mediatek: some fixes for 4K decoder formats
         - fix a Sphinx build warning (missing doc for client_caps)
         - some fixes for imx and atomisp staging drivers
      
        And two CEC core fixes:
         - don't set last_initiator if TX in progress
         - disable adapter in cec_devnode_unregister"
      
      * tag 'media/v6.4-4' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
        media: uvcvideo: Don't expose unsupported formats to userspace
        media: v4l2-subdev: Fix missing kerneldoc for client_caps
        media: staging: media: imx: initialize hs_settle to avoid warning
        media: v4l2-mc: Drop subdev check in v4l2_create_fwnode_links_to_pad()
        media: staging: media: atomisp: init high & low vars
        media: cec: core: don't set last_initiator if tx in progress
        media: cec: core: disable adapter in cec_devnode_unregister
        media: mediatek: vcodec: Only apply 4K frame sizes on decoder formats
        media: camss: camss-video: Don't zero subdev format again after initialization
        media: verisilicon: Additional fix for the crash when opening the driver
      5e89d62e
    • Linus Torvalds's avatar
      Merge tag 'char-misc-6.4-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc · 209835e8
      Linus Torvalds authored
      Pull char/misc driver fixes from Greg KH:
       "Here are a bunch of tiny char/misc/other driver fixes for 6.4-rc5 that
        resolve a number of reported issues. Included in here are:
      
         - iio driver fixes
      
         - fpga driver fixes
      
         - test_firmware bugfixes
      
         - fastrpc driver tiny bugfixes
      
         - MAINTAINERS file updates for some subsystems
      
        All of these have been in linux-next this past week with no reported
        issues"
      
      * tag 'char-misc-6.4-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (34 commits)
        test_firmware: fix the memory leak of the allocated firmware buffer
        test_firmware: fix a memory leak with reqs buffer
        test_firmware: prevent race conditions by a correct implementation of locking
        firmware_loader: Fix a NULL vs IS_ERR() check
        MAINTAINERS: Vaibhav Gupta is the new ipack maintainer
        dt-bindings: fpga: replace Ivan Bornyakov maintainership
        MAINTAINERS: update Microchip MPF FPGA reviewers
        misc: fastrpc: reject new invocations during device removal
        misc: fastrpc: return -EPIPE to invocations on device removal
        misc: fastrpc: Reassign memory ownership only for remote heap
        misc: fastrpc: Pass proper scm arguments for secure map request
        iio: imu: inv_icm42600: fix timestamp reset
        iio: adc: ad_sigma_delta: Fix IRQ issue by setting IRQ_DISABLE_UNLAZY flag
        dt-bindings: iio: adc: renesas,rcar-gyroadc: Fix adi,ad7476 compatible value
        iio: dac: mcp4725: Fix i2c_master_send() return value handling
        iio: accel: kx022a fix irq getting
        iio: bu27034: Ensure reset is written
        iio: dac: build ad5758 driver when AD5758 is selected
        iio: addac: ad74413: fix resistance input processing
        iio: light: vcnl4035: fixed chip ID check
        ...
      209835e8
    • Linus Torvalds's avatar
      Merge tag 'driver-core-6.4-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core · 41f3ab2d
      Linus Torvalds authored
      Pull driver core fixes from Greg KH:
       "Here are two small driver core cacheinfo fixes for 6.4-rc5 that
        resolve a number of reported issues with that file. These changes have
        been in linux-next this past week with no reported problems"
      
      * tag 'driver-core-6.4-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
        drivers: base: cacheinfo: Update cpu_map_populated during CPU Hotplug
        drivers: base: cacheinfo: Fix shared_cpu_map changes in event of CPU hotplug
      41f3ab2d
    • Linus Torvalds's avatar
      Merge tag 'tty-6.4-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty · 12c2f77b
      Linus Torvalds authored
      Pull tty/serial driver fixes from Greg KH:
       "Here are some small tty/serial driver fixes for 6.4-rc5 that have all
        been in linux-next this past week with no reported problems. Included
        in here are:
      
         - 8250_tegra driver bugfix
      
         - fsl uart driver bugfixes
      
         - Kconfig fix for dependancy issue
      
         - dt-bindings fix for the 8250_omap driver"
      
      * tag 'tty-6.4-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
        dt-bindings: serial: 8250_omap: add rs485-rts-active-high
        serial: cpm_uart: Fix a COMPILE_TEST dependency
        soc: fsl: cpm1: Fix TSA and QMC dependencies in case of COMPILE_TEST
        tty: serial: fsl_lpuart: use UARTCTRL_TXINV to send break instead of UARTCTRL_SBK
        serial: 8250_tegra: Fix an error handling path in tegra_uart_probe()
      12c2f77b
    • Linus Torvalds's avatar
      Merge tag 'usb-6.4-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb · 8b435e40
      Linus Torvalds authored
      Pull USB fixes from Greg KH:
       "Here are some USB driver and core fixes for 6.4-rc5. Most of these are
        tiny driver fixes, including:
      
         - udc driver bugfix
      
         - f_fs gadget driver bugfix
      
         - cdns3 driver bugfix
      
         - typec bugfixes
      
        But the "big" thing in here is a fix yet-again for how the USB buffers
        are handled from userspace when dealing with DMA issues. The changes
        were discussed a lot, and tested a lot, on the list, and acked by the
        relevant mm maintainers and have been in linux-next all this past week
        with no reported problems"
      
      * tag 'usb-6.4-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
        usb: typec: tps6598x: Fix broken polling mode after system suspend/resume
        mm: page_table_check: Ensure user pages are not slab pages
        mm: page_table_check: Make it dependent on EXCLUSIVE_SYSTEM_RAM
        usb: usbfs: Use consistent mmap functions
        usb: usbfs: Enforce page requirements for mmap
        dt-bindings: usb: snps,dwc3: Fix "snps,hsphy_interface" type
        usb: gadget: udc: fix NULL dereference in remove()
        usb: gadget: f_fs: Add unbind event before functionfs_unbind
        usb: cdns3: fix NCM gadget RX speed 20x slow than expection at iMX8QM
      8b435e40
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm · b066935b
      Linus Torvalds authored
      Pull kvm fixes from Paolo Bonzini:
       "ARM:
      
         - Address some fallout of the locking rework, this time affecting the
           way the vgic is configured
      
         - Fix an issue where the page table walker frees a subtree and then
           proceeds with walking what it has just freed...
      
         - Check that a given PA donated to the guest is actually memory (only
           affecting pKVM)
      
         - Correctly handle MTE CMOs by Set/Way
      
         - Fix the reported address of a watchpoint forwarded to userspace
      
         - Fix the freeing of the root of stage-2 page tables
      
         - Stop creating spurious PMU events to perform detection of the
           default PMU and use the existing PMU list instead
      
        x86:
      
         - Fix a memslot lookup bug in the NX recovery thread that could
           theoretically let userspace bypass the NX hugepage mitigation
      
         - Fix a s/BLOCKING/PENDING bug in SVM's vNMI support
      
         - Account exit stats for fastpath VM-Exits that never leave the super
           tight run-loop
      
         - Fix an out-of-bounds bug in the optimized APIC map code, and add a
           regression test for the race"
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
        KVM: selftests: Add test for race in kvm_recalculate_apic_map()
        KVM: x86: Bail from kvm_recalculate_phys_map() if x2APIC ID is out-of-bounds
        KVM: x86: Account fastpath-only VM-Exits in vCPU stats
        KVM: SVM: vNMI pending bit is V_NMI_PENDING_MASK not V_NMI_BLOCKING_MASK
        KVM: x86/mmu: Grab memslot for correct address space in NX recovery worker
        KVM: arm64: Document default vPMU behavior on heterogeneous systems
        KVM: arm64: Iterate arm_pmus list to probe for default PMU
        KVM: arm64: Drop last page ref in kvm_pgtable_stage2_free_removed()
        KVM: arm64: Populate fault info for watchpoint
        KVM: arm64: Reload PTE after invoking walker callback on preorder traversal
        KVM: arm64: Handle trap of tagged Set/Way CMOs
        arm64: Add missing Set/Way CMO encodings
        KVM: arm64: Prevent unconditional donation of unmapped regions from the host
        KVM: arm64: vgic: Fix a comment
        KVM: arm64: vgic: Fix locking comment
        KVM: arm64: vgic: Wrap vgic_its_create() with config_lock
        KVM: arm64: vgic: Fix a circular locking issue
      b066935b
    • Linus Torvalds's avatar
      Merge tag 'powerpc-6.4-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux · 9455b4b6
      Linus Torvalds authored
      Pull powerpc fixes from Michael Ellerman:
      
       - Fix link errors in new aes-gcm-p10 code when built-in with other
         drivers
      
       - Limit number of TCEs passed to H_STUFF_TCE hcall as per spec
      
       - Use KSYM_NAME_LEN in xmon array size to avoid possible OOB write
      
      Thanks to Gaurav Batra and Maninder Singh Vishal Chourasia.
      
      * tag 'powerpc-6.4-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
        powerpc/xmon: Use KSYM_NAME_LEN in array size
        powerpc/iommu: Limit number of TCEs to 512 for H_STUFF_TCE hcall
        powerpc/crypto: Fix aes-gcm-p10 link errors
      9455b4b6
  3. 03 Jun, 2023 10 commits
  4. 02 Jun, 2023 18 commits