Commits · 82c3cefb9f1652e7470f442ff96c613e8c8ed8f4 · Kirill Smelkov / linux

04 Mar, 2021 3 commits

iommu: Don't use lazy flush for untrusted device · 82c3cefb

Lu Baolu authored Feb 25, 2021

The lazy IOTLB flushing setup leaves a time window, in which the device
can still access some system memory, which has already been unmapped by
the device driver. It's not suitable for untrusted devices. A malicious
device might use this to attack the system by obtaining data that it
shouldn't obtain.

Fixes: c588072b ("iommu/vt-d: Convert intel iommu driver to the iommu ops")
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210225061454.2864009-1-baolu.lu@linux.intel.comSigned-off-by: Joerg Roedel <jroedel@suse.de>

82c3cefb

iommu/tegra-smmu: Fix mc errors on tegra124-nyan · 765a9d1d

Nicolin Chen authored Feb 18, 2021

Commit 25938c73 ("iommu/tegra-smmu: Rework tegra_smmu_probe_device()")
removed certain hack in the tegra_smmu_probe() by relying on IOMMU core to
of_xlate SMMU's SID per device, so as to get rid of tegra_smmu_find() and
tegra_smmu_configure() that are typically done in the IOMMU core also.

This approach works for both existing devices that have DT nodes and other
devices (like PCI device) that don't exist in DT, on Tegra210 and Tegra3
upon testing. However, Page Fault errors are reported on tegra124-Nyan:

tegra-mc 70019000.memory-controller: display0a: read @0xfe056b40:
EMEM address decode error (SMMU translation error [--S])
tegra-mc 70019000.memory-controller: display0a: read @0xfe056b40:
Page fault (SMMU translation error [--S])

After debugging, I found that the mentioned commit changed some function
callback sequence of tegra-smmu's, resulting in enabling SMMU for display
client before display driver gets initialized. I couldn't reproduce exact
same issue on Tegra210 as Tegra124 (arm-32) differs at arch-level code.

Actually this Page Fault is a known issue, as on most of Tegra platforms,
display gets enabled by the bootloader for the splash screen feature, so
it keeps filling the framebuffer memory. A proper fix to this issue is to
1:1 linear map the framebuffer memory to IOVA space so the SMMU will have
the same address as the physical address in its page table. Yet, Thierry
has been working on the solution above for a year, and it hasn't merged.

Therefore, let's partially revert the mentioned commit to fix the errors.

The reason why we do a partial revert here is that we can still set priv
in ->of_xlate() callback for PCI devices. Meanwhile, devices existing in
DT, like display, will go through tegra_smmu_configure() at the stage of
bus_set_iommu() when SMMU gets probed(), as what it did before we merged
the mentioned commit.

Once we have the linear map solution for framebuffer memory, this change
can be cleaned away.

[Big thank to Guillaume who reported and helped debugging/verification]

Fixes: 25938c73 ("iommu/tegra-smmu: Rework tegra_smmu_probe_device()")
Reported-by: Guillaume Tucker <guillaume.tucker@collabora.com>
Signed-off-by: Nicolin Chen <nicoleotsuka@gmail.com>
Tested-by: Guillaume Tucker <guillaume.tucker@collabora.com>
Acked-by: Thierry Reding <treding@nvidia.com>
Link: https://lore.kernel.org/r/20210218220702.1962-1-nicoleotsuka@gmail.comSigned-off-by: Joerg Roedel <jroedel@suse.de>

765a9d1d

iommu/amd: Fix sleeping in atomic in increase_address_space() · 140456f9

Andrey Ryabinin authored Feb 17, 2021

increase_address_space() calls get_zeroed_page(gfp) under spin_lock with
disabled interrupts. gfp flags passed to increase_address_space() may allow
sleeping, so it comes to this:

 BUG: sleeping function called from invalid context at mm/page_alloc.c:4342
 in_atomic(): 1, irqs_disabled(): 1, pid: 21555, name: epdcbbf1qnhbsd8

 Call Trace:
  dump_stack+0x66/0x8b
  ___might_sleep+0xec/0x110
  __alloc_pages_nodemask+0x104/0x300
  get_zeroed_page+0x15/0x40
  iommu_map_page+0xdd/0x3e0
  amd_iommu_map+0x50/0x70
  iommu_map+0x106/0x220
  vfio_iommu_type1_ioctl+0x76e/0x950 [vfio_iommu_type1]
  do_vfs_ioctl+0xa3/0x6f0
  ksys_ioctl+0x66/0x70
  __x64_sys_ioctl+0x16/0x20
  do_syscall_64+0x4e/0x100
  entry_SYSCALL_64_after_hwframe+0x44/0xa9

Fix this by moving get_zeroed_page() out of spin_lock/unlock section.

Fixes: 754265bc ("iommu/amd: Fix race in increase_address_space()")
Signed-off-by: Andrey Ryabinin <arbn@yandex-team.com>
Acked-by: Will Deacon <will@kernel.org>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20210217143004.19165-1-arbn@yandex-team.comSigned-off-by: Joerg Roedel <jroedel@suse.de>

140456f9

12 Feb, 2021 2 commits

Merge branches 'arm/renesas', 'arm/smmu', 'x86/amd', 'x86/vt-d' and 'core' into next · 45e606f2
Joerg Roedel authored Feb 12, 2021

45e606f2

iommu/amd: Fix performance counter initialization · 6778ff5b

Suravee Suthikulpanit authored Feb 08, 2021

Certain AMD platforms enable power gating feature for IOMMU PMC,
which prevents the IOMMU driver from updating the counter while
trying to validate the PMC functionality in the init_iommu_perf_ctr().
This results in disabling PMC support and the following error message:

"AMD-Vi: Unable to read/write to IOMMU perf counter"

To workaround this issue, disable power gating temporarily by programming
the counter source to non-zero value while validating the counter,
and restore the prior state afterward.
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Tested-by: Tj (Elloe Linux) <ml.linux@elloe.vision>
Link: https://lore.kernel.org/r/20210208122712.5048-1-suravee.suthikulpanit@amd.com
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=201753Signed-off-by: Joerg Roedel <jroedel@suse.de>

6778ff5b

08 Feb, 2021 2 commits

MAINTAINERS: repair file pattern in MEDIATEK IOMMU DRIVER · cc6e70bd

Lukas Bulwahn authored Feb 08, 2021

Commit 6af48738 ("MAINTAINERS: Add entry for MediaTek IOMMU") mentions
the pattern 'drivers/iommu/mtk-iommu*', but the files are actually named
with an underscore, not with a hyphen.

Hence, ./scripts/get_maintainer.pl --self-test=patterns complains:

warning: no file matches F: drivers/iommu/mtk-iommu*

Repair this minor typo in the file pattern.
Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Acked-by: Yong Wu <yong.wu@mediatek.com>
Link: https://lore.kernel.org/r/20210208061025.29198-1-lukas.bulwahn@gmail.comSigned-off-by: Joerg Roedel <jroedel@suse.de>

cc6e70bd

iommu/mediatek: Fix error code in probe() · a92a90ac

Dan Carpenter authored Feb 05, 2021

This error path is supposed to return -EINVAL. It used to return
directly but we added some clean up and accidentally removed the
error code. Also I fixed a typo in the error message.

Fixes: c0b57581 ("iommu/mediatek: Add power-domain operation")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Yong Wu <yong.wu@mediatek.com>
Link: https://lore.kernel.org/r/YB0+GU5akSdu29Vu@mwandaSigned-off-by: Joerg Roedel <jroedel@suse.de>

a92a90ac

07 Feb, 2021 9 commits

Linux 5.11-rc7 · 92bf2261
Linus Torvalds authored Feb 07, 2021

92bf2261

Merge tag 'libnvdimm-fixes-5.11-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm · b75dba7f

Linus Torvalds authored Feb 07, 2021

Pull libnvdimm fixes from Dan Williams:
 "A fix for a crash scenario that has been present since the initial
  merge, a minor regression in sysfs attribute visibility, and a fix for
  some flexible array warnings.

  The bulk of this pull is an update to the libnvdimm unit test
  infrastructure to test non-ACPI platforms. Given there is zero
  regression risk for test updates, and the tests enable validation of
  bits headed towards the next merge window, I saw no reason to hold the
  new tests back. Santosh originally submitted this before the v5.11
  window opened.

  Summary:

   - Fix a crash when sysfs accesses race 'dimm' driver probe/remove.

   - Fix a regression in 'resource' attribute visibility necessary for
     mapping badblocks and other physical address interrogations.

   - Fix some flexible array warnings

   - Expand the unit test infrastructure for non-ACPI platforms"

* tag 'libnvdimm-fixes-5.11-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
  libnvdimm/dimm: Avoid race between probe and available_slots_show()
  ndtest: Add papr health related flags
  ndtest: Add nvdimm control functions
  ndtest: Add regions and mappings to the test buses
  ndtest: Add dimm attributes
  ndtest: Add dimms to the two buses
  ndtest: Add compatability string to treat it as PAPR family
  testing/nvdimm: Add test module for non-nfit platforms
  libnvdimm/namespace: Fix visibility of namespace resource attribute
  libnvdimm/pmem: Remove unused header
  ACPI: NFIT: Fix flexible_array.cocci warnings

b75dba7f

Merge tag 'dma-mapping-5.11-2' of git://git.infradead.org/users/hch/dma-mapping · ff92acb2

Linus Torvalds authored Feb 07, 2021

Pull dma-mapping fix from Christoph Hellwig:
 "Fix a 32 vs 64-bit padding issue in the new benchmark code (Barry
  Song)"

* tag 'dma-mapping-5.11-2' of git://git.infradead.org/users/hch/dma-mapping:
  dma-mapping: benchmark: use u8 for reserved field in uAPI structure

ff92acb2

Merge tag 'irq_urgent_for_v5.11_rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · fc6c0ae5

Linus Torvalds authored Feb 07, 2021

Pull irq fixes from Borislav Petkov:

 - Prevent device managed IRQ allocation helpers from returning IRQ 0

 - A fix for MSI activation of PCI endpoints with multiple MSIs

* tag 'irq_urgent_for_v5.11_rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  genirq: Prevent [devm_]irq_alloc_desc from returning irq 0
  genirq/msi: Activate Multi-MSI early when MSI_FLAG_ACTIVATE_EARLY is set

fc6c0ae5

Merge tag 'core_urgent_for_v5.11_rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · c6792d44

Linus Torvalds authored Feb 07, 2021

Pull syscall entry fixes from Borislav Petkov:

 - For syscall user dispatch, separate prctl operation from syscall
   redirection range specification before the API has been made official
   in 5.11.

 - Ensure tasks using the generic syscall code do trap after returning
   from a syscall when single-stepping is requested.

* tag 'core_urgent_for_v5.11_rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  entry: Use different define for selector variable in SUD
  entry: Ensure trap after single-step on system call return

c6792d44

Merge tag 'sched_urgent_for_v5.11_rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 6fed85df

Linus Torvalds authored Feb 07, 2021

Pull scheduler fix from Borislav Petkov:
 "Revert an attempt to not spread IRQ threads on isolated CPUs which has
  a bunch of problems"

* tag 'sched_urgent_for_v5.11_rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  Revert "lib: Restrict cpumask_local_spread to houskeeping CPUs"

6fed85df

Merge tag 'timers_urgent_for_v5.11_rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 814daadb

Linus Torvalds authored Feb 07, 2021

Pull timer fixes from Borislav Petkov:
 "Two more timers-related fixes for v5.11:

   - Use a freezable workqueue for RTC sync because the sync can happen
     at any time and trigger suspend assertion checks in the i2c
     subsystem.

   - Correct a previous RTC validation change to check only bit 6 in
     register D because some Intel machines use bits 0-5"

* tag 'timers_urgent_for_v5.11_rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  ntp: Use freezable workqueue for RTC synchronization
  rtc: mc146818: Dont test for bit 0-5 in Register D

814daadb

Merge tag 'x86_urgent_for_v5.11_rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · e24f9c5f

Linus Torvalds authored Feb 07, 2021

Pull x86 fixes from Borislav Petkov:
 "I hope this is the last batch of x86/urgent updates for this round:

   - Remove superfluous EFI PGD range checks which lead to those
     assertions failing with certain kernel configs and LLVM.

   - Disable setting breakpoints on facilities involved in #DB exception
     handling to avoid infinite loops.

   - Add extra serialization to non-serializing MSRs (IA32_TSC_DEADLINE
     and x2 APIC MSRs) to adhere to SDM's recommendation and avoid any
     theoretical issues.

   - Re-add the EPB MSR reading on turbostat so that it works on older
     kernels which don't have the corresponding EPB sysfs file.

   - Add Alder Lake to the list of CPUs which support split lock.

   - Fix %dr6 register handling in order to be able to set watchpoints
     with gdb again.

   - Disable CET instrumentation in the kernel so that gcc doesn't add
     ENDBR64 to kernel code and thus confuse tracing"

* tag 'x86_urgent_for_v5.11_rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/efi: Remove EFI PGD build time checks
  x86/debug: Prevent data breakpoints on cpu_dr7
  x86/debug: Prevent data breakpoints on __per_cpu_offset
  x86/apic: Add extra serialization for non-serializing MSRs
  tools/power/turbostat: Fallback to an MSR read for EPB
  x86/split_lock: Enable the split lock feature on another Alder Lake CPU
  x86/debug: Fix DR6 handling
  x86/build: Disable CET instrumentation in the kernel

e24f9c5f

Merge tag 'kbuild-fixes-v5.11-2' of... · 2db138bb

Linus Torvalds authored Feb 07, 2021

Merge tag 'kbuild-fixes-v5.11-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild

Pull Kbuild fixes from Masahiro Yamada:

 - Use the 'python3' command to invoke python scripts because some
   distributions do not provide the 'python' command any more.

 - Clean-up and update documents

 - Use pkg-config to search libcrypto

 - Fix duplicated debug flags

 - Ignore some more stubs in scripts/kallsyms.c

* tag 'kbuild-fixes-v5.11-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
  kallsyms: fix nonconverging kallsyms table with lld
  kbuild: fix duplicated flags in DEBUG_CFLAGS
  scripts/clang-tools: switch explicitly to Python 3
  kbuild: remove PYTHON variable
  Documentation/llvm: Add a section about supported architectures
  Revert "checkpatch: add check for keyword 'boolean' in Kconfig definitions"
  scripts: use pkg-config to locate libcrypto
  kconfig: mconf: fix HOSTCC call
  doc: gcc-plugins: update gcc-plugins.rst
  kbuild: simplify GCC_PLUGINS enablement in dummy-tools/gcc
  Documentation/Kbuild: Remove references to gcc-plugin.sh
  scripts: switch explicitly to Python 3

2db138bb

06 Feb, 2021 10 commits

Merge tag '5.11-rc6-smb3' of git://git.samba.org/sfrench/cifs-2.6 · 825b5991

Linus Torvalds authored Feb 06, 2021

Pull cifs fixes from Steve French:
 "Three small smb3 fixes for stable"

* tag '5.11-rc6-smb3' of git://git.samba.org/sfrench/cifs-2.6:
  cifs: report error instead of invalid when revalidating a dentry fails
  smb3: fix crediting for compounding when only one request in flight
  smb3: Fix out-of-bounds bug in SMB2_negotiate()

825b5991

Merge tag 'riscv-for-linus-5.11-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux · f7455e5d

Linus Torvalds authored Feb 06, 2021

Pull RISC-V fixes from Palmer Dabbelt:
 "A handful of fixes for this week:

   - A fix to avoid evalating the VA twice in virt_addr_valid, which
     fixes some WARNs under DEBUG_VIRTUAL.

   - Two fixes related to STRICT_KERNEL_RWX: one that fixes some
     permissions when strict is disabled, and one to fix some alignment
     issues when strict is enabled.

   - A fix to disallow the selection of MAXPHYSMEM_2GB on RV32, which
     isn't valid any more but may still show up in some oldconfigs.

  We still have the HiFive Unleashed ethernet phy reset regression, so
  there will likely be something coming next week"

* tag 'riscv-for-linus-5.11-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
  RISC-V: Define MAXPHYSMEM_1GB only for RV32
  riscv: Align on L1_CACHE_BYTES when STRICT_KERNEL_RWX
  RISC-V: Fix .init section permission update
  riscv: virt_addr_valid must check the address belongs to linear mapping

f7455e5d

Merge tag 'powerpc-5.11-7' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux · f06279ea

Linus Torvalds authored Feb 06, 2021

Pull powerpc fixes from Michael Ellerman:

 - A fix for a change we made to __kernel_sigtramp_rt64() which confused
   glibc's backtrace logic, and also changed the semantics of that
   symbol, which was arguably an ABI break.

 - A fix for a stack overwrite in our VSX instruction emulation.

 - A couple of fixes for the Makefile logic in the new C VDSO.

Thanks to Masahiro Yamada, Naveen N.  Rao, Raoni Fassina Firmino, and
Ravi Bangoria.

* tag 'powerpc-5.11-7' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/64/signal: Fix regression in __kernel_sigtramp_rt64() semantics
  powerpc/vdso64: remove meaningless vgettimeofday.o build rule
  powerpc/vdso: fix unnecessary rebuilds of vgettimeofday.o
  powerpc/sstep: Fix array out of bound warning

f06279ea

Merge tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm · 4a7859ea

Linus Torvalds authored Feb 06, 2021

Pull ARM fixes from Russell King:

 - Fix latent bug with DC21285 (Footbridge PCI bridge) configuration
   accessors that affects GCC >= 4.9.2

 - Fix misplaced tegra_uart_config in decompressor

 - Ensure signal page contents are initialised

 - Fix kexec oops

* tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm:
  ARM: kexec: fix oops after TLB are invalidated
  ARM: ensure the signal page contains defined contents
  ARM: 9043/1: tegra: Fix misplaced tegra_uart_config in decompressor
  ARM: footbridge: fix dc21285 PCI configuration accessors

4a7859ea

Merge tag 'usb-5.11-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb · 368afecb

Linus Torvalds authored Feb 06, 2021

Pull USB fixes from Greg KH:
 "Here are some small, last-minute, USB driver fixes for 5.11-rc7

  They all resolve issues reported, or are a few new device ids for some
  drivers. They include:

   - new device ids for some usb-serial drivers

   - xhci fixes for a variety of reported problems

   - dwc3 driver bugfixes

   - dwc2 driver bugfixes

   - usblp driver bugfix

   - thunderbolt bugfix

   - few other tiny fixes

  All have been in linux-next with no reported issues"

* tag 'usb-5.11-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
  usb: dwc2: Fix endpoint direction check in ep_from_windex
  usb: dwc3: fix clock issue during resume in OTG mode
  xhci: fix bounce buffer usage for non-sg list case
  usb: host: xhci: mvebu: make USB 3.0 PHY optional for Armada 3720
  usb: xhci-mtk: break loop when find the endpoint to drop
  usb: xhci-mtk: skip dropping bandwidth of unchecked endpoints
  usb: renesas_usbhs: Clear pipe running flag in usbhs_pkt_pop()
  USB: gadget: legacy: fix an error code in eth_bind()
  thunderbolt: Fix possible NULL pointer dereference in tb_acpi_add_link()
  USB: serial: option: Adding support for Cinterion MV31
  usb: xhci-mtk: fix unreleased bandwidth data
  usb: gadget: aspeed: add missing of_node_put
  USB: usblp: don't call usb_set_interface if there's a single alt
  USB: serial: cp210x: add pid/vid for WSDA-200-USB
  USB: serial: cp210x: add new VID/PID for supporting Teraoka AD2000

368afecb

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input · 7c2d1835

Linus Torvalds authored Feb 06, 2021

Pull input fixes from Dmitry Torokhov:
 "Nothing terribly interesting, just a few fixups"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
  Input: xpad - sync supported devices with fork on GitHub
  Input: ariel-pwrbutton - remove unused variable ariel_pwrbutton_id_table
  Input: goodix - add support for Goodix GT9286 chip
  dt-bindings: input: touchscreen: goodix: Add binding for GT9286 IC
  dt-bindings: input: adc-keys: clarify description
  Input: ili210x - implement pressure reporting for ILI251x
  Input: i8042 - unbreak Pegatron C15B
  Input: st1232 - wait until device is ready before reading resolution
  Input: st1232 - do not read more bytes than needed
  Input: st1232 - fix off-by-one error in resolution handling

7c2d1835

Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi · 964d069f

Linus Torvalds authored Feb 06, 2021

Pull SCSI fix from James Bottomley:
 "One fix in drivers (lpfc) that stops an oops on resource exhaustion"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: lpfc: Fix EEH encountering oops with NVMe traffic

964d069f

Merge tag 'block-5.11-2021-02-05' of git://git.kernel.dk/linux-block · eec79181

Linus Torvalds authored Feb 06, 2021

Pull block fixes from Jens Axboe:
 "A few small regression fixes:

   - NVMe pull request from Christoph:
       - more quirks for buggy devices (Thorsten Leemhuis, Claus Stovgaard)
       - update the email address for Keith (Keith Busch)
       - fix an out of bounds access in nvmet-tcp (Sagi Grimberg)

   - Regression fix for BFQ shallow depth calculations introduced in
     this merge window (Lin)"

* tag 'block-5.11-2021-02-05' of git://git.kernel.dk/linux-block:
  nvmet-tcp: fix out-of-bounds access when receiving multiple h2cdata PDUs
  bfq-iosched: Revert "bfq: Fix computation of shallow depth"
  update the email address for Keith Bush
  nvme-pci: ignore the subsysem NQN on Phison E16
  nvme-pci: avoid the deepest sleep state on Kingston A2000 SSDs

eec79181

Merge tag 'io_uring-5.11-2021-02-05' of git://git.kernel.dk/linux-block · 860b45da

Linus Torvalds authored Feb 06, 2021

Pull io_uring fixes from Jens Axboe:
 "Two small fixes that should go into 5.11:

   - task_work resource drop fix (Pavel)

   - identity COW fix (Xiaoguang)"

* tag 'io_uring-5.11-2021-02-05' of git://git.kernel.dk/linux-block:
  io_uring: drop mm/files between task_work_submit
  io_uring: don't modify identity's files uncess identity is cowed

860b45da

x86/efi: Remove EFI PGD build time checks · 816ef8d7

Borislav Petkov authored Feb 05, 2021

With CONFIG_X86_5LEVEL, CONFIG_UBSAN and CONFIG_UBSAN_UNSIGNED_OVERFLOW
enabled, clang fails the build with

  x86_64-linux-ld: arch/x86/platform/efi/efi_64.o: in function `efi_sync_low_kernel_mappings':
  efi_64.c:(.text+0x22c): undefined reference to `__compiletime_assert_354'

which happens due to -fsanitize=unsigned-integer-overflow being enabled:

  -fsanitize=unsigned-integer-overflow: Unsigned integer overflow, where
  the result of an unsigned integer computation cannot be represented
  in its type. Unlike signed integer overflow, this is not undefined
  behavior, but it is often unintentional. This sanitizer does not check
  for lossy implicit conversions performed before such a computation
  (see -fsanitize=implicit-conversion).

and that fires when the (intentional) EFI_VA_START/END defines overflow
an unsigned long, leading to the assertion expressions not getting
optimized away (on GCC they do)...

However, those checks are superfluous: the runtime services mapping
code already makes sure the ranges don't overshoot EFI_VA_END as the
EFI mapping range is hardcoded. On each runtime services call, it is
switched to the EFI-specific PGD and even if mappings manage to escape
that last PGD, this won't remain unnoticed for long.

So rip them out.

See https://github.com/ClangBuiltLinux/linux/issues/256 for more info.
Reported-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Tested-by: Nick Desaulniers <ndesaulniers@google.com>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Link: http://lkml.kernel.org/r/20210107223424.4135538-1-arnd@kernel.org

816ef8d7

05 Feb, 2021 14 commits

entry: Use different define for selector variable in SUD · 36a6c843

Gabriel Krisman Bertazi authored Feb 05, 2021

Michael Kerrisk suggested that, from an API perspective, it is a bad
idea to share the PR_SYS_DISPATCH_ defines between the prctl operation
and the selector variable.

Therefore, define two new constants to be used by SUD's selector variable
and update the corresponding documentation and test cases.

While this changes the API syscall user dispatch has never been part of a
Linux release, it will show up for the first time in 5.11.
Suggested-by: Michael Kerrisk (man-pages) <mtk.manpages@gmail.com>
Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20210205184321.2062251-1-krisman@collabora.com

36a6c843

entry: Ensure trap after single-step on system call return · 6342adca

Gabriel Krisman Bertazi authored Feb 03, 2021

Commit 29915524 ("entry: Drop usage of TIF flags in the generic syscall
code") introduced a bug on architectures using the generic syscall entry
code, in which processes stopped by PTRACE_SYSCALL do not trap on syscall
return after receiving a TIF_SINGLESTEP.

The reason is that the meaning of TIF_SINGLESTEP flag is overloaded to
cause the trap after a system call is executed, but since the above commit,
the syscall call handler only checks for the SYSCALL_WORK flags on the exit
work.

Split the meaning of TIF_SINGLESTEP such that it only means single-step
mode, and create a new type of SYSCALL_WORK to request a trap immediately
after a syscall in single-step mode. In the current implementation, the
SYSCALL_WORK flag shadows the TIF_SINGLESTEP flag for simplicity.

Update x86 to flip this bit when a tracer enables single stepping.

Fixes: 29915524 ("entry: Drop usage of TIF flags in the generic syscall code")
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Kyle Huey <me@kylehuey.com>
Link: https://lore.kernel.org/r/87h7mtc9pr.fsf_-_@collabora.com

6342adca

Revert "lib: Restrict cpumask_local_spread to houskeeping CPUs" · 2452483d

Thomas Gleixner authored Feb 05, 2021

This reverts commit 1abdfe70.

This change is broken and not solving any problem it claims to solve.

Robin reported that cpumask_local_spread() now returns any cpu out of
cpu_possible_mask in case that NOHZ_FULL is disabled (runtime or compile
time). It can also return any offline or not-present CPU in the
housekeeping mask. Before that it was returning a CPU out of
online_cpu_mask.

While the function is racy against CPU hotplug if the caller does not
protect against it, the actual use cases are not caring much about it as
they use it mostly as hint for:

 - the user space affinity hint which is unused by the kernel
 - memory node selection which is just suboptimal
 - network queue affinity which might fail but is handled gracefully

But the occasional fail vs. hotplug is very different from returning
anything from possible_cpu_mask which can have a large amount of offline
CPUs obviously.

The changelog of the commit claims:

 "The current implementation of cpumask_local_spread() does not respect
  the isolated CPUs, i.e., even if a CPU has been isolated for Real-Time
  task, it will return it to the caller for pinning of its IRQ
  threads. Having these unwanted IRQ threads on an isolated CPU adds up
  to a latency overhead."

The only correct part of this changelog is:

 "The current implementation of cpumask_local_spread() does not respect
  the isolated CPUs."

Everything else is just disjunct from reality.
Reported-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Nitesh Narayan Lal <nitesh@redhat.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: abelits@marvell.com
Cc: davem@davemloft.net
Link: https://lore.kernel.org/r/87y2g26tnt.fsf@nanos.tec.linutronix.de

2452483d

Merge branch 'akpm' (patches from Andrew) · 1e0d27fc

Linus Torvalds authored Feb 05, 2021

Merge misc fixes from Andrew Morton:
 "18 patches.

  Subsystems affected by this patch series: mm (hugetlb, compaction,
  vmalloc, shmem, memblock, pagecache, kasan, and hugetlb), mailmap,
  gcov, ubsan, and MAINTAINERS"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  MAINTAINERS/.mailmap: use my @kernel.org address
  mm: hugetlb: fix missing put_page in gather_surplus_pages()
  ubsan: implement __ubsan_handle_alignment_assumption
  kasan: make addr_has_metadata() return true for valid addresses
  kasan: add explicit preconditions to kasan_report()
  mm/filemap: add missing mem_cgroup_uncharge() to __add_to_page_cache_locked()
  mailmap: add entries for Manivannan Sadhasivam
  mailmap: fix name/email for Viresh Kumar
  memblock: do not start bottom-up allocations with kernel_end
  mm: thp: fix MADV_REMOVE deadlock on shmem THP
  init/gcov: allow CONFIG_CONSTRUCTORS on UML to fix module gcov
  mm/vmalloc: separate put pages and flush VM flags
  mm, compaction: move high_pfn to the for loop scope
  mm: migrate: do not migrate HugeTLB page whose refcount is one
  mm: hugetlb: remove VM_BUG_ON_PAGE from page_huge_active
  mm: hugetlb: fix a race between isolating and freeing page
  mm: hugetlb: fix a race between freeing and dissolving the page
  mm: hugetlbfs: fix cannot migrate the fallocated HugeTLB page

1e0d27fc

genirq: Prevent [devm_]irq_alloc_desc from returning irq 0 · 4c7bcb51

Hans de Goede authored Dec 21, 2020

Since commit a85a6c86 ("driver core: platform: Clarify that IRQ 0
is invalid"), having a linux-irq with number 0 will trigger a WARN()
when calling platform_get_irq*() to retrieve that linux-irq.

Since [devm_]irq_alloc_desc allocs a single irq and since irq 0 is not used
on some systems, it can return 0, triggering that WARN(). This happens
e.g. on Intel Bay Trail and Cherry Trail devices using the LPE audio engine
for HDMI audio:

 0 is an invalid IRQ number
 WARNING: CPU: 3 PID: 472 at drivers/base/platform.c:238 platform_get_irq_optional+0x108/0x180
 Modules linked in: snd_hdmi_lpe_audio(+) ...

 Call Trace:
  platform_get_irq+0x17/0x30
  hdmi_lpe_audio_probe+0x4a/0x6c0 [snd_hdmi_lpe_audio]

 ---[ end trace ceece38854223a0b ]---

Change the 'from' parameter passed to __[devm_]irq_alloc_descs() by the
[devm_]irq_alloc_desc macros from 0 to 1, so that these macros will no
longer return 0.

Fixes: a85a6c86 ("driver core: platform: Clarify that IRQ 0 is invalid")
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20201221185647.226146-1-hdegoede@redhat.com

4c7bcb51

cifs: report error instead of invalid when revalidating a dentry fails · 21b200d0

Aurelien Aptel authored Feb 05, 2021

Assuming
- //HOST/a is mounted on /mnt
- //HOST/b is mounted on /mnt/b

On a slow connection, running 'df' and killing it while it's
processing /mnt/b can make cifs_get_inode_info() returns -ERESTARTSYS.

This triggers the following chain of events:
=> the dentry revalidation fail
=> dentry is put and released
=> superblock associated with the dentry is put
=> /mnt/b is unmounted

This patch makes cifs_d_revalidate() return the error instead of 0
(invalid) when cifs_revalidate_dentry() fails, except for ENOENT (file
deleted) and ESTALE (file recreated).
Signed-off-by: Aurelien Aptel <aaptel@suse.com>
Suggested-by: Shyam Prasad N <nspmangalore@gmail.com>
Reviewed-by: Shyam Prasad N <nspmangalore@gmail.com>
CC: stable@vger.kernel.org
Signed-off-by: Steve French <stfrench@microsoft.com>

21b200d0

x86/debug: Prevent data breakpoints on cpu_dr7 · 3943abf2

Lai Jiangshan authored Feb 04, 2021

local_db_save() is called at the start of exc_debug_kernel(), reads DR7 and
disables breakpoints to prevent recursion.

When running in a guest (X86_FEATURE_HYPERVISOR), local_db_save() reads the
per-cpu variable cpu_dr7 to check whether a breakpoint is active or not
before it accesses DR7.

A data breakpoint on cpu_dr7 therefore results in infinite #DB recursion.

Disallow data breakpoints on cpu_dr7 to prevent that.

Fixes: 84b6a349("x86/entry: Optimize local_db_save() for virt")
Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20210204152708.21308-2-jiangshanlai@gmail.com

3943abf2

x86/debug: Prevent data breakpoints on __per_cpu_offset · c4bed4b9

Lai Jiangshan authored Feb 04, 2021

When FSGSBASE is enabled, paranoid_entry() fetches the per-CPU GSBASE value
via __per_cpu_offset or pcpu_unit_offsets.

When a data breakpoint is set on __per_cpu_offset[cpu] (read-write
operation), the specific CPU will be stuck in an infinite #DB loop.

RCU will try to send an NMI to the specific CPU, but it is not working
either since NMI also relies on paranoid_entry(). Which means it's
undebuggable.

Fixes: eaad9812("x86/entry/64: Introduce the FIND_PERCPU_BASE macro")
Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20210204152708.21308-1-jiangshanlai@gmail.com

c4bed4b9

MAINTAINERS/.mailmap: use my @kernel.org address · 654eb3f2

Nathan Chancellor authored Feb 04, 2021

Use my @kernel.org for all points of contact so that I am always
accessible.

Link: https://lkml.kernel.org/r/20210126212730.2097108-1-nathan@kernel.orgSigned-off-by: Nathan Chancellor <nathan@kernel.org>
Acked-by: Nick Desaulniers <ndesaulniers@google.com>
Acked-by: Miguel Ojeda <ojeda@kernel.org>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

654eb3f2

mm: hugetlb: fix missing put_page in gather_surplus_pages() · e558464b

Muchun Song authored Feb 04, 2021

The VM_BUG_ON_PAGE avoids the generation of any code, even if that
expression has side-effects when !CONFIG_DEBUG_VM.

Link: https://lkml.kernel.org/r/20210126031009.96266-1-songmuchun@bytedance.com
Fixes: e5dfaceb ("mm/hugetlb.c: just use put_page_testzero() instead of page_count()")
Signed-off-by: Muchun Song <songmuchun@bytedance.com>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Reviewed-by: Miaohe Lin <linmiaohe@huawei.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

e558464b

ubsan: implement __ubsan_handle_alignment_assumption · 28abcc96

Nathan Chancellor authored Feb 04, 2021

When building ARCH=mips 32r2el_defconfig with CONFIG_UBSAN_ALIGNMENT:

ld.lld: error: undefined symbol: __ubsan_handle_alignment_assumption
referenced by slab.h:557 (include/linux/slab.h:557)
main.o:(do_initcalls) in archive init/built-in.a
referenced by slab.h:448 (include/linux/slab.h:448)
do_mounts_rd.o:(rd_load_image) in archive init/built-in.a
referenced by slab.h:448 (include/linux/slab.h:448)
do_mounts_rd.o:(identify_ramdisk_image) in archive init/built-in.a
referenced 1579 more times

Implement this for the kernel based on LLVM's
handleAlignmentAssumptionImpl because the kernel is not linked against
the compiler runtime.

Link: https://github.com/ClangBuiltLinux/linux/issues/1245
Link: https://github.com/llvm/llvm-project/blob/llvmorg-11.0.1/compiler-rt/lib/ubsan/ubsan_handlers.cpp#L151-L190
Link: https://lkml.kernel.org/r/20210127224451.2587372-1-nathan@kernel.orgSigned-off-by: Nathan Chancellor <nathan@kernel.org>
Acked-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

28abcc96

kasan: make addr_has_metadata() return true for valid addresses · b99acdcb

Vincenzo Frascino authored Feb 04, 2021

Currently, addr_has_metadata() returns true for every address.  An
invalid address (e.g.  NULL) passed to the function when, KASAN_HW_TAGS
is enabled, leads to a kernel panic.

Make addr_has_metadata() return true for valid addresses only.

Note: KASAN_HW_TAGS support for vmalloc will be added with a future
patch.

Link: https://lkml.kernel.org/r/20210126134409.47894-3-vincenzo.frascino@arm.com
Fixes: 2e903b91 ("kasan, arm64: implement HW_TAGS runtime")
Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Reviewed-by: Andrey Konovalov <andreyknvl@google.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Leon Romanovsky <leonro@mellanox.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Naresh Kamboju <naresh.kamboju@linaro.org>
Cc: "Paul E . McKenney" <paulmck@kernel.org>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

b99acdcb

kasan: add explicit preconditions to kasan_report() · 49c6631d

Vincenzo Frascino authored Feb 04, 2021

Patch series "kasan: Fix metadata detection for KASAN_HW_TAGS", v5.

With the introduction of KASAN_HW_TAGS, kasan_report() currently assumes
that every location in memory has valid metadata associated.  This is
due to the fact that addr_has_metadata() returns always true.

As a consequence of this, an invalid address (e.g.  NULL pointer
address) passed to kasan_report() when KASAN_HW_TAGS is enabled, leads
to a kernel panic.

Example below, based on arm64:

   BUG: KASAN: invalid-access in 0x0
   Read at addr 0000000000000000 by task swapper/0/1
   Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
   Mem abort info:
     ESR = 0x96000004
     EC = 0x25: DABT (current EL), IL = 32 bits
     SET = 0, FnV = 0
     EA = 0, S1PTW = 0
   Data abort info:
     ISV = 0, ISS = 0x00000004
     CM = 0, WnR = 0

  ...

   Call trace:
    mte_get_mem_tag+0x24/0x40
    kasan_report+0x1a4/0x410
    alsa_sound_last_init+0x8c/0xa4
    do_one_initcall+0x50/0x1b0
    kernel_init_freeable+0x1d4/0x23c
    kernel_init+0x14/0x118
    ret_from_fork+0x10/0x34
   Code: d65f03c0 9000f021 f9428021 b6cfff61 (d9600000)
   ---[ end trace 377c8bb45bdd3a1a ]---
   hrtimer: interrupt took 48694256 ns
   note: swapper/0[1] exited with preempt_count 1
   Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
   SMP: stopping secondary CPUs
   Kernel Offset: 0x35abaf140000 from 0xffff800010000000
   PHYS_OFFSET: 0x40000000
   CPU features: 0x0a7e0152,61c0a030
   Memory Limit: none
   ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b ]---

This series fixes the behavior of addr_has_metadata() that now returns
true only when the address is valid.

This patch (of 2):

With the introduction of KASAN_HW_TAGS, kasan_report() accesses the
metadata only when addr_has_metadata() succeeds.

Add a comment to make sure that the preconditions to the function are
explicitly clarified.

Link: https://lkml.kernel.org/r/20210126134409.47894-1-vincenzo.frascino@arm.com
Link: https://lkml.kernel.org/r/20210126134409.47894-2-vincenzo.frascino@arm.comSigned-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Reviewed-by: Andrey Konovalov <andreyknvl@google.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Leon Romanovsky <leonro@mellanox.com>
Cc: Andrey Konovalov <andreyknvl@google.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: "Paul E . McKenney" <paulmck@kernel.org>
Cc: Naresh Kamboju <naresh.kamboju@linaro.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

49c6631d

mm/filemap: add missing mem_cgroup_uncharge() to __add_to_page_cache_locked() · da74240e

Waiman Long authored Feb 04, 2021

Commit 3fea5a49 ("mm: memcontrol: convert page cache to a new
mem_cgroup_charge() API") introduced a bug in __add_to_page_cache_locked()
causing the following splat:

  page dumped because: VM_BUG_ON_PAGE(page_memcg(page))
  pages's memcg:ffff8889a4116000
  ------------[ cut here ]------------
  kernel BUG at mm/memcontrol.c:2924!
  invalid opcode: 0000 [#1] SMP KASAN PTI
  CPU: 35 PID: 12345 Comm: cat Tainted: G S      W I       5.11.0-rc4-debug+ #1
  Hardware name: HP HP Z8 G4 Workstation/81C7, BIOS P60 v01.25 12/06/2017
  RIP: commit_charge+0xf4/0x130
  Call Trace:
    mem_cgroup_charge+0x175/0x770
    __add_to_page_cache_locked+0x712/0xad0
    add_to_page_cache_lru+0xc5/0x1f0
    cachefiles_read_or_alloc_pages+0x895/0x2e10 [cachefiles]
    __fscache_read_or_alloc_pages+0x6c0/0xa00 [fscache]
    __nfs_readpages_from_fscache+0x16d/0x630 [nfs]
    nfs_readpages+0x24e/0x540 [nfs]
    read_pages+0x5b1/0xc40
    page_cache_ra_unbounded+0x460/0x750
    generic_file_buffered_read_get_pages+0x290/0x1710
    generic_file_buffered_read+0x2a9/0xc30
    nfs_file_read+0x13f/0x230 [nfs]
    new_sync_read+0x3af/0x610
    vfs_read+0x339/0x4b0
    ksys_read+0xf1/0x1c0
    do_syscall_64+0x33/0x40
    entry_SYSCALL_64_after_hwframe+0x44/0xa9

Before that commit, there was a try_charge() and commit_charge() in
__add_to_page_cache_locked().  These two separated charge functions were
replaced by a single mem_cgroup_charge().  However, it forgot to add a
matching mem_cgroup_uncharge() when the xarray insertion failed with the
page released back to the pool.

Fix this by adding a mem_cgroup_uncharge() call when insertion error
happens.

Link: https://lkml.kernel.org/r/20210125042441.20030-1-longman@redhat.com
Fixes: 3fea5a49 ("mm: memcontrol: convert page cache to a new mem_cgroup_charge() API")
Signed-off-by: Waiman Long <longman@redhat.com>
Reviewed-by: Alex Shi <alex.shi@linux.alibaba.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: Muchun Song <smuchun@gmail.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

da74240e