Commits · c60767421e102dfd1f4d99ad0cc7f8ba24461eb8 · Kirill Smelkov / linux

29 Jan, 2021 1 commit

irqchip/ls-extirq: add IRQCHIP_SKIP_SET_WAKE to the irqchip flags · c6076742

Biwen Li authored Jan 29, 2021

The ls-extirq driver doesn't implement the irq_set_wake()
callback, while being wake-up capable. This results in
ugly behaviours across suspend/resume cycles.

Advertise this by adding IRQCHIP_SKIP_SET_WAKE to
the irqchip flags

Fixes: b16a1caf ("irqchip/ls-extirq: Add LS1043A, LS1088A external interrupt support")
Signed-off-by: Biwen Li <biwen.li@nxp.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20210129095034.33821-1-biwen.li@oss.nxp.com

c6076742

26 Jan, 2021 2 commits

dt-bindings: qcom,pdc: Add compatible for SM8350 · 9eaad15e

Vinod Koul authored Jan 15, 2021

Add the compatible string for SM8350 SoC from Qualcomm.
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20210115090941.2289416-2-vkoul@kernel.org

9eaad15e

dt-bindings: qcom,pdc: Add compatible for SM8250 · e6f93c01

Vinod Koul authored Jan 15, 2021

Add the compatible string for SM8250 SoC from Qualcomm. This compatible
is used already in DTS files but not documented yet
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20210115090941.2289416-1-vkoul@kernel.org

e6f93c01

21 Jan, 2021 7 commits

irqchip/sun6i-r: Add wakeup support · 7ab365f6

Samuel Holland authored Jan 17, 2021

Maintain bitmaps of wake-enabled IRQs and mux inputs, and program them
to the hardware during the syscore phase of suspend and shutdown. Then
restore the original set of enabled IRQs (only the NMI) during resume.

This serves two purposes. First, it lets power management firmware
running on the ARISC coprocessor know which wakeup sources Linux wants
to have enabled. That way, it can avoid turning them off when it shuts
down the remainder of the clock tree. Second, it preconfigures the
coprocessor's interrupt controller, so the firmware's wakeup logic
is as simple as waiting for an interrupt to arrive.

The suspend/resume logic is not conditional on PM_SLEEP because it is
identical to the init/shutdown logic. Wake IRQs may be enabled during
shutdown to allow powering the board back on. As an example, see
commit a5c5e50c ("Input: gpio-keys - add shutdown callback").
Acked-by: Maxime Ripard <mripard@kernel.org>
Signed-off-by: Samuel Holland <samuel@sholland.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20210118055040.21910-5-samuel@sholland.org

7ab365f6

irqchip/sun6i-r: Use a stacked irqchip driver · 4e346146

Samuel Holland authored Jan 17, 2021

The R_INTC in the A31 and newer sun8i/sun50i SoCs is more similar to the
original sun4i interrupt controller than the sun7i/sun9i NMI controller.
It is used for two distinct purposes:
 - To control the trigger, latch, and mask for the NMI input pin
 - To provide the interrupt input for the ARISC coprocessor

As this interrupt controller is not documented, information about it
comes from vendor-provided firmware blobs and from experimentation.

Differences from the sun4i interrupt controller appear to be:
 - It only has one or two registers of each kind (max 32 or 64 IRQs)
 - Multiplexing logic is added to support additional inputs
 - There is no FIQ-related logic
 - There is no interrupt priority logic

In order to fulfill its two purposes, this hardware block combines four
types of IRQs. First, the NMI pin is routed to the "IRQ 0" input on this
chip, with a trigger type controlled by the NMI_CTRL_REG. The "IRQ 0
pending" output from this chip, if enabled, is then routed to a SPI IRQ
input on the GIC. In other words, bit 0 of IRQ_ENABLE_REG *does* affect
the NMI IRQ seen at the GIC.

The NMI is followed by a contiguous block of 15 "direct" (my name for
them) IRQ inputs that are connected in parallel to both R_INTC and the
GIC. Or in other words, these bits of IRQ_ENABLE_REG *do not* affect the
IRQs seen at the GIC.

Following the direct IRQs are the ARISC's copy of banked IRQs for shared
peripherals. These are not relevant to Linux. The remaining IRQs are
connected to a multiplexer and provide access to the first (up to) 128
SPIs from the ARISC. This range of SPIs overlaps with the direct IRQs.

Because of the 1:1 correspondence between R_INTC and GIC inputs, this is
a perfect scenario for using a stacked irqchip driver. We want to hook
into setting the NMI trigger type, but not actually handle any IRQ here.

To allow access to all multiplexed IRQs, this driver requires a new
binding where the interrupt number matches the GIC interrupt number.
(This moves the NMI from number 0 to 32 or 96, depending on the SoC.)
For simplicity, copy the three-cell GIC binding; this disambiguates
interrupt 0 in the old binding (the NMI) from interrupt 0 in the new
binding (SPI 0) by the number of cells.

Since R_INTC is in the always-on power domain, and its output is visible
to the power management coprocessor, a stacked irqchip driver provides a
simple way to add wakeup support to any of its IRQs. That is the next
patch; for now, just the NMI is moved over.

This commit mostly reverts commit 173bda53 ("irqchip/sunxi-nmi:
Support sun6i-a31-r-intc compatible").
Acked-by: Maxime Ripard <mripard@kernel.org>
Signed-off-by: Samuel Holland <samuel@sholland.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20210118055040.21910-4-samuel@sholland.org

4e346146

dt-bindings: irq: sun6i-r: Add a compatible for the H3 · 6436eb44

Samuel Holland authored Jan 17, 2021

The Allwinner H3 SoC contains an R_INTC that is, as far as we know,
compatible with the R_INTC present in other sun8i SoCs starting with
the A31. Since the R_INTC hardware is undocumented, introduce a new
compatible for the R_INTC variant in this SoC, in case there turns out
to be some difference.
Acked-by: Maxime Ripard <mripard@kernel.org>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Samuel Holland <samuel@sholland.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20210118055040.21910-3-samuel@sholland.org

6436eb44

dt-bindings: irq: sun6i-r: Split the binding from sun7i-nmi · ad6b47cd

Samuel Holland authored Jan 17, 2021

The R_INTC in the A31 and newer sun8i/sun50i SoCs has additional
functionality compared to the sun7i/sun9i NMI controller. Among other
things, it multiplexes access to up to 128 interrupts corresponding to
(and in parallel to) the first 128 GIC SPIs. This means the NMI is no
longer the lowest-numbered hwirq at this irqchip, since it is SPI 32 or
96 (depending on SoC). hwirq 0 now corresponds to SPI 0, usually UART0.

To allow access to all multiplexed IRQs, the R_INTC requires a new
binding where the interrupt number matches the GIC interrupt number.
Otherwise, interrupts with hwirq numbers below the NMI would not be
representable in the device tree.

For simplicity, copy the three-cell GIC binding; this disambiguates
interrupt 0 in the old binding (the NMI) from interrupt 0 in the new
binding (SPI 0) by the number of cells.

Because the H6 R_INTC has a different mapping from multiplexed IRQs to
top-level register bits, it is no longer compatible with the A31 R_INTC.
Acked-by: Maxime Ripard <mripard@kernel.org>
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Samuel Holland <samuel@sholland.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20210118055040.21910-2-samuel@sholland.org

ad6b47cd

irqchip/gic-v3: Fix typos in PMR/RPR SCR_EL3.FIQ handling explanation · d4034114

Lorenzo Pieralisi authored Jan 21, 2021

The GICv3 driver explanation related to PMR/RPR and SCR_EL3.FIQ
secure/non-secure priority handling contains a couple of typos.

Fix them.
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20210121182252.29320-1-lorenzo.pieralisi@arm.com

d4034114

irqchip: Remove sirfsoc driver · 5c1ea0d8

Arnd Bergmann authored Jan 20, 2021

The CSR SiRF prima2/atlas platforms are getting removed, so this driver
is no longer needed.

Cc: Barry Song <baohua@kernel.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Barry Song <baohua@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20210120133008.2421897-3-arnd@kernel.org

5c1ea0d8

irqchip: Remove sigma tango driver · 00e772c4

Arnd Bergmann authored Jan 20, 2021

The tango platform is getting removed, so the driver is no
longer needed.

Cc: Marc Gonzalez <marc.w.gonzalez@free.fr>
Cc: Mans Rullgard <mans@mansr.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Mans Rullgard <mans@mansr.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20210120133008.2421897-2-arnd@kernel.org

00e772c4

18 Jan, 2021 1 commit
- Linux 5.11-rc4 · 19c329f6
  Linus Torvalds authored Jan 17, 2021
  
  19c329f6
17 Jan, 2021 4 commits

Merge tag 'perf-tools-fixes-2021-01-17' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux · e2da7836

Linus Torvalds authored Jan 17, 2021

Pull perf tools fixes from Arnaldo Carvalho de Melo:

 - Fix 'CPU too large' error in Intel PT

 - Correct event attribute sizes in 'perf inject'

 - Sync build_bug.h and kvm.h kernel copies

 - Fix bpf.h header include directive in 5sec.c 'perf trace' bpf example

 - libbpf tests fixes

 - Fix shadow stat 'perf test' for non-bash shells

 - Take cgroups into account for shadow stats in 'perf stat'

* tag 'perf-tools-fixes-2021-01-17' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux:
  perf inject: Correct event attribute sizes
  perf intel-pt: Fix 'CPU too large' error
  perf stat: Take cgroups into account for shadow stats
  perf stat: Introduce struct runtime_stat_data
  libperf tests: Fail when failing to get a tracepoint id
  libperf tests: If a test fails return non-zero
  libperf tests: Avoid uninitialized variable warning
  perf test: Fix shadow stat test for non-bash shells
  tools headers: Syncronize linux/build_bug.h with the kernel sources
  tools headers UAPI: Sync kvm.h headers with the kernel sources
  perf bpf examples: Fix bpf.h header include directive in 5sec.c example

e2da7836

Merge tag 'powerpc-5.11-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux · a1339d63

Linus Torvalds authored Jan 17, 2021

Pull powerpc fixes from Michael Ellerman:
 "One fix for a lack of alignment in our linker script, that can lead to
  crashes depending on configuration etc.

  One fix for the 32-bit VDSO after the C VDSO conversion.

  Thanks to Andreas Schwab, Ariel Marcovitch, and Christophe Leroy"

* tag 'powerpc-5.11-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/vdso: Fix clock_gettime_fallback for vdso32
  powerpc: Fix alignment bug within the init sections

a1339d63

Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · a527a2b3

Linus Torvalds authored Jan 17, 2021

Pull misc vfs fixes from Al Viro:
 "Several assorted fixes.

  I still think that audit ->d_name race is better fixed this way for
  the benefit of backports, with any possibly fancier variants done on
  top of it"

* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  dump_common_audit_data(): fix racy accesses to ->d_name
  iov_iter: fix the uaccess area in copy_compat_iovec_from_user
  umount(2): move the flag validity checks first

a527a2b3

mm: don't put pinned pages into the swap cache · feb889fb

Linus Torvalds authored Jan 16, 2021

So technically there is nothing wrong with adding a pinned page to the
swap cache, but the pinning obviously means that the page can't actually
be free'd right now anyway, so it's a bit pointless.

However, the real problem is not with it being a bit pointless: the real
issue is that after we've added it to the swap cache, we'll try to unmap
the page.  That will succeed, because the code in mm/rmap.c doesn't know
or care about pinned pages.

Even the unmapping isn't fatal per se, since the page will stay around
in memory due to the pinning, and we do hold the connection to it using
the swap cache.  But when we then touch it next and take a page fault,
the logic in do_swap_page() will map it back into the process as a
possibly read-only page, and we'll then break the page association on
the next COW fault.

Honestly, this issue could have been fixed in any of those other places:
(a) we could refuse to unmap a pinned page (which makes conceptual
sense), or (b) we could make sure to re-map a pinned page writably in
do_swap_page(), or (c) we could just make do_wp_page() not COW the
pinned page (which was what we historically did before that "mm:
do_wp_page() simplification" commit).

But while all of them are equally valid models for breaking this chain,
not putting pinned pages into the swap cache in the first place is the
simplest one by far.

It's also the safest one: the reason why do_wp_page() was changed in the
first place was that getting the "can I re-use this page" wrong is so
fraught with errors.  If you do it wrong, you end up with an incorrectly
shared page.

As a result, using "page_maybe_dma_pinned()" in either do_wp_page() or
do_swap_page() would be a serious bug since it is only a (very good)
heuristic.  Re-using the page requires a hard black-and-white rule with
no room for ambiguity.

In contrast, saying "this page is very likely dma pinned, so let's not
add it to the swap cache and try to unmap it" is an obviously safe thing
to do, and if the heuristic might very rarely be a false positive, no
harm is done.

Fixes: 09854ba9 ("mm: do_wp_page() simplification")
Reported-and-tested-by: Martin Raiber <martin@urbackup.org>
Cc: Pavel Begunkov <asml.silence@gmail.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Peter Xu <peterx@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

feb889fb

16 Jan, 2021 12 commits

Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi · 0da0a8a0

Linus Torvalds authored Jan 16, 2021

Pull SCSI fixes from James Bottomley:
 "Nine minor fixes, seven in drivers and two in the core SCSI disk
  driver (sd) which should be harmless involving removing an unused
  variable and quietening a spurious warning"
Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: sd: Remove obsolete variable in sd_remove()
  scsi: sd: Suppress spurious errors when WRITE SAME is being disabled
  scsi: scsi_debug: Fix memleak in scsi_debug_init()
  scsi: mpt3sas: Fix spelling mistake in Kconfig "compatiblity" -> "compatibility"
  scsi: qedi: Correct max length of CHAP secret
  scsi: ufs: Correct the LUN used in eh_device_reset_handler() callback
  scsi: ufs: Relocate flush of exceptional event
  scsi: ufs: Relax the condition of UFSHCI_QUIRK_SKIP_MANUAL_WB_FLUSH_CTRL
  scsi: ufs: Fix possible power drain during system suspend

0da0a8a0

dump_common_audit_data(): fix racy accesses to ->d_name · d36a1dd9

Al Viro authored Jan 05, 2021

We are not guaranteed the locking environment that would prevent
dentry getting renamed right under us.  And it's possible for
old long name to be freed after rename, leading to UAF here.

Cc: stable@kernel.org # v2.6.2+
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

d36a1dd9

Merge tag 'block-5.11-2021-01-16' of git://git.kernel.dk/linux-block · 54c6247d

Linus Torvalds authored Jan 16, 2021

Pull block fixes from Jens Axboe:
 "Just an nvme pull request via Christoph:

   - don't initialize hwmon for discover controllers (Sagi Grimberg)

   - fix iov_iter handling in nvme-tcp (Sagi Grimberg)

   - fix a preempt warning in nvme-tcp (Sagi Grimberg)

   - fix a possible NULL pointer dereference in nvme (Israel Rukshin)"

* tag 'block-5.11-2021-01-16' of git://git.kernel.dk/linux-block:
  nvme: don't intialize hwmon for discovery controllers
  nvme-tcp: fix possible data corruption with bio merges
  nvme-tcp: Fix warning with CONFIG_DEBUG_PREEMPT
  nvmet-rdma: Fix NULL deref when setting pi_enable and traddr INADDR_ANY

54c6247d

Merge tag 'io_uring-5.11-2021-01-16' of git://git.kernel.dk/linux-block · 11c0239a

Linus Torvalds authored Jan 16, 2021

Pull io_uring fixes from Jens Axboe:
 "We still have a pending fix for a cancelation issue, but it's still
  being investigated. In the meantime:

   - Dead mm handling fix (Pavel)

   - SQPOLL setup error handling (Pavel)

   - Flush timeout sequence fix (Marcelo)

   - Missing finish_wait() for one exit case"

* tag 'io_uring-5.11-2021-01-16' of git://git.kernel.dk/linux-block:
  io_uring: ensure finish_wait() is always called in __io_uring_task_cancel()
  io_uring: flush timeouts that should already have expired
  io_uring: do sqo disable on install_fd error
  io_uring: fix null-deref in io_disable_sqo_submit
  io_uring: don't take files/mm for a dead task
  io_uring: drop mm and files after task_work_run

11c0239a

Merge tag 'riscv-for-linus-5.11-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux · acda701b

Linus Torvalds authored Jan 16, 2021

Pull RISC-V fixes from Palmer Dabbelt:
"There are a few more fixes than a normal rc4, largely due to the
bubble introduced by the holiday break:

- return -ENOSYS for syscall number -1, which previously returned an
uninitialized value.

- ensure of_clk_init() has been called in time_init(), without which
clock drivers may not be initialized.

- fix sifive,uart0 driver to properly display the baud rate. A fix to
initialize MPIE that allows interrupts to be processed during
system calls.

- avoid erronously begin tracing IRQs when interrupts are disabled,
which at least triggers suprious lockdep failures.

- workaround for a warning related to calling smp_processor_id()
while preemptible. The warning itself is suprious on currently
availiable systems.

- properly include the generic time VDSO calls. A fix to our kasan
address mapping. A fix to the HiFive Unleashed device tree, which
allows the Ethernet PHY to be properly initialized by Linux (as
opposed to relying on the bootloader).

- defconfig update to include SiFive's GPIO driver, which is present
on the HiFive Unleashed and necessary to initialize the PHY.

- avoid allocating memory while initializing reserved memory.

- avoid allocating the last 4K of memory, as pointers there alias
with syscall errors.

There are also two cleanups that should have no functional effect but
do fix build warnings:

- drop a duplicated definition of PAGE_KERNEL_EXEC.

- properly declare the asm register SP shim.

- cleanup the rv32 memory size Kconfig entry, to reflect the actual
size of memory availiable"

* tag 'riscv-for-linus-5.11-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
RISC-V: Fix maximum allowed phsyical memory for RV32
RISC-V: Set current memblock limit
RISC-V: Do not allocate memblock while iterating reserved memblocks
riscv: stacktrace: Move register keyword to beginning of declaration
riscv: defconfig: enable gpio support for HiFive Unleashed
dts: phy: add GPIO number and active state used for phy reset
dts: phy: fix missing mdio device and probe failure of vsc8541-01 device
riscv: Fix KASAN memory mapping.
riscv: Fixup CONFIG_GENERIC_TIME_VSYSCALL
riscv: cacheinfo: Fix using smp_processor_id() in preemptible
riscv: Trace irq on only interrupt is enabled
riscv: Drop a duplicated PAGE_KERNEL_EXEC
riscv: Enable interrupts during syscalls with M-Mode
riscv: Fix sifive serial driver
riscv: Fix kernel time_init()
riscv: return -ENOSYS for syscall -1

acda701b

mm: don't play games with pinned pages in clear_page_refs · 9348b73c

Linus Torvalds authored Jan 09, 2021

Turning a pinned page read-only breaks the pinning after COW. Don't do it.

The whole "track page soft dirty" state doesn't work with pinned pages
anyway, since the page might be dirtied by the pinning entity without
ever being noticed in the page tables.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

9348b73c

mm: fix clear_refs_write locking · 29a951df

Linus Torvalds authored Jan 08, 2021

Turning page table entries read-only requires the mmap_sem held for
writing.

So stop doing the odd games with turning things from read locks to write
locks and back.  Just get the write lock.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

29a951df

RISC-V: Fix maximum allowed phsyical memory for RV32 · e5577937

Atish Patra authored Jan 11, 2021

Linux kernel can only map 1GB of address space for RV32 as the page offset
is set to 0xC0000000. The current description in the Kconfig is confusing
as it indicates that RV32 can support 2GB of physical memory. That is
simply not true for current kernel. In future, a 2GB split support can be
added to allow 2GB physical address space.
Reviewed-by: Anup Patel <anup@brainfault.org>
Signed-off-by: Atish Patra <atish.patra@wdc.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>

e5577937

RISC-V: Set current memblock limit · abb8e86b

Atish Patra authored Jan 11, 2021

Currently, linux kernel can not use last 4k bytes of addressable space
because IS_ERR_VALUE macro treats those as an error. This will be an issue
for RV32 as any memblock allocator potentially allocate chunk of memory
from the end of DRAM (2GB) leading bad address error even though the
address was technically valid.

Fix this issue by limiting the memblock if available memory spans the
entire address space.
Reviewed-by: Anup Patel <anup@brainfault.org>
Signed-off-by: Atish Patra <atish.patra@wdc.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>

abb8e86b

RISC-V: Do not allocate memblock while iterating reserved memblocks · 797f0375

Atish Patra authored Jan 11, 2021

Currently, resource tree allocates memory blocks while iterating on the
list. It leads to following kernel warning because memblock allocation
also invokes memory block reservation API.

[    0.000000] ------------[ cut here ]------------
[    0.000000] WARNING: CPU: 0 PID: 0 at kernel/resource.c:795
__insert_resource+0x8e/0xd0
[    0.000000] Modules linked in:
[    0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted
5.10.0-00022-ge20097fb37e2-dirty #549
[    0.000000] epc: c00125c2 ra : c001262c sp : c1c01f50
[    0.000000]  gp : c1d456e0 tp : c1c0a980 t0 : ffffcf20
[    0.000000]  t1 : 00000000 t2 : 00000000 s0 : c1c01f60
[    0.000000]  s1 : ffffcf00 a0 : ffffff00 a1 : c1c0c0c4
[    0.000000]  a2 : 80c12b15 a3 : 80402000 a4 : 80402000
[    0.000000]  a5 : c1c0c0c4 a6 : 80c12b15 a7 : f5faf600
[    0.000000]  s2 : c1c0c0c4 s3 : c1c0e000 s4 : c1009a80
[    0.000000]  s5 : c1c0c000 s6 : c1d48000 s7 : c1613b4c
[    0.000000]  s8 : 00000fff s9 : 80000200 s10: c1613b40
[    0.000000]  s11: 00000000 t3 : c1d4a000 t4 : ffffffff

This is also unnecessary as we can pre-compute the total memblocks required
for each memory region and allocate it before the loop. It save precious
boot time not going through memblock allocation code every time.

Fixes: 00ab027a ("RISC-V: Add kernel image sections to the resource tree")
Reviewed-by: Anup Patel <anup@brainfault.org>
Tested-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Atish Patra <atish.patra@wdc.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>

797f0375

iov_iter: fix the uaccess area in copy_compat_iovec_from_user · a959a978

Christoph Hellwig authored Jan 11, 2021

sizeof needs to be called on the compat pointer, not the native one.

Fixes: 89cd35c5 ("iov_iter: transparently handle compat iovecs in import_iovec")
Reported-by: David Laight <David.Laight@ACULAB.COM>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

a959a978

Merge tag 'for-5.11/dm-fixes-1' of... · 1d94330a

Linus Torvalds authored Jan 15, 2021

Merge tag 'for-5.11/dm-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm

Pull device mapper fixes from Mike Snitzer:

 - Fix DM-raid's raid1 discard limits so discards work.

 - Select missing Kconfig dependencies for DM integrity and zoned
   targets.

 - Four fixes for DM crypt target's support to optionally bypass kcryptd
   workqueues.

 - Fix DM snapshot merge supports missing data flushes before committing
   metadata.

 - Fix DM integrity data device flushing when external metadata is used.

 - Fix DM integrity's maximum number of supported constructor arguments
   that user can request when creating an integrity device.

 - Eliminate DM core ioctl logging noise when an ioctl is issued without
   required CAP_SYS_RAWIO permission.

* tag 'for-5.11/dm-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
  dm crypt: defer decryption to a tasklet if interrupts disabled
  dm integrity: fix the maximum number of arguments
  dm crypt: do not call bio_endio() from the dm-crypt tasklet
  dm integrity: fix flush with external metadata device
  dm: eliminate potential source of excessive kernel log noise
  dm snapshot: flush merged data before committing metadata
  dm crypt: use GFP_ATOMIC when allocating crypto requests from softirq
  dm crypt: do not wait for backlogged crypto request completion in softirq
  dm zoned: select CONFIG_CRC32
  dm integrity: select CRYPTO_SKCIPHER
  dm raid: fix discard limits for raid1

1d94330a

15 Jan, 2021 13 commits

Merge branch 'akpm' (patches from Andrew) · b45e2da6

Linus Torvalds authored Jan 15, 2021

Merge misc fixes from Andrew Morton:
 "10 patches.

  Subsystems affected by this patch series: MAINTAINERS and mm (slub,
  pagealloc, memcg, kasan, vmalloc, migration, hugetlb, memory-failure,
  and process_vm_access)"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  mm/process_vm_access.c: include compat.h
  mm,hwpoison: fix printing of page flags
  MAINTAINERS: add Vlastimil as slab allocators maintainer
  mm/hugetlb: fix potential missing huge page size info
  mm: migrate: initialize err in do_migrate_pages
  mm/vmalloc.c: fix potential memory leak
  arm/kasan: fix the array size of kasan_early_shadow_pte[]
  mm/memcontrol: fix warning in mem_cgroup_page_lruvec()
  mm/page_alloc: add a missing mm_page_alloc_zone_locked() tracepoint
  mm, slub: consider rest of partial list if acquire_slab() fails

b45e2da6

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma · 8cbe71e7

Linus Torvalds authored Jan 15, 2021

Pull rdma fixes from Jason Gunthorpe:
 "A fairly modest set of bug fixes, nothing abnormal from the merge
  window

  The ucma patch is a bit on the larger side, but given the regression
  was recently added I've opted to forward it to the rc stream.

   - Fix a ucma memory leak introduced in v5.9 while fixing the
     Syzkaller bugs

   - Don't fail when the xarray wraps for user verbs objects

   - User triggerable oops regression from the umem page size rework

   - Error unwind bugs in usnic, ocrdma, mlx5 and cma"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
  RDMA/cma: Fix error flow in default_roce_mode_store
  RDMA/mlx5: Fix wrong free of blue flame register on error
  IB/mlx5: Fix error unwinding when set_has_smi_cap fails
  RDMA/umem: Avoid undefined behavior of rounddown_pow_of_two()
  RDMA/ocrdma: Fix use after free in ocrdma_dealloc_ucontext_pd()
  RDMA/usnic: Fix memleak in find_free_vf_and_create_qp_grp
  RDMA/restrack: Don't treat as an error allocation ID wrapping
  RDMA/ucma: Do not miss ctx destruction steps in some cases

8cbe71e7

io_uring: ensure finish_wait() is always called in __io_uring_task_cancel() · a8d13dbc

Jens Axboe authored Jan 15, 2021

If we enter with requests pending and performm cancelations, we'll have
a different inflight count before and after calling prepare_to_wait().
This causes the loop to restart. If we actually ended up canceling
everything, or everything completed in-between, then we'll break out
of the loop without calling finish_wait() on the waitqueue. This can
trigger a warning on exit_signals(), as we leave the task state in
TASK_UNINTERRUPTIBLE.

Put a finish_wait() after the loop to catch that case.

Cc: stable@vger.kernel.org # 5.9+
Signed-off-by: Jens Axboe <axboe@kernel.dk>

a8d13dbc

Merge tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 · 0bc9bc1d

Linus Torvalds authored Jan 15, 2021

Pull ext4 fixes from Ted Ts'o:
 "A number of bug fixes for ext4:

   - Fix for the new fast_commit feature

   - Fix some error handling codepaths in whiteout handling and
     mountpoint sampling

   - Fix how we write ext4_error information so it goes through the
     journal when journalling is active, to avoid races that can lead to
     lost error information, superblock checksum failures, or DIF/DIX
     features"

* tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
  ext4: remove expensive flush on fast commit
  ext4: fix bug for rename with RENAME_WHITEOUT
  ext4: fix wrong list_splice in ext4_fc_cleanup
  ext4: use IS_ERR instead of IS_ERR_OR_NULL and set inode null when IS_ERR
  ext4: don't leak old mountpoint samples
  ext4: drop ext4_handle_dirty_super()
  ext4: fix superblock checksum failure when setting password salt
  ext4: use sbi instead of EXT4_SB(sb) in ext4_update_super()
  ext4: save error info to sb through journal if available
  ext4: protect superblock modifications with a buffer lock
  ext4: drop sync argument of ext4_commit_super()
  ext4: combine ext4_handle_error() and save_error_info()

0bc9bc1d

Merge tag '5.11-rc3-smb3' of git://git.samba.org/sfrench/cifs-2.6 · 7cd3c412

Linus Torvalds authored Jan 15, 2021

Pull cifs fixes from Steve French:
 "Two small cifs fixes for stable (including an important handle leak
  fix) and three small cleanup patches"

* tag '5.11-rc3-smb3' of git://git.samba.org/sfrench/cifs-2.6:
  cifs: style: replace one-element array with flexible-array
  cifs: connect: style: Simplify bool comparison
  fs: cifs: remove unneeded variable in smb3_fs_context_dup
  cifs: fix interrupted close commands
  cifs: check pointer before freeing

7cd3c412

Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux · 82821be8

Linus Torvalds authored Jan 15, 2021

Pull arm64 fixes from Catalin Marinas:

 - Set the minimum GCC version to 5.1 for arm64 due to earlier compiler
   bugs.

 - Make atomic helpers __always_inline to avoid a section mismatch when
   compiling with clang.

 - Fix the CMA and crashkernel reservations to use ZONE_DMA (remove the
   arm64_dma32_phys_limit variable, no longer needed with a dynamic
   ZONE_DMA sizing in 5.11).

 - Remove redundant IRQ flag tracing that was leaving lockdep
   inconsistent with the hardware state.

 - Revert perf events based hard lockup detector that was causing
   smp_processor_id() to be called in preemptible context.

 - Some trivial cleanups - spelling fix, renaming S_FRAME_SIZE to
   PT_REGS_SIZE, function prototypes added.

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: selftests: Fix spelling of 'Mismatch'
  arm64: syscall: include prototype for EL0 SVC functions
  compiler.h: Raise minimum version of GCC to 5.1 for arm64
  arm64: make atomic helpers __always_inline
  arm64: rename S_FRAME_SIZE to PT_REGS_SIZE
  Revert "arm64: Enable perf events based hard lockup detector"
  arm64: entry: remove redundant IRQ flag tracing
  arm64: Remove arm64_dma32_phys_limit and its uses

82821be8

Merge tag 'mips_fixes_5.11.1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux · f288c895

Linus Torvalds authored Jan 15, 2021

Pull MIPS fixes from Thomas Bogendoerfer:

 - fix coredumps on 64bit kernels

 - fix for alignment bugs preventing booting

 - fix checking for failed irq_alloc_desc calls

* tag 'mips_fixes_5.11.1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux:
  MIPS: OCTEON: fix unreachable code in octeon_irq_init_ciu
  MIPS: relocatable: fix possible boot hangup with KASLR enabled
  MIPS: Fix malformed NT_FILE and NT_SIGINFO in 32bit coredumps
  MIPS: boot: Fix unaligned access with CONFIG_MIPS_RAW_APPENDED_DTB

f288c895

perf inject: Correct event attribute sizes · 648b054a

Al Grant authored Nov 24, 2020

When 'perf inject' reads a perf.data file from an older version of perf,
it writes event attributes into the output with the original size field,
but lays them out as if they had the size currently used. Readers see a
corrupt file. Update the size field to match the layout.
Signed-off-by: Al Grant <al.grant@foss.arm.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20201124195818.30603-1-al.grant@arm.comSigned-off-by: Denis Nikitin <denik@chromium.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

648b054a

perf intel-pt: Fix 'CPU too large' error · 5501e922

Adrian Hunter authored Jan 07, 2021

In some cases, the number of cpus (nr_cpus_online) is confused with the
maximum cpu number (nr_cpus_avail), which results in the error in the
example below:

Example on system with 8 cpus:

 Before:
   # echo 0 > /sys/devices/system/cpu/cpu2/online
   # ./perf record --kcore -e intel_pt// taskset --cpu-list 7 uname
   Linux
   [ perf record: Woken up 1 times to write data ]
   [ perf record: Captured and wrote 0.147 MB perf.data ]
   # ./perf script --itrace=e
   Requested CPU 7 too large. Consider raising MAX_NR_CPUS
   0x25908 [0x8]: failed to process type: 68 [Invalid argument]

 After:
   # ./perf script --itrace=e
   #

Fixes: 8c727469 ("perf machine: Replace MAX_NR_CPUS with perf_env::nr_cpus_online")
Fixes: 7df4e36a ("perf session: Replace MAX_NR_CPUS with perf_env::nr_cpus_online")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Tested-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: stable@vger.kernel.org
Link: http://lore.kernel.org/lkml/20210107174159.24897-1-adrian.hunter@intel.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

5501e922

perf stat: Take cgroups into account for shadow stats · a1bf2305

Namhyung Kim authored Jan 15, 2021

As of now it doesn't consider cgroups when collecting shadow stats and
metrics so counter values from different cgroups will be saved in a same
slot.  This resulted in incorrect numbers when those cgroups have
different workloads.

For example, let's look at the scenario below: cgroups A and C runs same
workload which burns a cpu while cgroup B runs a light workload.

  $ perf stat -a -e cycles,instructions --for-each-cgroup A,B,C  sleep 1

   Performance counter stats for 'system wide':

     3,958,116,522      cycles                A
     6,722,650,929      instructions          A #    2.53  insn per cycle
         1,132,741      cycles                B
           571,743      instructions          B #    0.00  insn per cycle
     4,007,799,935      cycles                C
     6,793,181,523      instructions          C #    2.56  insn per cycle

       1.001050869 seconds time elapsed

When I run 'perf stat' with single workload, it usually shows IPC around
1.7.  We can verify it (6,722,650,929.0 / 3,958,116,522 = 1.698) for cgroup A.

But in this case, since cgroups are ignored, cycles are averaged so it
used the lower value for IPC calculation and resulted in around 2.5.

  avg cycle: (3958116522 + 1132741 + 4007799935) / 3 = 2655683066
  IPC (A)  :  6722650929 / 2655683066 = 2.531
  IPC (B)  :      571743 / 2655683066 = 0.0002
  IPC (C)  :  6793181523 / 2655683066 = 2.557

We can simply compare cgroup pointers in the evsel and it'll be NULL
when cgroups are not specified.  With this patch, I can see correct
numbers like below:

  $ perf stat -a -e cycles,instructions --for-each-cgroup A,B,C  sleep 1

  Performance counter stats for 'system wide':

     4,171,051,687      cycles                A
     7,219,793,922      instructions          A #    1.73  insn per cycle
         1,051,189      cycles                B
           583,102      instructions          B #    0.55  insn per cycle
     4,171,124,710      cycles                C
     7,192,944,580      instructions          C #    1.72  insn per cycle

       1.007909814 seconds time elapsed
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20210115071139.257042-2-namhyung@kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

a1bf2305

perf stat: Introduce struct runtime_stat_data · 3ff1e718

Namhyung Kim authored Jan 15, 2021

To pass more info to the saved_value in the runtime_stat, add a new
struct runtime_stat_data.  Currently it only has 'ctx' field but later
patch will add more.

Note that we intentionally pass 0 as ctx to clock-related events for
compatibility.  It was already there in a few places.  So move the code
into the saved_value_lookup() explicitly and add a comment.
Suggested-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20210115071139.257042-1-namhyung@kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

3ff1e718

libperf tests: Fail when failing to get a tracepoint id · 66dd86b2

Ian Rogers authored Jan 14, 2021

Permissions are necessary to get a tracepoint id. Fail the test when the
read fails.
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20210114180250.3853825-2-irogers@google.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

66dd86b2

libperf tests: If a test fails return non-zero · bba2ea17

Ian Rogers authored Jan 14, 2021

If a test fails return -1 rather than 0. This is consistent with the
return value in test-cpumap.c
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20210114180250.3853825-1-irogers@google.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

bba2ea17