1. 31 Jan, 2023 2 commits
    • Michael Ellerman's avatar
      powerpc/kexec_file: Fix division by zero in extra size estimation · 7294194b
      Michael Ellerman authored
      In kexec_extra_fdt_size_ppc64() there's logic to estimate how much
      extra space will be needed in the device tree for some memory related
      properties.
      
      That logic uses the size of RAM divided by drmem_lmb_size() to do the
      estimation. However drmem_lmb_size() can be zero if the machine has no
      hotpluggable memory configured, which is the case when booting with qemu
      and no maxmem=x parameter is passed (the default).
      
      The division by zero is reported by UBSAN, and can also lead to an
      overflow and a warning from kvmalloc, and kdump kernel loading fails:
      
        WARNING: CPU: 0 PID: 133 at mm/util.c:596 kvmalloc_node+0x15c/0x160
        Modules linked in:
        CPU: 0 PID: 133 Comm: kexec Not tainted 6.2.0-rc5-03455-g07358bd97810 #223
        Hardware name: IBM pSeries (emulated by qemu) POWER9 (raw) 0x4e1200 0xf000005 of:SLOF,git-dd0dca pSeries
        NIP:  c00000000041ff4c LR: c00000000041fe58 CTR: 0000000000000000
        REGS: c0000000096ef750 TRAP: 0700   Not tainted  (6.2.0-rc5-03455-g07358bd97810)
        MSR:  800000000282b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE>  CR: 24248242  XER: 2004011e
        CFAR: c00000000041fed0 IRQMASK: 0
        ...
        NIP kvmalloc_node+0x15c/0x160
        LR  kvmalloc_node+0x68/0x160
        Call Trace:
          kvmalloc_node+0x68/0x160 (unreliable)
          of_kexec_alloc_and_setup_fdt+0xb8/0x7d0
          elf64_load+0x25c/0x4a0
          kexec_image_load_default+0x58/0x80
          sys_kexec_file_load+0x5c0/0x920
          system_call_exception+0x128/0x330
          system_call_vectored_common+0x15c/0x2ec
      
      To fix it, skip the calculation if drmem_lmb_size() is zero.
      
      Fixes: 2377c92e ("powerpc/kexec_file: fix FDT size estimation for kdump kernel")
      Cc: stable@vger.kernel.org # v5.12+
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/20230130014707.541110-1-mpe@ellerman.id.au
      7294194b
    • Michael Ellerman's avatar
      powerpc/imc-pmu: Revert nest_init_lock to being a mutex · ad53db4a
      Michael Ellerman authored
      The recent commit 76d588dd ("powerpc/imc-pmu: Fix use of mutex in
      IRQs disabled section") fixed warnings (and possible deadlocks) in the
      IMC PMU driver by converting the locking to use spinlocks.
      
      It also converted the init-time nest_init_lock to a spinlock, even
      though it's not used at runtime in IRQ disabled sections or while
      holding other spinlocks.
      
      This leads to warnings such as:
      
        BUG: sleeping function called from invalid context at include/linux/percpu-rwsem.h:49
        in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 1, name: swapper/0
        preempt_count: 1, expected: 0
        CPU: 7 PID: 1 Comm: swapper/0 Not tainted 6.2.0-rc2-14719-gf12cd061-dirty #1
        Hardware name: Mambo,Simulated-System POWER9 0x4e1203 opal:v6.6.6 PowerNV
        Call Trace:
          dump_stack_lvl+0x74/0xa8 (unreliable)
          __might_resched+0x178/0x1a0
          __cpuhp_setup_state+0x64/0x1e0
          init_imc_pmu+0xe48/0x1250
          opal_imc_counters_probe+0x30c/0x6a0
          platform_probe+0x78/0x110
          really_probe+0x104/0x420
          __driver_probe_device+0xb0/0x170
          driver_probe_device+0x58/0x180
          __driver_attach+0xd8/0x250
          bus_for_each_dev+0xb4/0x140
          driver_attach+0x34/0x50
          bus_add_driver+0x1e8/0x2d0
          driver_register+0xb4/0x1c0
          __platform_driver_register+0x38/0x50
          opal_imc_driver_init+0x2c/0x40
          do_one_initcall+0x80/0x360
          kernel_init_freeable+0x310/0x3b8
          kernel_init+0x30/0x1a0
          ret_from_kernel_thread+0x5c/0x64
      
      Fix it by converting nest_init_lock back to a mutex, so that we can call
      sleeping functions while holding it. There is no interaction between
      nest_init_lock and the runtime spinlocks used by the actual PMU routines.
      
      Fixes: 76d588dd ("powerpc/imc-pmu: Fix use of mutex in IRQs disabled section")
      Tested-by: Kajol Jain<kjain@linux.ibm.com>
      Reviewed-by: Kajol Jain<kjain@linux.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/20230130014401.540543-1-mpe@ellerman.id.au
      ad53db4a
  2. 30 Jan, 2023 4 commits
  3. 11 Jan, 2023 3 commits
    • Yang Yingliang's avatar
      powerpc/64s/hash: Make stress_hpt_timer_fn() static · f12cd061
      Yang Yingliang authored
      stress_hpt_timer_fn() is only used in hash_utils.c, make it static.
      
      Fixes: 6b34a099 ("powerpc/64s/hash: add stress_hpt kernel boot option to increase hash faults")
      Signed-off-by: default avatarYang Yingliang <yangyingliang@huawei.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/20221228093603.3166599-1-yangyingliang@huawei.com
      f12cd061
    • Kajol Jain's avatar
      powerpc/imc-pmu: Fix use of mutex in IRQs disabled section · 76d588dd
      Kajol Jain authored
      Current imc-pmu code triggers a WARNING with CONFIG_DEBUG_ATOMIC_SLEEP
      and CONFIG_PROVE_LOCKING enabled, while running a thread_imc event.
      
      Command to trigger the warning:
        # perf stat -e thread_imc/CPM_CS_FROM_L4_MEM_X_DPTEG/ sleep 5
      
         Performance counter stats for 'sleep 5':
      
                         0      thread_imc/CPM_CS_FROM_L4_MEM_X_DPTEG/
      
               5.002117947 seconds time elapsed
      
               0.000131000 seconds user
               0.001063000 seconds sys
      
      Below is snippet of the warning in dmesg:
      
        BUG: sleeping function called from invalid context at kernel/locking/mutex.c:580
        in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 2869, name: perf-exec
        preempt_count: 2, expected: 0
        4 locks held by perf-exec/2869:
         #0: c00000004325c540 (&sig->cred_guard_mutex){+.+.}-{3:3}, at: bprm_execve+0x64/0xa90
         #1: c00000004325c5d8 (&sig->exec_update_lock){++++}-{3:3}, at: begin_new_exec+0x460/0xef0
         #2: c0000003fa99d4e0 (&cpuctx_lock){-...}-{2:2}, at: perf_event_exec+0x290/0x510
         #3: c000000017ab8418 (&ctx->lock){....}-{2:2}, at: perf_event_exec+0x29c/0x510
        irq event stamp: 4806
        hardirqs last  enabled at (4805): [<c000000000f65b94>] _raw_spin_unlock_irqrestore+0x94/0xd0
        hardirqs last disabled at (4806): [<c0000000003fae44>] perf_event_exec+0x394/0x510
        softirqs last  enabled at (0): [<c00000000013c404>] copy_process+0xc34/0x1ff0
        softirqs last disabled at (0): [<0000000000000000>] 0x0
        CPU: 36 PID: 2869 Comm: perf-exec Not tainted 6.2.0-rc2-00011-g1247637727f2 #61
        Hardware name: 8375-42A POWER9 0x4e1202 opal:v7.0-16-g9b85f7d961 PowerNV
        Call Trace:
          dump_stack_lvl+0x98/0xe0 (unreliable)
          __might_resched+0x2f8/0x310
          __mutex_lock+0x6c/0x13f0
          thread_imc_event_add+0xf4/0x1b0
          event_sched_in+0xe0/0x210
          merge_sched_in+0x1f0/0x600
          visit_groups_merge.isra.92.constprop.166+0x2bc/0x6c0
          ctx_flexible_sched_in+0xcc/0x140
          ctx_sched_in+0x20c/0x2a0
          ctx_resched+0x104/0x1c0
          perf_event_exec+0x340/0x510
          begin_new_exec+0x730/0xef0
          load_elf_binary+0x3f8/0x1e10
        ...
        do not call blocking ops when !TASK_RUNNING; state=2001 set at [<00000000fd63e7cf>] do_nanosleep+0x60/0x1a0
        WARNING: CPU: 36 PID: 2869 at kernel/sched/core.c:9912 __might_sleep+0x9c/0xb0
        CPU: 36 PID: 2869 Comm: sleep Tainted: G        W          6.2.0-rc2-00011-g1247637727f2 #61
        Hardware name: 8375-42A POWER9 0x4e1202 opal:v7.0-16-g9b85f7d961 PowerNV
        NIP:  c000000000194a1c LR: c000000000194a18 CTR: c000000000a78670
        REGS: c00000004d2134e0 TRAP: 0700   Tainted: G        W           (6.2.0-rc2-00011-g1247637727f2)
        MSR:  9000000000021033 <SF,HV,ME,IR,DR,RI,LE>  CR: 48002824  XER: 00000000
        CFAR: c00000000013fb64 IRQMASK: 1
      
      The above warning triggered because the current imc-pmu code uses mutex
      lock in interrupt disabled sections. The function mutex_lock()
      internally calls __might_resched(), which will check if IRQs are
      disabled and in case IRQs are disabled, it will trigger the warning.
      
      Fix the issue by changing the mutex lock to spinlock.
      
      Fixes: 8f95faaa ("powerpc/powernv: Detect and create IMC device")
      Reported-by: default avatarMichael Petlan <mpetlan@redhat.com>
      Reported-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Signed-off-by: default avatarKajol Jain <kjain@linux.ibm.com>
      [mpe: Fix comments, trim oops in change log, add reported-by tags]
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/20230106065157.182648-1-kjain@linux.ibm.com
      76d588dd
    • Ojaswin Mujoo's avatar
      powerpc/boot: Fix incorrect version calculation issue in ld_version · 3287ebd7
      Ojaswin Mujoo authored
      The ld_version() function computes the wrong version value for certain
      ld versions such as the following:
      
        $ ld --version
        GNU ld (GNU Binutils; SUSE Linux Enterprise 15)
        2.37.20211103-150100.7.37
      
      For input 2.37.20211103, the value computed is 202348030000 which is
      higher than the value for a later version like 2.39.0, which is
      23900000.
      
      This issue was highlighted because with the above ld version, the
      powerpc kernel build started failing with ld error: "unrecognized option
      --no-warn-rwx-segments". This was caused due to the recent commit
      579aee9f ("powerpc: suppress some linker warnings in recent linker
      versions") which added the --no-warn-rwx-segments linker flag if the ld
      version is greater than 2.39.
      
      Due to the bug in ld_version(), ld version 2.37.20111103 is wrongly
      calculated to be greater than 2.39 and the unsupported flag is added.
      
      To fix it, if version is of the form x.y.z and length(z) == 8, then most
      probably it is a date [yyyymmdd] commonly used for release snapshots and
      not an actual new version. Hence, ignore the date part replacing it with
      0.
      
      Fixes: 579aee9f ("powerpc: suppress some linker warnings in recent linker versions")
      Signed-off-by: default avatarOjaswin Mujoo <ojaswin@linux.ibm.com>
      [mpe: Tweak change log wording/formatting, add Fixes tag]
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/20230104202437.90039-1-ojaswin@linux.ibm.com
      3287ebd7
  4. 05 Jan, 2023 3 commits
    • Michael Ellerman's avatar
      powerpc/vmlinux.lds: Don't discard .comment · be5f95c8
      Michael Ellerman authored
      Although the powerpc linker script mentions .comment in the DISCARD
      section, that has never actually caused it to be discarded, because the
      earlier ELF_DETAILS macro (previously STABS_DEBUG) explicitly includes
      .comment.
      
      However commit 99cb0d91 ("arch: fix broken BuildID for arm64 and
      riscv") introduced an earlier use of DISCARD as part of the RO_DATA
      macro. With binutils < 2.36 that causes the DISCARD directives later in
      the script to be applied earlier, causing .comment to actually be
      discarded.
      
      It's confusing to explicitly include and discard .comment, and even more
      so if the behaviour depends on the toolchain version. So don't discard
      .comment in order to maintain the existing behaviour in all cases.
      
      Fixes: 83a092cf ("powerpc: Link warning for orphan sections")
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/20230105132349.384666-3-mpe@ellerman.id.au
      be5f95c8
    • Michael Ellerman's avatar
      powerpc/vmlinux.lds: Don't discard .rela* for relocatable builds · 07b050f9
      Michael Ellerman authored
      Relocatable kernels must not discard relocations, they need to be
      processed at runtime. As such they are included for CONFIG_RELOCATABLE
      builds in the powerpc linker script (line 340).
      
      However they are also unconditionally discarded later in the
      script (line 414). Previously that worked because the earlier inclusion
      superseded the discard.
      
      However commit 99cb0d91 ("arch: fix broken BuildID for arm64 and
      riscv") introduced an earlier use of DISCARD as part of the RO_DATA
      macro (line 137). With binutils < 2.36 that causes the DISCARD
      directives later in the script to be applied earlier, causing .rela* to
      actually be discarded at link time, leading to build warnings and a
      kernel that doesn't boot:
      
        ld: warning: discarding dynamic section .rela.init.rodata
      
      Fix it by conditionally discarding .rela* only when CONFIG_RELOCATABLE
      is disabled.
      
      Fixes: 99cb0d91 ("arch: fix broken BuildID for arm64 and riscv")
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      
      Link: https://lore.kernel.org/r/20230105132349.384666-2-mpe@ellerman.id.au
      07b050f9
    • Michael Ellerman's avatar
      powerpc/vmlinux.lds: Define RUNTIME_DISCARD_EXIT · 4b9880db
      Michael Ellerman authored
      The powerpc linker script explicitly includes .exit.text, because
      otherwise the link fails due to references from __bug_table and
      __ex_table. The code is freed (discarded) at runtime along with
      .init.text and data.
      
      That has worked in the past despite powerpc not defining
      RUNTIME_DISCARD_EXIT because DISCARDS appears late in the powerpc linker
      script (line 410), and the explicit inclusion of .exit.text
      earlier (line 280) supersedes the discard.
      
      However commit 99cb0d91 ("arch: fix broken BuildID for arm64 and
      riscv") introduced an earlier use of DISCARD as part of the RO_DATA
      macro (line 136). With binutils < 2.36 that causes the DISCARD
      directives later in the script to be applied earlier [1], causing
      .exit.text to actually be discarded at link time, leading to build
      errors:
      
        '.exit.text' referenced in section '__bug_table' of crypto/algboss.o: defined in
        discarded section '.exit.text' of crypto/algboss.o
        '.exit.text' referenced in section '__ex_table' of drivers/nvdimm/core.o: defined in
        discarded section '.exit.text' of drivers/nvdimm/core.o
      
      Fix it by defining RUNTIME_DISCARD_EXIT, which causes the generic
      DISCARDS macro to not include .exit.text at all.
      
      1: https://lore.kernel.org/lkml/87fscp2v7k.fsf@igel.home/
      
      Fixes: 99cb0d91 ("arch: fix broken BuildID for arm64 and riscv")
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/20230105132349.384666-1-mpe@ellerman.id.au
      4b9880db
  5. 01 Jan, 2023 6 commits
  6. 31 Dec, 2022 2 commits
  7. 30 Dec, 2022 19 commits
  8. 29 Dec, 2022 1 commit
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm · 2258c2dc
      Linus Torvalds authored
      Pull kvm fixes from Paolo Bonzini:
       "Changes that were posted too late for 6.1, or after the release.
      
        x86:
      
         - several fixes to nested VMX execution controls
      
         - fixes and clarification to the documentation for Xen emulation
      
         - do not unnecessarily release a pmu event with zero period
      
         - MMU fixes
      
         - fix Coverity warning in kvm_hv_flush_tlb()
      
        selftests:
      
         - fixes for the ucall mechanism in selftests
      
         - other fixes mostly related to compilation with clang"
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (41 commits)
        KVM: selftests: restore special vmmcall code layout needed by the harness
        Documentation: kvm: clarify SRCU locking order
        KVM: x86: fix deadlock for KVM_XEN_EVTCHN_RESET
        KVM: x86/xen: Documentation updates and clarifications
        KVM: x86/xen: Add KVM_XEN_INVALID_GPA and KVM_XEN_INVALID_GFN to uapi
        KVM: x86/xen: Simplify eventfd IOCTLs
        KVM: x86/xen: Fix SRCU/RCU usage in readers of evtchn_ports
        KVM: x86/xen: Use kvm_read_guest_virt() instead of open-coding it badly
        KVM: x86/xen: Fix memory leak in kvm_xen_write_hypercall_page()
        KVM: Delete extra block of "};" in the KVM API documentation
        kvm: x86/mmu: Remove duplicated "be split" in spte.h
        kvm: Remove the unused macro KVM_MMU_READ_{,UN}LOCK()
        MAINTAINERS: adjust entry after renaming the vmx hyperv files
        KVM: selftests: Mark correct page as mapped in virt_map()
        KVM: arm64: selftests: Don't identity map the ucall MMIO hole
        KVM: selftests: document the default implementation of vm_vaddr_populate_bitmap
        KVM: selftests: Use magic value to signal ucall_alloc() failure
        KVM: selftests: Disable "gnu-variable-sized-type-not-at-end" warning
        KVM: selftests: Include lib.mk before consuming $(CC)
        KVM: selftests: Explicitly disable builtins for mem*() overrides
        ...
      2258c2dc