1. 22 Oct, 2021 24 commits
  2. 14 Oct, 2021 2 commits
  3. 13 Oct, 2021 3 commits
  4. 12 Oct, 2021 2 commits
  5. 08 Oct, 2021 9 commits
    • Nathan Lynch's avatar
      powerpc/pseries/cpuhp: remove obsolete comment from pseries_cpu_die · f9473a65
      Nathan Lynch authored
      This comment likely refers to the obsolete DLPAR workflow where some
      resource state transitions were driven more directly from user space
      utilities, but it also seems to contradict itself: "Change isolate state to
      Isolate [...]" is at odds with the preceding sentences, and it does not
      relate at all to the code that follows.
      
      Remove it to prevent confusion.
      Signed-off-by: default avatarNathan Lynch <nathanl@linux.ibm.com>
      Reviewed-by: default avatarDaniel Henrique Barboza <danielhb413@gmail.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/20210927201933.76786-5-nathanl@linux.ibm.com
      f9473a65
    • Nathan Lynch's avatar
      powerpc/pseries/cpuhp: delete add/remove_by_count code · fa2a5dfe
      Nathan Lynch authored
      The core DLPAR code supports two actions (add and remove) and three
      subtypes of action:
      
      * By DRC index: the action is attempted on a single specified resource.
        This is the usual case for processors.
      * By indexed count: the action is attempted on a range of resources
        beginning at the specified index. This is implemented only by the memory
        DLPAR code.
      * By count: the lower layer (CPU or memory) is responsible for locating the
        specified number of resources to which the action can be applied.
      
      I cannot find any evidence of the "by count" subtype being used by drmgr or
      qemu for processors. And when I try to exercise this code, the add case
      does not work:
      
        $ ppc64_cpu --smt ; nproc
        SMT=8
        24
        $ printf "cpu remove count 2" > /sys/kernel/dlpar
        $ nproc
        8
        $ printf "cpu add count 2" > /sys/kernel/dlpar
        -bash: printf: write error: Invalid argument
        $ dmesg | tail -2
        pseries-hotplug-cpu: Failed to find enough CPUs (1 of 2) to add
        dlpar: Could not handle DLPAR request "cpu add count 2"
        $ nproc
        8
        $ drmgr -c cpu -a -q 2         # this uses the by-index method
        Validating CPU DLPAR capability...yes.
        CPU 1
        CPU 17
        $ nproc
        24
      
      This is because find_drc_info_cpus_to_add() does not increment drc_index
      appropriately during its search.
      
      This is not hard to fix. But the _by_count() functions also have the
      property that they attempt to roll back all prior operations if the entire
      request cannot be satisfied, even though the rollback itself can encounter
      errors. It's not possible to provide transaction-like behavior at this
      level, and it's undesirable to have code that can only pretend to do that.
      Any users of these functions cannot know what the state of the system is in
      the error case. And the error paths are, to my knowledge, impossible to
      test without adding custom error injection code.
      
      Summary:
      
      * This code has not worked reliably since its introduction.
      * There is no evidence that it is used.
      * It contains questionable rollback behaviors in error paths which are
        difficult to test.
      
      So let's remove it.
      
      Fixes: ac713800 ("powerpc/pseries: Add CPU dlpar remove functionality")
      Fixes: 90edf184 ("powerpc/pseries: Add CPU dlpar add functionality")
      Fixes: b015f6bc ("powerpc/pseries: Add cpu DLPAR support for drc-info property")
      Signed-off-by: default avatarNathan Lynch <nathanl@linux.ibm.com>
      Tested-by: default avatarDaniel Henrique Barboza <danielhb413@gmail.com>
      Reviewed-by: default avatarDaniel Henrique Barboza <danielhb413@gmail.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/20210927201933.76786-4-nathanl@linux.ibm.com
      fa2a5dfe
    • Nathan Lynch's avatar
      powerpc/cpuhp: BUG -> WARN conversion in offline path · 983f9101
      Nathan Lynch authored
      If, due to bugs elsewhere, we get into unregister_cpu_online() with a CPU
      that isn't marked hotpluggable, we can emit a warning and return an
      appropriate error instead of crashing.
      Signed-off-by: default avatarNathan Lynch <nathanl@linux.ibm.com>
      Reviewed-by: default avatarDaniel Henrique Barboza <danielhb413@gmail.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/20210927201933.76786-3-nathanl@linux.ibm.com
      983f9101
    • Nathan Lynch's avatar
      powerpc/pseries/cpuhp: cache node corrections · 7edd5c9a
      Nathan Lynch authored
      On pseries, cache nodes in the device tree can be added and removed by the
      CPU DLPAR code as well as the partition migration (mobility) code. PowerVM
      partitions in dedicated processor mode typically have L2 and L3 cache
      nodes.
      
      The CPU DLPAR code has the following shortcomings:
      
      * Cache nodes returned as siblings of a new CPU node by
        ibm,configure-connector are silently discarded; only the CPU node is
        added to the device tree.
      
      * Cache nodes which become unreferenced in the processor removal path are
        not removed from the device tree. This can lead to duplicate nodes when
        the post-migration device tree update code replaces cache nodes.
      
      This is long-standing behavior. Presumably it has gone mostly unnoticed
      because the two bugs have the property of obscuring each other in common
      simple scenarios (e.g. remove a CPU and add it back). Likely you'd notice
      only if you cared to inspect the device tree or the sysfs cacheinfo
      information.
      
      Booted with two processors:
      
        $ pwd
        /sys/firmware/devicetree/base/cpus
        $ ls -1d */
        l2-cache@2010/
        l2-cache@2011/
        l3-cache@3110/
        l3-cache@3111/
        PowerPC,POWER9@0/
        PowerPC,POWER9@8/
        $ lsprop */l2-cache
        l2-cache@2010/l2-cache
                       00003110 (12560)
        l2-cache@2011/l2-cache
                       00003111 (12561)
        PowerPC,POWER9@0/l2-cache
                       00002010 (8208)
        PowerPC,POWER9@8/l2-cache
                       00002011 (8209)
        $ ls /sys/devices/system/cpu/cpu0/cache/
        index0  index1  index2  index3
      
      After DLPAR-adding PowerPC,POWER9@10, we see that its associated cache
      nodes are absent, its threads' L2+L3 cacheinfo is unpopulated, and it is
      missing a cache level in its sched domain hierarchy:
      
        $ ls -1d */
        l2-cache@2010/
        l2-cache@2011/
        l3-cache@3110/
        l3-cache@3111/
        PowerPC,POWER9@0/
        PowerPC,POWER9@10/
        PowerPC,POWER9@8/
        $ lsprop PowerPC\,POWER9@10/l2-cache
        PowerPC,POWER9@10/l2-cache
                       00002012 (8210)
        $ ls /sys/devices/system/cpu/cpu16/cache/
        index0  index1
        $ grep . /sys/kernel/debug/sched/domains/cpu{0,8,16}/domain*/name
        /sys/kernel/debug/sched/domains/cpu0/domain0/name:SMT
        /sys/kernel/debug/sched/domains/cpu0/domain1/name:CACHE
        /sys/kernel/debug/sched/domains/cpu0/domain2/name:DIE
        /sys/kernel/debug/sched/domains/cpu8/domain0/name:SMT
        /sys/kernel/debug/sched/domains/cpu8/domain1/name:CACHE
        /sys/kernel/debug/sched/domains/cpu8/domain2/name:DIE
        /sys/kernel/debug/sched/domains/cpu16/domain0/name:SMT
        /sys/kernel/debug/sched/domains/cpu16/domain1/name:DIE
      
      When removing PowerPC,POWER9@8, we see that its cache nodes are left
      behind:
      
        $ ls -1d */
        l2-cache@2010/
        l2-cache@2011/
        l3-cache@3110/
        l3-cache@3111/
        PowerPC,POWER9@0/
      
      When DLPAR is combined with VM migration, we can get duplicate nodes. E.g.
      removing one processor, then migrating, adding a processor, and then
      migrating again can result in warnings from the OF core during
      post-migration device tree updates:
      
        Duplicate name in cpus, renamed to "l2-cache@2011#1"
        Duplicate name in cpus, renamed to "l3-cache@3111#1"
      
      and nodes with duplicated phandles in the tree, making lookup behavior
      unpredictable:
      
        $ lsprop l[23]-cache@*/ibm,phandle
        l2-cache@2010/ibm,phandle
                         00002010 (8208)
        l2-cache@2011#1/ibm,phandle
                         00002011 (8209)
        l2-cache@2011/ibm,phandle
                         00002011 (8209)
        l3-cache@3110/ibm,phandle
                         00003110 (12560)
        l3-cache@3111#1/ibm,phandle
                         00003111 (12561)
        l3-cache@3111/ibm,phandle
                         00003111 (12561)
      
      Address these issues by:
      
      * Correctly processing siblings of the node returned from
        dlpar_configure_connector().
      * Removing cache nodes in the CPU remove path when it can be determined
        that they are not associated with other CPUs or caches.
      
      Use the of_changeset API in both cases, which allows us to keep the error
      handling in this code from becoming more complex while ensuring that the
      device tree cannot become inconsistent.
      
      Fixes: ac713800 ("powerpc/pseries: Add CPU dlpar remove functionality")
      Fixes: 90edf184 ("powerpc/pseries: Add CPU dlpar add functionality")
      Signed-off-by: default avatarNathan Lynch <nathanl@linux.ibm.com>
      Tested-by: default avatarDaniel Henrique Barboza <danielhb413@gmail.com>
      Reviewed-by: default avatarDaniel Henrique Barboza <danielhb413@gmail.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/20210927201933.76786-2-nathanl@linux.ibm.com
      7edd5c9a
    • Nathan Lynch's avatar
      powerpc/paravirt: correct preempt debug splat in vcpu_is_preempted() · fda0eb22
      Nathan Lynch authored
      vcpu_is_preempted() can be used outside of preempt-disabled critical
      sections, yielding warnings such as:
      
      BUG: using smp_processor_id() in preemptible [00000000] code: systemd-udevd/185
      caller is rwsem_spin_on_owner+0x1cc/0x2d0
      CPU: 1 PID: 185 Comm: systemd-udevd Not tainted 5.15.0-rc2+ #33
      Call Trace:
      [c000000012907ac0] [c000000000aa30a8] dump_stack_lvl+0xac/0x108 (unreliable)
      [c000000012907b00] [c000000001371f70] check_preemption_disabled+0x150/0x160
      [c000000012907b90] [c0000000001e0e8c] rwsem_spin_on_owner+0x1cc/0x2d0
      [c000000012907be0] [c0000000001e1408] rwsem_down_write_slowpath+0x478/0x9a0
      [c000000012907ca0] [c000000000576cf4] filename_create+0x94/0x1e0
      [c000000012907d10] [c00000000057ac08] do_symlinkat+0x68/0x1a0
      [c000000012907d70] [c00000000057ae18] sys_symlink+0x58/0x70
      [c000000012907da0] [c00000000002e448] system_call_exception+0x198/0x3c0
      [c000000012907e10] [c00000000000c54c] system_call_common+0xec/0x250
      
      The result of vcpu_is_preempted() is always used speculatively, and the
      function does not access per-cpu resources in a (Linux) preempt-unsafe way.
      Use raw_smp_processor_id() to avoid such warnings, adding explanatory
      comments.
      
      Fixes: ca3f969d ("powerpc/paravirt: Use is_kvm_guest() in vcpu_is_preempted()")
      Signed-off-by: default avatarNathan Lynch <nathanl@linux.ibm.com>
      Reviewed-by: default avatarSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/20210928214147.312412-3-nathanl@linux.ibm.com
      fda0eb22
    • Nathan Lynch's avatar
      powerpc/paravirt: vcpu_is_preempted() commentary · 799f9b51
      Nathan Lynch authored
      Add comments more clearly documenting that this function determines whether
      hypervisor-level preemption of the VM has occurred.
      Signed-off-by: default avatarNathan Lynch <nathanl@linux.ibm.com>
      Reviewed-by: default avatarSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/20210928214147.312412-2-nathanl@linux.ibm.com
      799f9b51
    • Nathan Lynch's avatar
      powerpc: fix unbalanced node refcount in check_kvm_guest() · 56537faf
      Nathan Lynch authored
      When check_kvm_guest() succeeds in looking up a /hypervisor OF node, it
      returns without performing a matching put for the lookup, leaving the
      node's reference count elevated.
      
      Add the necessary call to of_node_put(), rearranging the code slightly to
      avoid repetition or goto.
      
      Fixes: 107c5500 ("powerpc/pseries: Add KVM guest doorbell restrictions")
      Signed-off-by: default avatarNathan Lynch <nathanl@linux.ibm.com>
      Reviewed-by: default avatarSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      Reviewed-by: default avatarTyrel Datwyler <tyreld@linux.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/20210928124550.132020-1-nathanl@linux.ibm.com
      56537faf
    • Christophe Leroy's avatar
      video: fbdev: chipsfb: use memset_io() instead of memset() · f2719b26
      Christophe Leroy authored
      While investigating a lockup at startup on Powerbook 3400C, it was
      identified that the fbdev driver generates alignment exception at
      startup:
      
        --- interrupt: 600 at memset+0x60/0xc0
        NIP:  c0021414 LR: c03fc49c CTR: 00007fff
        REGS: ca021c10 TRAP: 0600   Tainted: G        W          (5.14.2-pmac-00727-g12a41fa69492)
        MSR:  00009032 <EE,ME,IR,DR,RI>  CR: 44008442  XER: 20000100
        DAR: cab80020 DSISR: 00017c07
        GPR00: 00000007 ca021cd0 c14412e0 cab80000 00000000 00100000 cab8001c 00000004
        GPR08: 00100000 00007fff 00000000 00000000 84008442 00000000 c0006fb4 00000000
        GPR16: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00100000
        GPR24: 00000000 81800000 00000320 c15fa400 c14d1878 00000000 c14d1800 c094e19c
        NIP [c0021414] memset+0x60/0xc0
        LR [c03fc49c] chipsfb_pci_init+0x160/0x580
        --- interrupt: 600
        [ca021cd0] [c03fc46c] chipsfb_pci_init+0x130/0x580 (unreliable)
        [ca021d20] [c03a3a70] pci_device_probe+0xf8/0x1b8
        [ca021d50] [c043d584] really_probe.part.0+0xac/0x388
        [ca021d70] [c043d914] __driver_probe_device+0xb4/0x170
        [ca021d90] [c043da18] driver_probe_device+0x48/0x144
        [ca021dc0] [c043e318] __driver_attach+0x11c/0x1c4
        [ca021de0] [c043ad30] bus_for_each_dev+0x88/0xf0
        [ca021e10] [c043c724] bus_add_driver+0x190/0x22c
        [ca021e40] [c043ee94] driver_register+0x9c/0x170
        [ca021e60] [c0006c28] do_one_initcall+0x54/0x1ec
        [ca021ed0] [c08246e4] kernel_init_freeable+0x1c0/0x270
        [ca021f10] [c0006fdc] kernel_init+0x28/0x11c
        [ca021f30] [c0017148] ret_from_kernel_thread+0x14/0x1c
        Instruction dump:
        7d4601a4 39490777 7d4701a4 39490888 7d4801a4 39490999 7d4901a4 39290aaa
        7d2a01a4 4c00012c 4bfffe88 0fe00000 <4bfffe80> 9421fff0 38210010 48001970
      
      This is due to 'dcbz' instruction being used on non-cached memory.
      'dcbz' instruction is used by memset() to zeroize a complete
      cacheline at once, and memset() is not expected to be used on non
      cached memory.
      
      When performing a 'sparse' check on fbdev driver, it also appears
      that the use of memset() is unexpected:
      
        drivers/video/fbdev/chipsfb.c:334:17: warning: incorrect type in argument 1 (different address spaces)
        drivers/video/fbdev/chipsfb.c:334:17:    expected void *
        drivers/video/fbdev/chipsfb.c:334:17:    got char [noderef] __iomem *screen_base
        drivers/video/fbdev/chipsfb.c:334:15: warning: memset with byte count of 1048576
      
      Use fb_memset() instead of memset(). fb_memset() is defined as
      memset_io() for powerpc.
      
      Fixes: 8c870933 ("[PATCH] ppc32: Remove CONFIG_PMAC_PBOOK")
      Reported-by: default avatarStan Johnson <userm57@yahoo.com>
      Signed-off-by: default avatarChristophe Leroy <christophe.leroy@csgroup.eu>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/884a54f1e5cb774c1d9b4db780209bee5d4f6718.1631712563.git.christophe.leroy@csgroup.eu
      f2719b26
    • Vasant Hegde's avatar