1. 19 Oct, 2023 32 commits
  2. 18 Oct, 2023 3 commits
  3. 17 Oct, 2023 1 commit
    • Michael Ellerman's avatar
      powerpc/64s/radix: Don't warn on copros in radix__tlb_flush() · 20045f01
      Michael Ellerman authored
      Sachin reported a warning when running the inject-ra-err selftest:
      
        # selftests: powerpc/mce: inject-ra-err
        Disabling lock debugging due to kernel taint
        MCE: CPU19: machine check (Severe)  Real address Load/Store (foreign/control memory) [Not recovered]
        MCE: CPU19: PID: 5254 Comm: inject-ra-err NIP: [0000000010000e48]
        MCE: CPU19: Initiator CPU
        MCE: CPU19: Unknown
        ------------[ cut here ]------------
        WARNING: CPU: 19 PID: 5254 at arch/powerpc/mm/book3s64/radix_tlb.c:1221 radix__tlb_flush+0x160/0x180
        CPU: 19 PID: 5254 Comm: inject-ra-err Kdump: loaded Tainted: G   M        E      6.6.0-rc3-00055-g9ed22ae6 #4
        Hardware name: IBM,9080-HEX POWER10 (raw) 0x800200 0xf000006 of:IBM,FW1030.20 (NH1030_058) hv:phyp pSeries
        ...
        NIP radix__tlb_flush+0x160/0x180
        LR  radix__tlb_flush+0x104/0x180
        Call Trace:
          radix__tlb_flush+0xf4/0x180 (unreliable)
          tlb_finish_mmu+0x15c/0x1e0
          exit_mmap+0x1a0/0x510
          __mmput+0x60/0x1e0
          exit_mm+0xdc/0x170
          do_exit+0x2bc/0x5a0
          do_group_exit+0x4c/0xc0
          sys_exit_group+0x28/0x30
          system_call_exception+0x138/0x330
          system_call_vectored_common+0x15c/0x2ec
      
      And bisected it to commit e43c0a0c ("powerpc/64s/radix: combine
      final TLB flush and lazy tlb mm shootdown IPIs"), which added a warning
      in radix__tlb_flush() if mm->context.copros is still elevated.
      
      However it's possible for the copros count to be elevated if a process
      exits without first closing file descriptors that are associated with a
      copro, eg. VAS.
      
      If the process exits with a VAS file still open, the release callback
      is queued up for exit_task_work() via:
        exit_files()
          put_files_struct()
            close_files()
              filp_close()
                fput()
      
      And called via:
        exit_task_work()
          ____fput()
            __fput()
              file->f_op->release(inode, file)
                coproc_release()
                  vas_user_win_ops->close_win()
                    vas_deallocate_window()
                      mm_context_remove_vas_window()
                        mm_context_remove_copro()
      
      But that is after exit_mm() has been called from do_exit() and triggered
      the warning.
      
      Fix it by dropping the warning, and always calling __flush_all_mm().
      
      In the normal case of no copros, that will result in a call to
      _tlbiel_pid(mm->context.id, RIC_FLUSH_ALL) just as the current code
      does.
      
      If the copros count is elevated then it will cause a global flush, which
      should flush translations from any copros. Note that the process table
      entry was cleared in arch_exit_mmap(), so copros should not be able to
      fetch any new translations.
      
      Fixes: e43c0a0c ("powerpc/64s/radix: combine final TLB flush and lazy tlb mm shootdown IPIs")
      Reported-by: default avatarSachin Sant <sachinp@linux.ibm.com>
      Closes: https://lore.kernel.org/all/A8E52547-4BF1-47CE-8AEA-BC5A9D7E3567@linux.ibm.com/Signed-off-by: default avatarNicholas Piggin <npiggin@gmail.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Tested-by: default avatarSachin Sant <sachinp@linux.ibm.com>
      Link: https://msgid.link/20231017121527.1574104-1-mpe@ellerman.id.au
      20045f01
  4. 15 Oct, 2023 1 commit
  5. 10 Oct, 2023 3 commits