• Guo Ren's avatar
    riscv: asid: Fixup stale TLB entry cause application crash · 82dd33fd
    Guo Ren authored
    After use_asid_allocator is enabled, the userspace application will
    crash by stale TLB entries. Because only using cpumask_clear_cpu without
    local_flush_tlb_all couldn't guarantee CPU's TLB entries were fresh.
    Then set_mm_asid would cause the user space application to get a stale
    value by stale TLB entry, but set_mm_noasid is okay.
    
    Here is the symptom of the bug:
    unhandled signal 11 code 0x1 (coredump)
       0x0000003fd6d22524 <+4>:     auipc   s0,0x70
       0x0000003fd6d22528 <+8>:     ld      s0,-148(s0) # 0x3fd6d92490
    => 0x0000003fd6d2252c <+12>:    ld      a5,0(s0)
    (gdb) i r s0
    s0          0x8082ed1cc3198b21       0x8082ed1cc3198b21
    (gdb) x /2x 0x3fd6d92490
    0x3fd6d92490:   0xd80ac8a8      0x0000003f
    The core dump file shows that register s0 is wrong, but the value in
    memory is correct. Because 'ld s0, -148(s0)' used a stale mapping entry
    in TLB and got a wrong result from an incorrect physical address.
    
    When the task ran on CPU0, which loaded/speculative-loaded the value of
    address(0x3fd6d92490), then the first version of the mapping entry was
    PTWed into CPU0's TLB.
    When the task switched from CPU0 to CPU1 (No local_tlb_flush_all here by
    asid), it happened to write a value on the address (0x3fd6d92490). It
    caused do_page_fault -> wp_page_copy -> ptep_clear_flush ->
    ptep_get_and_clear & flush_tlb_page.
    The flush_tlb_page used mm_cpumask(mm) to determine which CPUs need TLB
    flush, but CPU0 had cleared the CPU0's mm_cpumask in the previous
    switch_mm. So we only flushed the CPU1 TLB and set the second version
    mapping of the PTE. When the task switched from CPU1 to CPU0 again, CPU0
    still used a stale TLB mapping entry which contained a wrong target
    physical address. It raised a bug when the task happened to read that
    value.
    
       CPU0                               CPU1
       - switch 'task' in
       - read addr (Fill stale mapping
         entry into TLB)
       - switch 'task' out (no tlb_flush)
                                          - switch 'task' in (no tlb_flush)
                                          - write addr cause pagefault
                                            do_page_fault() (change to
                                            new addr mapping)
                                              wp_page_copy()
                                                ptep_clear_flush()
                                                  ptep_get_and_clear()
                                                  & flush_tlb_page()
                                            write new value into addr
                                          - switch 'task' out (no tlb_flush)
       - switch 'task' in (no tlb_flush)
       - read addr again (Use stale
         mapping entry in TLB)
         get wrong value from old phyical
         addr, BUG!
    
    The solution is to keep all CPUs' footmarks of cpumask(mm) in switch_mm,
    which could guarantee to invalidate all stale TLB entries during TLB
    flush.
    
    Fixes: 65d4b9c5 ("RISC-V: Implement ASID allocator")
    Signed-off-by: default avatarGuo Ren <guoren@linux.alibaba.com>
    Signed-off-by: default avatarGuo Ren <guoren@kernel.org>
    Tested-by: default avatarLad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
    Tested-by: default avatarZong Li <zong.li@sifive.com>
    Tested-by: default avatarSergey Matyukevich <sergey.matyukevich@syntacore.com>
    Cc: Anup Patel <apatel@ventanamicro.com>
    Cc: Palmer Dabbelt <palmer@rivosinc.com>
    Cc: stable@vger.kernel.org
    Reviewed-by: default avatarAndrew Jones <ajones@ventanamicro.com>
    Link: https://lore.kernel.org/r/20230226150137.1919750-3-geomatsi@gmail.comSigned-off-by: default avatarPalmer Dabbelt <palmer@rivosinc.com>
    82dd33fd
context.c 9.46 KB