• Michael Neuling's avatar
    powerpc/tm: Fix stack pointer corruption in __tm_recheckpoint() · 6bcb8014
    Michael Neuling authored
    At the start of __tm_recheckpoint() we save the kernel stack pointer
    (r1) in SPRG SCRATCH0 (SPRG2) so that we can restore it after the
    trecheckpoint.
    
    Unfortunately, the same SPRG is used in the SLB miss handler.  If an
    SLB miss is taken between the save and restore of r1 to the SPRG, the
    SPRG is changed and hence r1 is also corrupted.  We can end up with
    the following crash when we start using r1 again after the restore
    from the SPRG:
    
      Oops: Bad kernel stack pointer, sig: 6 [#1]
      SMP NR_CPUS=2048 NUMA pSeries
      CPU: 658 PID: 143777 Comm: htm_demo Tainted: G            EL   X 4.4.13-0-default #1
      task: c0000b56993a7810 ti: c00000000cfec000 task.ti: c0000b56993bc000
      NIP: c00000000004f188 LR: 00000000100040b8 CTR: 0000000010002570
      REGS: c00000000cfefd40 TRAP: 0300   Tainted: G            EL   X  (4.4.13-0-default)
      MSR: 8000000300001033 <SF,ME,IR,DR,RI,LE>  CR: 02000424  XER: 20000000
      CFAR: c000000000008468 DAR: 00003ffd84e66880 DSISR: 40000000 SOFTE: 0
      PACATMSCRATCH: 00003ffbc865e680
      GPR00: fffffffcfabc4268 00003ffd84e667a0 00000000100d8c38 000000030544bb80
      GPR04: 0000000000000002 00000000100cf200 0000000000000449 00000000100cf100
      GPR08: 000000000000c350 0000000000002569 0000000000002569 00000000100d6c30
      GPR12: 00000000100d6c28 c00000000e6a6b00 00003ffd84660000 0000000000000000
      GPR16: 0000000000000003 0000000000000449 0000000010002570 0000010009684f20
      GPR20: 0000000000800000 00003ffd84e5f110 00003ffd84e5f7a0 00000000100d0f40
      GPR24: 0000000000000000 0000000000000000 0000000000000000 00003ffff0673f50
      GPR28: 00003ffd84e5e960 00000000003d0f00 00003ffd84e667a0 00003ffd84e5e680
      NIP [c00000000004f188] restore_gprs+0x110/0x17c
      LR [00000000100040b8] 0x100040b8
      Call Trace:
      Instruction dump:
      f8a1fff0 e8e700a8 38a00000 7ca10164 e8a1fff8 e821fff0 7c0007dd 7c421378
      7db142a6 7c3242a6 38800002 7c810164 <e9c100e0> e9e100e8 ea0100f0 ea2100f8
    
    We hit this on large memory machines (> 2TB) but it can also be hit on
    smaller machines when 1TB segments are disabled.
    
    To hit this, you also need to be virtualised to ensure SLBs are
    periodically removed by the hypervisor.
    
    This patches moves the saving of r1 to the SPRG to the region where we
    are guaranteed not to take any further SLB misses.
    
    Fixes: 98ae22e1 ("powerpc: Add helper functions for transactional memory context switching")
    Cc: stable@vger.kernel.org # v3.9+
    Signed-off-by: default avatarMichael Neuling <mikey@neuling.org>
    Acked-by: default avatarCyril Bur <cyrilbur@gmail.com>
    Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
    6bcb8014
tm.S 12.1 KB