1. 24 Apr, 2021 2 commits
  2. 23 Apr, 2021 1 commit
    • He Ying's avatar
      irqchip/gic-v3: Do not enable irqs when handling spurious interrups · a97709f5
      He Ying authored
      We triggered the following error while running our 4.19 kernel
      with the pseudo-NMI patches backported to it:
      
      [   14.816231] ------------[ cut here ]------------
      [   14.816231] kernel BUG at irq.c:99!
      [   14.816232] Internal error: Oops - BUG: 0 [#1] SMP
      [   14.816232] Process swapper/0 (pid: 0, stack limit = 0x(____ptrval____))
      [   14.816233] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G           O      4.19.95.aarch64 #14
      [   14.816233] Hardware name: evb (DT)
      [   14.816234] pstate: 80400085 (Nzcv daIf +PAN -UAO)
      [   14.816234] pc : asm_nmi_enter+0x94/0x98
      [   14.816235] lr : asm_nmi_enter+0x18/0x98
      [   14.816235] sp : ffff000008003c50
      [   14.816235] pmr_save: 00000070
      [   14.816237] x29: ffff000008003c50 x28: ffff0000095f56c0
      [   14.816238] x27: 0000000000000000 x26: ffff000008004000
      [   14.816239] x25: 00000000015e0000 x24: ffff8008fb916000
      [   14.816240] x23: 0000000020400005 x22: ffff0000080817cc
      [   14.816241] x21: ffff000008003da0 x20: 0000000000000060
      [   14.816242] x19: 00000000000003ff x18: ffffffffffffffff
      [   14.816243] x17: 0000000000000008 x16: 003d090000000000
      [   14.816244] x15: ffff0000095ea6c8 x14: ffff8008fff5ab40
      [   14.816244] x13: ffff8008fff58b9d x12: 0000000000000000
      [   14.816245] x11: ffff000008c8a200 x10: 000000008e31fca5
      [   14.816246] x9 : ffff000008c8a208 x8 : 000000000000000f
      [   14.816247] x7 : 0000000000000004 x6 : ffff8008fff58b9e
      [   14.816248] x5 : 0000000000000000 x4 : 0000000080000000
      [   14.816249] x3 : 0000000000000000 x2 : 0000000080000000
      [   14.816250] x1 : 0000000000120000 x0 : ffff0000095f56c0
      [   14.816251] Call trace:
      [   14.816251]  asm_nmi_enter+0x94/0x98
      [   14.816251]  el1_irq+0x8c/0x180                    (IRQ C)
      [   14.816252]  gic_handle_irq+0xbc/0x2e4
      [   14.816252]  el1_irq+0xcc/0x180                    (IRQ B)
      [   14.816253]  arch_timer_handler_virt+0x38/0x58
      [   14.816253]  handle_percpu_devid_irq+0x90/0x240
      [   14.816253]  generic_handle_irq+0x34/0x50
      [   14.816254]  __handle_domain_irq+0x68/0xc0
      [   14.816254]  gic_handle_irq+0xf8/0x2e4
      [   14.816255]  el1_irq+0xcc/0x180                    (IRQ A)
      [   14.816255]  arch_cpu_idle+0x34/0x1c8
      [   14.816255]  default_idle_call+0x24/0x44
      [   14.816256]  do_idle+0x1d0/0x2c8
      [   14.816256]  cpu_startup_entry+0x28/0x30
      [   14.816256]  rest_init+0xb8/0xc8
      [   14.816257]  start_kernel+0x4c8/0x4f4
      [   14.816257] Code: 940587f1 d5384100 b9401001 36a7fd01 (d4210000)
      [   14.816258] Modules linked in: start_dp(O) smeth(O)
      [   15.103092] ---[ end trace 701753956cb14aa8 ]---
      [   15.103093] Kernel panic - not syncing: Fatal exception in interrupt
      [   15.103099] SMP: stopping secondary CPUs
      [   15.103100] Kernel Offset: disabled
      [   15.103100] CPU features: 0x36,a2400218
      [   15.103100] Memory Limit: none
      
      which is cause by a 'BUG_ON(in_nmi())' in nmi_enter().
      
      From the call trace, we can find three interrupts (noted A, B, C above):
      interrupt (A) is preempted by (B), which is further interrupted by (C).
      
      Subsequent investigations show that (B) results in nmi_enter() being
      called, but that it actually is a spurious interrupt. Furthermore,
      interrupts are reenabled in the context of (B), and (C) fires with
      NMI priority. We end-up with a nested NMI situation, something
      we definitely do not want to (and cannot) handle.
      
      The bug here is that spurious interrupts should never result in any
      state change, and we should just return to the interrupted context.
      Moving the handling of spurious interrupts as early as possible in
      the GICv3 handler fixes this issue.
      
      Fixes: 3f1f3234 ("irqchip/gic-v3: Switch to PMR masking before calling IRQ handler")
      Acked-by: default avatarMark Rutland <mark.rutland@arm.com>
      Signed-off-by: default avatarHe Ying <heying24@huawei.com>
      [maz: rewrote commit message, corrected Fixes: tag]
      Signed-off-by: default avatarMarc Zyngier <maz@kernel.org>
      Link: https://lore.kernel.org/r/20210423083516.170111-1-heying24@huawei.com
      Cc: stable@vger.kernel.org
      a97709f5
  3. 22 Apr, 2021 8 commits
  4. 10 Apr, 2021 2 commits
    • Nicholas Piggin's avatar
      genirq: Reduce irqdebug cacheline bouncing · 7c07012e
      Nicholas Piggin authored
      note_interrupt() increments desc->irq_count for each interrupt even for
      percpu interrupt handlers, even when they are handled successfully. This
      causes cacheline bouncing and limits scalability.
      
      Instead of incrementing irq_count every time, only start incrementing it
      after seeing an unhandled irq, which should avoid the cache line
      bouncing in the common path.
      
      This actually should give better consistency in handling misbehaving
      irqs too, because instead of the first unhandled irq arriving at an
      arbitrary point in the irq_count cycle, its arrival will begin the
      irq_count cycle.
      
      Cédric reports the result of his IPI throughput test:
      
                     Millions of IPIs/s
       -----------   --------------------------------------
                     upstream   upstream   patched
       chips  cpus   default    noirqdebug default (irqdebug)
       -----------   -----------------------------------------
       1      0-15     4.061      4.153      4.084
              0-31     7.937      8.186      8.158
              0-47    11.018     11.392     11.233
              0-63    11.460     13.907     14.022
       2      0-79     8.376     18.105     18.084
              0-95     7.338     22.101     22.266
              0-111    6.716     25.306     25.473
              0-127    6.223     27.814     28.029
      Signed-off-by: default avatarNicholas Piggin <npiggin@gmail.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Link: https://lore.kernel.org/r/20210402132037.574661-1-npiggin@gmail.com
      7c07012e
    • Tetsuo Handa's avatar
      kernel: Initialize cpumask before parsing · c5e3a411
      Tetsuo Handa authored
      KMSAN complains that new_value at cpumask_parse_user() from
      write_irq_affinity() from irq_affinity_proc_write() is uninitialized.
      
        [  148.133411][ T5509] =====================================================
        [  148.135383][ T5509] BUG: KMSAN: uninit-value in find_next_bit+0x325/0x340
        [  148.137819][ T5509]
        [  148.138448][ T5509] Local variable ----new_value.i@irq_affinity_proc_write created at:
        [  148.140768][ T5509]  irq_affinity_proc_write+0xc3/0x3d0
        [  148.142298][ T5509]  irq_affinity_proc_write+0xc3/0x3d0
        [  148.143823][ T5509] =====================================================
      
      Since bitmap_parse() from cpumask_parse_user() calls find_next_bit(),
      any alloc_cpumask_var() + cpumask_parse_user() sequence has possibility
      that find_next_bit() accesses uninitialized cpu mask variable. Fix this
      problem by replacing alloc_cpumask_var() with zalloc_cpumask_var().
      Signed-off-by: default avatarTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Acked-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      Link: https://lore.kernel.org/r/20210401055823.3929-1-penguin-kernel@I-love.SAKURA.ne.jp
      c5e3a411
  5. 08 Apr, 2021 1 commit
  6. 07 Apr, 2021 11 commits
  7. 30 Mar, 2021 1 commit
  8. 25 Mar, 2021 1 commit
  9. 22 Mar, 2021 1 commit
  10. 19 Mar, 2021 1 commit
  11. 17 Mar, 2021 11 commits