1. 17 Nov, 2020 6 commits
  2. 16 Nov, 2020 13 commits
  3. 15 Nov, 2020 17 commits
  4. 14 Nov, 2020 4 commits
    • Sven Van Asbroeck's avatar
      lan743x: prevent entire kernel HANG on open, for some platforms · 796a2665
      Sven Van Asbroeck authored
      On arm imx6, when opening the chip's netdev, the whole Linux
      kernel intermittently hangs/freezes.
      
      This is caused by a bug in the driver code which tests if pcie
      interrupts are working correctly, using the software interrupt:
      
      1. open: enable the software interrupt
      2. open: tell the chip to assert the software interrupt
      3. open: wait for flag
      4. ISR: acknowledge s/w interrupt, set flag
      5. open: notice flag, disable the s/w interrupt, continue
      
      Unfortunately the ISR only acknowledges the s/w interrupt, but
      does not disable it. This will re-trigger the ISR in a tight
      loop.
      
      On some (lucky) platforms, open proceeds to disable the s/w
      interrupt even while the ISR is 'spinning'. On arm imx6,
      the spinning ISR does not allow open to proceed, resulting
      in a hung Linux kernel.
      
      Fix minimally by disabling the s/w interrupt in the ISR, which
      will prevent it from spinning. This won't break anything because
      the s/w interrupt is used as a one-shot interrupt.
      
      Note that this is a minimal fix, overlooking many possible
      cleanups, e.g.:
      - lan743x_intr_software_isr() is completely redundant and reads
        INT_STS twice for no apparent reason
      - disabling the s/w interrupt in lan743x_intr_test_isr() is now
        redundant, but harmless
      - waiting on software_isr_flag can be converted from a sleeping
        poll loop to wait_event_timeout()
      
      Fixes: 23f0703c ("lan743x: Add main source files for new lan743x driver")
      Tested-by: Sven Van Asbroeck <thesven73@gmail.com> # arm imx6 lan7430
      Signed-off-by: default avatarSven Van Asbroeck <thesven73@gmail.com>
      Link: https://lore.kernel.org/r/20201112204741.12375-1-TheSven73@gmail.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      796a2665
    • Sven Van Asbroeck's avatar
      lan743x: fix issue causing intermittent kernel log warnings · e35df62e
      Sven Van Asbroeck authored
      When running this chip on arm imx6, we intermittently observe
      the following kernel warning in the log, especially when the
      system is under high load:
      
      [   50.119484] ------------[ cut here ]------------
      [   50.124377] WARNING: CPU: 0 PID: 303 at kernel/softirq.c:169 __local_bh_enable_ip+0x100/0x184
      [   50.132925] IRQs not enabled as expected
      [   50.159250] CPU: 0 PID: 303 Comm: rngd Not tainted 5.7.8 #1
      [   50.164837] Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree)
      [   50.171395] [<c0111a38>] (unwind_backtrace) from [<c010be28>] (show_stack+0x10/0x14)
      [   50.179162] [<c010be28>] (show_stack) from [<c05b9dec>] (dump_stack+0xac/0xd8)
      [   50.186408] [<c05b9dec>] (dump_stack) from [<c0122e40>] (__warn+0xd0/0x10c)
      [   50.193391] [<c0122e40>] (__warn) from [<c0123238>] (warn_slowpath_fmt+0x98/0xc4)
      [   50.200892] [<c0123238>] (warn_slowpath_fmt) from [<c012b010>] (__local_bh_enable_ip+0x100/0x184)
      [   50.209860] [<c012b010>] (__local_bh_enable_ip) from [<bf09ecbc>] (destroy_conntrack+0x48/0xd8 [nf_conntrack])
      [   50.220038] [<bf09ecbc>] (destroy_conntrack [nf_conntrack]) from [<c0ac9b58>] (nf_conntrack_destroy+0x94/0x168)
      [   50.230160] [<c0ac9b58>] (nf_conntrack_destroy) from [<c0a4aaa0>] (skb_release_head_state+0xa0/0xd0)
      [   50.239314] [<c0a4aaa0>] (skb_release_head_state) from [<c0a4aadc>] (skb_release_all+0xc/0x24)
      [   50.247946] [<c0a4aadc>] (skb_release_all) from [<c0a4b4cc>] (consume_skb+0x74/0x17c)
      [   50.255796] [<c0a4b4cc>] (consume_skb) from [<c081a2dc>] (lan743x_tx_release_desc+0x120/0x124)
      [   50.264428] [<c081a2dc>] (lan743x_tx_release_desc) from [<c081a98c>] (lan743x_tx_napi_poll+0x5c/0x18c)
      [   50.273755] [<c081a98c>] (lan743x_tx_napi_poll) from [<c0a6b050>] (net_rx_action+0x118/0x4a4)
      [   50.282306] [<c0a6b050>] (net_rx_action) from [<c0101364>] (__do_softirq+0x13c/0x53c)
      [   50.290157] [<c0101364>] (__do_softirq) from [<c012b29c>] (irq_exit+0x150/0x17c)
      [   50.297575] [<c012b29c>] (irq_exit) from [<c0196a08>] (__handle_domain_irq+0x60/0xb0)
      [   50.305423] [<c0196a08>] (__handle_domain_irq) from [<c05d44fc>] (gic_handle_irq+0x4c/0x90)
      [   50.313790] [<c05d44fc>] (gic_handle_irq) from [<c0100ed4>] (__irq_usr+0x54/0x80)
      [   50.321287] Exception stack(0xecd99fb0 to 0xecd99ff8)
      [   50.326355] 9fa0:                                     1cf1aa74 00000001 00000001 00000000
      [   50.334547] 9fc0: 00000001 00000000 00000000 00000000 00000000 00000000 00004097 b6d17d14
      [   50.342738] 9fe0: 00000001 b6d17c60 00000000 b6e71f94 800b0010 ffffffff
      [   50.349364] irq event stamp: 2525027
      [   50.352955] hardirqs last  enabled at (2525026): [<c0a6afec>] net_rx_action+0xb4/0x4a4
      [   50.360892] hardirqs last disabled at (2525027): [<c0d6d2fc>] _raw_spin_lock_irqsave+0x1c/0x50
      [   50.369517] softirqs last  enabled at (2524660): [<c01015b4>] __do_softirq+0x38c/0x53c
      [   50.377446] softirqs last disabled at (2524693): [<c012b29c>] irq_exit+0x150/0x17c
      [   50.385027] ---[ end trace c0b571db4bc8087d ]---
      
      The driver is calling dev_kfree_skb() from code inside a spinlock,
      where h/w interrupts are disabled. This is forbidden, as documented
      in include/linux/netdevice.h. The correct function to use
      dev_kfree_skb_irq(), or dev_kfree_skb_any().
      
      Fix by using the correct dev_kfree_skb_xxx() functions:
      
      in lan743x_tx_release_desc():
        called by lan743x_tx_release_completed_descriptors()
          called by in lan743x_tx_napi_poll()
          which holds a spinlock
        called by lan743x_tx_release_all_descriptors()
          called by lan743x_tx_close()
          which can-sleep
      conclusion: use dev_kfree_skb_any()
      
      in lan743x_tx_xmit_frame():
        which holds a spinlock
      conclusion: use dev_kfree_skb_irq()
      
      in lan743x_tx_close():
        which can-sleep
      conclusion: use dev_kfree_skb()
      
      in lan743x_rx_release_ring_element():
        called by lan743x_rx_close()
          which can-sleep
        called by lan743x_rx_open()
          which can-sleep
      conclusion: use dev_kfree_skb()
      
      Fixes: 23f0703c ("lan743x: Add main source files for new lan743x driver")
      Signed-off-by: default avatarSven Van Asbroeck <thesven73@gmail.com>
      Link: https://lore.kernel.org/r/20201112185949.11315-1-TheSven73@gmail.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      e35df62e
    • Paul Moore's avatar
      netlabel: fix an uninitialized warning in netlbl_unlabel_staticlist() · 1ba86d43
      Paul Moore authored
      Static checking revealed that a previous fix to
      netlbl_unlabel_staticlist() leaves a stack variable uninitialized,
      this patches fixes that.
      
      Fixes: 866358ec ("netlabel: fix our progress tracking in netlbl_unlabel_staticlist()")
      Reported-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarPaul Moore <paul@paul-moore.com>
      Reviewed-by: default avatarJames Morris <jamorris@linux.microsoft.com>
      Link: https://lore.kernel.org/r/160530304068.15651.18355773009751195447.stgit@siflSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      1ba86d43
    • Xin Long's avatar
      sctp: change to hold/put transport for proto_unreach_timer · 057a10fa
      Xin Long authored
      A call trace was found in Hangbin's Codenomicon testing with debug kernel:
      
        [ 2615.981988] ODEBUG: free active (active state 0) object type: timer_list hint: sctp_generate_proto_unreach_event+0x0/0x3a0 [sctp]
        [ 2615.995050] WARNING: CPU: 17 PID: 0 at lib/debugobjects.c:328 debug_print_object+0x199/0x2b0
        [ 2616.095934] RIP: 0010:debug_print_object+0x199/0x2b0
        [ 2616.191533] Call Trace:
        [ 2616.194265]  <IRQ>
        [ 2616.202068]  debug_check_no_obj_freed+0x25e/0x3f0
        [ 2616.207336]  slab_free_freelist_hook+0xeb/0x140
        [ 2616.220971]  kfree+0xd6/0x2c0
        [ 2616.224293]  rcu_do_batch+0x3bd/0xc70
        [ 2616.243096]  rcu_core+0x8b9/0xd00
        [ 2616.256065]  __do_softirq+0x23d/0xacd
        [ 2616.260166]  irq_exit+0x236/0x2a0
        [ 2616.263879]  smp_apic_timer_interrupt+0x18d/0x620
        [ 2616.269138]  apic_timer_interrupt+0xf/0x20
        [ 2616.273711]  </IRQ>
      
      This is because it holds asoc when transport->proto_unreach_timer starts
      and puts asoc when the timer stops, and without holding transport the
      transport could be freed when the timer is still running.
      
      So fix it by holding/putting transport instead for proto_unreach_timer
      in transport, just like other timers in transport.
      
      v1->v2:
        - Also use sctp_transport_put() for the "out_unlock:" path in
          sctp_generate_proto_unreach_event(), as Marcelo noticed.
      
      Fixes: 50b5d6ad ("sctp: Fix a race between ICMP protocol unreachable and connect()")
      Reported-by: default avatarHangbin Liu <liuhangbin@gmail.com>
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Acked-by: default avatarMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Link: https://lore.kernel.org/r/102788809b554958b13b95d33440f5448113b8d6.1605331373.git.lucien.xin@gmail.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      057a10fa