1. 10 Sep, 2020 5 commits
  2. 09 Sep, 2020 19 commits
    • David S. Miller's avatar
      Merge branch 'net-qed-disable-aRFS-in-NPAR-and-100G' · 9b29e26f
      David S. Miller authored
      Igor Russkikh says:
      
      ====================
      net: qed disable aRFS in NPAR and 100G
      
      This patchset fixes some recent issues found by customers.
      
      v3:
        resending on Dmitry's behalf
      
      v2:
        correct hash in Fixes tag
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9b29e26f
    • Dmitry Bogdanov's avatar
      net: qed: RDMA personality shouldn't fail VF load · ce1cf9e5
      Dmitry Bogdanov authored
      Fix the assert during VF driver installation when the personality is iWARP
      
      Fixes: 1fe614d1 ("qed: Relax VF firmware requirements")
      Signed-off-by: default avatarIgor Russkikh <irusskikh@marvell.com>
      Signed-off-by: default avatarMichal Kalderon <michal.kalderon@marvell.com>
      Signed-off-by: default avatarDmitry Bogdanov <dbogdanov@marvell.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ce1cf9e5
    • Dmitry Bogdanov's avatar
      net: qede: Disable aRFS for NPAR and 100G · 0367f058
      Dmitry Bogdanov authored
      In some configurations ARFS cannot be used, so disable it if device
      is not capable.
      
      Fixes: e4917d46 ("qede: Add aRFS support")
      Signed-off-by: default avatarManish Chopra <manishc@marvell.com>
      Signed-off-by: default avatarIgor Russkikh <irusskikh@marvell.com>
      Signed-off-by: default avatarMichal Kalderon <michal.kalderon@marvell.com>
      Signed-off-by: default avatarDmitry Bogdanov <dbogdanov@marvell.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0367f058
    • Dmitry Bogdanov's avatar
      net: qed: Disable aRFS for NPAR and 100G · 2d2fe843
      Dmitry Bogdanov authored
      In CMT and NPAR the PF is unknown when the GFS block processes the
      packet. Therefore cannot use searcher as it has a per PF database,
      and thus ARFS must be disabled.
      
      Fixes: d51e4af5 ("qed: aRFS infrastructure support")
      Signed-off-by: default avatarManish Chopra <manishc@marvell.com>
      Signed-off-by: default avatarIgor Russkikh <irusskikh@marvell.com>
      Signed-off-by: default avatarMichal Kalderon <michal.kalderon@marvell.com>
      Signed-off-by: default avatarDmitry Bogdanov <dbogdanov@marvell.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2d2fe843
    • David S. Miller's avatar
      Merge tag 'wireless-drivers-2020-09-09' of... · a19454b6
      David S. Miller authored
      Merge tag 'wireless-drivers-2020-09-09' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers
      
      Kalle Valo says:
      
      ====================
      wireless-drivers fixes for v5.9
      
      First set of fixes for v5.9, small but important.
      
      brcmfmac
      
      * fix a throughput regression on bcm4329
      
      mt76
      
      * fix a regression with stations reconnecting on mt7616
      
      * properly free tx skbs, it was working by accident before
      
      mwifiex
      
      * fix a regression with 256 bit encryption keys
      
      wlcore
      
      * revert AES CMAC support as it caused a regression
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a19454b6
    • David S. Miller's avatar
      Merge branch 'wireguard-fixes' · 99dc4a5d
      David S. Miller authored
      Jason A. Donenfeld says:
      
      ====================
      wireguard fixes for 5.9-rc5
      
      Yesterday, Eric reported a race condition found by syzbot. This series
      contains two commits, one that fixes the direct issue, and another that
      addresses the more general issue, as a defense in depth.
      
      1) The basic problem syzbot unearthed was that one particular mutation
         of handshake->entry was not protected by the handshake mutex like the
         other cases, so this patch basically just reorders a line to make
         sure the mutex is actually taken at the right point. Most of the work
         here went into making sure the race was fully understood and making a
         reproducer (which syzbot was unable to do itself, due to the rarity
         of the race).
      
      2) Eric's initial suggestion for fixing this was taking a spinlock
         around the hash table replace function where the null ptr deref was
         happening. This doesn't address the main problem in the most precise
         possible way like (1) does, but it is a good suggestion for
         defense-in-depth, in case related issues come up in the future, and
         basically costs nothing from a performance perspective. I thought it
         aided in implementing a good general rule: all mutators of that hash
         table take the table lock. So that's part of this series as a
         companion.
      
      Both of these contain Fixes: tags and are good candidates for stable.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      99dc4a5d
    • Jason A. Donenfeld's avatar
      wireguard: peerlookup: take lock before checking hash in replace operation · 6147f7b1
      Jason A. Donenfeld authored
      Eric's suggested fix for the previous commit's mentioned race condition
      was to simply take the table->lock in wg_index_hashtable_replace(). The
      table->lock of the hash table is supposed to protect the bucket heads,
      not the entires, but actually, since all the mutator functions are
      already taking it, it makes sense to take it too for the test to
      hlist_unhashed, as a defense in depth measure, so that it no longer
      races with deletions, regardless of what other locks are protecting
      individual entries. This is sensible from a performance perspective
      because, as Eric pointed out, the case of being unhashed is already the
      unlikely case, so this won't add common contention. And comparing
      instructions, this basically doesn't make much of a difference other
      than pushing and popping %r13, used by the new `bool ret`. More
      generally, I like the idea of locking consistency across table mutator
      functions, and this might let me rest slightly easier at night.
      Suggested-by: default avatarEric Dumazet <edumazet@google.com>
      Link: https://lore.kernel.org/wireguard/20200908145911.4090480-1-edumazet@google.com/
      Fixes: e7096c13 ("net: WireGuard secure network tunnel")
      Signed-off-by: default avatarJason A. Donenfeld <Jason@zx2c4.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6147f7b1
    • Jason A. Donenfeld's avatar
      wireguard: noise: take lock when removing handshake entry from table · 9179ba31
      Jason A. Donenfeld authored
      Eric reported that syzkaller found a race of this variety:
      
      CPU 1                                       CPU 2
      -------------------------------------------|---------------------------------------
      wg_index_hashtable_replace(old, ...)       |
        if (hlist_unhashed(&old->index_hash))    |
                                                 | wg_index_hashtable_remove(old)
                                                 |   hlist_del_init_rcu(&old->index_hash)
      				           |     old->index_hash.pprev = NULL
        hlist_replace_rcu(&old->index_hash, ...) |
          *old->index_hash.pprev                 |
      
      Syzbot wasn't actually able to reproduce this more than once or create a
      reproducer, because the race window between checking "hlist_unhashed" and
      calling "hlist_replace_rcu" is just so small. Adding an mdelay(5) or
      similar there helps make this demonstrable using this simple script:
      
          #!/bin/bash
          set -ex
          trap 'kill $pid1; kill $pid2; ip link del wg0; ip link del wg1' EXIT
          ip link add wg0 type wireguard
          ip link add wg1 type wireguard
          wg set wg0 private-key <(wg genkey) listen-port 9999
          wg set wg1 private-key <(wg genkey) peer $(wg show wg0 public-key) endpoint 127.0.0.1:9999 persistent-keepalive 1
          wg set wg0 peer $(wg show wg1 public-key)
          ip link set wg0 up
          yes link set wg1 up | ip -force -batch - &
          pid1=$!
          yes link set wg1 down | ip -force -batch - &
          pid2=$!
          wait
      
      The fundumental underlying problem is that we permit calls to wg_index_
      hashtable_remove(handshake.entry) without requiring the caller to take
      the handshake mutex that is intended to protect members of handshake
      during mutations. This is consistently the case with calls to wg_index_
      hashtable_insert(handshake.entry) and wg_index_hashtable_replace(
      handshake.entry), but it's missing from a pertinent callsite of wg_
      index_hashtable_remove(handshake.entry). So, this patch makes sure that
      mutex is taken.
      
      The original code was a little bit funky though, in the form of:
      
          remove(handshake.entry)
          lock(), memzero(handshake.some_members), unlock()
          remove(handshake.entry)
      
      The original intention of that double removal pattern outside the lock
      appears to be some attempt to prevent insertions that might happen while
      locks are dropped during expensive crypto operations, but actually, all
      callers of wg_index_hashtable_insert(handshake.entry) take the write
      lock and then explicitly check handshake.state, as they should, which
      the aforementioned memzero clears, which means an insertion should
      already be impossible. And regardless, the original intention was
      necessarily racy, since it wasn't guaranteed that something else would
      run after the unlock() instead of after the remove(). So, from a
      soundness perspective, it seems positive to remove what looks like a
      hack at best.
      
      The crash from both syzbot and from the script above is as follows:
      
        general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] PREEMPT SMP KASAN
        KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
        CPU: 0 PID: 7395 Comm: kworker/0:3 Not tainted 5.9.0-rc4-syzkaller #0
        Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
        Workqueue: wg-kex-wg1 wg_packet_handshake_receive_worker
        RIP: 0010:hlist_replace_rcu include/linux/rculist.h:505 [inline]
        RIP: 0010:wg_index_hashtable_replace+0x176/0x330 drivers/net/wireguard/peerlookup.c:174
        Code: 00 fc ff df 48 89 f9 48 c1 e9 03 80 3c 01 00 0f 85 44 01 00 00 48 b9 00 00 00 00 00 fc ff df 48 8b 45 10 48 89 c6 48 c1 ee 03 <80> 3c 0e 00 0f 85 06 01 00 00 48 85 d2 4c 89 28 74 47 e8 a3 4f b5
        RSP: 0018:ffffc90006a97bf8 EFLAGS: 00010246
        RAX: 0000000000000000 RBX: ffff888050ffc4f8 RCX: dffffc0000000000
        RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88808e04e010
        RBP: ffff88808e04e000 R08: 0000000000000001 R09: ffff8880543d0000
        R10: ffffed100a87a000 R11: 000000000000016e R12: ffff8880543d0000
        R13: ffff88808e04e008 R14: ffff888050ffc508 R15: ffff888050ffc500
        FS:  0000000000000000(0000) GS:ffff8880ae600000(0000) knlGS:0000000000000000
        CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
        CR2: 00000000f5505db0 CR3: 0000000097cf7000 CR4: 00000000001526f0
        DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
        DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
        Call Trace:
        wg_noise_handshake_begin_session+0x752/0xc9a drivers/net/wireguard/noise.c:820
        wg_receive_handshake_packet drivers/net/wireguard/receive.c:183 [inline]
        wg_packet_handshake_receive_worker+0x33b/0x730 drivers/net/wireguard/receive.c:220
        process_one_work+0x94c/0x1670 kernel/workqueue.c:2269
        worker_thread+0x64c/0x1120 kernel/workqueue.c:2415
        kthread+0x3b5/0x4a0 kernel/kthread.c:292
        ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:294
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Reported-by: default avatarEric Dumazet <edumazet@google.com>
      Link: https://lore.kernel.org/wireguard/20200908145911.4090480-1-edumazet@google.com/
      Fixes: e7096c13 ("net: WireGuard secure network tunnel")
      Signed-off-by: default avatarJason A. Donenfeld <Jason@zx2c4.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9179ba31
    • Ye Bin's avatar
      hsr: avoid newline at end of message in NL_SET_ERR_MSG_MOD · b87f9fe1
      Ye Bin authored
      clean follow coccicheck warning:
      net//hsr/hsr_netlink.c:94:8-42: WARNING avoid newline at end of message
      in NL_SET_ERR_MSG_MOD
      net//hsr/hsr_netlink.c:87:30-57: WARNING avoid newline at end of message
      in NL_SET_ERR_MSG_MOD
      net//hsr/hsr_netlink.c:79:29-53: WARNING avoid newline at end of message
      in NL_SET_ERR_MSG_MOD
      Signed-off-by: default avatarYe Bin <yebin10@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b87f9fe1
    • David S. Miller's avatar
      Merge branch 'net-skb_put_padto-fixes' · 0ddaa278
      David S. Miller authored
      Eric Dumazet says:
      
      ====================
      net: skb_put_padto() fixes
      
      sysbot reported a bug in qrtr leading to use-after-free.
      
      First patch fixes the issue.
      
      Second patch addes __must_check attribute to avoid similar
      issues in the future.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0ddaa278
    • Eric Dumazet's avatar
      net: add __must_check to skb_put_padto() · 4a009cb0
      Eric Dumazet authored
      skb_put_padto() and __skb_put_padto() callers
      must check return values or risk use-after-free.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4a009cb0
    • Eric Dumazet's avatar
      net: qrtr: check skb_put_padto() return value · 3ca1a42a
      Eric Dumazet authored
      If skb_put_padto() returns an error, skb has been freed.
      Better not touch it anymore, as reported by syzbot [1]
      
      Note to qrtr maintainers : this suggests qrtr_sendmsg()
      should adjust sock_alloc_send_skb() second parameter
      to account for the potential added alignment to avoid
      reallocation.
      
      [1]
      
      BUG: KASAN: use-after-free in __skb_insert include/linux/skbuff.h:1907 [inline]
      BUG: KASAN: use-after-free in __skb_queue_before include/linux/skbuff.h:2016 [inline]
      BUG: KASAN: use-after-free in __skb_queue_tail include/linux/skbuff.h:2049 [inline]
      BUG: KASAN: use-after-free in skb_queue_tail+0x6b/0x120 net/core/skbuff.c:3146
      Write of size 8 at addr ffff88804d8ab3c0 by task syz-executor.4/4316
      
      CPU: 1 PID: 4316 Comm: syz-executor.4 Not tainted 5.9.0-rc4-syzkaller #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       __dump_stack lib/dump_stack.c:77 [inline]
       dump_stack+0x1d6/0x29e lib/dump_stack.c:118
       print_address_description+0x66/0x620 mm/kasan/report.c:383
       __kasan_report mm/kasan/report.c:513 [inline]
       kasan_report+0x132/0x1d0 mm/kasan/report.c:530
       __skb_insert include/linux/skbuff.h:1907 [inline]
       __skb_queue_before include/linux/skbuff.h:2016 [inline]
       __skb_queue_tail include/linux/skbuff.h:2049 [inline]
       skb_queue_tail+0x6b/0x120 net/core/skbuff.c:3146
       qrtr_tun_send+0x1a/0x40 net/qrtr/tun.c:23
       qrtr_node_enqueue+0x44f/0xc00 net/qrtr/qrtr.c:364
       qrtr_bcast_enqueue+0xbe/0x140 net/qrtr/qrtr.c:861
       qrtr_sendmsg+0x680/0x9c0 net/qrtr/qrtr.c:960
       sock_sendmsg_nosec net/socket.c:651 [inline]
       sock_sendmsg net/socket.c:671 [inline]
       sock_write_iter+0x317/0x470 net/socket.c:998
       call_write_iter include/linux/fs.h:1882 [inline]
       new_sync_write fs/read_write.c:503 [inline]
       vfs_write+0xa96/0xd10 fs/read_write.c:578
       ksys_write+0x11b/0x220 fs/read_write.c:631
       do_syscall_64+0x31/0x70 arch/x86/entry/common.c:46
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
      RIP: 0033:0x45d5b9
      Code: 5d b4 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 2b b4 fb ff c3 66 2e 0f 1f 84 00 00 00 00
      RSP: 002b:00007f84b5b81c78 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
      RAX: ffffffffffffffda RBX: 0000000000038b40 RCX: 000000000045d5b9
      RDX: 0000000000000055 RSI: 0000000020001240 RDI: 0000000000000003
      RBP: 00007f84b5b81ca0 R08: 0000000000000000 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000246 R12: 000000000000000f
      R13: 00007ffcbbf86daf R14: 00007f84b5b829c0 R15: 000000000118cf4c
      
      Allocated by task 4316:
       kasan_save_stack mm/kasan/common.c:48 [inline]
       kasan_set_track mm/kasan/common.c:56 [inline]
       __kasan_kmalloc+0x100/0x130 mm/kasan/common.c:461
       slab_post_alloc_hook+0x3e/0x290 mm/slab.h:518
       slab_alloc mm/slab.c:3312 [inline]
       kmem_cache_alloc+0x1c1/0x2d0 mm/slab.c:3482
       skb_clone+0x1b2/0x370 net/core/skbuff.c:1449
       qrtr_bcast_enqueue+0x6d/0x140 net/qrtr/qrtr.c:857
       qrtr_sendmsg+0x680/0x9c0 net/qrtr/qrtr.c:960
       sock_sendmsg_nosec net/socket.c:651 [inline]
       sock_sendmsg net/socket.c:671 [inline]
       sock_write_iter+0x317/0x470 net/socket.c:998
       call_write_iter include/linux/fs.h:1882 [inline]
       new_sync_write fs/read_write.c:503 [inline]
       vfs_write+0xa96/0xd10 fs/read_write.c:578
       ksys_write+0x11b/0x220 fs/read_write.c:631
       do_syscall_64+0x31/0x70 arch/x86/entry/common.c:46
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      Freed by task 4316:
       kasan_save_stack mm/kasan/common.c:48 [inline]
       kasan_set_track+0x3d/0x70 mm/kasan/common.c:56
       kasan_set_free_info+0x17/0x30 mm/kasan/generic.c:355
       __kasan_slab_free+0xdd/0x110 mm/kasan/common.c:422
       __cache_free mm/slab.c:3418 [inline]
       kmem_cache_free+0x82/0xf0 mm/slab.c:3693
       __skb_pad+0x3f5/0x5a0 net/core/skbuff.c:1823
       __skb_put_padto include/linux/skbuff.h:3233 [inline]
       skb_put_padto include/linux/skbuff.h:3252 [inline]
       qrtr_node_enqueue+0x62f/0xc00 net/qrtr/qrtr.c:360
       qrtr_bcast_enqueue+0xbe/0x140 net/qrtr/qrtr.c:861
       qrtr_sendmsg+0x680/0x9c0 net/qrtr/qrtr.c:960
       sock_sendmsg_nosec net/socket.c:651 [inline]
       sock_sendmsg net/socket.c:671 [inline]
       sock_write_iter+0x317/0x470 net/socket.c:998
       call_write_iter include/linux/fs.h:1882 [inline]
       new_sync_write fs/read_write.c:503 [inline]
       vfs_write+0xa96/0xd10 fs/read_write.c:578
       ksys_write+0x11b/0x220 fs/read_write.c:631
       do_syscall_64+0x31/0x70 arch/x86/entry/common.c:46
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      The buggy address belongs to the object at ffff88804d8ab3c0
       which belongs to the cache skbuff_head_cache of size 224
      The buggy address is located 0 bytes inside of
       224-byte region [ffff88804d8ab3c0, ffff88804d8ab4a0)
      The buggy address belongs to the page:
      page:00000000ea8cccfb refcount:1 mapcount:0 mapping:0000000000000000 index:0xffff88804d8abb40 pfn:0x4d8ab
      flags: 0xfffe0000000200(slab)
      raw: 00fffe0000000200 ffffea0002237ec8 ffffea00029b3388 ffff88821bb66800
      raw: ffff88804d8abb40 ffff88804d8ab000 000000010000000b 0000000000000000
      page dumped because: kasan: bad access detected
      
      Fixes: ce57785b ("net: qrtr: fix len of skb_put_padto in qrtr_node_enqueue")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Cc: Carl Huang <cjhuang@codeaurora.org>
      Cc: Wen Gong <wgong@codeaurora.org>
      Cc: Bjorn Andersson <bjorn.andersson@linaro.org>
      Cc: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
      Acked-by: default avatarManivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
      Reviewed-by: default avatarBjorn Andersson <bjorn.andersson@linaro.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3ca1a42a
    • Wei Wang's avatar
      ip: fix tos reflection in ack and reset packets · ba9e04a7
      Wei Wang authored
      Currently, in tcp_v4_reqsk_send_ack() and tcp_v4_send_reset(), we
      echo the TOS value of the received packets in the response.
      However, we do not want to echo the lower 2 ECN bits in accordance
      with RFC 3168 6.1.5 robustness principles.
      
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Signed-off-by: default avatarWei Wang <weiwan@google.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ba9e04a7
    • David S. Miller's avatar
      Merge tag 'ieee802154-for-davem-2020-09-08' of... · 6fd40d32
      David S. Miller authored
      Merge tag 'ieee802154-for-davem-2020-09-08' of git://git.kernel.org/pub/scm/linux/kernel/git/sschmidt/wpan
      
      Stefan Schmidt says:
      
      ====================
      pull-request: ieee802154 for net 2020-09-08
      
      An update from ieee802154 for your *net* tree.
      
      A potential memory leak fix for ca8210 from Liu Jian,
      a check on the return for a register read in adf7242
      and finally a user after free fix in the softmac tx
      function from Eric found by syzkaller.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6fd40d32
    • Jakub Kicinski's avatar
      MAINTAINERS: remove John Allen from ibmvnic · 2a154988
      Jakub Kicinski authored
      John's email has bounced and Thomas confirms he no longer
      works on ibmvnic.
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2a154988
    • Brian Vazquez's avatar
      fib: fix fib_rule_ops indirect call wrappers when CONFIG_IPV6=m · 923f614c
      Brian Vazquez authored
      If CONFIG_IPV6=m, the IPV6 functions won't be found by the linker:
      
      ld: net/core/fib_rules.o: in function `fib_rules_lookup':
      fib_rules.c:(.text+0x606): undefined reference to `fib6_rule_match'
      ld: fib_rules.c:(.text+0x611): undefined reference to `fib6_rule_match'
      ld: fib_rules.c:(.text+0x68c): undefined reference to `fib6_rule_action'
      ld: fib_rules.c:(.text+0x693): undefined reference to `fib6_rule_action'
      ld: fib_rules.c:(.text+0x6aa): undefined reference to `fib6_rule_suppress'
      ld: fib_rules.c:(.text+0x6bc): undefined reference to `fib6_rule_suppress'
      make: *** [Makefile:1166: vmlinux] Error 1
      Reported-by: default avatarSven Joachim <svenjoac@gmx.de>
      Fixes: b9aaec8f ("fib: use indirect call wrappers in the most common fib_rules_ops")
      Acked-by: Randy Dunlap <rdunlap@infradead.org> # build-tested
      Signed-off-by: default avatarBrian Vazquez <brianvv@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      923f614c
    • David S. Miller's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf · 2650be2c
      David S. Miller authored
      Pablo Neira Ayuso says:
      
      ===================
      Netfilter fixes for net
      
      The following patchset contains Netfilter fixes for net:
      
      1) Allow conntrack entries with l3num == NFPROTO_IPV4 or == NFPROTO_IPV6
         only via ctnetlink, from Will McVicker.
      
      2) Batch notifications to userspace to improve netlink socket receive
         utilization.
      
      3) Restore mark based dump filtering via ctnetlink, from Martin Willi.
      
      4) nf_conncount_init() fails with -EPROTO with CONFIG_IPV6, from
         Eelco Chaudron.
      
      5) Containers fail to match on meta skuid and skgid, use socket user_ns
         to retrieve meta skuid and skgid.
      ===================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2650be2c
    • Eric Dumazet's avatar
      ipv6: avoid lockdep issue in fib6_del() · 843d926b
      Eric Dumazet authored
      syzbot reported twice a lockdep issue in fib6_del() [1]
      which I think is caused by net->ipv6.fib6_null_entry
      having a NULL fib6_table pointer.
      
      fib6_del() already checks for fib6_null_entry special
      case, we only need to return earlier.
      
      Bug seems to occur very rarely, I have thus chosen
      a 'bug origin' that makes backports not too complex.
      
      [1]
      WARNING: suspicious RCU usage
      5.9.0-rc4-syzkaller #0 Not tainted
      -----------------------------
      net/ipv6/ip6_fib.c:1996 suspicious rcu_dereference_protected() usage!
      
      other info that might help us debug this:
      
      rcu_scheduler_active = 2, debug_locks = 1
      4 locks held by syz-executor.5/8095:
       #0: ffffffff8a7ea708 (rtnl_mutex){+.+.}-{3:3}, at: ppp_release+0x178/0x240 drivers/net/ppp/ppp_generic.c:401
       #1: ffff88804c422dd8 (&net->ipv6.fib6_gc_lock){+.-.}-{2:2}, at: spin_trylock_bh include/linux/spinlock.h:414 [inline]
       #1: ffff88804c422dd8 (&net->ipv6.fib6_gc_lock){+.-.}-{2:2}, at: fib6_run_gc+0x21b/0x2d0 net/ipv6/ip6_fib.c:2312
       #2: ffffffff89bd6a40 (rcu_read_lock){....}-{1:2}, at: __fib6_clean_all+0x0/0x290 net/ipv6/ip6_fib.c:2613
       #3: ffff8880a82e6430 (&tb->tb6_lock){+.-.}-{2:2}, at: spin_lock_bh include/linux/spinlock.h:359 [inline]
       #3: ffff8880a82e6430 (&tb->tb6_lock){+.-.}-{2:2}, at: __fib6_clean_all+0x107/0x290 net/ipv6/ip6_fib.c:2245
      
      stack backtrace:
      CPU: 1 PID: 8095 Comm: syz-executor.5 Not tainted 5.9.0-rc4-syzkaller #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       __dump_stack lib/dump_stack.c:77 [inline]
       dump_stack+0x198/0x1fd lib/dump_stack.c:118
       fib6_del+0x12b4/0x1630 net/ipv6/ip6_fib.c:1996
       fib6_clean_node+0x39b/0x570 net/ipv6/ip6_fib.c:2180
       fib6_walk_continue+0x4aa/0x8e0 net/ipv6/ip6_fib.c:2102
       fib6_walk+0x182/0x370 net/ipv6/ip6_fib.c:2150
       fib6_clean_tree+0xdb/0x120 net/ipv6/ip6_fib.c:2230
       __fib6_clean_all+0x120/0x290 net/ipv6/ip6_fib.c:2246
       fib6_clean_all net/ipv6/ip6_fib.c:2257 [inline]
       fib6_run_gc+0x113/0x2d0 net/ipv6/ip6_fib.c:2320
       ndisc_netdev_event+0x217/0x350 net/ipv6/ndisc.c:1805
       notifier_call_chain+0xb5/0x200 kernel/notifier.c:83
       call_netdevice_notifiers_info+0xb5/0x130 net/core/dev.c:2033
       call_netdevice_notifiers_extack net/core/dev.c:2045 [inline]
       call_netdevice_notifiers net/core/dev.c:2059 [inline]
       dev_close_many+0x30b/0x650 net/core/dev.c:1634
       rollback_registered_many+0x3a8/0x1210 net/core/dev.c:9261
       rollback_registered net/core/dev.c:9329 [inline]
       unregister_netdevice_queue+0x2dd/0x570 net/core/dev.c:10410
       unregister_netdevice include/linux/netdevice.h:2774 [inline]
       ppp_release+0x216/0x240 drivers/net/ppp/ppp_generic.c:403
       __fput+0x285/0x920 fs/file_table.c:281
       task_work_run+0xdd/0x190 kernel/task_work.c:141
       tracehook_notify_resume include/linux/tracehook.h:188 [inline]
       exit_to_user_mode_loop kernel/entry/common.c:163 [inline]
       exit_to_user_mode_prepare+0x1e1/0x200 kernel/entry/common.c:190
       syscall_exit_to_user_mode+0x7e/0x2e0 kernel/entry/common.c:265
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      Fixes: 421842ed ("net/ipv6: Add fib6_null_entry")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: David Ahern <dsahern@gmail.com>
      Reviewed-by: default avatarDavid Ahern <dsahern@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      843d926b
    • Vladimir Oltean's avatar
      net: dsa: link interfaces with the DSA master to get rid of lockdep warnings · 2f1e8ea7
      Vladimir Oltean authored
      Since commit 845e0ebb ("net: change addr_list_lock back to static
      key"), cascaded DSA setups (DSA switch port as DSA master for another
      DSA switch port) are emitting this lockdep warning:
      
      ============================================
      WARNING: possible recursive locking detected
      5.8.0-rc1-00133-g923e4b5032dd-dirty #208 Not tainted
      --------------------------------------------
      dhcpcd/323 is trying to acquire lock:
      ffff000066dd4268 (&dsa_master_addr_list_lock_key/1){+...}-{2:2}, at: dev_mc_sync+0x44/0x90
      
      but task is already holding lock:
      ffff00006608c268 (&dsa_master_addr_list_lock_key/1){+...}-{2:2}, at: dev_mc_sync+0x44/0x90
      
      other info that might help us debug this:
       Possible unsafe locking scenario:
      
             CPU0
             ----
        lock(&dsa_master_addr_list_lock_key/1);
        lock(&dsa_master_addr_list_lock_key/1);
      
       *** DEADLOCK ***
      
       May be due to missing lock nesting notation
      
      3 locks held by dhcpcd/323:
       #0: ffffdbd1381dda18 (rtnl_mutex){+.+.}-{3:3}, at: rtnl_lock+0x24/0x30
       #1: ffff00006614b268 (_xmit_ETHER){+...}-{2:2}, at: dev_set_rx_mode+0x28/0x48
       #2: ffff00006608c268 (&dsa_master_addr_list_lock_key/1){+...}-{2:2}, at: dev_mc_sync+0x44/0x90
      
      stack backtrace:
      Call trace:
       dump_backtrace+0x0/0x1e0
       show_stack+0x20/0x30
       dump_stack+0xec/0x158
       __lock_acquire+0xca0/0x2398
       lock_acquire+0xe8/0x440
       _raw_spin_lock_nested+0x64/0x90
       dev_mc_sync+0x44/0x90
       dsa_slave_set_rx_mode+0x34/0x50
       __dev_set_rx_mode+0x60/0xa0
       dev_mc_sync+0x84/0x90
       dsa_slave_set_rx_mode+0x34/0x50
       __dev_set_rx_mode+0x60/0xa0
       dev_set_rx_mode+0x30/0x48
       __dev_open+0x10c/0x180
       __dev_change_flags+0x170/0x1c8
       dev_change_flags+0x2c/0x70
       devinet_ioctl+0x774/0x878
       inet_ioctl+0x348/0x3b0
       sock_do_ioctl+0x50/0x310
       sock_ioctl+0x1f8/0x580
       ksys_ioctl+0xb0/0xf0
       __arm64_sys_ioctl+0x28/0x38
       el0_svc_common.constprop.0+0x7c/0x180
       do_el0_svc+0x2c/0x98
       el0_sync_handler+0x9c/0x1b8
       el0_sync+0x158/0x180
      
      Since DSA never made use of the netdev API for describing links between
      upper devices and lower devices, the dev->lower_level value of a DSA
      switch interface would be 1, which would warn when it is a DSA master.
      
      We can use netdev_upper_dev_link() to describe the relationship between
      a DSA slave and a DSA master. To be precise, a DSA "slave" (switch port)
      is an "upper" to a DSA "master" (host port). The relationship is "many
      uppers to one lower", like in the case of VLAN. So, for that reason, we
      use the same function as VLAN uses.
      
      There might be a chance that somebody will try to take hold of this
      interface and use it immediately after register_netdev() and before
      netdev_upper_dev_link(). To avoid that, we do the registration and
      linkage while holding the RTNL, and we use the RTNL-locked cousin of
      register_netdev(), which is register_netdevice().
      
      Since this warning was not there when lockdep was using dynamic keys for
      addr_list_lock, we are blaming the lockdep patch itself. The network
      stack _has_ been using static lockdep keys before, and it _is_ likely
      that stacked DSA setups have been triggering these lockdep warnings
      since forever, however I can't test very old kernels on this particular
      stacked DSA setup, to ensure I'm not in fact introducing regressions.
      
      Fixes: 845e0ebb ("net: change addr_list_lock back to static key")
      Suggested-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: default avatarVladimir Oltean <olteanv@gmail.com>
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2f1e8ea7
  3. 08 Sep, 2020 7 commits
    • Eric Dumazet's avatar
      mac802154: tx: fix use-after-free · 0ff4628f
      Eric Dumazet authored
      syzbot reported a bug in ieee802154_tx() [1]
      
      A similar issue in ieee802154_xmit_worker() is also fixed in this patch.
      
      [1]
      BUG: KASAN: use-after-free in ieee802154_tx+0x3d2/0x480 net/mac802154/tx.c:88
      Read of size 4 at addr ffff8880251a8c70 by task syz-executor.3/928
      
      CPU: 0 PID: 928 Comm: syz-executor.3 Not tainted 5.9.0-rc3-syzkaller #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       __dump_stack lib/dump_stack.c:77 [inline]
       dump_stack+0x198/0x1fd lib/dump_stack.c:118
       print_address_description.constprop.0.cold+0xae/0x497 mm/kasan/report.c:383
       __kasan_report mm/kasan/report.c:513 [inline]
       kasan_report.cold+0x1f/0x37 mm/kasan/report.c:530
       ieee802154_tx+0x3d2/0x480 net/mac802154/tx.c:88
       ieee802154_subif_start_xmit+0xbe/0xe4 net/mac802154/tx.c:130
       __netdev_start_xmit include/linux/netdevice.h:4634 [inline]
       netdev_start_xmit include/linux/netdevice.h:4648 [inline]
       dev_direct_xmit+0x4e9/0x6e0 net/core/dev.c:4203
       packet_snd net/packet/af_packet.c:2989 [inline]
       packet_sendmsg+0x2413/0x5290 net/packet/af_packet.c:3014
       sock_sendmsg_nosec net/socket.c:651 [inline]
       sock_sendmsg+0xcf/0x120 net/socket.c:671
       ____sys_sendmsg+0x6e8/0x810 net/socket.c:2353
       ___sys_sendmsg+0xf3/0x170 net/socket.c:2407
       __sys_sendmsg+0xe5/0x1b0 net/socket.c:2440
       do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
      RIP: 0033:0x45d5b9
      Code: 5d b4 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 2b b4 fb ff c3 66 2e 0f 1f 84 00 00 00 00
      RSP: 002b:00007fc98e749c78 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
      RAX: ffffffffffffffda RBX: 000000000002ccc0 RCX: 000000000045d5b9
      RDX: 0000000000000000 RSI: 0000000020007780 RDI: 000000000000000b
      RBP: 000000000118d020 R08: 0000000000000000 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000246 R12: 000000000118cfec
      R13: 00007fff690c720f R14: 00007fc98e74a9c0 R15: 000000000118cfec
      
      Allocated by task 928:
       kasan_save_stack+0x1b/0x40 mm/kasan/common.c:48
       kasan_set_track mm/kasan/common.c:56 [inline]
       __kasan_kmalloc.constprop.0+0xbf/0xd0 mm/kasan/common.c:461
       slab_post_alloc_hook mm/slab.h:518 [inline]
       slab_alloc_node mm/slab.c:3254 [inline]
       kmem_cache_alloc_node+0x136/0x3e0 mm/slab.c:3574
       __alloc_skb+0x71/0x550 net/core/skbuff.c:198
       alloc_skb include/linux/skbuff.h:1094 [inline]
       alloc_skb_with_frags+0x92/0x570 net/core/skbuff.c:5771
       sock_alloc_send_pskb+0x72a/0x880 net/core/sock.c:2348
       packet_alloc_skb net/packet/af_packet.c:2837 [inline]
       packet_snd net/packet/af_packet.c:2932 [inline]
       packet_sendmsg+0x19fb/0x5290 net/packet/af_packet.c:3014
       sock_sendmsg_nosec net/socket.c:651 [inline]
       sock_sendmsg+0xcf/0x120 net/socket.c:671
       ____sys_sendmsg+0x6e8/0x810 net/socket.c:2353
       ___sys_sendmsg+0xf3/0x170 net/socket.c:2407
       __sys_sendmsg+0xe5/0x1b0 net/socket.c:2440
       do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      Freed by task 928:
       kasan_save_stack+0x1b/0x40 mm/kasan/common.c:48
       kasan_set_track+0x1c/0x30 mm/kasan/common.c:56
       kasan_set_free_info+0x1b/0x30 mm/kasan/generic.c:355
       __kasan_slab_free+0xd8/0x120 mm/kasan/common.c:422
       __cache_free mm/slab.c:3418 [inline]
       kmem_cache_free.part.0+0x74/0x1e0 mm/slab.c:3693
       kfree_skbmem+0xef/0x1b0 net/core/skbuff.c:622
       __kfree_skb net/core/skbuff.c:679 [inline]
       consume_skb net/core/skbuff.c:838 [inline]
       consume_skb+0xcf/0x160 net/core/skbuff.c:832
       __dev_kfree_skb_any+0x9c/0xc0 net/core/dev.c:3107
       fakelb_hw_xmit+0x20e/0x2a0 drivers/net/ieee802154/fakelb.c:81
       drv_xmit_async net/mac802154/driver-ops.h:16 [inline]
       ieee802154_tx+0x282/0x480 net/mac802154/tx.c:81
       ieee802154_subif_start_xmit+0xbe/0xe4 net/mac802154/tx.c:130
       __netdev_start_xmit include/linux/netdevice.h:4634 [inline]
       netdev_start_xmit include/linux/netdevice.h:4648 [inline]
       dev_direct_xmit+0x4e9/0x6e0 net/core/dev.c:4203
       packet_snd net/packet/af_packet.c:2989 [inline]
       packet_sendmsg+0x2413/0x5290 net/packet/af_packet.c:3014
       sock_sendmsg_nosec net/socket.c:651 [inline]
       sock_sendmsg+0xcf/0x120 net/socket.c:671
       ____sys_sendmsg+0x6e8/0x810 net/socket.c:2353
       ___sys_sendmsg+0xf3/0x170 net/socket.c:2407
       __sys_sendmsg+0xe5/0x1b0 net/socket.c:2440
       do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      The buggy address belongs to the object at ffff8880251a8c00
       which belongs to the cache skbuff_head_cache of size 224
      The buggy address is located 112 bytes inside of
       224-byte region [ffff8880251a8c00, ffff8880251a8ce0)
      The buggy address belongs to the page:
      page:0000000062b6a4f1 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x251a8
      flags: 0xfffe0000000200(slab)
      raw: 00fffe0000000200 ffffea0000435c88 ffffea00028b6c08 ffff8880a9055d00
      raw: 0000000000000000 ffff8880251a80c0 000000010000000c 0000000000000000
      page dumped because: kasan: bad access detected
      
      Memory state around the buggy address:
       ffff8880251a8b00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
       ffff8880251a8b80: fb fb fb fb fc fc fc fc fc fc fc fc fc fc fc fc
      >ffff8880251a8c00: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                                                                   ^
       ffff8880251a8c80: fb fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc
       ffff8880251a8d00: fc fc fc fc fc fc fc fc fa fb fb fb fb fb fb fb
      
      Fixes: 409c3b0c ("mac802154: tx: move stats tx increment")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Cc: Alexander Aring <alex.aring@gmail.com>
      Cc: Stefan Schmidt <stefan@datenfreihafen.org>
      Cc: linux-wpan@vger.kernel.org
      Link: https://lore.kernel.org/r/20200908104025.4009085-1-edumazet@google.comSigned-off-by: default avatarStefan Schmidt <stefan@datenfreihafen.org>
      0ff4628f
    • Pablo Neira Ayuso's avatar
      netfilter: nft_meta: use socket user_ns to retrieve skuid and skgid · 0c92411b
      Pablo Neira Ayuso authored
      ... instead of using init_user_ns.
      
      Fixes: 96518518 ("netfilter: add nftables")
      Tested-by: default avatarPhil Sutter <phil@nwl.cc>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      0c92411b
    • Eelco Chaudron's avatar
      netfilter: conntrack: nf_conncount_init is failing with IPv6 disabled · 526e81b9
      Eelco Chaudron authored
      The openvswitch module fails initialization when used in a kernel
      without IPv6 enabled. nf_conncount_init() fails because the ct code
      unconditionally tries to initialize the netns IPv6 related bit,
      regardless of the build option. The change below ignores the IPv6
      part if not enabled.
      
      Note that the corresponding _put() function already has this IPv6
      configuration check.
      
      Fixes: 11efd5cb ("openvswitch: Support conntrack zone limit")
      Signed-off-by: default avatarEelco Chaudron <echaudro@redhat.com>
      Reviewed-by: default avatarSimon Horman <simon.horman@netronome.com>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      526e81b9
    • Martin Willi's avatar
      netfilter: ctnetlink: fix mark based dump filtering regression · 6c0d95d1
      Martin Willi authored
      conntrack mark based dump filtering may falsely skip entries if a mask
      is given: If the mask-based check does not filter out the entry, the
      else-if check is always true and compares the mark without considering
      the mask. The if/else-if logic seems wrong.
      
      Given that the mask during filter setup is implicitly set to 0xffffffff
      if not specified explicitly, the mark filtering flags seem to just
      complicate things. Restore the previously used approach by always
      matching against a zero mask is no filter mark is given.
      
      Fixes: cb8aa9a3 ("netfilter: ctnetlink: add kernel side filtering for dump")
      Signed-off-by: default avatarMartin Willi <martin@strongswan.org>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      6c0d95d1
    • Pablo Neira Ayuso's avatar
      netfilter: nf_tables: coalesce multiple notifications into one skbuff · 67cc570e
      Pablo Neira Ayuso authored
      On x86_64, each notification results in one skbuff allocation which
      consumes at least 768 bytes due to the skbuff overhead.
      
      This patch coalesces several notifications into one single skbuff, so
      each notification consumes at least ~211 bytes, that ~3.5 times less
      memory consumption. As a result, this is reducing the chances to exhaust
      the netlink socket receive buffer.
      
      Rule of thumb is that each notification batch only contains netlink
      messages whose report flag is the same, nfnetlink_send() requires this
      to do appropriate delivery to userspace, either via unicast (echo
      mode) or multicast (monitor mode).
      
      The skbuff control buffer is used to annotate the report flag for later
      handling at the new coalescing routine.
      
      The batch skbuff notification size is NLMSG_GOODSIZE, using a larger
      skbuff would allow for more socket receiver buffer savings (to amortize
      the cost of the skbuff even more), however, going over that size might
      break userspace applications, so let's be conservative and stick to
      NLMSG_GOODSIZE.
      Reported-by: default avatarPhil Sutter <phil@nwl.cc>
      Acked-by: default avatarPhil Sutter <phil@nwl.cc>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      67cc570e
    • Will McVicker's avatar
      netfilter: ctnetlink: add a range check for l3/l4 protonum · 1cc5ef91
      Will McVicker authored
      The indexes to the nf_nat_l[34]protos arrays come from userspace. So
      check the tuple's family, e.g. l3num, when creating the conntrack in
      order to prevent an OOB memory access during setup.  Here is an example
      kernel panic on 4.14.180 when userspace passes in an index greater than
      NFPROTO_NUMPROTO.
      
      Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
      Modules linked in:...
      Process poc (pid: 5614, stack limit = 0x00000000a3933121)
      CPU: 4 PID: 5614 Comm: poc Tainted: G S      W  O    4.14.180-g051355490483
      Hardware name: Qualcomm Technologies, Inc. SM8150 V2 PM8150 Google Inc. MSM
      task: 000000002a3dfffe task.stack: 00000000a3933121
      pc : __cfi_check_fail+0x1c/0x24
      lr : __cfi_check_fail+0x1c/0x24
      ...
      Call trace:
      __cfi_check_fail+0x1c/0x24
      name_to_dev_t+0x0/0x468
      nfnetlink_parse_nat_setup+0x234/0x258
      ctnetlink_parse_nat_setup+0x4c/0x228
      ctnetlink_new_conntrack+0x590/0xc40
      nfnetlink_rcv_msg+0x31c/0x4d4
      netlink_rcv_skb+0x100/0x184
      nfnetlink_rcv+0xf4/0x180
      netlink_unicast+0x360/0x770
      netlink_sendmsg+0x5a0/0x6a4
      ___sys_sendmsg+0x314/0x46c
      SyS_sendmsg+0xb4/0x108
      el0_svc_naked+0x34/0x38
      
      This crash is not happening since 5.4+, however, ctnetlink still
      allows for creating entries with unsupported layer 3 protocol number.
      
      Fixes: c1d10adb ("[NETFILTER]: Add ctnetlink port for nf_conntrack")
      Signed-off-by: default avatarWill McVicker <willmcvicker@google.com>
      [pablo@netfilter.org: rebased original patch on top of nf.git]
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      1cc5ef91
    • Dexuan Cui's avatar
      hv_netvsc: Fix hibernation for mlx5 VF driver · 19162fd4
      Dexuan Cui authored
      mlx5_suspend()/resume() keep the network interface, so during hibernation
      netvsc_unregister_vf() and netvsc_register_vf() are not called, and hence
      netvsc_resume() should call netvsc_vf_changed() to switch the data path
      back to the VF after hibernation. Note: after we close and re-open the
      vmbus channel of the netvsc NIC in netvsc_suspend() and netvsc_resume(),
      the data path is implicitly switched to the netvsc NIC. Similarly,
      netvsc_suspend() should not call netvsc_unregister_vf(), otherwise the VF
      can no longer be used after hibernation.
      
      For mlx4, since the VF network interafce is explicitly destroyed and
      re-created during hibernation (see mlx4_suspend()/resume()), hv_netvsc
      already explicitly switches the data path from and to the VF automatically
      via netvsc_register_vf() and netvsc_unregister_vf(), so mlx4 doesn't need
      this fix. Note: mlx4 can still work with the fix because in
      netvsc_suspend()/resume() ndev_ctx->vf_netdev is NULL for mlx4.
      
      Fixes: 0efeea5f ("hv_netvsc: Add the support of hibernation")
      Signed-off-by: default avatarDexuan Cui <decui@microsoft.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      19162fd4
  4. 07 Sep, 2020 8 commits
    • Taehee Yoo's avatar
      Revert "netns: don't disable BHs when locking "nsid_lock"" · e1f469cd
      Taehee Yoo authored
      This reverts commit 8d7e5dee.
      
      To protect netns id, the nsid_lock is used when netns id is being
      allocated and removed by peernet2id_alloc() and unhash_nsid().
      The nsid_lock can be used in BH context but only spin_lock() is used
      in this code.
      Using spin_lock() instead of spin_lock_bh() can result in a deadlock in
      the following scenario reported by the lockdep.
      In order to avoid a deadlock, the spin_lock_bh() should be used instead
      of spin_lock() to acquire nsid_lock.
      
      Test commands:
          ip netns del nst
          ip netns add nst
          ip link add veth1 type veth peer name veth2
          ip link set veth1 netns nst
          ip netns exec nst ip link add name br1 type bridge vlan_filtering 1
          ip netns exec nst ip link set dev br1 up
          ip netns exec nst ip link set dev veth1 master br1
          ip netns exec nst ip link set dev veth1 up
          ip netns exec nst ip link add macvlan0 link br1 up type macvlan
      
      Splat looks like:
      [   33.615860][  T607] WARNING: SOFTIRQ-safe -> SOFTIRQ-unsafe lock order detected
      [   33.617194][  T607] 5.9.0-rc1+ #665 Not tainted
      [ ... ]
      [   33.670615][  T607] Chain exists of:
      [   33.670615][  T607]   &mc->mca_lock --> &bridge_netdev_addr_lock_key --> &net->nsid_lock
      [   33.670615][  T607]
      [   33.673118][  T607]  Possible interrupt unsafe locking scenario:
      [   33.673118][  T607]
      [   33.674599][  T607]        CPU0                    CPU1
      [   33.675557][  T607]        ----                    ----
      [   33.676516][  T607]   lock(&net->nsid_lock);
      [   33.677306][  T607]                                local_irq_disable();
      [   33.678517][  T607]                                lock(&mc->mca_lock);
      [   33.679725][  T607]                                lock(&bridge_netdev_addr_lock_key);
      [   33.681166][  T607]   <Interrupt>
      [   33.681791][  T607]     lock(&mc->mca_lock);
      [   33.682579][  T607]
      [   33.682579][  T607]  *** DEADLOCK ***
      [ ... ]
      [   33.922046][  T607] stack backtrace:
      [   33.922999][  T607] CPU: 3 PID: 607 Comm: ip Not tainted 5.9.0-rc1+ #665
      [   33.924099][  T607] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014
      [   33.925714][  T607] Call Trace:
      [   33.926238][  T607]  dump_stack+0x78/0xab
      [   33.926905][  T607]  check_irq_usage+0x70b/0x720
      [   33.927708][  T607]  ? iterate_chain_key+0x60/0x60
      [   33.928507][  T607]  ? check_path+0x22/0x40
      [   33.929201][  T607]  ? check_noncircular+0xcf/0x180
      [   33.930024][  T607]  ? __lock_acquire+0x1952/0x1f20
      [   33.930860][  T607]  __lock_acquire+0x1952/0x1f20
      [   33.931667][  T607]  lock_acquire+0xaf/0x3a0
      [   33.932366][  T607]  ? peernet2id_alloc+0x3a/0x170
      [   33.933147][  T607]  ? br_port_fill_attrs+0x54c/0x6b0 [bridge]
      [   33.934140][  T607]  ? br_port_fill_attrs+0x5de/0x6b0 [bridge]
      [   33.935113][  T607]  ? kvm_sched_clock_read+0x14/0x30
      [   33.935974][  T607]  _raw_spin_lock+0x30/0x70
      [   33.936728][  T607]  ? peernet2id_alloc+0x3a/0x170
      [   33.937523][  T607]  peernet2id_alloc+0x3a/0x170
      [   33.938313][  T607]  rtnl_fill_ifinfo+0xb5e/0x1400
      [   33.939091][  T607]  rtmsg_ifinfo_build_skb+0x8a/0xf0
      [   33.939953][  T607]  rtmsg_ifinfo_event.part.39+0x17/0x50
      [   33.940863][  T607]  rtmsg_ifinfo+0x1f/0x30
      [   33.941571][  T607]  __dev_notify_flags+0xa5/0xf0
      [   33.942376][  T607]  ? __irq_work_queue_local+0x49/0x50
      [   33.943249][  T607]  ? irq_work_queue+0x1d/0x30
      [   33.943993][  T607]  ? __dev_set_promiscuity+0x7b/0x1a0
      [   33.944878][  T607]  __dev_set_promiscuity+0x7b/0x1a0
      [   33.945758][  T607]  dev_set_promiscuity+0x1e/0x50
      [   33.946582][  T607]  br_port_set_promisc+0x1f/0x40 [bridge]
      [   33.947487][  T607]  br_manage_promisc+0x8b/0xe0 [bridge]
      [   33.948388][  T607]  __dev_set_promiscuity+0x123/0x1a0
      [   33.949244][  T607]  __dev_set_rx_mode+0x68/0x90
      [   33.950021][  T607]  dev_uc_add+0x50/0x60
      [   33.950720][  T607]  macvlan_open+0x18e/0x1f0 [macvlan]
      [   33.951601][  T607]  __dev_open+0xd6/0x170
      [   33.952269][  T607]  __dev_change_flags+0x181/0x1d0
      [   33.953056][  T607]  rtnl_configure_link+0x2f/0xa0
      [   33.953884][  T607]  __rtnl_newlink+0x6b9/0x8e0
      [   33.954665][  T607]  ? __lock_acquire+0x95d/0x1f20
      [   33.955450][  T607]  ? lock_acquire+0xaf/0x3a0
      [   33.956193][  T607]  ? is_bpf_text_address+0x5/0xe0
      [   33.956999][  T607]  rtnl_newlink+0x47/0x70
      Acked-by: default avatarGuillaume Nault <gnault@redhat.com>
      Fixes: 8d7e5dee ("netns: don't disable BHs when locking "nsid_lock"")
      Reported-by: syzbot+3f960c64a104eaa2c813@syzkaller.appspotmail.com
      Signed-off-by: default avatarTaehee Yoo <ap420073@gmail.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      e1f469cd
    • Jakub Kicinski's avatar
      ibmvnic: add missing parenthesis in do_reset() · 8ae4dff8
      Jakub Kicinski authored
      Indentation and logic clearly show that this code is missing
      parenthesis.
      
      Fixes: 9f134573 ("ibmvnic fix NULL tx_pools and rx_tools issue at do_reset")
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      8ae4dff8
    • Randy Dunlap's avatar
      netdevice.h: fix xdp_state kernel-doc warning · ffa59b0b
      Randy Dunlap authored
      Fix kernel-doc warning in <linux/netdevice.h>:
      
      ../include/linux/netdevice.h:2158: warning: Function parameter or member 'xdp_state' not described in 'net_device'
      
      Fixes: 7f0a8382 ("bpf, xdp: Maintain info on attached XDP BPF programs in net_device")
      Signed-off-by: default avatarRandy Dunlap <rdunlap@infradead.org>
      Cc: Andrii Nakryiko <andriin@fb.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      ffa59b0b
    • Randy Dunlap's avatar
      netdevice.h: fix proto_down_reason kernel-doc warning · eb02d39a
      Randy Dunlap authored
      Fix kernel-doc warning in <linux/netdevice.h>:
      
      ../include/linux/netdevice.h:2158: warning: Function parameter or member 'proto_down_reason' not described in 'net_device'
      
      Fixes: 829eb208 ("rtnetlink: add support for protodown reason")
      Signed-off-by: default avatarRandy Dunlap <rdunlap@infradead.org>
      Acked-by: default avatarRoopa Prabhu <roopa@cumulusnetworks.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      eb02d39a
    • Jakub Kicinski's avatar
      Merge branch 'bnxt_en-Two-bug-fixes' · 72bbee2a
      Jakub Kicinski authored
      Michael Chan says:
      
      ====================
      bnxt_en: Two bug fixes.
      
      The first patch fixes AER recovery by reducing the time from several
      minutes to a more reasonable 20 - 30 seconds.  The second patch fixes
      a possible NULL pointer crash during firmware reset.
      ====================
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      72bbee2a
    • Vasundhara Volam's avatar
      bnxt_en: Fix NULL ptr dereference crash in bnxt_fw_reset_task() · b16939b5
      Vasundhara Volam authored
      bnxt_fw_reset_task() which runs from a workqueue can race with
      bnxt_remove_one().  For example, if firmware reset and VF FLR are
      happening at about the same time.
      
      bnxt_remove_one() already cancels the workqueue and waits for it
      to finish, but we need to do this earlier before the devlink
      reporters are destroyed.  This will guarantee that
      the devlink reporters will always be valid when bnxt_fw_reset_task()
      is still running.
      
      Fixes: b148bb23 ("bnxt_en: Fix possible crash in bnxt_fw_reset_task().")
      Reviewed-by: default avatarEdwin Peer <edwin.peer@broadcom.com>
      Signed-off-by: default avatarVasundhara Volam <vasundhara-v.volam@broadcom.com>
      Signed-off-by: default avatarMichael Chan <michael.chan@broadcom.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      b16939b5
    • Vasundhara Volam's avatar
      bnxt_en: Avoid sending firmware messages when AER error is detected. · b340dc68
      Vasundhara Volam authored
      When the driver goes through PCIe AER reset in error state, all
      firmware messages will timeout because the PCIe bus is no longer
      accessible.  This can lead to AER reset taking many minutes to
      complete as each firmware command takes time to timeout.
      
      Define a new macro BNXT_NO_FW_ACCESS() to skip these firmware messages
      when either firmware is in fatal error state or when
      pci_channel_offline() is true.  It now takes a more reasonable 20 to
      30 seconds to complete AER recovery.
      
      Fixes: b4fff207 ("bnxt_en: Do not send firmware messages if firmware is in error state.")
      Signed-off-by: default avatarVasundhara Volam <vasundhara-v.volam@broadcom.com>
      Signed-off-by: default avatarMichael Chan <michael.chan@broadcom.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      b340dc68
    • Mauro Carvalho Chehab's avatar
      Revert "wlcore: Adding suppoprt for IGTK key in wlcore driver" · 1264c1e0
      Mauro Carvalho Chehab authored
      This patch causes a regression betwen Kernel 5.7 and 5.8 at wlcore:
      with it applied, WiFi stops working, and the Kernel starts printing
      this message every second:
      
         wlcore: PHY firmware version: Rev 8.2.0.0.242
         wlcore: firmware booted (Rev 8.9.0.0.79)
         wlcore: ERROR command execute failure 14
         ------------[ cut here ]------------
         WARNING: CPU: 0 PID: 133 at drivers/net/wireless/ti/wlcore/main.c:795 wl12xx_queue_recovery_work.part.0+0x6c/0x74 [wlcore]
         Modules linked in: wl18xx wlcore mac80211 libarc4 cfg80211 rfkill snd_soc_hdmi_codec crct10dif_ce wlcore_sdio adv7511 cec kirin9xx_drm(C) kirin9xx_dw_drm_dsi(C) drm_kms_helper drm ip_tables x_tables ipv6 nf_defrag_ipv6
         CPU: 0 PID: 133 Comm: kworker/0:1 Tainted: G        WC        5.8.0+ #186
         Hardware name: HiKey970 (DT)
         Workqueue: events_freezable ieee80211_restart_work [mac80211]
         pstate: 60000005 (nZCv daif -PAN -UAO BTYPE=--)
         pc : wl12xx_queue_recovery_work.part.0+0x6c/0x74 [wlcore]
         lr : wl12xx_queue_recovery_work+0x24/0x30 [wlcore]
         sp : ffff8000126c3a60
         x29: ffff8000126c3a60 x28: 00000000000025de
         x27: 0000000000000010 x26: 0000000000000005
         x25: ffff0001a5d49e80 x24: ffff8000092cf580
         x23: ffff0001b7c12623 x22: ffff0001b6fcf2e8
         x21: ffff0001b7e46200 x20: 00000000fffffffb
         x19: ffff0001a78e6400 x18: 0000000000000030
         x17: 0000000000000001 x16: 0000000000000001
         x15: ffff0001b7e46670 x14: ffffffffffffffff
         x13: ffff8000926c37d7 x12: ffff8000126c37e0
         x11: ffff800011e01000 x10: ffff8000120526d0
         x9 : 0000000000000000 x8 : 3431206572756c69
         x7 : 6166206574756365 x6 : 0000000000000c2c
         x5 : 0000000000000000 x4 : ffff0001bf1361e8
         x3 : ffff0001bf1790b0 x2 : 0000000000000000
         x1 : ffff0001a5d49e80 x0 : 0000000000000001
         Call trace:
          wl12xx_queue_recovery_work.part.0+0x6c/0x74 [wlcore]
          wl12xx_queue_recovery_work+0x24/0x30 [wlcore]
          wl1271_cmd_set_sta_key+0x258/0x25c [wlcore]
          wl1271_set_key+0x7c/0x2dc [wlcore]
          wlcore_set_key+0xe4/0x360 [wlcore]
          wl18xx_set_key+0x48/0x1d0 [wl18xx]
          wlcore_op_set_key+0xa4/0x180 [wlcore]
          ieee80211_key_enable_hw_accel+0xb0/0x2d0 [mac80211]
          ieee80211_reenable_keys+0x70/0x110 [mac80211]
          ieee80211_reconfig+0xa00/0xca0 [mac80211]
          ieee80211_restart_work+0xc4/0xfc [mac80211]
          process_one_work+0x1cc/0x350
          worker_thread+0x13c/0x470
          kthread+0x154/0x160
          ret_from_fork+0x10/0x30
         ---[ end trace b1f722abf9af5919 ]---
         wlcore: WARNING could not set keys
         wlcore: ERROR Could not add or replace key
         wlan0: failed to set key (4, ff:ff:ff:ff:ff:ff) to hardware (-5)
         wlcore: Hardware recovery in progress. FW ver: Rev 8.9.0.0.79
         wlcore: pc: 0x0, hint_sts: 0x00000040 count: 39
         wlcore: down
         wlcore: down
         ieee80211 phy0: Hardware restart was requested
         mmc_host mmc0: Bus speed (slot 0) = 400000Hz (slot req 400000Hz, actual 400000HZ div = 0)
         mmc_host mmc0: Bus speed (slot 0) = 25000000Hz (slot req 25000000Hz, actual 25000000HZ div = 0)
         wlcore: PHY firmware version: Rev 8.2.0.0.242
         wlcore: firmware booted (Rev 8.9.0.0.79)
         wlcore: ERROR command execute failure 14
         ------------[ cut here ]------------
      
      Tested on Hikey 970.
      
      This reverts commit 2b7aadd3.
      Signed-off-by: default avatarMauro Carvalho Chehab <mchehab+huawei@kernel.org>
      Signed-off-by: default avatarKalle Valo <kvalo@codeaurora.org>
      Link: https://lore.kernel.org/r/f0a2cb7ea606f1a284d4c23cbf983da2954ce9b6.1598420968.git.mchehab+huawei@kernel.org
      1264c1e0
  5. 06 Sep, 2020 1 commit