1. 13 Jul, 2022 5 commits
  2. 12 Jul, 2022 2 commits
  3. 11 Jul, 2022 4 commits
  4. 09 Jul, 2022 7 commits
    • Pablo Neira Ayuso's avatar
      netfilter: nf_tables: replace BUG_ON by element length check · c39ba4de
      Pablo Neira Ayuso authored
      BUG_ON can be triggered from userspace with an element with a large
      userdata area. Replace it by length check and return EINVAL instead.
      Over time extensions have been growing in size.
      
      Pick a sufficiently old Fixes: tag to propagate this fix.
      
      Fixes: 7d740264 ("netfilter: nf_tables: variable sized set element keys / data")
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      c39ba4de
    • Eric Dumazet's avatar
      vlan: fix memory leak in vlan_newlink() · 72a0b329
      Eric Dumazet authored
      Blamed commit added back a bug I fixed in commit 9bbd917e
      ("vlan: fix memory leak in vlan_dev_set_egress_priority")
      
      If a memory allocation fails in vlan_changelink() after other allocations
      succeeded, we need to call vlan_dev_free_egress_priority()
      to free all allocated memory because after a failed ->newlink()
      we do not call any methods like ndo_uninit() or dev->priv_destructor().
      
      In following example, if the allocation for last element 2000:2001 fails,
      we need to free eight prior allocations:
      
      ip link add link dummy0 dummy0.100 type vlan id 100 \
      	egress-qos-map 1:2 2:3 3:4 4:5 5:6 6:7 7:8 8:9 2000:2001
      
      syzbot report was:
      
      BUG: memory leak
      unreferenced object 0xffff888117bd1060 (size 32):
      comm "syz-executor408", pid 3759, jiffies 4294956555 (age 34.090s)
      hex dump (first 32 bytes):
      09 00 00 00 00 a0 00 00 00 00 00 00 00 00 00 00 ................
      00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
      backtrace:
      [<ffffffff83fc60ad>] kmalloc include/linux/slab.h:600 [inline]
      [<ffffffff83fc60ad>] vlan_dev_set_egress_priority+0xed/0x170 net/8021q/vlan_dev.c:193
      [<ffffffff83fc6628>] vlan_changelink+0x178/0x1d0 net/8021q/vlan_netlink.c:128
      [<ffffffff83fc67c8>] vlan_newlink+0x148/0x260 net/8021q/vlan_netlink.c:185
      [<ffffffff838b1278>] rtnl_newlink_create net/core/rtnetlink.c:3363 [inline]
      [<ffffffff838b1278>] __rtnl_newlink+0xa58/0xdc0 net/core/rtnetlink.c:3580
      [<ffffffff838b1629>] rtnl_newlink+0x49/0x70 net/core/rtnetlink.c:3593
      [<ffffffff838ac66c>] rtnetlink_rcv_msg+0x21c/0x5c0 net/core/rtnetlink.c:6089
      [<ffffffff839f9c37>] netlink_rcv_skb+0x87/0x1d0 net/netlink/af_netlink.c:2501
      [<ffffffff839f8da7>] netlink_unicast_kernel net/netlink/af_netlink.c:1319 [inline]
      [<ffffffff839f8da7>] netlink_unicast+0x397/0x4c0 net/netlink/af_netlink.c:1345
      [<ffffffff839f9266>] netlink_sendmsg+0x396/0x710 net/netlink/af_netlink.c:1921
      [<ffffffff8384dbf6>] sock_sendmsg_nosec net/socket.c:714 [inline]
      [<ffffffff8384dbf6>] sock_sendmsg+0x56/0x80 net/socket.c:734
      [<ffffffff8384e15c>] ____sys_sendmsg+0x36c/0x390 net/socket.c:2488
      [<ffffffff838523cb>] ___sys_sendmsg+0x8b/0xd0 net/socket.c:2542
      [<ffffffff838525b8>] __sys_sendmsg net/socket.c:2571 [inline]
      [<ffffffff838525b8>] __do_sys_sendmsg net/socket.c:2580 [inline]
      [<ffffffff838525b8>] __se_sys_sendmsg net/socket.c:2578 [inline]
      [<ffffffff838525b8>] __x64_sys_sendmsg+0x78/0xf0 net/socket.c:2578
      [<ffffffff845ad8d5>] do_syscall_x64 arch/x86/entry/common.c:50 [inline]
      [<ffffffff845ad8d5>] do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
      [<ffffffff8460006a>] entry_SYSCALL_64_after_hwframe+0x46/0xb0
      
      Fixes: 37aa50c5 ("vlan: introduce vlan_dev_free_egress_priority")
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Xin Long <lucien.xin@gmail.com>
      Reviewed-by: default avatarXin Long <lucien.xin@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      72a0b329
    • Baowen Zheng's avatar
      nfp: fix issue of skb segments exceeds descriptor limitation · 9c840d5f
      Baowen Zheng authored
      TCP packets will be dropped if the segments number in the tx skb
      exceeds limitation when sending iperf3 traffic with --zerocopy option.
      
      we make the following changes:
      
      Get nr_frags in nfp_nfdk_tx_maybe_close_block instead of passing from
      outside because it will be changed after skb_linearize operation.
      
      Fill maximum dma_len in first tx descriptor to make sure the whole
      head is included in the first descriptor.
      
      Fixes: c10d12e3 ("nfp: add support for NFDK data path")
      Signed-off-by: default avatarBaowen Zheng <baowen.zheng@corigine.com>
      Reviewed-by: default avatarLouis Peens <louis.peens@corigine.com>
      Signed-off-by: default avatarSimon Horman <simon.horman@corigine.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9c840d5f
    • Pablo Neira Ayuso's avatar
      netfilter: nf_log: incorrect offset to network header · 7a847c00
      Pablo Neira Ayuso authored
      NFPROTO_ARP is expecting to find the ARP header at the network offset.
      
      In the particular case of ARP, HTYPE= field shows the initial bytes of
      the ethernet header destination MAC address.
      
       netdev out: IN= OUT=bridge0 MACSRC=c2:76:e5:71:e1:de MACDST=36:b0:4a:e2:72:ea MACPROTO=0806 ARP HTYPE=14000 PTYPE=0x4ae2 OPCODE=49782
      
      NFPROTO_NETDEV egress hook is also expecting to find the IP headers at
      the network offset.
      
      Fixes: 35b93951 ("netfilter: add generic ARP packet logger")
      Reported-by: default avatarTom Yan <tom.ty89@gmail.com>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      7a847c00
    • Jakub Kicinski's avatar
      Merge branch 'selftests-forwarding-install-two-missing-tests' · 6676d727
      Jakub Kicinski authored
      Martin Blumenstingl says:
      
      ====================
      selftests: forwarding: Install two missing tests
      
      For some distributions (e.g. OpenWrt) we don't want to rely on rsync
      to copy the tests to the target as some extra dependencies need to be
      installed. The Makefile in tools/testing/selftests/net/forwarding
      already installs most of the tests.
      
      This series adds the two missing tests to the list of installed tests.
      That way a downstream distribution can build a package using this
      Makefile (and add dependencies there as needed).
      ====================
      
      Link: https://lore.kernel.org/r/20220707135532.1783925-1-martin.blumenstingl@googlemail.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      6676d727
    • Martin Blumenstingl's avatar
      selftests: forwarding: Install no_forwarding.sh · cfbba7b4
      Martin Blumenstingl authored
      When using the Makefile from tools/testing/selftests/net/forwarding/
      all tests should be installed. Add no_forwarding.sh to the list of
      "to be installed tests" where it has been missing so far.
      
      Fixes: 476a4f05 ("selftests: forwarding: add a no_forwarding.sh test")
      Signed-off-by: default avatarMartin Blumenstingl <martin.blumenstingl@googlemail.com>
      Reviewed-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      cfbba7b4
    • Martin Blumenstingl's avatar
      selftests: forwarding: Install local_termination.sh · 437ac259
      Martin Blumenstingl authored
      When using the Makefile from tools/testing/selftests/net/forwarding/
      all tests should be installed. Add local_termination.sh to the list of
      "to be installed tests" where it has been missing so far.
      
      Fixes: 90b9566a ("selftests: forwarding: add a test for local_termination.sh")
      Signed-off-by: default avatarMartin Blumenstingl <martin.blumenstingl@googlemail.com>
      Reviewed-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      437ac259
  5. 08 Jul, 2022 20 commits
  6. 07 Jul, 2022 2 commits
    • Florian Westphal's avatar
      netfilter: conntrack: fix crash due to confirmed bit load reordering · 0ed8f619
      Florian Westphal authored
      Kajetan Puchalski reports crash on ARM, with backtrace of:
      
      __nf_ct_delete_from_lists
      nf_ct_delete
      early_drop
      __nf_conntrack_alloc
      
      Unlike atomic_inc_not_zero, refcount_inc_not_zero is not a full barrier.
      conntrack uses SLAB_TYPESAFE_BY_RCU, i.e. it is possible that a 'newly'
      allocated object is still in use on another CPU:
      
      CPU1						CPU2
      						encounter 'ct' during hlist walk
       delete_from_lists
       refcount drops to 0
       kmem_cache_free(ct);
       __nf_conntrack_alloc() // returns same object
      						refcount_inc_not_zero(ct); /* might fail */
      
      						/* If set, ct is public/in the hash table */
      						test_bit(IPS_CONFIRMED_BIT, &ct->status);
      
      In case CPU1 already set refcount back to 1, refcount_inc_not_zero()
      will succeed.
      
      The expected possibilities for a CPU that obtained the object 'ct'
      (but no reference so far) are:
      
      1. refcount_inc_not_zero() fails.  CPU2 ignores the object and moves to
         the next entry in the list.  This happens for objects that are about
         to be free'd, that have been free'd, or that have been reallocated
         by __nf_conntrack_alloc(), but where the refcount has not been
         increased back to 1 yet.
      
      2. refcount_inc_not_zero() succeeds. CPU2 checks the CONFIRMED bit
         in ct->status.  If set, the object is public/in the table.
      
         If not, the object must be skipped; CPU2 calls nf_ct_put() to
         un-do the refcount increment and moves to the next object.
      
      Parallel deletion from the hlists is prevented by a
      'test_and_set_bit(IPS_DYING_BIT, &ct->status);' check, i.e. only one
      cpu will do the unlink, the other one will only drop its reference count.
      
      Because refcount_inc_not_zero is not a full barrier, CPU2 may try to
      delete an object that is not on any list:
      
      1. refcount_inc_not_zero() successful (refcount inited to 1 on other CPU)
      2. CONFIRMED test also successful (load was reordered or zeroing
         of ct->status not yet visible)
      3. delete_from_lists unlinks entry not on the hlist, because
         IPS_DYING_BIT is 0 (already cleared).
      
      2) is already wrong: CPU2 will handle a partially initited object
      that is supposed to be private to CPU1.
      
      Add needed barriers when refcount_inc_not_zero() is successful.
      
      It also inserts a smp_wmb() before the refcount is set to 1 during
      allocation.
      
      Because other CPU might still see the object, refcount_set(1)
      "resurrects" it, so we need to make sure that other CPUs will also observe
      the right content.  In particular, the CONFIRMED bit test must only pass
      once the object is fully initialised and either in the hash or about to be
      inserted (with locks held to delay possible unlink from early_drop or
      gc worker).
      
      I did not change flow_offload_alloc(), as far as I can see it should call
      refcount_inc(), not refcount_inc_not_zero(): the ct object is attached to
      the skb so its refcount should be >= 1 in all cases.
      
      v2: prefer smp_acquire__after_ctrl_dep to smp_rmb (Will Deacon).
      v3: keep smp_acquire__after_ctrl_dep close to refcount_inc_not_zero call
          add comment in nf_conntrack_netlink, no control dependency there
          due to locks.
      
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/all/Yr7WTfd6AVTQkLjI@e126311.manchester.arm.com/Reported-by: default avatarKajetan Puchalski <kajetan.puchalski@arm.com>
      Diagnosed-by: default avatarWill Deacon <will@kernel.org>
      Fixes: 71977437 ("netfilter: conntrack: convert to refcount_t api")
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Acked-by: default avatarWill Deacon <will@kernel.org>
      0ed8f619
    • Eric Dumazet's avatar
      bpf: Make sure mac_header was set before using it · 0326195f
      Eric Dumazet authored
      Classic BPF has a way to load bytes starting from the mac header.
      
      Some skbs do not have a mac header, and skb_mac_header()
      in this case is returning a pointer that 65535 bytes after
      skb->head.
      
      Existing range check in bpf_internal_load_pointer_neg_helper()
      was properly kicking and no illegal access was happening.
      
      New sanity check in skb_mac_header() is firing, so we need
      to avoid it.
      
      WARNING: CPU: 1 PID: 28990 at include/linux/skbuff.h:2785 skb_mac_header include/linux/skbuff.h:2785 [inline]
      WARNING: CPU: 1 PID: 28990 at include/linux/skbuff.h:2785 bpf_internal_load_pointer_neg_helper+0x1b1/0x1c0 kernel/bpf/core.c:74
      Modules linked in:
      CPU: 1 PID: 28990 Comm: syz-executor.0 Not tainted 5.19.0-rc4-syzkaller-00865-g4874fb94 #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 06/29/2022
      RIP: 0010:skb_mac_header include/linux/skbuff.h:2785 [inline]
      RIP: 0010:bpf_internal_load_pointer_neg_helper+0x1b1/0x1c0 kernel/bpf/core.c:74
      Code: ff ff 45 31 f6 e9 5a ff ff ff e8 aa 27 40 00 e9 3b ff ff ff e8 90 27 40 00 e9 df fe ff ff e8 86 27 40 00 eb 9e e8 2f 2c f3 ff <0f> 0b eb b1 e8 96 27 40 00 e9 79 fe ff ff 90 41 57 41 56 41 55 41
      RSP: 0018:ffffc9000309f668 EFLAGS: 00010216
      RAX: 0000000000000118 RBX: ffffffffffeff00c RCX: ffffc9000e417000
      RDX: 0000000000040000 RSI: ffffffff81873f21 RDI: 0000000000000003
      RBP: ffff8880842878c0 R08: 0000000000000003 R09: 000000000000ffff
      R10: 000000000000ffff R11: 0000000000000001 R12: 0000000000000004
      R13: ffff88803ac56c00 R14: 000000000000ffff R15: dffffc0000000000
      FS: 00007f5c88a16700(0000) GS:ffff8880b9b00000(0000) knlGS:0000000000000000
      CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00007fdaa9f6c058 CR3: 000000003a82c000 CR4: 00000000003506e0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Call Trace:
      <TASK>
      ____bpf_skb_load_helper_32 net/core/filter.c:276 [inline]
      bpf_skb_load_helper_32+0x191/0x220 net/core/filter.c:264
      
      Fixes: f9aefd6b ("net: warn if mac header was not set")
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Link: https://lore.kernel.org/bpf/20220707123900.945305-1-edumazet@google.com
      0326195f