1. 15 Apr, 2021 1 commit
    • Jason Xing's avatar
      i40e: fix the panic when running bpf in xdpdrv mode · 4e39a072
      Jason Xing authored
      Fix this panic by adding more rules to calculate the value of @rss_size_max
      which could be used in allocating the queues when bpf is loaded, which,
      however, could cause the failure and then trigger the NULL pointer of
      vsi->rx_rings. Prio to this fix, the machine doesn't care about how many
      cpus are online and then allocates 256 queues on the machine with 32 cpus
      online actually.
      
      Once the load of bpf begins, the log will go like this "failed to get
      tracking for 256 queues for VSI 0 err -12" and this "setup of MAIN VSI
      failed".
      
      Thus, I attach the key information of the crash-log here.
      
      BUG: unable to handle kernel NULL pointer dereference at
      0000000000000000
      RIP: 0010:i40e_xdp+0xdd/0x1b0 [i40e]
      Call Trace:
      [2160294.717292]  ? i40e_reconfig_rss_queues+0x170/0x170 [i40e]
      [2160294.717666]  dev_xdp_install+0x4f/0x70
      [2160294.718036]  dev_change_xdp_fd+0x11f/0x230
      [2160294.718380]  ? dev_disable_lro+0xe0/0xe0
      [2160294.718705]  do_setlink+0xac7/0xe70
      [2160294.719035]  ? __nla_parse+0xed/0x120
      [2160294.719365]  rtnl_newlink+0x73b/0x860
      
      Fixes: 41c445ff ("i40e: main driver core")
      Co-developed-by: default avatarShujin Li <lishujin@kuaishou.com>
      Signed-off-by: default avatarShujin Li <lishujin@kuaishou.com>
      Signed-off-by: default avatarJason Xing <xingwanli@kuaishou.com>
      Reviewed-by: default avatarJesse Brandeburg <jesse.brandeburg@intel.com>
      Acked-by: default avatarJesper Dangaard Brouer <brouer@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4e39a072
  2. 14 Apr, 2021 12 commits
  3. 13 Apr, 2021 11 commits
    • Michael Brown's avatar
      xen-netback: Check for hotplug-status existence before watching · 2afeec08
      Michael Brown authored
      The logic in connect() is currently written with the assumption that
      xenbus_watch_pathfmt() will return an error for a node that does not
      exist.  This assumption is incorrect: xenstore does allow a watch to
      be registered for a nonexistent node (and will send notifications
      should the node be subsequently created).
      
      As of commit 1f256578 ("xen-netback: remove 'hotplug-status' once it
      has served its purpose"), this leads to a failure when a domU
      transitions into XenbusStateConnected more than once.  On the first
      domU transition into Connected state, the "hotplug-status" node will
      be deleted by the hotplug_status_changed() callback in dom0.  On the
      second or subsequent domU transition into Connected state, the
      hotplug_status_changed() callback will therefore never be invoked, and
      so the backend will remain stuck in InitWait.
      
      This failure prevents scenarios such as reloading the xen-netfront
      module within a domU, or booting a domU via iPXE.  There is
      unfortunately no way for the domU to work around this dom0 bug.
      
      Fix by explicitly checking for existence of the "hotplug-status" node,
      thereby creating the behaviour that was previously assumed to exist.
      Signed-off-by: default avatarMichael Brown <mbrown@fensystems.co.uk>
      Reviewed-by: default avatarPaul Durrant <paul@xen.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2afeec08
    • Eric Dumazet's avatar
      gro: ensure frag0 meets IP header alignment · 38ec4944
      Eric Dumazet authored
      After commit 0f6925b3 ("virtio_net: Do not pull payload in skb->head")
      Guenter Roeck reported one failure in his tests using sh architecture.
      
      After much debugging, we have been able to spot silent unaligned accesses
      in inet_gro_receive()
      
      The issue at hand is that upper networking stacks assume their header
      is word-aligned. Low level drivers are supposed to reserve NET_IP_ALIGN
      bytes before the Ethernet header to make that happen.
      
      This patch hardens skb_gro_reset_offset() to not allow frag0 fast-path
      if the fragment is not properly aligned.
      
      Some arches like x86, arm64 and powerpc do not care and define NET_IP_ALIGN
      as 0, this extra check will be a NOP for them.
      
      Note that if frag0 is not used, GRO will call pskb_may_pull()
      as many times as needed to pull network and transport headers.
      
      Fixes: 0f6925b3 ("virtio_net: Do not pull payload in skb->head")
      Fixes: 78a478d0 ("gro: Inline skb_gro_header and cache frag0 virtual address")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Cc: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
      Cc: "Michael S. Tsirkin" <mst@redhat.com>
      Cc: Jason Wang <jasowang@redhat.com>
      Acked-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Tested-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      38ec4944
    • Or Cohen's avatar
      net/sctp: fix race condition in sctp_destroy_sock · b166a20b
      Or Cohen authored
      If sctp_destroy_sock is called without sock_net(sk)->sctp.addr_wq_lock
      held and sp->do_auto_asconf is true, then an element is removed
      from the auto_asconf_splist without any proper locking.
      
      This can happen in the following functions:
      1. In sctp_accept, if sctp_sock_migrate fails.
      2. In inet_create or inet6_create, if there is a bpf program
         attached to BPF_CGROUP_INET_SOCK_CREATE which denies
         creation of the sctp socket.
      
      The bug is fixed by acquiring addr_wq_lock in sctp_destroy_sock
      instead of sctp_close.
      
      This addresses CVE-2021-23133.
      Reported-by: default avatarOr Cohen <orcohen@paloaltonetworks.com>
      Reviewed-by: default avatarXin Long <lucien.xin@gmail.com>
      Fixes: 61023658 ("bpf: Add new cgroup attach type to enable sock modifications")
      Signed-off-by: default avatarOr Cohen <orcohen@paloaltonetworks.com>
      Acked-by: default avatarMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b166a20b
    • Lijun Pan's avatar
      ibmvnic: correctly use dev_consume/free_skb_irq · ca09bf7b
      Lijun Pan authored
      It is more correct to use dev_kfree_skb_irq when packets are dropped,
      and to use dev_consume_skb_irq when packets are consumed.
      
      Fixes: 0d973388 ("ibmvnic: Introduce xmit_more support using batched subCRQ hcalls")
      Suggested-by: default avatarThomas Falcon <tlfalcon@linux.ibm.com>
      Signed-off-by: default avatarLijun Pan <lijunp213@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ca09bf7b
    • Jonathon Reinhart's avatar
      net: Make tcp_allowed_congestion_control readonly in non-init netns · 97684f09
      Jonathon Reinhart authored
      Currently, tcp_allowed_congestion_control is global and writable;
      writing to it in any net namespace will leak into all other net
      namespaces.
      
      tcp_available_congestion_control and tcp_allowed_congestion_control are
      the only sysctls in ipv4_net_table (the per-netns sysctl table) with a
      NULL data pointer; their handlers (proc_tcp_available_congestion_control
      and proc_allowed_congestion_control) have no other way of referencing a
      struct net. Thus, they operate globally.
      
      Because ipv4_net_table does not use designated initializers, there is no
      easy way to fix up this one "bad" table entry. However, the data pointer
      updating logic shouldn't be applied to NULL pointers anyway, so we
      instead force these entries to be read-only.
      
      These sysctls used to exist in ipv4_table (init-net only), but they were
      moved to the per-net ipv4_net_table, presumably without realizing that
      tcp_allowed_congestion_control was writable and thus introduced a leak.
      
      Because the intent of that commit was only to know (i.e. read) "which
      congestion algorithms are available or allowed", this read-only solution
      should be sufficient.
      
      The logic added in recent commit
      31c4d2f1: ("net: Ensure net namespace isolation of sysctls")
      does not and cannot check for NULL data pointers, because
      other table entries (e.g. /proc/sys/net/netfilter/nf_log/) have
      .data=NULL but use other methods (.extra2) to access the struct net.
      
      Fixes: 9cb8e048 ("net/ipv4/sysctl: show tcp_{allowed, available}_congestion_control in non-initial netns")
      Signed-off-by: default avatarJonathon Reinhart <jonathon.reinhart@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      97684f09
    • David S. Miller's avatar
      Merge branch 'catch-all-devices' · 61aaa1aa
      David S. Miller authored
      Hristo Venev says:
      
      ====================
      net: Fix two use-after-free bugs
      
      The two patches fix two use-after-free bugs related to cleaning up
      network namespaces, one in sit and one in ip6_tunnel. They are easy to
      trigger if the user has the ability to create network namespaces.
      
      The bugs can be used to trigger null pointer dereferences. I am not
      sure if they can be exploited further, but I would guess that they
      can. I am not sending them to the mailing list without confirmation
      that doing so would be OK.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      61aaa1aa
    • Hristo Venev's avatar
      net: ip6_tunnel: Unregister catch-all devices · 941ea91e
      Hristo Venev authored
      Similarly to the sit case, we need to remove the tunnels with no
      addresses that have been moved to another network namespace.
      
      Fixes: 0bd87628 ("ip6tnl: add x-netns support")
      Signed-off-by: default avatarHristo Venev <hristo@venev.name>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      941ea91e
    • Hristo Venev's avatar
      net: sit: Unregister catch-all devices · 610f8c0f
      Hristo Venev authored
      A sit interface created without a local or a remote address is linked
      into the `sit_net::tunnels_wc` list of its original namespace. When
      deleting a network namespace, delete the devices that have been moved.
      
      The following script triggers a null pointer dereference if devices
      linked in a deleted `sit_net` remain:
      
          for i in `seq 1 30`; do
              ip netns add ns-test
              ip netns exec ns-test ip link add dev veth0 type veth peer veth1
              ip netns exec ns-test ip link add dev sit$i type sit dev veth0
              ip netns exec ns-test ip link set dev sit$i netns $$
              ip netns del ns-test
          done
          for i in `seq 1 30`; do
              ip link del dev sit$i
          done
      
      Fixes: 5e6700b3 ("sit: add support of x-netns")
      Signed-off-by: default avatarHristo Venev <hristo@venev.name>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      610f8c0f
    • Colin Ian King's avatar
      ice: Fix potential infinite loop when using u8 loop counter · ef963ae4
      Colin Ian King authored
      A for-loop is using a u8 loop counter that is being compared to
      a u32 cmp_dcbcfg->numapp to check for the end of the loop. If
      cmp_dcbcfg->numapp is larger than 255 then the counter j will wrap
      around to zero and hence an infinite loop occurs. Fix this by making
      counter j the same type as cmp_dcbcfg->numapp.
      
      Addresses-Coverity: ("Infinite loop")
      Fixes: aeac8ce8 ("ice: Recognize 860 as iSCSI port in CEE mode")
      Signed-off-by: default avatarColin Ian King <colin.king@canonical.com>
      Tested-by: default avatarTony Brelinski <tonyx.brelinski@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      ef963ae4
    • Yongxin Liu's avatar
      ixgbe: fix unbalanced device enable/disable in suspend/resume · debb9df3
      Yongxin Liu authored
      pci_disable_device() called in __ixgbe_shutdown() decreases
      dev->enable_cnt by 1. pci_enable_device_mem() which increases
      dev->enable_cnt by 1, was removed from ixgbe_resume() in commit
      6f82b255 ("ixgbe: use generic power management"). This caused
      unbalanced increase/decrease. So add pci_enable_device_mem() back.
      
      Fix the following call trace.
      
        ixgbe 0000:17:00.1: disabling already-disabled device
        Call Trace:
         __ixgbe_shutdown+0x10a/0x1e0 [ixgbe]
         ixgbe_suspend+0x32/0x70 [ixgbe]
         pci_pm_suspend+0x87/0x160
         ? pci_pm_freeze+0xd0/0xd0
         dpm_run_callback+0x42/0x170
         __device_suspend+0x114/0x460
         async_suspend+0x1f/0xa0
         async_run_entry_fn+0x3c/0xf0
         process_one_work+0x1dd/0x410
         worker_thread+0x34/0x3f0
         ? cancel_delayed_work+0x90/0x90
         kthread+0x14c/0x170
         ? kthread_park+0x90/0x90
         ret_from_fork+0x1f/0x30
      
      Fixes: 6f82b255 ("ixgbe: use generic power management")
      Signed-off-by: default avatarYongxin Liu <yongxin.liu@windriver.com>
      Tested-by: default avatarDave Switzer <david.switzer@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      debb9df3
    • Alexander Duyck's avatar
      ixgbe: Fix NULL pointer dereference in ethtool loopback test · 31166efb
      Alexander Duyck authored
      The ixgbe driver currently generates a NULL pointer dereference when
      performing the ethtool loopback test. This is due to the fact that there
      isn't a q_vector associated with the test ring when it is setup as
      interrupts are not normally added to the test rings.
      
      To address this I have added code that will check for a q_vector before
      returning a napi_id value. If a q_vector is not present it will return a
      value of 0.
      
      Fixes: b02e5a0e ("xsk: Propagate napi_id to XDP socket Rx path")
      Signed-off-by: default avatarAlexander Duyck <alexanderduyck@fb.com>
      Acked-by: default avatarBjörn Töpel <bjorn.topel@intel.com>
      Tested-by: default avatarDave Switzer <david.switzer@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      31166efb
  4. 12 Apr, 2021 5 commits
    • David S. Miller's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf · ccb39c62
      David S. Miller authored
      Pablo Neira Ayuso says:
      
      ====================
      Netfilter fixes for net
      
      The following patchset contains Netfilter fixes for net:
      
      1) Fix NAT IPv6 offload in the flowtable.
      
      2) icmpv6 is printed as unknown in /proc/net/nf_conntrack.
      
      3) Use div64_u64() in nft_limit, from Eric Dumazet.
      
      4) Use pre_exit to unregister ebtables and arptables hooks,
         from Florian Westphal.
      
      5) Fix out-of-bound memset in x_tables compat match/target,
         also from Florian.
      
      6) Clone set elements expression to ensure proper initialization.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ccb39c62
    • Pablo Neira Ayuso's avatar
      netfilter: nftables: clone set element expression template · 4d8f9065
      Pablo Neira Ayuso authored
      memcpy() breaks when using connlimit in set elements. Use
      nft_expr_clone() to initialize the connlimit expression list, otherwise
      connlimit garbage collector crashes when walking on the list head copy.
      
      [  493.064656] Workqueue: events_power_efficient nft_rhash_gc [nf_tables]
      [  493.064685] RIP: 0010:find_or_evict+0x5a/0x90 [nf_conncount]
      [  493.064694] Code: 2b 43 40 83 f8 01 77 0d 48 c7 c0 f5 ff ff ff 44 39 63 3c 75 df 83 6d 18 01 48 8b 43 08 48 89 de 48 8b 13 48 8b 3d ee 2f 00 00 <48> 89 42 08 48 89 10 48 b8 00 01 00 00 00 00 ad de 48 89 03 48 83
      [  493.064699] RSP: 0018:ffffc90000417dc0 EFLAGS: 00010297
      [  493.064704] RAX: 0000000000000000 RBX: ffff888134f38410 RCX: 0000000000000000
      [  493.064708] RDX: 0000000000000000 RSI: ffff888134f38410 RDI: ffff888100060cc0
      [  493.064711] RBP: ffff88812ce594a8 R08: ffff888134f38438 R09: 00000000ebb9025c
      [  493.064714] R10: ffffffff8219f838 R11: 0000000000000017 R12: 0000000000000001
      [  493.064718] R13: ffffffff82146740 R14: ffff888134f38410 R15: 0000000000000000
      [  493.064721] FS:  0000000000000000(0000) GS:ffff88840e440000(0000) knlGS:0000000000000000
      [  493.064725] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  493.064729] CR2: 0000000000000008 CR3: 00000001330aa002 CR4: 00000000001706e0
      [  493.064733] Call Trace:
      [  493.064737]  nf_conncount_gc_list+0x8f/0x150 [nf_conncount]
      [  493.064746]  nft_rhash_gc+0x106/0x390 [nf_tables]
      Reported-by: default avatarLaura Garcia Liebana <nevola@gmail.com>
      Fixes: 40944452 ("netfilter: nf_tables: add elements with stateful expressions")
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      4d8f9065
    • Florian Westphal's avatar
      netfilter: x_tables: fix compat match/target pad out-of-bound write · b29c457a
      Florian Westphal authored
      xt_compat_match/target_from_user doesn't check that zeroing the area
      to start of next rule won't write past end of allocated ruleset blob.
      
      Remove this code and zero the entire blob beforehand.
      
      Reported-by: syzbot+cfc0247ac173f597aaaa@syzkaller.appspotmail.com
      Reported-by: default avatarAndy Nguyen <theflow@google.com>
      Fixes: 9fa492cd ("[NETFILTER]: x_tables: simplify compat API")
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      b29c457a
    • Jakub Kicinski's avatar
      ethtool: fix kdoc attr name · f33b0e19
      Jakub Kicinski authored
      Add missing 't' in attrtype.
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f33b0e19
    • Pali Rohár's avatar
      net: phy: marvell: fix detection of PHY on Topaz switches · 1fe976d3
      Pali Rohár authored
      Since commit fee2d546 ("net: phy: marvell: mv88e6390 temperature
      sensor reading"), Linux reports the temperature of Topaz hwmon as
      constant -75°C.
      
      This is because switches from the Topaz family (88E6141 / 88E6341) have
      the address of the temperature sensor register different from Peridot.
      
      This address is instead compatible with 88E1510 PHYs, as was used for
      Topaz before the above mentioned commit.
      
      Create a new mapping table between switch family and PHY ID for families
      which don't have a model number. And define PHY IDs for Topaz and Peridot
      families.
      
      Create a new PHY ID and a new PHY driver for Topaz's internal PHY.
      The only difference from Peridot's PHY driver is the HWMON probing
      method.
      
      Prior this change Topaz's internal PHY is detected by kernel as:
      
        PHY [...] driver [Marvell 88E6390] (irq=63)
      
      And afterwards as:
      
        PHY [...] driver [Marvell 88E6341 Family] (irq=63)
      Signed-off-by: default avatarPali Rohár <pali@kernel.org>
      BugLink: https://github.com/globalscaletechnologies/linux/issues/1
      Fixes: fee2d546 ("net: phy: marvell: mv88e6390 temperature sensor reading")
      Reviewed-by: default avatarMarek Behún <kabel@kernel.org>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1fe976d3
  5. 11 Apr, 2021 3 commits
  6. 10 Apr, 2021 3 commits
    • Florian Westphal's avatar
      netfilter: arp_tables: add pre_exit hook for table unregister · d163a925
      Florian Westphal authored
      Same problem that also existed in iptables/ip(6)tables, when
      arptable_filter is removed there is no longer a wait period before the
      table/ruleset is free'd.
      
      Unregister the hook in pre_exit, then remove the table in the exit
      function.
      This used to work correctly because the old nf_hook_unregister API
      did unconditional synchronize_net.
      
      The per-net hook unregister function uses call_rcu instead.
      
      Fixes: b9e69e12 ("netfilter: xtables: don't hook tables by default")
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      d163a925
    • Florian Westphal's avatar
      netfilter: bridge: add pre_exit hooks for ebtable unregistration · 7ee3c61d
      Florian Westphal authored
      Just like ip/ip6/arptables, the hooks have to be removed, then
      synchronize_rcu() has to be called to make sure no more packets are being
      processed before the ruleset data is released.
      
      Place the hook unregistration in the pre_exit hook, then call the new
      ebtables pre_exit function from there.
      
      Years ago, when first netns support got added for netfilter+ebtables,
      this used an older (now removed) netfilter hook unregister API, that did
      a unconditional synchronize_rcu().
      
      Now that all is done with call_rcu, ebtable_{filter,nat,broute} pernet exit
      handlers may free the ebtable ruleset while packets are still in flight.
      
      This can only happens on module removal, not during netns exit.
      
      The new function expects the table name, not the table struct.
      
      This is because upcoming patch set (targeting -next) will remove all
      net->xt.{nat,filter,broute}_table instances, this makes it necessary
      to avoid external references to those member variables.
      
      The existing APIs will be converted, so follow the upcoming scheme of
      passing name + hook type instead.
      
      Fixes: aee12a0a ("ebtables: remove nf_hook_register usage")
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      7ee3c61d
    • Eric Dumazet's avatar
      netfilter: nft_limit: avoid possible divide error in nft_limit_init · b895bdf5
      Eric Dumazet authored
      div_u64() divides u64 by u32.
      
      nft_limit_init() wants to divide u64 by u64, use the appropriate
      math function (div64_u64)
      
      divide error: 0000 [#1] PREEMPT SMP KASAN
      CPU: 1 PID: 8390 Comm: syz-executor188 Not tainted 5.12.0-rc4-syzkaller #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      RIP: 0010:div_u64_rem include/linux/math64.h:28 [inline]
      RIP: 0010:div_u64 include/linux/math64.h:127 [inline]
      RIP: 0010:nft_limit_init+0x2a2/0x5e0 net/netfilter/nft_limit.c:85
      Code: ef 4c 01 eb 41 0f 92 c7 48 89 de e8 38 a5 22 fa 4d 85 ff 0f 85 97 02 00 00 e8 ea 9e 22 fa 4c 0f af f3 45 89 ed 31 d2 4c 89 f0 <49> f7 f5 49 89 c6 e8 d3 9e 22 fa 48 8d 7d 48 48 b8 00 00 00 00 00
      RSP: 0018:ffffc90009447198 EFLAGS: 00010246
      RAX: 0000000000000000 RBX: 0000200000000000 RCX: 0000000000000000
      RDX: 0000000000000000 RSI: ffffffff875152e6 RDI: 0000000000000003
      RBP: ffff888020f80908 R08: 0000200000000000 R09: 0000000000000000
      R10: ffffffff875152d8 R11: 0000000000000000 R12: ffffc90009447270
      R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
      FS:  000000000097a300(0000) GS:ffff8880b9d00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00000000200001c4 CR3: 0000000026a52000 CR4: 00000000001506e0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Call Trace:
       nf_tables_newexpr net/netfilter/nf_tables_api.c:2675 [inline]
       nft_expr_init+0x145/0x2d0 net/netfilter/nf_tables_api.c:2713
       nft_set_elem_expr_alloc+0x27/0x280 net/netfilter/nf_tables_api.c:5160
       nf_tables_newset+0x1997/0x3150 net/netfilter/nf_tables_api.c:4321
       nfnetlink_rcv_batch+0x85a/0x21b0 net/netfilter/nfnetlink.c:456
       nfnetlink_rcv_skb_batch net/netfilter/nfnetlink.c:580 [inline]
       nfnetlink_rcv+0x3af/0x420 net/netfilter/nfnetlink.c:598
       netlink_unicast_kernel net/netlink/af_netlink.c:1312 [inline]
       netlink_unicast+0x533/0x7d0 net/netlink/af_netlink.c:1338
       netlink_sendmsg+0x856/0xd90 net/netlink/af_netlink.c:1927
       sock_sendmsg_nosec net/socket.c:654 [inline]
       sock_sendmsg+0xcf/0x120 net/socket.c:674
       ____sys_sendmsg+0x6e8/0x810 net/socket.c:2350
       ___sys_sendmsg+0xf3/0x170 net/socket.c:2404
       __sys_sendmsg+0xe5/0x1b0 net/socket.c:2433
       do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
       entry_SYSCALL_64_after_hwframe+0x44/0xae
      
      Fixes: c26844ed ("netfilter: nf_tables: Fix nft limit burst handling")
      Fixes: 3e0f64b7 ("netfilter: nft_limit: fix packet ratelimiting")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Diagnosed-by: default avatarLuigi Rizzo <lrizzo@google.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      b895bdf5
  7. 09 Apr, 2021 5 commits
    • Linus Torvalds's avatar
      Merge tag 'net-5.12-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · 4e04e751
      Linus Torvalds authored
      Pull networking fixes from Jakub Kicinski:
       "Networking fixes for 5.12-rc7, including fixes from can, ipsec,
        mac80211, wireless, and bpf trees.
      
        No scary regressions here or in the works, but small fixes for 5.12
        changes keep coming.
      
        Current release - regressions:
      
         - virtio: do not pull payload in skb->head
      
         - virtio: ensure mac header is set in virtio_net_hdr_to_skb()
      
         - Revert "net: correct sk_acceptq_is_full()"
      
         - mptcp: revert "mptcp: provide subflow aware release function"
      
         - ethernet: lan743x: fix ethernet frame cutoff issue
      
         - dsa: fix type was not set for devlink port
      
         - ethtool: remove link_mode param and derive link params from driver
      
         - sched: htb: fix null pointer dereference on a null new_q
      
         - wireless: iwlwifi: Fix softirq/hardirq disabling in
           iwl_pcie_enqueue_hcmd()
      
         - wireless: iwlwifi: fw: fix notification wait locking
      
         - wireless: brcmfmac: p2p: Fix deadlock introduced by avoiding the
           rtnl dependency
      
        Current release - new code bugs:
      
         - napi: fix hangup on napi_disable for threaded napi
      
         - bpf: take module reference for trampoline in module
      
         - wireless: mt76: mt7921: fix airtime reporting and related tx hangs
      
         - wireless: iwlwifi: mvm: rfi: don't lock mvm->mutex when sending
           config command
      
        Previous releases - regressions:
      
         - rfkill: revert back to old userspace API by default
      
         - nfc: fix infinite loop, refcount & memory leaks in LLCP sockets
      
         - let skb_orphan_partial wake-up waiters
      
         - xfrm/compat: Cleanup WARN()s that can be user-triggered
      
         - vxlan, geneve: do not modify the shared tunnel info when PMTU
           triggers an ICMP reply
      
         - can: fix msg_namelen values depending on CAN_REQUIRED_SIZE
      
         - can: uapi: mark union inside struct can_frame packed
      
         - sched: cls: fix action overwrite reference counting
      
         - sched: cls: fix err handler in tcf_action_init()
      
         - ethernet: mlxsw: fix ECN marking in tunnel decapsulation
      
         - ethernet: nfp: Fix a use after free in nfp_bpf_ctrl_msg_rx
      
         - ethernet: i40e: fix receiving of single packets in xsk zero-copy
           mode
      
         - ethernet: cxgb4: avoid collecting SGE_QBASE regs during traffic
      
        Previous releases - always broken:
      
         - bpf: Refuse non-O_RDWR flags in BPF_OBJ_GET
      
         - bpf: Refcount task stack in bpf_get_task_stack
      
         - bpf, x86: Validate computation of branch displacements
      
         - ieee802154: fix many similar syzbot-found bugs
             - fix NULL dereferences in netlink attribute handling
             - reject unsupported operations on monitor interfaces
             - fix error handling in llsec_key_alloc()
      
         - xfrm: make ipv4 pmtu check honor ip header df
      
         - xfrm: make hash generation lock per network namespace
      
         - xfrm: esp: delete NETIF_F_SCTP_CRC bit from features for esp
           offload
      
         - ethtool: fix incorrect datatype in set_eee ops
      
         - xdp: fix xdp_return_frame() kernel BUG throw for page_pool memory
           model
      
         - openvswitch: fix send of uninitialized stack memory in ct limit
           reply
      
        Misc:
      
         - udp: add get handling for UDP_GRO sockopt"
      
      * tag 'net-5.12-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (182 commits)
        net: fix hangup on napi_disable for threaded napi
        net: hns3: Trivial spell fix in hns3 driver
        lan743x: fix ethernet frame cutoff issue
        net: ipv6: check for validity before dereferencing cfg->fc_nlinfo.nlh
        net: dsa: lantiq_gswip: Configure all remaining GSWIP_MII_CFG bits
        net: dsa: lantiq_gswip: Don't use PHY auto polling
        net: sched: sch_teql: fix null-pointer dereference
        ipv6: report errors for iftoken via netlink extack
        net: sched: fix err handler in tcf_action_init()
        net: sched: fix action overwrite reference counting
        Revert "net: sched: bump refcount for new action in ACT replace mode"
        ice: fix memory leak of aRFS after resuming from suspend
        i40e: Fix sparse warning: missing error code 'err'
        i40e: Fix sparse error: 'vsi->netdev' could be null
        i40e: Fix sparse error: uninitialized symbol 'ring'
        i40e: Fix sparse errors in i40e_txrx.c
        i40e: Fix parameters in aq_get_phy_register()
        nl80211: fix beacon head validation
        bpf, x86: Validate computation of branch displacements for x86-32
        bpf, x86: Validate computation of branch displacements for x86-64
        ...
      4e04e751
    • Linus Torvalds's avatar
      Merge tag 'io_uring-5.12-2021-04-09' of git://git.kernel.dk/linux-block · 3b978435
      Linus Torvalds authored
      Pull io_uring fixes from Jens Axboe:
       "Two minor fixups for the reissue logic, and one for making sure that
        unbounded work is canceled on io-wq exit"
      
      * tag 'io_uring-5.12-2021-04-09' of git://git.kernel.dk/linux-block:
        io-wq: cancel unbounded works on io-wq destroy
        io_uring: fix rw req completion
        io_uring: clear F_REISSUE right after getting it
      3b978435
    • Linus Torvalds's avatar
      Merge tag 'devicetree-fixes-for-5.12-2' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux · a2521822
      Linus Torvalds authored
      Pull devicetree fixes from Rob Herring:
      
       - Fix fw_devlink failure with ".*,nr-gpios" properties
      
       - Doc link reference fixes from Mauro
      
       - Fixes for unaligned FDT handling found on OpenRisc. First, avoid
         crash with better error handling when unflattening an unaligned FDT.
         Second, fix memory allocations for FDTs to ensure alignment.
      
      * tag 'devicetree-fixes-for-5.12-2' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux:
        of: property: fw_devlink: do not link ".*,nr-gpios"
        dt-bindings:iio:adc: update motorola,cpcap-adc.yaml reference
        dt-bindings: fix references for iio-bindings.txt
        dt-bindings: don't use ../dir for doc references
        of: unittest: overlay: ensure proper alignment of copied FDT
        of: properly check for error returned by fdt_get_name()
      a2521822
    • Linus Torvalds's avatar
      Merge tag 'drm-fixes-2021-04-10' of git://anongit.freedesktop.org/drm/drm · a85f165e
      Linus Torvalds authored
      Pull drm fixes from Dave Airlie:
       "Was relatively quiet this week, but still a few pulls came in, pretty
        much small fixes across the board, a couple of regression fixes in the
        amdgpu/radeon code, msm has a few minor fixes across the board, a
        panel regression fix also.
      
        amdgpu:
         - DCN3 fix
         - Fix CAC setting regression for TOPAZ
         - Fix ttm regression
      
        radeon:
         - Fix ttm regression
      
        msm:
         - a5xx/a6xx timestamp fix
         - microcode version check
         - fail path fix
         - block programming fix
         - error removal fix
      
        i915:
         - Fix invalid access to ACPI _DSM objects
      
        xen:
         - Fix use-after-free in xen
         - minor duplicate defintion cleanup
      
        vc4:
         - Reduce fifo threshold on hvs4 to fix a fifo full error
         - minor redunantant assignment cleanup
      
        panel:
         - Disable TE support for Droid4 and N950"
      
      * tag 'drm-fixes-2021-04-10' of git://anongit.freedesktop.org/drm/drm:
        drm/vc4: crtc: Reduce PV fifo threshold on hvs4
        drm/vc4: plane: Remove redundant assignment
        drm/amdgpu/smu7: fix CAC setting on TOPAZ
        drm/radeon: Fix size overflow
        drm/amdgpu: Fix size overflow
        drm/i915: Fix invalid access to ACPI _DSM objects
        drm/amd/display: Add missing mask for DCN3
        drm/panel: panel-dsi-cm: disable TE for now
        drm/msm/disp/dpu1: program 3d_merge only if block is attached
        drm/msm: a6xx: fix version check for the A650 SQE microcode
        drm/msm: Fix a5xx/a6xx timestamps
        drm/msm: Fix removal of valid error case when checking speed_bin
        drm/msm: Set drvdata to NULL when msm_drm_init() fails
        drivers: gpu: drm: xen_drm_front_drm_info is declared twice
        gpu/xen: Fix a use after free in xen_drm_drv_init
      a85f165e
    • Paolo Abeni's avatar
      net: fix hangup on napi_disable for threaded napi · 27f0ad71
      Paolo Abeni authored
      napi_disable() is subject to an hangup, when the threaded
      mode is enabled and the napi is under heavy traffic.
      
      If the relevant napi has been scheduled and the napi_disable()
      kicks in before the next napi_threaded_wait() completes - so
      that the latter quits due to the napi_disable_pending() condition,
      the existing code leaves the NAPI_STATE_SCHED bit set and the
      napi_disable() loop waiting for such bit will hang.
      
      This patch addresses the issue by dropping the NAPI_STATE_DISABLE
      bit test in napi_thread_wait(). The later napi_threaded_poll()
      iteration will take care of clearing the NAPI_STATE_SCHED.
      
      This also addresses a related problem reported by Jakub:
      before this patch a napi_disable()/napi_enable() pair killed
      the napi thread, effectively disabling the threaded mode.
      On the patched kernel napi_disable() simply stops scheduling
      the relevant thread.
      
      v1 -> v2:
        - let the main napi_thread_poll() loop clear the SCHED bit
      Reported-by: default avatarJakub Kicinski <kuba@kernel.org>
      Fixes: 29863d41 ("net: implement threaded-able napi poll loop support")
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Link: https://lore.kernel.org/r/883923fa22745a9589e8610962b7dc59df09fb1f.1617981844.git.pabeni@redhat.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      27f0ad71