1. 29 Mar, 2024 36 commits
  2. 28 Mar, 2024 4 commits
    • Linus Torvalds's avatar
      Merge tag 'net-6.9-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · 50108c35
      Linus Torvalds authored
      Pull networking fixes from Paolo Abeni:
       "Including fixes from bpf, WiFi and netfilter.
      
        Current release - regressions:
      
         - ipv6: fix address dump when IPv6 is disabled on an interface
      
        Current release - new code bugs:
      
         - bpf: temporarily disable atomic operations in BPF arena
      
         - nexthop: fix uninitialized variable in nla_put_nh_group_stats()
      
        Previous releases - regressions:
      
         - bpf: protect against int overflow for stack access size
      
         - hsr: fix the promiscuous mode in offload mode
      
         - wifi: don't always use FW dump trig
      
         - tls: adjust recv return with async crypto and failed copy to
           userspace
      
         - tcp: properly terminate timers for kernel sockets
      
         - ice: fix memory corruption bug with suspend and rebuild
      
         - at803x: fix kernel panic with at8031_probe
      
         - qeth: handle deferred cc1
      
        Previous releases - always broken:
      
         - bpf: fix bug in BPF_LDX_MEMSX
      
         - netfilter: reject table flag and netdev basechain updates
      
         - inet_defrag: prevent sk release while still in use
      
         - wifi: pick the version of SESSION_PROTECTION_NOTIF
      
         - wwan: t7xx: split 64bit accesses to fix alignment issues
      
         - mlxbf_gige: call request_irq() after NAPI initialized
      
         - hns3: fix kernel crash when devlink reload during pf
           initialization"
      
      * tag 'net-6.9-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (81 commits)
        inet: inet_defrag: prevent sk release while still in use
        Octeontx2-af: fix pause frame configuration in GMP mode
        net: lan743x: Add set RFE read fifo threshold for PCI1x1x chips
        net: bcmasp: Remove phy_{suspend/resume}
        net: bcmasp: Bring up unimac after PHY link up
        net: phy: qcom: at803x: fix kernel panic with at8031_probe
        netfilter: arptables: Select NETFILTER_FAMILY_ARP when building arp_tables.c
        netfilter: nf_tables: skip netdev hook unregistration if table is dormant
        netfilter: nf_tables: reject table flag and netdev basechain updates
        netfilter: nf_tables: reject destroy command to remove basechain hooks
        bpf: update BPF LSM designated reviewer list
        bpf: Protect against int overflow for stack access size
        bpf: Check bloom filter map value size
        bpf: fix warning for crash_kexec
        selftests: netdevsim: set test timeout to 10 minutes
        net: wan: framer: Add missing static inline qualifiers
        mlxbf_gige: call request_irq() after NAPI initialized
        tls: get psock ref after taking rxlock to avoid leak
        selftests: tls: add test with a partially invalid iov
        tls: adjust recv return with async crypto and failed copy to userspace
        ...
      50108c35
    • Florian Westphal's avatar
      inet: inet_defrag: prevent sk release while still in use · 18685451
      Florian Westphal authored
      ip_local_out() and other functions can pass skb->sk as function argument.
      
      If the skb is a fragment and reassembly happens before such function call
      returns, the sk must not be released.
      
      This affects skb fragments reassembled via netfilter or similar
      modules, e.g. openvswitch or ct_act.c, when run as part of tx pipeline.
      
      Eric Dumazet made an initial analysis of this bug.  Quoting Eric:
        Calling ip_defrag() in output path is also implying skb_orphan(),
        which is buggy because output path relies on sk not disappearing.
      
        A relevant old patch about the issue was :
        8282f274 ("inet: frag: Always orphan skbs inside ip_defrag()")
      
        [..]
      
        net/ipv4/ip_output.c depends on skb->sk being set, and probably to an
        inet socket, not an arbitrary one.
      
        If we orphan the packet in ipvlan, then downstream things like FQ
        packet scheduler will not work properly.
      
        We need to change ip_defrag() to only use skb_orphan() when really
        needed, ie whenever frag_list is going to be used.
      
      Eric suggested to stash sk in fragment queue and made an initial patch.
      However there is a problem with this:
      
      If skb is refragmented again right after, ip_do_fragment() will copy
      head->sk to the new fragments, and sets up destructor to sock_wfree.
      IOW, we have no choice but to fix up sk_wmem accouting to reflect the
      fully reassembled skb, else wmem will underflow.
      
      This change moves the orphan down into the core, to last possible moment.
      As ip_defrag_offset is aliased with sk_buff->sk member, we must move the
      offset into the FRAG_CB, else skb->sk gets clobbered.
      
      This allows to delay the orphaning long enough to learn if the skb has
      to be queued or if the skb is completing the reasm queue.
      
      In the former case, things work as before, skb is orphaned.  This is
      safe because skb gets queued/stolen and won't continue past reasm engine.
      
      In the latter case, we will steal the skb->sk reference, reattach it to
      the head skb, and fix up wmem accouting when inet_frag inflates truesize.
      
      Fixes: 7026b1dd ("netfilter: Pass socket pointer down through okfn().")
      Diagnosed-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarxingwei lee <xrivendell7@gmail.com>
      Reported-by: default avataryue sun <samsun1006219@gmail.com>
      Reported-by: syzbot+e5167d7144a62715044c@syzkaller.appspotmail.com
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Link: https://lore.kernel.org/r/20240326101845.30836-1-fw@strlen.deSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      18685451
    • Hariprasad Kelam's avatar
      Octeontx2-af: fix pause frame configuration in GMP mode · 40d4b480
      Hariprasad Kelam authored
      The Octeontx2 MAC block (CGX) has separate data paths (SMU and GMP) for
      different speeds, allowing for efficient data transfer.
      
      The previous patch which added pause frame configuration has a bug due
      to which pause frame feature is not working in GMP mode.
      
      This patch fixes the issue by configurating appropriate registers.
      
      Fixes: f7e086e7 ("octeontx2-af: Pause frame configuration at cgx")
      Signed-off-by: default avatarHariprasad Kelam <hkelam@marvell.com>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Link: https://lore.kernel.org/r/20240326052720.4441-1-hkelam@marvell.comSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      40d4b480
    • Raju Lakkaraju's avatar
      net: lan743x: Add set RFE read fifo threshold for PCI1x1x chips · e4a58989
      Raju Lakkaraju authored
      PCI11x1x Rev B0 devices might drop packets when receiving back to back frames
      at 2.5G link speed. Change the B0 Rev device's Receive filtering Engine FIFO
      threshold parameter from its hardware default of 4 to 3 dwords to prevent the
      problem. Rev C0 and later hardware already defaults to 3 dwords.
      
      Fixes: bb4f6bff ("net: lan743x: Add PCI11010 / PCI11414 device IDs")
      Signed-off-by: default avatarRaju Lakkaraju <Raju.Lakkaraju@microchip.com>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Link: https://lore.kernel.org/r/20240326065805.686128-1-Raju.Lakkaraju@microchip.comSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      e4a58989