1. 31 Jan, 2024 2 commits
    • Linus Lüssing's avatar
      bridge: mcast: fix disabled snooping after long uptime · f5c3eb4b
      Linus Lüssing authored
      The original idea of the delay_time check was to not apply multicast
      snooping too early when an MLD querier appears. And to instead wait at
      least for MLD reports to arrive before switching from flooding to group
      based, MLD snooped forwarding, to avoid temporary packet loss.
      
      However in a batman-adv mesh network it was noticed that after 248 days of
      uptime 32bit MIPS based devices would start to signal that they had
      stopped applying multicast snooping due to missing queriers - even though
      they were the elected querier and still sending MLD queries themselves.
      
      While time_is_before_jiffies() generally is safe against jiffies
      wrap-arounds, like the code comments in jiffies.h explain, it won't
      be able to track a difference larger than ULONG_MAX/2. With a 32bit
      large jiffies and one jiffies tick every 10ms (CONFIG_HZ=100) on these MIPS
      devices running OpenWrt this would result in a difference larger than
      ULONG_MAX/2 after 248 (= 2^32/100/60/60/24/2) days and
      time_is_before_jiffies() would then start to return false instead of
      true. Leading to multicast snooping not being applied to multicast
      packets anymore.
      
      Fix this issue by using a proper timer_list object which won't have this
      ULONG_MAX/2 difference limitation.
      
      Fixes: b00589af ("bridge: disable snooping if there is no querier")
      Signed-off-by: default avatarLinus Lüssing <linus.luessing@c0d3.blue>
      Acked-by: default avatarNikolay Aleksandrov <razor@blackwall.org>
      Link: https://lore.kernel.org/r/20240127175033.9640-1-linus.luessing@c0d3.blueSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      f5c3eb4b
    • Ido Schimmel's avatar
      selftests: net: Add missing matchall classifier · b40f873a
      Ido Schimmel authored
      One of the test cases in the test_bridge_backup_port.sh selftest relies
      on a matchall classifier to drop unrelated traffic so that the Tx drop
      counter on the VXLAN device will only be incremented as a result of
      traffic generated by the test.
      
      However, the configuration option for the matchall classifier is
      missing from the configuration file which might explain the failures we
      see in the netdev CI [1].
      
      Fix by adding CONFIG_NET_CLS_MATCHALL to the configuration file.
      
      [1]
       # Backup nexthop ID - invalid IDs
       # -------------------------------
       [...]
       # TEST: Forwarding out of vx0                                         [ OK ]
       # TEST: No forwarding using backup nexthop ID                         [ OK ]
       # TEST: Tx drop increased                                             [FAIL]
       # TEST: IPv6 address family nexthop as backup nexthop                 [ OK ]
       # TEST: No forwarding out of swp1                                     [ OK ]
       # TEST: Forwarding out of vx0                                         [ OK ]
       # TEST: No forwarding using backup nexthop ID                         [ OK ]
       # TEST: Tx drop increased                                             [FAIL]
       [...]
      
      Fixes: b4084530 ("selftests: net: Add bridge backup port and backup nexthop ID test")
      Signed-off-by: default avatarIdo Schimmel <idosch@nvidia.com>
      Acked-by: default avatarNikolay Aleksandrov <razor@blackwall.org>
      Link: https://lore.kernel.org/r/20240129123703.1857843-1-idosch@nvidia.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      b40f873a
  2. 30 Jan, 2024 7 commits
    • Eric Dumazet's avatar
      llc: call sock_orphan() at release time · aa2b2eb3
      Eric Dumazet authored
      syzbot reported an interesting trace [1] caused by a stale sk->sk_wq
      pointer in a closed llc socket.
      
      In commit ff7b11aa ("net: socket: set sock->sk to NULL after
      calling proto_ops::release()") Eric Biggers hinted that some protocols
      are missing a sock_orphan(), we need to perform a full audit.
      
      In net-next, I plan to clear sock->sk from sock_orphan() and
      amend Eric patch to add a warning.
      
      [1]
       BUG: KASAN: slab-use-after-free in list_empty include/linux/list.h:373 [inline]
       BUG: KASAN: slab-use-after-free in waitqueue_active include/linux/wait.h:127 [inline]
       BUG: KASAN: slab-use-after-free in sock_def_write_space_wfree net/core/sock.c:3384 [inline]
       BUG: KASAN: slab-use-after-free in sock_wfree+0x9a8/0x9d0 net/core/sock.c:2468
      Read of size 8 at addr ffff88802f4fc880 by task ksoftirqd/1/27
      
      CPU: 1 PID: 27 Comm: ksoftirqd/1 Not tainted 6.8.0-rc1-syzkaller-00049-g6098d87e #0
      Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
      Call Trace:
       <TASK>
        __dump_stack lib/dump_stack.c:88 [inline]
        dump_stack_lvl+0xd9/0x1b0 lib/dump_stack.c:106
        print_address_description mm/kasan/report.c:377 [inline]
        print_report+0xc4/0x620 mm/kasan/report.c:488
        kasan_report+0xda/0x110 mm/kasan/report.c:601
        list_empty include/linux/list.h:373 [inline]
        waitqueue_active include/linux/wait.h:127 [inline]
        sock_def_write_space_wfree net/core/sock.c:3384 [inline]
        sock_wfree+0x9a8/0x9d0 net/core/sock.c:2468
        skb_release_head_state+0xa3/0x2b0 net/core/skbuff.c:1080
        skb_release_all net/core/skbuff.c:1092 [inline]
        napi_consume_skb+0x119/0x2b0 net/core/skbuff.c:1404
        e1000_unmap_and_free_tx_resource+0x144/0x200 drivers/net/ethernet/intel/e1000/e1000_main.c:1970
        e1000_clean_tx_irq drivers/net/ethernet/intel/e1000/e1000_main.c:3860 [inline]
        e1000_clean+0x4a1/0x26e0 drivers/net/ethernet/intel/e1000/e1000_main.c:3801
        __napi_poll.constprop.0+0xb4/0x540 net/core/dev.c:6576
        napi_poll net/core/dev.c:6645 [inline]
        net_rx_action+0x956/0xe90 net/core/dev.c:6778
        __do_softirq+0x21a/0x8de kernel/softirq.c:553
        run_ksoftirqd kernel/softirq.c:921 [inline]
        run_ksoftirqd+0x31/0x60 kernel/softirq.c:913
        smpboot_thread_fn+0x660/0xa10 kernel/smpboot.c:164
        kthread+0x2c6/0x3a0 kernel/kthread.c:388
        ret_from_fork+0x45/0x80 arch/x86/kernel/process.c:147
        ret_from_fork_asm+0x11/0x20 arch/x86/entry/entry_64.S:242
       </TASK>
      
      Allocated by task 5167:
        kasan_save_stack+0x33/0x50 mm/kasan/common.c:47
        kasan_save_track+0x14/0x30 mm/kasan/common.c:68
        unpoison_slab_object mm/kasan/common.c:314 [inline]
        __kasan_slab_alloc+0x81/0x90 mm/kasan/common.c:340
        kasan_slab_alloc include/linux/kasan.h:201 [inline]
        slab_post_alloc_hook mm/slub.c:3813 [inline]
        slab_alloc_node mm/slub.c:3860 [inline]
        kmem_cache_alloc_lru+0x142/0x6f0 mm/slub.c:3879
        alloc_inode_sb include/linux/fs.h:3019 [inline]
        sock_alloc_inode+0x25/0x1c0 net/socket.c:308
        alloc_inode+0x5d/0x220 fs/inode.c:260
        new_inode_pseudo+0x16/0x80 fs/inode.c:1005
        sock_alloc+0x40/0x270 net/socket.c:634
        __sock_create+0xbc/0x800 net/socket.c:1535
        sock_create net/socket.c:1622 [inline]
        __sys_socket_create net/socket.c:1659 [inline]
        __sys_socket+0x14c/0x260 net/socket.c:1706
        __do_sys_socket net/socket.c:1720 [inline]
        __se_sys_socket net/socket.c:1718 [inline]
        __x64_sys_socket+0x72/0xb0 net/socket.c:1718
        do_syscall_x64 arch/x86/entry/common.c:52 [inline]
        do_syscall_64+0xd3/0x250 arch/x86/entry/common.c:83
       entry_SYSCALL_64_after_hwframe+0x63/0x6b
      
      Freed by task 0:
        kasan_save_stack+0x33/0x50 mm/kasan/common.c:47
        kasan_save_track+0x14/0x30 mm/kasan/common.c:68
        kasan_save_free_info+0x3f/0x60 mm/kasan/generic.c:640
        poison_slab_object mm/kasan/common.c:241 [inline]
        __kasan_slab_free+0x121/0x1b0 mm/kasan/common.c:257
        kasan_slab_free include/linux/kasan.h:184 [inline]
        slab_free_hook mm/slub.c:2121 [inline]
        slab_free mm/slub.c:4299 [inline]
        kmem_cache_free+0x129/0x350 mm/slub.c:4363
        i_callback+0x43/0x70 fs/inode.c:249
        rcu_do_batch kernel/rcu/tree.c:2158 [inline]
        rcu_core+0x819/0x1680 kernel/rcu/tree.c:2433
        __do_softirq+0x21a/0x8de kernel/softirq.c:553
      
      Last potentially related work creation:
        kasan_save_stack+0x33/0x50 mm/kasan/common.c:47
        __kasan_record_aux_stack+0xba/0x100 mm/kasan/generic.c:586
        __call_rcu_common.constprop.0+0x9a/0x7b0 kernel/rcu/tree.c:2683
        destroy_inode+0x129/0x1b0 fs/inode.c:315
        iput_final fs/inode.c:1739 [inline]
        iput.part.0+0x560/0x7b0 fs/inode.c:1765
        iput+0x5c/0x80 fs/inode.c:1755
        dentry_unlink_inode+0x292/0x430 fs/dcache.c:400
        __dentry_kill+0x1ca/0x5f0 fs/dcache.c:603
        dput.part.0+0x4ac/0x9a0 fs/dcache.c:845
        dput+0x1f/0x30 fs/dcache.c:835
        __fput+0x3b9/0xb70 fs/file_table.c:384
        task_work_run+0x14d/0x240 kernel/task_work.c:180
        exit_task_work include/linux/task_work.h:38 [inline]
        do_exit+0xa8a/0x2ad0 kernel/exit.c:871
        do_group_exit+0xd4/0x2a0 kernel/exit.c:1020
        __do_sys_exit_group kernel/exit.c:1031 [inline]
        __se_sys_exit_group kernel/exit.c:1029 [inline]
        __x64_sys_exit_group+0x3e/0x50 kernel/exit.c:1029
        do_syscall_x64 arch/x86/entry/common.c:52 [inline]
        do_syscall_64+0xd3/0x250 arch/x86/entry/common.c:83
       entry_SYSCALL_64_after_hwframe+0x63/0x6b
      
      The buggy address belongs to the object at ffff88802f4fc800
       which belongs to the cache sock_inode_cache of size 1408
      The buggy address is located 128 bytes inside of
       freed 1408-byte region [ffff88802f4fc800, ffff88802f4fcd80)
      
      The buggy address belongs to the physical page:
      page:ffffea0000bd3e00 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x2f4f8
      head:ffffea0000bd3e00 order:3 entire_mapcount:0 nr_pages_mapped:0 pincount:0
      anon flags: 0xfff00000000840(slab|head|node=0|zone=1|lastcpupid=0x7ff)
      page_type: 0xffffffff()
      raw: 00fff00000000840 ffff888013b06b40 0000000000000000 0000000000000001
      raw: 0000000000000000 0000000080150015 00000001ffffffff 0000000000000000
      page dumped because: kasan: bad access detected
      page_owner tracks the page as allocated
      page last allocated via order 3, migratetype Reclaimable, gfp_mask 0xd20d0(__GFP_IO|__GFP_FS|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP|__GFP_NOMEMALLOC|__GFP_RECLAIMABLE), pid 4956, tgid 4956 (sshd), ts 31423924727, free_ts 0
        set_page_owner include/linux/page_owner.h:31 [inline]
        post_alloc_hook+0x2d0/0x350 mm/page_alloc.c:1533
        prep_new_page mm/page_alloc.c:1540 [inline]
        get_page_from_freelist+0xa28/0x3780 mm/page_alloc.c:3311
        __alloc_pages+0x22f/0x2440 mm/page_alloc.c:4567
        __alloc_pages_node include/linux/gfp.h:238 [inline]
        alloc_pages_node include/linux/gfp.h:261 [inline]
        alloc_slab_page mm/slub.c:2190 [inline]
        allocate_slab mm/slub.c:2354 [inline]
        new_slab+0xcc/0x3a0 mm/slub.c:2407
        ___slab_alloc+0x4af/0x19a0 mm/slub.c:3540
        __slab_alloc.constprop.0+0x56/0xa0 mm/slub.c:3625
        __slab_alloc_node mm/slub.c:3678 [inline]
        slab_alloc_node mm/slub.c:3850 [inline]
        kmem_cache_alloc_lru+0x379/0x6f0 mm/slub.c:3879
        alloc_inode_sb include/linux/fs.h:3019 [inline]
        sock_alloc_inode+0x25/0x1c0 net/socket.c:308
        alloc_inode+0x5d/0x220 fs/inode.c:260
        new_inode_pseudo+0x16/0x80 fs/inode.c:1005
        sock_alloc+0x40/0x270 net/socket.c:634
        __sock_create+0xbc/0x800 net/socket.c:1535
        sock_create net/socket.c:1622 [inline]
        __sys_socket_create net/socket.c:1659 [inline]
        __sys_socket+0x14c/0x260 net/socket.c:1706
        __do_sys_socket net/socket.c:1720 [inline]
        __se_sys_socket net/socket.c:1718 [inline]
        __x64_sys_socket+0x72/0xb0 net/socket.c:1718
        do_syscall_x64 arch/x86/entry/common.c:52 [inline]
        do_syscall_64+0xd3/0x250 arch/x86/entry/common.c:83
       entry_SYSCALL_64_after_hwframe+0x63/0x6b
      page_owner free stack trace missing
      
      Memory state around the buggy address:
       ffff88802f4fc780: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
       ffff88802f4fc800: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      >ffff88802f4fc880: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                         ^
       ffff88802f4fc900: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
       ffff88802f4fc980: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      
      Fixes: 43815482 ("net: sock_def_readable() and friends RCU conversion")
      Reported-and-tested-by: syzbot+32b89eaa102b372ff76d@syzkaller.appspotmail.com
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Eric Biggers <ebiggers@google.com>
      Cc: Kuniyuki Iwashima <kuniyu@amazon.com>
      Reviewed-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Link: https://lore.kernel.org/r/20240126165532.3396702-1-edumazet@google.comSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      aa2b2eb3
    • Paolo Abeni's avatar
      Merge branch 'net-stmmac-dwmac-imx-time-based-scheduling-support' · f8affba7
      Paolo Abeni authored
      Esben Haabendal says:
      
      ====================
      net: stmmac: dwmac-imx: Time Based Scheduling support
      
      This small patch series allows using TBS support of the i.MX Ethernet QOS
      controller for etf qdisc offload.
      It achieves this in a similar manner that it is done in dwmac-intel.c,
      dwmac-mediatek.c and stmmac_pci.c.
      
      Changes since v1:
      
      - Simplified for loop by starting at index 1.
      - Fixed problem with indentation.
      ====================
      
      Link: https://lore.kernel.org/r/cover.1706256158.git.esben@geanix.comSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      f8affba7
    • Esben Haabendal's avatar
      net: stmmac: dwmac-imx: set TSO/TBS TX queues default settings · 3b12ec8f
      Esben Haabendal authored
      TSO and TBS cannot coexist. For now we set i.MX Ethernet QOS controller to
      use the first TX queue with TSO and the rest for TBS.
      
      TX queues with TBS can support etf qdisc hw offload.
      Signed-off-by: default avatarEsben Haabendal <esben@geanix.com>
      Reviewed-by: Kurt Kanzenbach's avatarKurt Kanzenbach <kurt@linutronix.de>
      Reviewed-by: default avatarVadim Fedorenko <vadim.fedorenko@linux.dev>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      3b12ec8f
    • Esben Haabendal's avatar
      net: stmmac: do not clear TBS enable bit on link up/down · 4896bb7c
      Esben Haabendal authored
      With the dma conf being reallocated on each call to stmmac_open(), any
      information in there is lost, unless we specifically handle it.
      
      The STMMAC_TBS_EN bit is set when adding an etf qdisc, and the etf qdisc
      therefore would stop working when link was set down and then back up.
      
      Fixes: ba39b344 ("net: ethernet: stmicro: stmmac: generate stmmac dma conf before open")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarEsben Haabendal <esben@geanix.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      4896bb7c
    • Helge Deller's avatar
      ipv6: Ensure natural alignment of const ipv6 loopback and router addresses · 60365049
      Helge Deller authored
      On a parisc64 kernel I sometimes notice this kernel warning:
      Kernel unaligned access to 0x40ff8814 at ndisc_send_skb+0xc0/0x4d8
      
      The address 0x40ff8814 points to the in6addr_linklocal_allrouters
      variable and the warning simply means that some ipv6 function tries to
      read a 64-bit word directly from the not-64-bit aligned
      in6addr_linklocal_allrouters variable.
      
      Unaligned accesses are non-critical as the architecture or exception
      handlers usually will fix it up at runtime. Nevertheless it may trigger
      a performance penality for some architectures. For details read the
      "unaligned-memory-access" kernel documentation.
      
      The patch below ensures that the ipv6 loopback and router addresses will
      always be naturally aligned. This prevents the unaligned accesses for
      all architectures.
      Signed-off-by: default avatarHelge Deller <deller@gmx.de>
      Fixes: 034dfc5d ("ipv6: export in6addr_loopback to modules")
      Acked-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Link: https://lore.kernel.org/r/ZbNuFM1bFqoH-UoY@p100Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      60365049
    • Jakub Kicinski's avatar
      selftests: net: add missing config for nftables-backed iptables · 59c93583
      Jakub Kicinski authored
      Modern OSes use iptables implementation with nf_tables as a backend,
      e.g.:
      
      $ iptables -V
      iptables v1.8.8 (nf_tables)
      
      Pablo points out that we need CONFIG_NFT_COMPAT to make that work,
      otherwise we see a lot of:
      
        Warning: Extension DNAT revision 0 not supported, missing kernel module?
      
      with DNAT being just an example here, other modules we need
      include udp, TTL, length etc.
      
      Link: https://lore.kernel.org/r/20240126201308.2903602-1-kuba@kernel.orgSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      59c93583
    • Michal Vokáč's avatar
      net: dsa: qca8k: fix illegal usage of GPIO · c44fc98f
      Michal Vokáč authored
      When working with GPIO, its direction must be set either when the GPIO is
      requested by gpiod_get*() or later on by one of the gpiod_direction_*()
      functions. Neither of this is done here which results in undefined
      behavior on some systems.
      
      As the reset GPIO is used right after it is requested here, it makes sense
      to configure it as GPIOD_OUT_HIGH right away. With that, the following
      gpiod_set_value_cansleep(1) becomes redundant and can be safely
      removed.
      
      Fixes: a653f2f5 ("net: dsa: qca8k: introduce reset via gpio feature")
      Signed-off-by: default avatarMichal Vokáč <michal.vokac@ysoft.com>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Link: https://lore.kernel.org/r/1706266175-3408-1-git-send-email-michal.vokac@ysoft.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      c44fc98f
  3. 29 Jan, 2024 3 commits
    • Eric Dumazet's avatar
      tcp: add sanity checks to rx zerocopy · 577e4432
      Eric Dumazet authored
      TCP rx zerocopy intent is to map pages initially allocated
      from NIC drivers, not pages owned by a fs.
      
      This patch adds to can_map_frag() these additional checks:
      
      - Page must not be a compound one.
      - page->mapping must be NULL.
      
      This fixes the panic reported by ZhangPeng.
      
      syzbot was able to loopback packets built with sendfile(),
      mapping pages owned by an ext4 file to TCP rx zerocopy.
      
      r3 = socket$inet_tcp(0x2, 0x1, 0x0)
      mmap(&(0x7f0000ff9000/0x4000)=nil, 0x4000, 0x0, 0x12, r3, 0x0)
      r4 = socket$inet_tcp(0x2, 0x1, 0x0)
      bind$inet(r4, &(0x7f0000000000)={0x2, 0x4e24, @multicast1}, 0x10)
      connect$inet(r4, &(0x7f00000006c0)={0x2, 0x4e24, @empty}, 0x10)
      r5 = openat$dir(0xffffffffffffff9c, &(0x7f00000000c0)='./file0\x00',
          0x181e42, 0x0)
      fallocate(r5, 0x0, 0x0, 0x85b8)
      sendfile(r4, r5, 0x0, 0x8ba0)
      getsockopt$inet_tcp_TCP_ZEROCOPY_RECEIVE(r4, 0x6, 0x23,
          &(0x7f00000001c0)={&(0x7f0000ffb000/0x3000)=nil, 0x3000, 0x0, 0x0, 0x0,
          0x0, 0x0, 0x0, 0x0}, &(0x7f0000000440)=0x40)
      r6 = openat$dir(0xffffffffffffff9c, &(0x7f00000000c0)='./file0\x00',
          0x181e42, 0x0)
      
      Fixes: 93ab6cc6 ("tcp: implement mmap() for zero copy receive")
      Link: https://lore.kernel.org/netdev/5106a58e-04da-372a-b836-9d3d0bd2507b@huawei.com/T/Reported-and-bisected-by: default avatarZhangPeng <zhangpeng362@huawei.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Arjun Roy <arjunroy@google.com>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: linux-mm@vger.kernel.org
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: linux-fsdevel@vger.kernel.org
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      577e4432
    • Fedor Pchelkin's avatar
      nfc: nci: free rx_data_reassembly skb on NCI device cleanup · bfb007ae
      Fedor Pchelkin authored
      rx_data_reassembly skb is stored during NCI data exchange for processing
      fragmented packets. It is dropped only when the last fragment is processed
      or when an NTF packet with NCI_OP_RF_DEACTIVATE_NTF opcode is received.
      However, the NCI device may be deallocated before that which leads to skb
      leak.
      
      As by design the rx_data_reassembly skb is bound to the NCI device and
      nothing prevents the device to be freed before the skb is processed in
      some way and cleaned, free it on the NCI device cleanup.
      
      Found by Linux Verification Center (linuxtesting.org) with Syzkaller.
      
      Fixes: 6a2968aa ("NFC: basic NCI protocol implementation")
      Cc: stable@vger.kernel.org
      Reported-by: syzbot+6b7c68d9c21e4ee4251b@syzkaller.appspotmail.com
      Closes: https://lore.kernel.org/lkml/000000000000f43987060043da7b@google.com/Signed-off-by: default avatarFedor Pchelkin <pchelkin@ispras.ru>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bfb007ae
    • Nikita Zhandarovich's avatar
      net: hsr: remove WARN_ONCE() in send_hsr_supervision_frame() · 37e8c97e
      Nikita Zhandarovich authored
      Syzkaller reported [1] hitting a warning after failing to allocate
      resources for skb in hsr_init_skb(). Since a WARN_ONCE() call will
      not help much in this case, it might be prudent to switch to
      netdev_warn_once(). At the very least it will suppress syzkaller
      reports such as [1].
      
      Just in case, use netdev_warn_once() in send_prp_supervision_frame()
      for similar reasons.
      
      [1]
      HSR: Could not send supervision frame
      WARNING: CPU: 1 PID: 85 at net/hsr/hsr_device.c:294 send_hsr_supervision_frame+0x60a/0x810 net/hsr/hsr_device.c:294
      RIP: 0010:send_hsr_supervision_frame+0x60a/0x810 net/hsr/hsr_device.c:294
      ...
      Call Trace:
       <IRQ>
       hsr_announce+0x114/0x370 net/hsr/hsr_device.c:382
       call_timer_fn+0x193/0x590 kernel/time/timer.c:1700
       expire_timers kernel/time/timer.c:1751 [inline]
       __run_timers+0x764/0xb20 kernel/time/timer.c:2022
       run_timer_softirq+0x58/0xd0 kernel/time/timer.c:2035
       __do_softirq+0x21a/0x8de kernel/softirq.c:553
       invoke_softirq kernel/softirq.c:427 [inline]
       __irq_exit_rcu kernel/softirq.c:632 [inline]
       irq_exit_rcu+0xb7/0x120 kernel/softirq.c:644
       sysvec_apic_timer_interrupt+0x95/0xb0 arch/x86/kernel/apic/apic.c:1076
       </IRQ>
       <TASK>
       asm_sysvec_apic_timer_interrupt+0x1a/0x20 arch/x86/include/asm/idtentry.h:649
      ...
      
      This issue is also found in older kernels (at least up to 5.10).
      
      Cc: stable@vger.kernel.org
      Reported-by: syzbot+3ae0a3f42c84074b7c8e@syzkaller.appspotmail.com
      Fixes: 121c33b0 ("net: hsr: introduce common code for skb initialization")
      Signed-off-by: default avatarNikita Zhandarovich <n.zhandarovich@fintech.ru>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      37e8c97e
  4. 27 Jan, 2024 3 commits
    • Horatiu Vultur's avatar
      net: lan966x: Fix port configuration when using SGMII interface · 62b42481
      Horatiu Vultur authored
      In case the interface between the MAC and the PHY is SGMII, then the bit
      GIGA_MODE on the MAC side needs to be set regardless of the speed at
      which it is running.
      
      Fixes: d28d6d2e ("net: lan966x: add port module support")
      Signed-off-by: default avatarHoratiu Vultur <horatiu.vultur@microchip.com>
      Reviewed-by: default avatarMaxime Chevallier <maxime.chevallier@bootlin.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      62b42481
    • Simon Horman's avatar
      MAINTAINERS: Add connector headers to NETWORKING DRIVERS · 586e40aa
      Simon Horman authored
      Commit 46cf789b ("connector: Move maintainence under networking
      drivers umbrella.") moved the connector maintenance but did not include
      the connector header files.
      
      It seems that it has always been implied that these headers were
      maintained along with the rest of the connector code, both before and
      after the cited commit. Make this explicit.
      Signed-off-by: default avatarSimon Horman <horms@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      586e40aa
    • Nicolas Dichtel's avatar
      ipmr: fix kernel panic when forwarding mcast packets · e622502c
      Nicolas Dichtel authored
      The stacktrace was:
      [   86.305548] BUG: kernel NULL pointer dereference, address: 0000000000000092
      [   86.306815] #PF: supervisor read access in kernel mode
      [   86.307717] #PF: error_code(0x0000) - not-present page
      [   86.308624] PGD 0 P4D 0
      [   86.309091] Oops: 0000 [#1] PREEMPT SMP NOPTI
      [   86.309883] CPU: 2 PID: 3139 Comm: pimd Tainted: G     U             6.8.0-6wind-knet #1
      [   86.311027] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.11.1-0-g0551a4be2c-prebuilt.qemu-project.org 04/01/2014
      [   86.312728] RIP: 0010:ip_mr_forward (/build/work/knet/net/ipv4/ipmr.c:1985)
      [ 86.313399] Code: f9 1f 0f 87 85 03 00 00 48 8d 04 5b 48 8d 04 83 49 8d 44 c5 00 48 8b 40 70 48 39 c2 0f 84 d9 00 00 00 49 8b 46 58 48 83 e0 fe <80> b8 92 00 00 00 00 0f 84 55 ff ff ff 49 83 47 38 01 45 85 e4 0f
      [   86.316565] RSP: 0018:ffffad21c0583ae0 EFLAGS: 00010246
      [   86.317497] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
      [   86.318596] RDX: ffff9559cb46c000 RSI: 0000000000000000 RDI: 0000000000000000
      [   86.319627] RBP: ffffad21c0583b30 R08: 0000000000000000 R09: 0000000000000000
      [   86.320650] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000001
      [   86.321672] R13: ffff9559c093a000 R14: ffff9559cc00b800 R15: ffff9559c09c1d80
      [   86.322873] FS:  00007f85db661980(0000) GS:ffff955a79d00000(0000) knlGS:0000000000000000
      [   86.324291] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [   86.325314] CR2: 0000000000000092 CR3: 000000002f13a000 CR4: 0000000000350ef0
      [   86.326589] Call Trace:
      [   86.327036]  <TASK>
      [   86.327434] ? show_regs (/build/work/knet/arch/x86/kernel/dumpstack.c:479)
      [   86.328049] ? __die (/build/work/knet/arch/x86/kernel/dumpstack.c:421 /build/work/knet/arch/x86/kernel/dumpstack.c:434)
      [   86.328508] ? page_fault_oops (/build/work/knet/arch/x86/mm/fault.c:707)
      [   86.329107] ? do_user_addr_fault (/build/work/knet/arch/x86/mm/fault.c:1264)
      [   86.329756] ? srso_return_thunk (/build/work/knet/arch/x86/lib/retpoline.S:223)
      [   86.330350] ? __irq_work_queue_local (/build/work/knet/kernel/irq_work.c:111 (discriminator 1))
      [   86.331013] ? exc_page_fault (/build/work/knet/./arch/x86/include/asm/paravirt.h:693 /build/work/knet/arch/x86/mm/fault.c:1515 /build/work/knet/arch/x86/mm/fault.c:1563)
      [   86.331702] ? asm_exc_page_fault (/build/work/knet/./arch/x86/include/asm/idtentry.h:570)
      [   86.332468] ? ip_mr_forward (/build/work/knet/net/ipv4/ipmr.c:1985)
      [   86.333183] ? srso_return_thunk (/build/work/knet/arch/x86/lib/retpoline.S:223)
      [   86.333920] ipmr_mfc_add (/build/work/knet/./include/linux/rcupdate.h:782 /build/work/knet/net/ipv4/ipmr.c:1009 /build/work/knet/net/ipv4/ipmr.c:1273)
      [   86.334583] ? __pfx_ipmr_hash_cmp (/build/work/knet/net/ipv4/ipmr.c:363)
      [   86.335357] ip_mroute_setsockopt (/build/work/knet/net/ipv4/ipmr.c:1470)
      [   86.336135] ? srso_return_thunk (/build/work/knet/arch/x86/lib/retpoline.S:223)
      [   86.336854] ? ip_mroute_setsockopt (/build/work/knet/net/ipv4/ipmr.c:1470)
      [   86.337679] do_ip_setsockopt (/build/work/knet/net/ipv4/ip_sockglue.c:944)
      [   86.338408] ? __pfx_unix_stream_read_actor (/build/work/knet/net/unix/af_unix.c:2862)
      [   86.339232] ? srso_return_thunk (/build/work/knet/arch/x86/lib/retpoline.S:223)
      [   86.339809] ? aa_sk_perm (/build/work/knet/security/apparmor/include/cred.h:153 /build/work/knet/security/apparmor/net.c:181)
      [   86.340342] ip_setsockopt (/build/work/knet/net/ipv4/ip_sockglue.c:1415)
      [   86.340859] raw_setsockopt (/build/work/knet/net/ipv4/raw.c:836)
      [   86.341408] ? security_socket_setsockopt (/build/work/knet/security/security.c:4561 (discriminator 13))
      [   86.342116] sock_common_setsockopt (/build/work/knet/net/core/sock.c:3716)
      [   86.342747] do_sock_setsockopt (/build/work/knet/net/socket.c:2313)
      [   86.343363] __sys_setsockopt (/build/work/knet/./include/linux/file.h:32 /build/work/knet/net/socket.c:2336)
      [   86.344020] __x64_sys_setsockopt (/build/work/knet/net/socket.c:2340)
      [   86.344766] do_syscall_64 (/build/work/knet/arch/x86/entry/common.c:52 /build/work/knet/arch/x86/entry/common.c:83)
      [   86.345433] ? srso_return_thunk (/build/work/knet/arch/x86/lib/retpoline.S:223)
      [   86.346161] ? syscall_exit_work (/build/work/knet/./include/linux/audit.h:357 /build/work/knet/kernel/entry/common.c:160)
      [   86.346938] ? srso_return_thunk (/build/work/knet/arch/x86/lib/retpoline.S:223)
      [   86.347657] ? syscall_exit_to_user_mode (/build/work/knet/kernel/entry/common.c:215)
      [   86.348538] ? srso_return_thunk (/build/work/knet/arch/x86/lib/retpoline.S:223)
      [   86.349262] ? do_syscall_64 (/build/work/knet/./arch/x86/include/asm/cpufeature.h:171 /build/work/knet/arch/x86/entry/common.c:98)
      [   86.349971] entry_SYSCALL_64_after_hwframe (/build/work/knet/arch/x86/entry/entry_64.S:129)
      
      The original packet in ipmr_cache_report() may be queued and then forwarded
      with ip_mr_forward(). This last function has the assumption that the skb
      dst is set.
      
      After the below commit, the skb dst is dropped by ipv4_pktinfo_prepare(),
      which causes the oops.
      
      Fixes: bb740365 ("ipmr: support IP_PKTINFO on cache report IGMP msg")
      Signed-off-by: default avatarNicolas Dichtel <nicolas.dichtel@6wind.com>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Link: https://lore.kernel.org/r/20240125141847.1931933-1-nicolas.dichtel@6wind.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      e622502c
  5. 26 Jan, 2024 15 commits
  6. 25 Jan, 2024 10 commits
    • Jakub Kicinski's avatar
      Merge branch 'selftests-net-a-few-fixes' · ce36ea75
      Jakub Kicinski authored
      Paolo Abeni says:
      
      ====================
      selftests: net: a few fixes
      
      This series address self-tests failures for udp gro-related tests.
      
      The first patch addresses the main problem I observe locally - the XDP
      program required by such tests, xdp_dummy, is currently build in the
      ebpf self-tests directory, not available if/when the user targets net
      only. Arguably is more a refactor than a fix, but still targeting net
      to hopefully
      
      The second patch fixes the integration of such tests with the build
      system.
      
      Patch 3/3 fixes sporadic failures due to races.
      
      Tested with:
      
      make -C tools/testing/selftests/ TARGETS=net install
      ./tools/testing/selftests/kselftest_install/run_kselftest.sh \
      	-t "net:udpgro_bench.sh net:udpgro.sh net:udpgro_fwd.sh \
      	    net:udpgro_frglist.sh net:veth.sh"
      
      no failures.
      ====================
      
      Link: https://lore.kernel.org/r/cover.1706131762.git.pabeni@redhat.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      ce36ea75
    • Paolo Abeni's avatar
      selftests: net: explicitly wait for listener ready · 4acffb66
      Paolo Abeni authored
      The UDP GRO forwarding test still hard-code an arbitrary pause
      to wait for the UDP listener becoming ready in background.
      
      That causes sporadic failures depending on the host load.
      
      Replace the sleep with the existing helper waiting for the desired
      port being exposed.
      
      Fixes: a062260a ("selftests: net: add UDP GRO forwarding self-tests")
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Reviewed-by: default avatarWillem de Bruijn <willemb@google.com>
      Link: https://lore.kernel.org/r/4d58900fb09cef42749cfcf2ad7f4b91a97d225c.1706131762.git.pabeni@redhat.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      4acffb66
    • Paolo Abeni's avatar
      selftests: net: included needed helper in the install targets · f5173fe3
      Paolo Abeni authored
      The blamed commit below introduce a dependency in some net self-tests
      towards a newly introduce helper script.
      
      Such script is currently not included into the TEST_PROGS_EXTENDED list
      and thus is not installed, causing failure for the relevant tests when
      executed from the install dir.
      
      Fix the issue updating the install targets.
      
      Fixes: 3bdd9fd2 ("selftests/net: synchronize udpgro tests' tx and rx connection")
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Reviewed-by: default avatarWillem de Bruijn <willemb@google.com>
      Link: https://lore.kernel.org/r/076e8758e21ff2061cc9f81640e7858df775f0a9.1706131762.git.pabeni@redhat.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      f5173fe3
    • Paolo Abeni's avatar
      selftests: net: remove dependency on ebpf tests · 98cb12eb
      Paolo Abeni authored
      Several net tests requires an XDP program build under the ebpf
      directory, and error out if such program is not available.
      
      That makes running successful net test hard, let's duplicate into the
      net dir the [very small] program, re-using the existing rules to build
      it, and finally dropping the bogus dependency.
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Reviewed-by: default avatarWillem de Bruijn <willemb@google.com>
      Link: https://lore.kernel.org/r/28e7af7c031557f691dc8045ee41dd549dd5e74c.1706131762.git.pabeni@redhat.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      98cb12eb
    • Jakub Kicinski's avatar
      selftests: tcp_ao: add a config file · b6478784
      Jakub Kicinski authored
      Still a bit unclear whether each directory should have its own
      config file, but assuming they should lets add one for tcp_ao.
      
      The following tests still fail with this config in place:
       - rst_ipv4,
       - rst_ipv6,
       - bench-lookups_ipv6.
      other 21 pass.
      
      Fixes: d11301f6 ("selftests/net: Add TCP-AO ICMPs accept test")
      Reviewed-by: default avatarDmitry Safonov <0x7f454c46@gmail.com>
      Link: https://lore.kernel.org/r/20240124192550.1865743-1-kuba@kernel.orgSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      b6478784
    • Linus Torvalds's avatar
      Merge tag 'net-6.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · ecb1b828
      Linus Torvalds authored
      Pull networking fixes from Paolo Abeni:
       "Including fixes from bpf, netfilter and WiFi.
      
        Jakub is doing a lot of work to include the self-tests in our CI, as a
        result a significant amount of self-tests related fixes is flowing in
        (and will likely continue in the next few weeks).
      
        Current release - regressions:
      
         - bpf: fix a kernel crash for the riscv 64 JIT
      
         - bnxt_en: fix memory leak in bnxt_hwrm_get_rings()
      
         - revert "net: macsec: use skb_ensure_writable_head_tail to expand
           the skb"
      
        Previous releases - regressions:
      
         - core: fix removing a namespace with conflicting altnames
      
         - tc/flower: fix chain template offload memory leak
      
         - tcp:
            - make sure init the accept_queue's spinlocks once
            - fix autocork on CPUs with weak memory model
      
         - udp: fix busy polling
      
         - mlx5e:
            - fix out-of-bound read in port timestamping
            - fix peer flow lists corruption
      
         - iwlwifi: fix a memory corruption
      
        Previous releases - always broken:
      
         - netfilter:
            - nft_chain_filter: handle NETDEV_UNREGISTER for inet/ingress
              basechain
            - nft_limit: reject configurations that cause integer overflow
      
         - bpf: fix bpf_xdp_adjust_tail() with XSK zero-copy mbuf, avoiding a
           NULL pointer dereference upon shrinking
      
         - llc: make llc_ui_sendmsg() more robust against bonding changes
      
         - smc: fix illegal rmb_desc access in SMC-D connection dump
      
         - dpll: fix pin dump crash for rebound module
      
         - bnxt_en: fix possible crash after creating sw mqprio TCs
      
         - hv_netvsc: calculate correct ring size when PAGE_SIZE is not 4kB
      
        Misc:
      
         - several self-tests fixes for better integration with the netdev CI
      
         - added several missing modules descriptions"
      
      * tag 'net-6.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (88 commits)
        tsnep: Fix XDP_RING_NEED_WAKEUP for empty fill ring
        tsnep: Remove FCS for XDP data path
        net: fec: fix the unhandled context fault from smmu
        selftests: bonding: do not test arp/ns target with mode balance-alb/tlb
        fjes: fix memleaks in fjes_hw_setup
        i40e: update xdp_rxq_info::frag_size for ZC enabled Rx queue
        i40e: set xdp_rxq_info::frag_size
        xdp: reflect tail increase for MEM_TYPE_XSK_BUFF_POOL
        ice: update xdp_rxq_info::frag_size for ZC enabled Rx queue
        intel: xsk: initialize skb_frag_t::bv_offset in ZC drivers
        ice: remove redundant xdp_rxq_info registration
        i40e: handle multi-buffer packets that are shrunk by xdp prog
        ice: work on pre-XDP prog frag count
        xsk: fix usage of multi-buffer BPF helpers for ZC XDP
        xsk: make xsk_buff_pool responsible for clearing xdp_buff::flags
        xsk: recycle buffer in case Rx queue was full
        net: fill in MODULE_DESCRIPTION()s for rvu_mbox
        net: fill in MODULE_DESCRIPTION()s for litex
        net: fill in MODULE_DESCRIPTION()s for fsl_pq_mdio
        net: fill in MODULE_DESCRIPTION()s for fec
        ...
      ecb1b828
    • Linus Torvalds's avatar
      Merge tag 'ovl-fixes-6.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/overlayfs/vfs · bdc01020
      Linus Torvalds authored
      Pull overlayfs fix from Amir Goldstein:
       "Change the on-disk format for the new "xwhiteouts" feature introduced
        in v6.7
      
        The change reduces unneeded overhead of an extra getxattr per readdir.
        The only user of the "xwhiteout" feature is the external composefs
        tool, which has been updated to support the new on-disk format.
      
        This change is also designated for 6.7.y"
      
      * tag 'ovl-fixes-6.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/overlayfs/vfs:
        ovl: mark xwhiteouts directory with overlay.opaque='x'
      bdc01020
    • Linus Torvalds's avatar
      Merge tag 'vfs-6.8-rc2.netfs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs · a658e0e9
      Linus Torvalds authored
      Pull netfs fixes from Christian Brauner:
       "This contains various fixes for the netfs work merged earlier this
        cycle:
      
        afs:
         - Fix locking imbalance in afs_proc_addr_prefs_show()
         - Remove afs_dynroot_d_revalidate() which is redundant
         - Fix error handling during lookup
         - Hide sillyrenames from userspace. This fixes a race between
           silly-rename files being created/removed and userspace iterating
           over directory entries
         - Don't use unnecessary folio_*() functions
      
        cifs:
         - Don't use unnecessary folio_*() functions
      
        cachefiles:
         - erofs: Fix Null dereference when cachefiles are not doing
           ondemand-mode
         - Update mailing list
      
        netfs library:
         - Add Jeff Layton as reviewer
         - Update mailing list
         - Fix a error checking in netfs_perform_write()
         - fscache: Check error before dereferencing
         - Don't use unnecessary folio_*() functions"
      
      * tag 'vfs-6.8-rc2.netfs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
        afs: Fix missing/incorrect unlocking of RCU read lock
        afs: Remove afs_dynroot_d_revalidate() as it is redundant
        afs: Fix error handling with lookup via FS.InlineBulkStatus
        afs: Hide silly-rename files from userspace
        cachefiles, erofs: Fix NULL deref in when cachefiles is not doing ondemand-mode
        netfs: Fix a NULL vs IS_ERR() check in netfs_perform_write()
        netfs, fscache: Prevent Oops in fscache_put_cache()
        cifs: Don't use certain unnecessary folio_*() functions
        afs: Don't use certain unnecessary folio_*() functions
        netfs: Don't use certain unnecessary folio_*() functions
        netfs: Add Jeff Layton as reviewer
        netfs, cachefiles: Change mailing list
      a658e0e9
    • Linus Torvalds's avatar
      Merge tag 'nfsd-6.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux · b9fa4cbd
      Linus Torvalds authored
      Pull nfsd fixes from Chuck Lever:
      
       - Fix in-kernel RPC UDP transport
      
       - Fix NFSv4.0 RELEASE_LOCKOWNER
      
      * tag 'nfsd-6.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux:
        nfsd: fix RELEASE_LOCKOWNER
        SUNRPC: use request size to initialize bio_vec in svc_udp_sendto()
      b9fa4cbd
    • Linus Torvalds's avatar
      Merge tag 'urgent-rcu.2024.01.24a' of https://github.com/neeraju/linux · 3cb9871f
      Linus Torvalds authored
      Pull RCU fix from Neeraj Upadhyay:
       "This fixes RCU grace period stalls, which are observed when an
        outgoing CPU's quiescent state reporting results in wakeup of one of
        the grace period kthreads, to complete the grace period.
      
        If those kthreads have SCHED_FIFO policy, the wake up can indirectly
        arm the RT bandwith timer to the local offline CPU.
      
        Earlier migration of the hrtimers from the CPU introduced in commit
        5c0930cc ("hrtimers: Push pending hrtimers away from outgoing CPU
        earlier") results in this timer getting ignored.
      
        If the RCU grace period kthreads are waiting for RT bandwidth to be
        available, they may never be actually scheduled, resulting in RCU
        stall warnings"
      
      * tag 'urgent-rcu.2024.01.24a' of https://github.com/neeraju/linux:
        rcu: Defer RCU kthreads wakeup when CPU is dying
      3cb9871f