1. 19 Jan, 2021 9 commits
    • Guillaume Nault's avatar
      udp: mask TOS bits in udp_v4_early_demux() · 8d2b51b0
      Guillaume Nault authored
      udp_v4_early_demux() is the only function that calls
      ip_mc_validate_source() with a TOS that hasn't been masked with
      IPTOS_RT_MASK.
      
      This results in different behaviours for incoming multicast UDPv4
      packets, depending on if ip_mc_validate_source() is called from the
      early-demux path (udp_v4_early_demux) or from the regular input path
      (ip_route_input_noref).
      
      ECN would normally not be used with UDP multicast packets, so the
      practical consequences should be limited on that side. However,
      IPTOS_RT_MASK is used to also masks the TOS' high order bits, to align
      with the non-early-demux path behaviour.
      
      Reproducer:
      
        Setup two netns, connected with veth:
        $ ip netns add ns0
        $ ip netns add ns1
        $ ip -netns ns0 link set dev lo up
        $ ip -netns ns1 link set dev lo up
        $ ip link add name veth01 netns ns0 type veth peer name veth10 netns ns1
        $ ip -netns ns0 link set dev veth01 up
        $ ip -netns ns1 link set dev veth10 up
        $ ip -netns ns0 address add 192.0.2.10 peer 192.0.2.11/32 dev veth01
        $ ip -netns ns1 address add 192.0.2.11 peer 192.0.2.10/32 dev veth10
      
        In ns0, add route to multicast address 224.0.2.0/24 using source
        address 198.51.100.10:
        $ ip -netns ns0 address add 198.51.100.10/32 dev lo
        $ ip -netns ns0 route add 224.0.2.0/24 dev veth01 src 198.51.100.10
      
        In ns1, define route to 198.51.100.10, only for packets with TOS 4:
        $ ip -netns ns1 route add 198.51.100.10/32 tos 4 dev veth10
      
        Also activate rp_filter in ns1, so that incoming packets not matching
        the above route get dropped:
        $ ip netns exec ns1 sysctl -wq net.ipv4.conf.veth10.rp_filter=1
      
        Now try to receive packets on 224.0.2.11:
        $ ip netns exec ns1 socat UDP-RECVFROM:1111,ip-add-membership=224.0.2.11:veth10,ignoreeof -
      
        In ns0, send packet to 224.0.2.11 with TOS 4 and ECT(0) (that is,
        tos 6 for socat):
        $ echo test0 | ip netns exec ns0 socat - UDP-DATAGRAM:224.0.2.11:1111,bind=:1111,tos=6
      
        The "test0" message is properly received by socat in ns1, because
        early-demux has no cached dst to use, so source address validation
        is done by ip_route_input_mc(), which receives a TOS that has the
        ECN bits masked.
      
        Now send another packet to 224.0.2.11, still with TOS 4 and ECT(0):
        $ echo test1 | ip netns exec ns0 socat - UDP-DATAGRAM:224.0.2.11:1111,bind=:1111,tos=6
      
        The "test1" message isn't received by socat in ns1, because, now,
        early-demux has a cached dst to use and calls ip_mc_validate_source()
        immediately, without masking the ECN bits.
      
      Fixes: bc044e8d ("udp: perform source validation for mcast early demux")
      Signed-off-by: default avatarGuillaume Nault <gnault@redhat.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      8d2b51b0
    • Jakub Kicinski's avatar
      Merge branch 'sh_eth-fix-reboot-crash' · f7b9820d
      Jakub Kicinski authored
      Geert Uytterhoeven says:
      
      ====================
      sh_eth: Fix reboot crash
      
      This patch fixes a regression v5.11-rc1, where rebooting while a sh_eth
      device is not opened will cause a crash.
      
      Changes compared to v1:
        - Export mdiobb_{read,write}(),
        - Call mdiobb_{read,write}() now they are exported,
        - Use mii_bus.parent to avoid bb_info.dev copy,
        - Drop RFC state.
      
      Alternatively, mdio-bitbang could provide Runtime PM-aware wrappers
      itself, and use them either manually (through a new parameter to
      alloc_mdio_bitbang(), or a new alloc_mdio_bitbang_*() function), or
      automatically (e.g. if pm_runtime_enabled() returns true).  Note that
      the latter requires a "struct device *" parameter to operate on.
      Currently there are only two drivers that call alloc_mdio_bitbang() and
      use Runtime PM: the Renesas sh_eth and ravb drivers.  This series fixes
      the former, while the latter is not affected (it keeps the device
      powered all the time between driver probe and driver unbind, and
      changing that seems to be non-trivial).
      ====================
      
      Link: https://lore.kernel.org/r/20210118150656.796584-1-geert+renesas@glider.beSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      f7b9820d
    • Geert Uytterhoeven's avatar
      sh_eth: Make PHY access aware of Runtime PM to fix reboot crash · 02cae02a
      Geert Uytterhoeven authored
      Wolfram reports that his R-Car H2-based Lager board can no longer be
      rebooted in v5.11-rc1, as it crashes with an imprecise external abort.
      The issue can be reproduced on other boards (e.g. Koelsch with R-Car
      M2-W) too, if CONFIG_IP_PNP is disabled, and the Ethernet interface is
      down at reboot time:
      
          Unhandled fault: imprecise external abort (0x1406) at 0x00000000
          pgd = (ptrval)
          [00000000] *pgd=422b6835, *pte=00000000, *ppte=00000000
          Internal error: : 1406 [#1] ARM
          Modules linked in:
          CPU: 0 PID: 1105 Comm: init Tainted: G        W         5.10.0-rc1-00402-ge2f016cf #1048
          Hardware name: Generic R-Car Gen2 (Flattened Device Tree)
          PC is at sh_mdio_ctrl+0x44/0x60
          LR is at sh_mmd_ctrl+0x20/0x24
          ...
          Backtrace:
          [<c0451f30>] (sh_mdio_ctrl) from [<c0451fd4>] (sh_mmd_ctrl+0x20/0x24)
           r7:0000001f r6:00000020 r5:00000002 r4:c22a1dc4
          [<c0451fb4>] (sh_mmd_ctrl) from [<c044fc18>] (mdiobb_cmd+0x38/0xa8)
          [<c044fbe0>] (mdiobb_cmd) from [<c044feb8>] (mdiobb_read+0x58/0xdc)
           r9:c229f844 r8:c0c329dc r7:c221e000 r6:00000001 r5:c22a1dc4 r4:00000001
          [<c044fe60>] (mdiobb_read) from [<c044c854>] (__mdiobus_read+0x74/0xe0)
           r7:0000001f r6:00000001 r5:c221e000 r4:c221e000
          [<c044c7e0>] (__mdiobus_read) from [<c044c9d8>] (mdiobus_read+0x40/0x54)
           r7:0000001f r6:00000001 r5:c221e000 r4:c221e458
          [<c044c998>] (mdiobus_read) from [<c044d678>] (phy_read+0x1c/0x20)
           r7:ffffe000 r6:c221e470 r5:00000200 r4:c229f800
          [<c044d65c>] (phy_read) from [<c044d94c>] (kszphy_config_intr+0x44/0x80)
          [<c044d908>] (kszphy_config_intr) from [<c044694c>] (phy_disable_interrupts+0x44/0x50)
           r5:c229f800 r4:c229f800
          [<c0446908>] (phy_disable_interrupts) from [<c0449370>] (phy_shutdown+0x18/0x1c)
           r5:c229f800 r4:c229f804
          [<c0449358>] (phy_shutdown) from [<c040066c>] (device_shutdown+0x168/0x1f8)
          [<c0400504>] (device_shutdown) from [<c013de44>] (kernel_restart_prepare+0x3c/0x48)
           r9:c22d2000 r8:c0100264 r7:c0b0d034 r6:00000000 r5:4321fedc r4:00000000
          [<c013de08>] (kernel_restart_prepare) from [<c013dee0>] (kernel_restart+0x1c/0x60)
          [<c013dec4>] (kernel_restart) from [<c013e1d8>] (__do_sys_reboot+0x168/0x208)
           r5:4321fedc r4:01234567
          [<c013e070>] (__do_sys_reboot) from [<c013e2e8>] (sys_reboot+0x18/0x1c)
           r7:00000058 r6:00000000 r5:00000000 r4:00000000
          [<c013e2d0>] (sys_reboot) from [<c0100060>] (ret_fast_syscall+0x0/0x54)
      
      As of commit e2f016cf ("net: phy: add a shutdown procedure"),
      system reboot calls phy_disable_interrupts() during shutdown.  As this
      happens unconditionally, the PHY registers may be accessed while the
      device is suspended, causing undefined behavior, which may crash the
      system.
      
      Fix this by wrapping the PHY bitbang accessors in the sh_eth driver by
      wrappers that take care of Runtime PM, to resume the device when needed.
      Reported-by: default avatarWolfram Sang <wsa+renesas@sang-engineering.com>
      Suggested-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarGeert Uytterhoeven <geert+renesas@glider.be>
      Tested-by: default avatarWolfram Sang <wsa+renesas@sang-engineering.com>
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      02cae02a
    • Geert Uytterhoeven's avatar
      mdio-bitbang: Export mdiobb_{read,write}() · 8eed01b5
      Geert Uytterhoeven authored
      Export mdiobb_read() and mdiobb_write(), so Ethernet controller drivers
      can call them from their MDIO read/write wrappers.
      Signed-off-by: default avatarGeert Uytterhoeven <geert+renesas@glider.be>
      Tested-by: default avatarWolfram Sang <wsa+renesas@sang-engineering.com>
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      8eed01b5
    • Oleksandr Mazur's avatar
      net: core: devlink: use right genl user_ptr when handling port param get/set · 7e238de8
      Oleksandr Mazur authored
      Fix incorrect user_ptr dereferencing when handling port param get/set:
      
          idx [0] stores the 'struct devlink' pointer;
          idx [1] stores the 'struct devlink_port' pointer;
      
      Fixes: 637989b5 ("devlink: Always use user_ptr[0] for devlink and simplify post_doit")
      CC: Parav Pandit <parav@mellanox.com>
      Signed-off-by: default avatarOleksandr Mazur <oleksandr.mazur@plvision.eu>
      Signed-off-by: default avatarVadym Kochan <vadym.kochan@plvision.eu>
      Link: https://lore.kernel.org/r/20210119085333.16833-1-vadym.kochan@plvision.euSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      7e238de8
    • Enke Chen's avatar
      tcp: fix TCP_USER_TIMEOUT with zero window · 9d9b1ee0
      Enke Chen authored
      The TCP session does not terminate with TCP_USER_TIMEOUT when data
      remain untransmitted due to zero window.
      
      The number of unanswered zero-window probes (tcp_probes_out) is
      reset to zero with incoming acks irrespective of the window size,
      as described in tcp_probe_timer():
      
          RFC 1122 4.2.2.17 requires the sender to stay open indefinitely
          as long as the receiver continues to respond probes. We support
          this by default and reset icsk_probes_out with incoming ACKs.
      
      This counter, however, is the wrong one to be used in calculating the
      duration that the window remains closed and data remain untransmitted.
      Thanks to Jonathan Maxwell <jmaxwell37@gmail.com> for diagnosing the
      actual issue.
      
      In this patch a new timestamp is introduced for the socket in order to
      track the elapsed time for the zero-window probes that have not been
      answered with any non-zero window ack.
      
      Fixes: 9721e709 ("tcp: simplify window probe aborting on USER_TIMEOUT")
      Reported-by: default avatarWilliam McCall <william.mccall@gmail.com>
      Co-developed-by: default avatarNeal Cardwell <ncardwell@google.com>
      Signed-off-by: default avatarNeal Cardwell <ncardwell@google.com>
      Signed-off-by: default avatarEnke Chen <enchen@paloaltonetworks.com>
      Reviewed-by: default avatarYuchung Cheng <ycheng@google.com>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Link: https://lore.kernel.org/r/20210115223058.GA39267@localhost.localdomainSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      9d9b1ee0
    • Jakub Kicinski's avatar
      Merge branch 'ipv6-fixes-for-the-multicast-routes' · b889c7c8
      Jakub Kicinski authored
      Matteo Croce says:
      
      ====================
      ipv6: fixes for the multicast routes
      
      Fix two wrong flags in the IPv6 multicast routes created
      by the autoconf code.
      ====================
      
      Link: https://lore.kernel.org/r/20210115184209.78611-1-mcroce@linux.microsoft.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      b889c7c8
    • Matteo Croce's avatar
      ipv6: set multicast flag on the multicast route · ceed9038
      Matteo Croce authored
      The multicast route ff00::/8 is created with type RTN_UNICAST:
      
        $ ip -6 -d route
        unicast ::1 dev lo proto kernel scope global metric 256 pref medium
        unicast fe80::/64 dev eth0 proto kernel scope global metric 256 pref medium
        unicast ff00::/8 dev eth0 proto kernel scope global metric 256 pref medium
      
      Set the type to RTN_MULTICAST which is more appropriate.
      
      Fixes: e8478e80 ("net/ipv6: Save route type in rt6_info")
      Signed-off-by: default avatarMatteo Croce <mcroce@microsoft.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      ceed9038
    • Matteo Croce's avatar
      ipv6: create multicast route with RTPROT_KERNEL · a826b043
      Matteo Croce authored
      The ff00::/8 multicast route is created without specifying the fc_protocol
      field, so the default RTPROT_BOOT value is used:
      
        $ ip -6 -d route
        unicast ::1 dev lo proto kernel scope global metric 256 pref medium
        unicast fe80::/64 dev eth0 proto kernel scope global metric 256 pref medium
        unicast ff00::/8 dev eth0 proto boot scope global metric 256 pref medium
      
      As the documentation says, this value identifies routes installed during
      boot, but the route is created when interface is set up.
      Change the value to RTPROT_KERNEL which is a better value.
      
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Signed-off-by: default avatarMatteo Croce <mcroce@microsoft.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      a826b043
  2. 18 Jan, 2021 3 commits
  3. 17 Jan, 2021 1 commit
  4. 16 Jan, 2021 4 commits
    • Eric Dumazet's avatar
      net_sched: avoid shift-out-of-bounds in tcindex_set_parms() · bcd0cf19
      Eric Dumazet authored
      tc_index being 16bit wide, we need to check that TCA_TCINDEX_SHIFT
      attribute is not silly.
      
      UBSAN: shift-out-of-bounds in net/sched/cls_tcindex.c:260:29
      shift exponent 255 is too large for 32-bit type 'int'
      CPU: 0 PID: 8516 Comm: syz-executor228 Not tainted 5.10.0-syzkaller #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       __dump_stack lib/dump_stack.c:79 [inline]
       dump_stack+0x107/0x163 lib/dump_stack.c:120
       ubsan_epilogue+0xb/0x5a lib/ubsan.c:148
       __ubsan_handle_shift_out_of_bounds.cold+0xb1/0x181 lib/ubsan.c:395
       valid_perfect_hash net/sched/cls_tcindex.c:260 [inline]
       tcindex_set_parms.cold+0x1b/0x215 net/sched/cls_tcindex.c:425
       tcindex_change+0x232/0x340 net/sched/cls_tcindex.c:546
       tc_new_tfilter+0x13fb/0x21b0 net/sched/cls_api.c:2127
       rtnetlink_rcv_msg+0x8b6/0xb80 net/core/rtnetlink.c:5555
       netlink_rcv_skb+0x153/0x420 net/netlink/af_netlink.c:2494
       netlink_unicast_kernel net/netlink/af_netlink.c:1304 [inline]
       netlink_unicast+0x533/0x7d0 net/netlink/af_netlink.c:1330
       netlink_sendmsg+0x907/0xe40 net/netlink/af_netlink.c:1919
       sock_sendmsg_nosec net/socket.c:652 [inline]
       sock_sendmsg+0xcf/0x120 net/socket.c:672
       ____sys_sendmsg+0x6e8/0x810 net/socket.c:2336
       ___sys_sendmsg+0xf3/0x170 net/socket.c:2390
       __sys_sendmsg+0xe5/0x1b0 net/socket.c:2423
       do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Link: https://lore.kernel.org/r/20210114185229.1742255-1-eric.dumazet@gmail.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      bcd0cf19
    • Eric Dumazet's avatar
      net_sched: gen_estimator: support large ewma log · dd5e0733
      Eric Dumazet authored
      syzbot report reminded us that very big ewma_log were supported in the past,
      even if they made litle sense.
      
      tc qdisc replace dev xxx root est 1sec 131072sec ...
      
      While fixing the bug, also add boundary checks for ewma_log, in line
      with range supported by iproute2.
      
      UBSAN: shift-out-of-bounds in net/core/gen_estimator.c:83:38
      shift exponent -1 is negative
      CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.10.0-syzkaller #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       <IRQ>
       __dump_stack lib/dump_stack.c:79 [inline]
       dump_stack+0x107/0x163 lib/dump_stack.c:120
       ubsan_epilogue+0xb/0x5a lib/ubsan.c:148
       __ubsan_handle_shift_out_of_bounds.cold+0xb1/0x181 lib/ubsan.c:395
       est_timer.cold+0xbb/0x12d net/core/gen_estimator.c:83
       call_timer_fn+0x1a5/0x710 kernel/time/timer.c:1417
       expire_timers kernel/time/timer.c:1462 [inline]
       __run_timers.part.0+0x692/0xa80 kernel/time/timer.c:1731
       __run_timers kernel/time/timer.c:1712 [inline]
       run_timer_softirq+0xb3/0x1d0 kernel/time/timer.c:1744
       __do_softirq+0x2bc/0xa77 kernel/softirq.c:343
       asm_call_irq_on_stack+0xf/0x20
       </IRQ>
       __run_on_irqstack arch/x86/include/asm/irq_stack.h:26 [inline]
       run_on_irqstack_cond arch/x86/include/asm/irq_stack.h:77 [inline]
       do_softirq_own_stack+0xaa/0xd0 arch/x86/kernel/irq_64.c:77
       invoke_softirq kernel/softirq.c:226 [inline]
       __irq_exit_rcu+0x17f/0x200 kernel/softirq.c:420
       irq_exit_rcu+0x5/0x20 kernel/softirq.c:432
       sysvec_apic_timer_interrupt+0x4d/0x100 arch/x86/kernel/apic/apic.c:1096
       asm_sysvec_apic_timer_interrupt+0x12/0x20 arch/x86/include/asm/idtentry.h:628
      RIP: 0010:native_save_fl arch/x86/include/asm/irqflags.h:29 [inline]
      RIP: 0010:arch_local_save_flags arch/x86/include/asm/irqflags.h:79 [inline]
      RIP: 0010:arch_irqs_disabled arch/x86/include/asm/irqflags.h:169 [inline]
      RIP: 0010:acpi_safe_halt drivers/acpi/processor_idle.c:111 [inline]
      RIP: 0010:acpi_idle_do_entry+0x1c9/0x250 drivers/acpi/processor_idle.c:516
      
      Fixes: 1c0d32fd ("net_sched: gen_estimator: complete rewrite of rate estimators")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Link: https://lore.kernel.org/r/20210114181929.1717985-1-eric.dumazet@gmail.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      dd5e0733
    • Eric Dumazet's avatar
      net_sched: reject silly cell_log in qdisc_get_rtab() · e4bedf48
      Eric Dumazet authored
      iproute2 probably never goes beyond 8 for the cell exponent,
      but stick to the max shift exponent for signed 32bit.
      
      UBSAN reported:
      UBSAN: shift-out-of-bounds in net/sched/sch_api.c:389:22
      shift exponent 130 is too large for 32-bit type 'int'
      CPU: 1 PID: 8450 Comm: syz-executor586 Not tainted 5.11.0-rc3-syzkaller #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       __dump_stack lib/dump_stack.c:79 [inline]
       dump_stack+0x183/0x22e lib/dump_stack.c:120
       ubsan_epilogue lib/ubsan.c:148 [inline]
       __ubsan_handle_shift_out_of_bounds+0x432/0x4d0 lib/ubsan.c:395
       __detect_linklayer+0x2a9/0x330 net/sched/sch_api.c:389
       qdisc_get_rtab+0x2b5/0x410 net/sched/sch_api.c:435
       cbq_init+0x28f/0x12c0 net/sched/sch_cbq.c:1180
       qdisc_create+0x801/0x1470 net/sched/sch_api.c:1246
       tc_modify_qdisc+0x9e3/0x1fc0 net/sched/sch_api.c:1662
       rtnetlink_rcv_msg+0xb1d/0xe60 net/core/rtnetlink.c:5564
       netlink_rcv_skb+0x1f0/0x460 net/netlink/af_netlink.c:2494
       netlink_unicast_kernel net/netlink/af_netlink.c:1304 [inline]
       netlink_unicast+0x7de/0x9b0 net/netlink/af_netlink.c:1330
       netlink_sendmsg+0xaa6/0xe90 net/netlink/af_netlink.c:1919
       sock_sendmsg_nosec net/socket.c:652 [inline]
       sock_sendmsg net/socket.c:672 [inline]
       ____sys_sendmsg+0x5a2/0x900 net/socket.c:2345
       ___sys_sendmsg net/socket.c:2399 [inline]
       __sys_sendmsg+0x319/0x400 net/socket.c:2432
       do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Acked-by: default avatarCong Wang <cong.wang@bytedance.com>
      Link: https://lore.kernel.org/r/20210114160637.1660597-1-eric.dumazet@gmail.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      e4bedf48
    • Jakub Kicinski's avatar
      Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf · e23a8d00
      Jakub Kicinski authored
      Daniel Borkmann says:
      
      ====================
      pull-request: bpf 2021-01-16
      
      1) Fix a double bpf_prog_put() for BPF_PROG_{TYPE_EXT,TYPE_TRACING} types in
         link creation's error path causing a refcount underflow, from Jiri Olsa.
      
      2) Fix BTF validation errors for the case where kernel modules don't declare
         any new types and end up with an empty BTF, from Andrii Nakryiko.
      
      3) Fix BPF local storage helpers to first check their {task,inode} owners for
         being NULL before access, from KP Singh.
      
      4) Fix a memory leak in BPF setsockopt handling for the case where optlen is
         zero and thus temporary optval buffer should be freed, from Stanislav Fomichev.
      
      5) Fix a syzbot memory allocation splat in BPF_PROG_TEST_RUN infra for
         raw_tracepoint caused by too big ctx_size_in, from Song Liu.
      
      6) Fix LLVM code generation issues with verifier where PTR_TO_MEM{,_OR_NULL}
         registers were spilled to stack but not recognized, from Gilad Reti.
      
      * https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
        MAINTAINERS: Update my email address
        selftests/bpf: Add verifier test for PTR_TO_MEM spill
        bpf: Support PTR_TO_MEM{,_OR_NULL} register spilling
        bpf: Reject too big ctx_size_in for raw_tp test run
        libbpf: Allow loading empty BTFs
        bpf: Allow empty module BTFs
        bpf: Don't leak memory in bpf getsockopt when optlen == 0
        bpf: Update local storage test to check handling of null ptrs
        bpf: Fix typo in bpf_inode_storage.c
        bpf: Local storage helpers should check nullness of owner ptr passed
        bpf: Prevent double bpf_prog_put call from bpf_tracing_prog_attach
      ====================
      
      Link: https://lore.kernel.org/r/20210116002025.15706-1-daniel@iogearbox.netSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      e23a8d00
  5. 15 Jan, 2021 3 commits
  6. 14 Jan, 2021 20 commits
    • Linus Torvalds's avatar
      Merge tag 'linux-kselftest-fixes-5.11-rc4' of... · 14662050
      Linus Torvalds authored
      Merge tag 'linux-kselftest-fixes-5.11-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
      
      Pull kselftest fixes from Shuah Khan:
       "One single fix to skip BPF selftests by default.
      
        BPF selftests have a hard dependency on cutting edge versions of tools
        in the BPF ecosystem including LLVM.
      
        Skipping BPF allows by default will make it easier for users
        interested in running kselftest as a whole. Users can include BPF in
        Kselftest build by via SKIP_TARGETS variable"
      
      * tag 'linux-kselftest-fixes-5.11-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
        selftests: Skip BPF seftests by default
      14662050
    • Linus Torvalds's avatar
      Merge tag 'net-5.11-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · e8c13a6b
      Linus Torvalds authored
      Pull networking fixes from Jakub Kicinski:
       "We have a few fixes for long standing issues, in particular Eric's fix
        to not underestimate the skb sizes, and my fix for brokenness of
        register_netdevice() error path. They may uncover other bugs so we
        will keep an eye on them. Also included are Willem's fixes for
        kmap(_atomic).
      
        Looking at the "current release" fixes, it seems we are about one rc
        behind a normal cycle. We've previously seen an uptick of "people had
        run their test suites" / "humans actually tried to use new features"
        fixes between rc2 and rc3.
      
        Summary:
      
        Current release - regressions:
      
         - fix feature enforcement to allow NETIF_F_HW_TLS_TX if IP_CSUM &&
           IPV6_CSUM
      
         - dcb: accept RTM_GETDCB messages carrying set-like DCB commands if
           user is admin for backward-compatibility
      
         - selftests/tls: fix selftests build after adding ChaCha20-Poly1305
      
        Current release - always broken:
      
         - ppp: fix refcount underflow on channel unbridge
      
         - bnxt_en: clear DEFRAG flag in firmware message when retry flashing
      
         - smc: fix out of bound access in the new netlink interface
      
        Previous releases - regressions:
      
         - fix use-after-free with UDP GRO by frags
      
         - mptcp: better msk-level shutdown
      
         - rndis_host: set proper input size for OID_GEN_PHYSICAL_MEDIUM
           request
      
         - i40e: xsk: fix potential NULL pointer dereferencing
      
        Previous releases - always broken:
      
         - skb frag: kmap_atomic fixes
      
         - avoid 32 x truesize under-estimation for tiny skbs
      
         - fix issues around register_netdevice() failures
      
         - udp: prevent reuseport_select_sock from reading uninitialized socks
      
         - dsa: unbind all switches from tree when DSA master unbinds
      
         - dsa: clear devlink port type before unregistering slave netdevs
      
         - can: isotp: isotp_getname(): fix kernel information leak
      
         - mlxsw: core: Thermal control fixes
      
         - ipv6: validate GSO SKB against MTU before finish IPv6 processing
      
         - stmmac: use __napi_schedule() for PREEMPT_RT
      
         - net: mvpp2: remove Pause and Asym_Pause support
      
        Misc:
      
         - remove from MAINTAINERS folks who had been inactive for >5yrs"
      
      * tag 'net-5.11-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (58 commits)
        mptcp: fix locking in mptcp_disconnect()
        net: Allow NETIF_F_HW_TLS_TX if IP_CSUM && IPV6_CSUM
        MAINTAINERS: dccp: move Gerrit Renker to CREDITS
        MAINTAINERS: ipvs: move Wensong Zhang to CREDITS
        MAINTAINERS: tls: move Aviad to CREDITS
        MAINTAINERS: ena: remove Zorik Machulsky from reviewers
        MAINTAINERS: vrf: move Shrijeet to CREDITS
        MAINTAINERS: net: move Alexey Kuznetsov to CREDITS
        MAINTAINERS: altx: move Jay Cliburn to CREDITS
        net: avoid 32 x truesize under-estimation for tiny skbs
        nt: usb: USB_RTL8153_ECM should not default to y
        net: stmmac: fix taprio configuration when base_time is in the past
        net: stmmac: fix taprio schedule configuration
        net: tip: fix a couple kernel-doc markups
        net: sit: unregister_netdevice on newlink's error path
        net: stmmac: Fixed mtu channged by cache aligned
        cxgb4/chtls: Fix tid stuck due to wrong update of qid
        i40e: fix potential NULL pointer dereferencing
        net: stmmac: use __napi_schedule() for PREEMPT_RT
        can: mcp251xfd: mcp251xfd_handle_rxif_one(): fix wrong NULL pointer check
        ...
      e8c13a6b
    • Lorenzo Bianconi's avatar
      mac80211: check if atf has been disabled in __ieee80211_schedule_txq · c13cf5c1
      Lorenzo Bianconi authored
      Check if atf has been disabled in __ieee80211_schedule_txq() in order to
      avoid a given sta is always put to the beginning of the active_txqs list
      and never moved to the end since deficit is not decremented in
      ieee80211_sta_register_airtime()
      
      Fixes: b4809e94 ("mac80211: Add airtime accounting and scheduling to TXQs")
      Signed-off-by: default avatarLorenzo Bianconi <lorenzo@kernel.org>
      Acked-by: default avatarToke Høiland-Jørgensen <toke@toke.dk>
      Link: https://lore.kernel.org/r/93889406c50f1416214c079ca0b8c9faecc5143e.1608975195.git.lorenzo@kernel.orgSigned-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      c13cf5c1
    • Felix Fietkau's avatar
      mac80211: do not drop tx nulldata packets on encrypted links · 2463ec86
      Felix Fietkau authored
      ieee80211_tx_h_select_key drops any non-mgmt packets without a key when
      encryption is used. This is wrong for nulldata packets that can't be
      encrypted and are sent out for probing clients and indicating 4-address
      mode.
      Reported-by: default avatarSebastian Gottschall <s.gottschall@dd-wrt.com>
      Fixes: a0761a30 ("mac80211: drop data frames without key on encrypted links")
      Signed-off-by: default avatarFelix Fietkau <nbd@nbd.name>
      Link: https://lore.kernel.org/r/20201218191525.1168-1-nbd@nbd.nameSigned-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      2463ec86
    • Felix Fietkau's avatar
      mac80211: fix encryption key selection for 802.3 xmit · b101dd2d
      Felix Fietkau authored
      When using WEP, the default unicast key needs to be selected, instead of
      the STA PTK.
      Signed-off-by: default avatarFelix Fietkau <nbd@nbd.name>
      Link: https://lore.kernel.org/r/20201218184718.93650-4-nbd@nbd.nameSigned-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      b101dd2d
    • Felix Fietkau's avatar
      mac80211: fix fast-rx encryption check · 622d3b4e
      Felix Fietkau authored
      When using WEP, the default unicast key needs to be selected, instead of
      the STA PTK.
      Signed-off-by: default avatarFelix Fietkau <nbd@nbd.name>
      Link: https://lore.kernel.org/r/20201218184718.93650-5-nbd@nbd.nameSigned-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      622d3b4e
    • Shayne Chen's avatar
      mac80211: fix incorrect strlen of .write in debugfs · 6020d534
      Shayne Chen authored
      This fixes strlen mismatch problems happening in some .write callbacks
      of debugfs.
      
      When trying to configure airtime_flags in debugfs, an error appeared:
      ash: write error: Invalid argument
      
      The error is returned from kstrtou16() since a wrong length makes it
      miss the real end of input string.  To fix this, use count as the string
      length, and set proper end of string for a char buffer.
      
      The debug print is shown - airtime_flags_write: count = 2, len = 8,
      where the actual length is 2, but "len = strlen(buf)" gets 8.
      
      Also cleanup the other similar cases for the sake of consistency.
      Signed-off-by: default avatarSujuan Chen <sujuan.chen@mediatek.com>
      Signed-off-by: default avatarRyder Lee <ryder.lee@mediatek.com>
      Signed-off-by: default avatarShayne Chen <shayne.chen@mediatek.com>
      Link: https://lore.kernel.org/r/20210112032028.7482-1-shayne.chen@mediatek.comSigned-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      6020d534
    • Mauro Carvalho Chehab's avatar
      cfg80211: fix a kerneldoc markup · c2083e28
      Mauro Carvalho Chehab authored
      A function has a different name between their prototype
      and its kernel-doc markup:
      	../include/net/cfg80211.h:1766: warning: expecting prototype for struct cfg80211_sar_chan_ranges. Prototype was for struct cfg80211_sar_freq_ranges instead
      Signed-off-by: default avatarMauro Carvalho Chehab <mchehab+huawei@kernel.org>
      Link: https://lore.kernel.org/r/c7ed4bc4d9e992ead16d3d2df246f3b56dbfb1fb.1610610937.git.mchehab+huawei@kernel.orgSigned-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      c2083e28
    • Paolo Abeni's avatar
      mptcp: fix locking in mptcp_disconnect() · 13a9499e
      Paolo Abeni authored
      tcp_disconnect() expects the caller acquires the sock lock,
      but mptcp_disconnect() is not doing that. Add the missing
      required lock.
      Reported-by: default avatarEric Dumazet <eric.dumazet@gmail.com>
      Fixes: 76e2a55d ("mptcp: better msk-level shutdown.")
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Link: https://lore.kernel.org/r/f818e82b58a556feeb71dcccc8bf1c87aafc6175.1610638176.git.pabeni@redhat.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      13a9499e
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid · 2bbe17ae
      Linus Torvalds authored
      Pull HID fixes from Jiri Kosina:
      
       - memory leak fix for Wacom driver (Ping Cheng)
      
       - various trivial small fixes, cleanups and device ID additions
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid:
        HID: logitech-hidpp: Add product ID for MX Ergo in Bluetooth mode
        HID: Ignore battery for Elan touchscreen on ASUS UX550
        HID: logitech-dj: add the G602 receiver
        HID: wiimote: remove h from printk format specifier
        HID: uclogic: remove h from printk format specifier
        HID: sony: select CONFIG_CRC32
        HID: sfh: fix address space confusion
        HID: multitouch: Enable multi-input for Synaptics pointstick/touchpad device
        HID: wacom: Fix memory leakage caused by kfifo_alloc
      2bbe17ae
    • Tariq Toukan's avatar
      net: Allow NETIF_F_HW_TLS_TX if IP_CSUM && IPV6_CSUM · 25537d71
      Tariq Toukan authored
      Cited patch below blocked the TLS TX device offload unless HW_CSUM
      is set. This broke devices that use IP_CSUM && IP6_CSUM.
      Here we fix it.
      
      Note that the single HW_TLS_TX feature flag indicates support for
      both IPv4/6, hence it should still be disabled in case only one of
      (IP_CSUM | IPV6_CSUM) is set.
      
      Fixes: ae0b04b2 ("net: Disable NETIF_F_HW_TLS_TX when HW_CSUM is disabled")
      Signed-off-by: default avatarTariq Toukan <tariqt@nvidia.com>
      Reported-by: default avatarRohit Maheshwari <rohitm@chelsio.com>
      Reviewed-by: default avatarMaxim Mikityanskiy <maximmi@mellanox.com>
      Link: https://lore.kernel.org/r/20210114151215.7061-1-tariqt@nvidia.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      25537d71
    • Jakub Kicinski's avatar
      Merge branch 'maintainers-remove-inactive-folks-from-networking' · 70db767f
      Jakub Kicinski authored
      To make maintainers' lives easier we're trying to nudge people
      towards CCing all the relevant folks on patches, in an attempt
      to improve review rate. We have a check in patchwork which validates
      the CC list against get_maintainers.pl. It's a little awkward, however,
      to force people to CC maintainers who we haven't seen on the mailing
      list for years. This series removes from maintainers folks who didn't
      provide any tag (incl. authoring a patch) in the last 5 years.
      To ensure reasonable signal to noise ratio we only considered
      MAINTAINERS entries which had more than 100 patches fall under
      them in that time period.
      
      All this is purely a process-greasing exercise, I hope nobody
      sees this series as an affront. Most folks are moved to CREDITS,
      a couple entries are simply removed.
      
      The following inactive maintainers are kept, because they indicated
      the intention to come back in the near future:
      
       - Veaceslav Falico (bonding)
       - Christian Benvenuti (Cisco drivers)
       - Felix Fietkau (mtk-eth)
       - Mirko Linder (skge/sky2)
      
      Patches in this series contain report from a script which did
      the analysis. Big thanks to Jonathan Corbet for help and writing
      the script (although I feel like I used it differently than Jon
      may have intended ;)). The output format is thus:
      
       Subsystem $name
        Changes $reviewed / $total ($percent%)           // how many changes to the subsystem had at least one ack/review
        Last activity: $date_of_most_recent_patch
        $maintainer/reviewer1:
          Author $last_commit_authored_by_the_person $how_many_in_5yrs
          Committer $last_committed $how_many
          Tags $last_tag_like_review_signoff_etc $how_many
        $maintainer/reviewer2:
          Author $last_commit_authored_by_the_person $how_many_in_5yrs
          Committer $last_committed $how_many
          Tags $last_tag_like_review_signoff_etc $how_many
        Top reviewers: // Top 3 reviewers (who are not listed in MAINTAINERS)
          [$count_of_reviews_and_acks]: $email
        INACTIVE MAINTAINER $name   // maintainer / reviewer who has done nothing in last 5yrs
      
      v2:
       - keep Felix and Mirko
      
      Link: https://lore.kernel.org/r/20210114014912.2519931-1-kuba@kernel.orgSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      70db767f
    • Jakub Kicinski's avatar
      MAINTAINERS: dccp: move Gerrit Renker to CREDITS · 054c4610
      Jakub Kicinski authored
      As far as I can tell we haven't heard from Gerrit for roughly
      5 years now. DCCP patch would really benefit from some review.
      Gerrit was the last maintainer so mark this entry as orphaned.
      
      Subsystem DCCP PROTOCOL
        Changes 38 / 166 (22%)
        (No activity)
        Top reviewers:
          [6]: kstewart@linuxfoundation.org
          [6]: allison@lohutok.net
          [5]: edumazet@google.com
        INACTIVE MAINTAINER Gerrit Renker <gerrit@erg.abdn.ac.uk>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      054c4610
    • Jakub Kicinski's avatar
      MAINTAINERS: ipvs: move Wensong Zhang to CREDITS · 4f3786e0
      Jakub Kicinski authored
      Move Wensong Zhang to credits, we haven't heard from
      him in years.
      
      Subsystem IPVS
        Changes 83 / 226 (36%)
        Last activity: 2020-11-27
        Wensong Zhang <wensong@linux-vs.org>:
        Simon Horman <horms@verge.net.au>:
          Committer c24b75e0 2019-10-24 00:00:00 33
          Tags 7980d2ea 2020-10-12 00:00:00 76
        Julian Anastasov <ja@ssi.bg>:
          Author 7980d2ea 2020-10-12 00:00:00 26
          Tags 4bc3c8dc 2020-11-27 00:00:00 78
        Top reviewers:
          [6]: horms+renesas@verge.net.au
        INACTIVE MAINTAINER Wensong Zhang <wensong@linux-vs.org>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      4f3786e0
    • Jakub Kicinski's avatar
      MAINTAINERS: tls: move Aviad to CREDITS · 0e4ed0b6
      Jakub Kicinski authored
      Aviad wrote parts of the initial TLS implementation
      but hasn't been contributing to TLS since.
      
      Subsystem NETWORKING [TLS]
        Changes 123 / 308 (39%)
        Last activity: 2020-12-01
        Boris Pismenny <borisp@nvidia.com>:
          Tags 138559b9 2020-11-17 00:00:00 1
        Aviad Yehezkel <aviadye@nvidia.com>:
        John Fastabend <john.fastabend@gmail.com>:
          Author e91de6af 2020-06-01 00:00:00 22
          Tags e91de6af 2020-06-01 00:00:00 29
        Daniel Borkmann <daniel@iogearbox.net>:
          Author c16ee04c 2018-10-20 00:00:00 7
          Committer b8e202d1 2020-02-21 00:00:00 19
          Tags b8e202d1 2020-02-21 00:00:00 28
        Jakub Kicinski <kuba@kernel.org>:
          Author 5c39f26e 2020-11-27 00:00:00 89
          Committer d31c0800 2020-12-01 00:00:00 15
          Tags d31c0800 2020-12-01 00:00:00 117
        Top reviewers:
          [50]: dirk.vandermerwe@netronome.com
          [26]: simon.horman@netronome.com
          [14]: john.hurley@netronome.com
        INACTIVE MAINTAINER Aviad Yehezkel <aviadye@nvidia.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      0e4ed0b6
    • Jakub Kicinski's avatar
      MAINTAINERS: ena: remove Zorik Machulsky from reviewers · c41efbf2
      Jakub Kicinski authored
      While ENA has 3 reviewers and 2 maintainers, we mostly see review
      tags and comments from the maintainers. While we very much appreciate
      Zorik's invovment in the community let's trim the reviewer list
      down to folks we've seen tags from.
      
      Subsystem AMAZON ETHERNET DRIVERS
        Changes 13 / 269 (4%)
        Last activity: 2020-11-24
        Netanel Belgazal <netanel@amazon.com>:
          Author 24dee0c7 2019-12-10 00:00:00 43
          Tags 0e3a3f6d 2020-07-21 00:00:00 47
        Arthur Kiyanovski <akiyano@amazon.com>:
          Author 0e3a3f6d 2020-07-21 00:00:00 79
          Tags 09323b3b 2020-11-24 00:00:00 104
        Guy Tzalik <gtzalik@amazon.com>:
          Tags 713865da 2020-09-10 00:00:00 3
        Saeed Bishara <saeedb@amazon.com>:
          Tags 470793a7 2020-02-11 00:00:00 2
        Zorik Machulsky <zorik@amazon.com>:
        Top reviewers:
          [4]: sameehj@amazon.com
          [3]: snelson@pensando.io
          [3]: shayagr@amazon.com
        INACTIVE MAINTAINER Zorik Machulsky <zorik@amazon.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      c41efbf2
    • Jakub Kicinski's avatar
      MAINTAINERS: vrf: move Shrijeet to CREDITS · 5e62d124
      Jakub Kicinski authored
      Shrijeet has moved on from VRF-related work.
      
      Subsystem VRF
        Changes 30 / 120 (25%)
        Last activity: 2020-12-09
        David Ahern <dsahern@kernel.org>:
          Author 1b6687e3 2020-07-23 00:00:00 1
          Tags 9125abe7 2020-12-09 00:00:00 4
        Shrijeet Mukherjee <shrijeet@gmail.com>:
        Top reviewers:
          [13]: dsahern@gmail.com
          [4]: dsa@cumulusnetworks.com
        INACTIVE MAINTAINER Shrijeet Mukherjee <shrijeet@gmail.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      5e62d124
    • Jakub Kicinski's avatar
      MAINTAINERS: net: move Alexey Kuznetsov to CREDITS · 09cd3f46
      Jakub Kicinski authored
      Move Alexey to CREDITS.
      
      I am probably not giving him enough justice with
      the description line..
      
      Subsystem NETWORKING [IPv4/IPv6]
        Changes 1535 / 5111 (30%)
        Last activity: 2020-12-10
        "David S. Miller" <davem@davemloft.net>:
          Author b7e4ba9a 2020-12-09 00:00:00 407
          Committer e0fecb28 2020-12-10 00:00:00 3992
          Tags e0fecb28 2020-12-10 00:00:00 3978
        Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>:
        Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>:
          Tags d5d8760b 2016-06-16 00:00:00 8
        Top reviewers:
          [225]: edumazet@google.com
          [222]: dsahern@gmail.com
          [176]: ncardwell@google.com
        INACTIVE MAINTAINER Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      09cd3f46
    • Jakub Kicinski's avatar
      MAINTAINERS: altx: move Jay Cliburn to CREDITS · 93089de9
      Jakub Kicinski authored
      Jay was not active in recent years and does not have plans
      to return to work on ATLX drivers.
      
      Subsystem ATLX ETHERNET DRIVERS
        Changes 20 / 116 (17%)
        Last activity: 2020-02-24
        Jay Cliburn <jcliburn@gmail.com>:
        Chris Snook <chris.snook@gmail.com>:
          Tags ea973742 2020-02-24 00:00:00 1
        Top reviewers:
          [4]: andrew@lunn.ch
          [2]: kuba@kernel.org
          [2]: o.rempel@pengutronix.de
        INACTIVE MAINTAINER Jay Cliburn <jcliburn@gmail.com>
      Acked-by: default avatarChris Snook <chris.snook@gmail.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      93089de9
    • Eric Dumazet's avatar
      net: avoid 32 x truesize under-estimation for tiny skbs · 3226b158
      Eric Dumazet authored
      Both virtio net and napi_get_frags() allocate skbs
      with a very small skb->head
      
      While using page fragments instead of a kmalloc backed skb->head might give
      a small performance improvement in some cases, there is a huge risk of
      under estimating memory usage.
      
      For both GOOD_COPY_LEN and GRO_MAX_HEAD, we can fit at least 32 allocations
      per page (order-3 page in x86), or even 64 on PowerPC
      
      We have been tracking OOM issues on GKE hosts hitting tcp_mem limits
      but consuming far more memory for TCP buffers than instructed in tcp_mem[2]
      
      Even if we force napi_alloc_skb() to only use order-0 pages, the issue
      would still be there on arches with PAGE_SIZE >= 32768
      
      This patch makes sure that small skb head are kmalloc backed, so that
      other objects in the slab page can be reused instead of being held as long
      as skbs are sitting in socket queues.
      
      Note that we might in the future use the sk_buff napi cache,
      instead of going through a more expensive __alloc_skb()
      
      Another idea would be to use separate page sizes depending
      on the allocated length (to never have more than 4 frags per page)
      
      I would like to thank Greg Thelen for his precious help on this matter,
      analysing crash dumps is always a time consuming task.
      
      Fixes: fd11a83d ("net: Pull out core bits of __netdev_alloc_skb and add __napi_alloc_skb")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Paolo Abeni <pabeni@redhat.com>
      Cc: Greg Thelen <gthelen@google.com>
      Reviewed-by: default avatarAlexander Duyck <alexanderduyck@fb.com>
      Acked-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Link: https://lore.kernel.org/r/20210113161819.1155526-1-eric.dumazet@gmail.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      3226b158