1. 02 Aug, 2022 2 commits
    • Maciej Żenczykowski's avatar
      net: usb: make USB_RTL8153_ECM non user configurable · f56530dc
      Maciej Żenczykowski authored
      This refixes:
      
          commit 7da17624
          nt: usb: USB_RTL8153_ECM should not default to y
      
          In general, device drivers should not be enabled by default.
      
      which basically broke the commit it claimed to fix, ie:
      
          commit 657bc1d1
          r8153_ecm: avoid to be prior to r8152 driver
      
          Avoid r8153_ecm is compiled as built-in, if r8152 driver is compiled
          as modules. Otherwise, the r8153_ecm would be used, even though the
          device is supported by r8152 driver.
      
      this commit amounted to:
      
      drivers/net/usb/Kconfig:
      
      +config USB_RTL8153_ECM
      +       tristate "RTL8153 ECM support"
      +       depends on USB_NET_CDCETHER && (USB_RTL8152 || USB_RTL8152=n)
      +       default y
      +       help
      +         This option supports ECM mode for RTL8153 ethernet adapter, when
      +         CONFIG_USB_RTL8152 is not set, or the RTL8153 device is not
      +         supported by r8152 driver.
      
      drivers/net/usb/Makefile:
      
      -obj-$(CONFIG_USB_NET_CDCETHER) += cdc_ether.o r8153_ecm.o
      +obj-$(CONFIG_USB_NET_CDCETHER) += cdc_ether.o
      +obj-$(CONFIG_USB_RTL8153_ECM)  += r8153_ecm.o
      
      And as can be seen it pulls a piece of the cdc_ether driver out into
      a separate config option to be able to make this piece modular in case
      cdc_ether is builtin, while r8152 is modular.
      
      While in general, device drivers should indeed not be enabled by default:
      this isn't a device driver per say, but rather this is support code for
      the CDCETHER (ECM) driver, and should thus be enabled if it is enabled.
      
      See also email thread at:
        https://www.spinics.net/lists/netdev/msg767649.html
      
      In:
        https://www.spinics.net/lists/netdev/msg768284.html
      
      Jakub wrote:
        And when we say "removed" we can just hide it from what's prompted
        to the user (whatever such internal options are called)? I believe
        this way we don't bring back Marek's complaint.
      
      Side note: these incorrect defaults will result in Android 13
      on 5.15 GKI kernels lacking USB_RTL8153_ECM support while having
      USB_NET_CDCETHER (luckily we also have USB_RTL8150 and USB_RTL8152,
      so it's probably only an issue for very new RTL815x hardware with
      no native 5.15 driver).
      
      Fixes: 7da17624 ("nt: usb: USB_RTL8153_ECM should not default to y")
      Cc: Geert Uytterhoeven <geert+renesas@glider.be>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Hayes Wang <hayeswang@realtek.com>
      Cc: Jakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarMaciej Żenczykowski <maze@google.com>
      Link: https://lore.kernel.org/r/20220730230113.4138858-1-zenczykowski@gmail.comSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      f56530dc
    • Subbaraya Sundeep's avatar
      octeontx2-pf: Reduce minimum mtu size to 60 · 53e99496
      Subbaraya Sundeep authored
      PTP messages like SYNC, FOLLOW_UP, DELAY_REQ are of size 58 bytes.
      Using a minimum packet length as 64 makes NIX to pad 6 bytes of
      zeroes while transmission. This is causing latest ptp4l application to
      emit errors since length in PTP header and received packet are not same.
      Padding upto 3 bytes is fine but more than that makes ptp4l to assume
      the pad bytes as a TLV. Hence reduce the size to 60 from 64.
      Signed-off-by: default avatarSubbaraya Sundeep <sbhatta@marvell.com>
      Signed-off-by: default avatarNaveen Mamindlapalli <naveenm@marvell.com>
      Signed-off-by: default avatarSunil Kovvuri Goutham <sgoutham@marvell.com>
      Link: https://lore.kernel.org/r/20220729092457.3850-1-naveenm@marvell.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      53e99496
  2. 01 Aug, 2022 6 commits
  3. 30 Jul, 2022 1 commit
  4. 29 Jul, 2022 8 commits
    • Przemyslaw Patynowski's avatar
      iavf: Fix 'tc qdisc show' listing too many queues · 93cb804e
      Przemyslaw Patynowski authored
      Fix tc qdisc show dev <ethX> root displaying too many fq_codel qdiscs.
      tc_modify_qdisc, which is caller of ndo_setup_tc, expects driver to call
      netif_set_real_num_tx_queues, which prepares qdiscs.
      Without this patch, fq_codel qdiscs would not be adjusted to number of
      queues on VF.
      e.g.:
      tc qdisc show dev <ethX>
      qdisc mq 0: root
      qdisc fq_codel 0: parent :4 limit 10240p flows 1024 quantum 1514 target 5ms interval 100ms memory_limit 32Mb ecn drop_batch 64
      qdisc fq_codel 0: parent :3 limit 10240p flows 1024 quantum 1514 target 5ms interval 100ms memory_limit 32Mb ecn drop_batch 64
      qdisc fq_codel 0: parent :2 limit 10240p flows 1024 quantum 1514 target 5ms interval 100ms memory_limit 32Mb ecn drop_batch 64
      qdisc fq_codel 0: parent :1 limit 10240p flows 1024 quantum 1514 target 5ms interval 100ms memory_limit 32Mb ecn drop_batch 64
      tc qdisc add dev <ethX> root mqprio num_tc 2 map 1 0 0 0 0 0 0 0 queues 1@0 1@1 hw 1 mode channel shaper bw_rlimit max_rate 5000Mbit 150Mbit
      tc qdisc show dev <ethX>
      qdisc mqprio 8003: root tc 2 map 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
                   queues:(0:0) (1:1)
                   mode:channel
                   shaper:bw_rlimit   max_rate:5Gbit 150Mbit
      qdisc fq_codel 0: parent 8003:4 limit 10240p flows 1024 quantum 1514 target 5ms interval 100ms memory_limit 32Mb ecn drop_batch 64
      qdisc fq_codel 0: parent 8003:3 limit 10240p flows 1024 quantum 1514 target 5ms interval 100ms memory_limit 32Mb ecn drop_batch 64
      qdisc fq_codel 0: parent 8003:2 limit 10240p flows 1024 quantum 1514 target 5ms interval 100ms memory_limit 32Mb ecn drop_batch 64
      qdisc fq_codel 0: parent 8003:1 limit 10240p flows 1024 quantum 1514 target 5ms interval 100ms memory_limit 32Mb ecn drop_batch 64
      
      While after fix:
      tc qdisc add dev <ethX> root mqprio num_tc 2 map 1 0 0 0 0 0 0 0 queues 1@0 1@1 hw 1 mode channel shaper bw_rlimit max_rate 5000Mbit 150Mbit
      tc qdisc show dev <ethX> #should show 2, shows 4
      qdisc mqprio 8004: root tc 2 map 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
                   queues:(0:0) (1:1)
                   mode:channel
                   shaper:bw_rlimit   max_rate:5Gbit 150Mbit
      qdisc fq_codel 0: parent 8004:2 limit 10240p flows 1024 quantum 1514 target 5ms interval 100ms memory_limit 32Mb ecn drop_batch 64
      qdisc fq_codel 0: parent 8004:1 limit 10240p flows 1024 quantum 1514 target 5ms interval 100ms memory_limit 32Mb ecn drop_batch 64
      
      Fixes: d5b33d02 ("i40evf: add ndo_setup_tc callback to i40evf")
      Signed-off-by: default avatarPrzemyslaw Patynowski <przemyslawx.patynowski@intel.com>
      Co-developed-by: default avatarGrzegorz Szczurek <grzegorzx.szczurek@intel.com>
      Signed-off-by: default avatarGrzegorz Szczurek <grzegorzx.szczurek@intel.com>
      Co-developed-by: default avatarKiran Patil <kiran.patil@intel.com>
      Signed-off-by: default avatarKiran Patil <kiran.patil@intel.com>
      Signed-off-by: default avatarJedrzej Jagielski <jedrzej.jagielski@intel.com>
      Tested-by: default avatarBharathi Sreenivas <bharathi.sreenivas@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      93cb804e
    • Przemyslaw Patynowski's avatar
      iavf: Fix max_rate limiting · ec60d54c
      Przemyslaw Patynowski authored
      Fix max_rate option in TC, check for proper quanta boundaries.
      Check for minimum value provided and if it fits expected 50Mbps
      quanta.
      
      Without this patch, iavf could send settings for max_rate limiting
      that would be accepted from by PF even the max_rate option is less
      than expected 50Mbps quanta. It results in no rate limiting
      on traffic as rate limiting will be floored to 0.
      
      Example:
      tc qdisc add dev $vf root mqprio num_tc 3 map 0 2 1 queues \
      2@0 2@2 2@4 hw 1 mode channel shaper bw_rlimit \
      max_rate 50Mbps 500Mbps 500Mbps
      
      Should limit TC0 to circa 50 Mbps
      
      tc qdisc add dev $vf root mqprio num_tc 3 map 0 2 1 queues \
      2@0 2@2 2@4 hw 1 mode channel shaper bw_rlimit \
      max_rate 0Mbps 100Kbit 500Mbps
      
      Should return error
      
      Fixes: d5b33d02 ("i40evf: add ndo_setup_tc callback to i40evf")
      Signed-off-by: default avatarPrzemyslaw Patynowski <przemyslawx.patynowski@intel.com>
      Signed-off-by: default avatarJun Zhang <xuejun.zhang@intel.com>
      Tested-by: default avatarBharathi Sreenivas <bharathi.sreenivas@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      ec60d54c
    • David S. Miller's avatar
      Merge branch 'netdevsim-fib-route-delete-leak' · b65a1534
      David S. Miller authored
      Ido Schimmel says:
      
      ====================
      netdevsim: fib: Fix reference count leak on route deletion failure
      
      Fix a recently reported netdevsim bug found using syzkaller.
      
      Patch #1 fixes the bug.
      
      Patch #2 adds a debugfs knob to allow us to test the fix.
      
      Patch #3 adds test cases.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b65a1534
    • Ido Schimmel's avatar
      selftests: netdevsim: Add test cases for route deletion failure · 40823f3e
      Ido Schimmel authored
      Add IPv4 and IPv6 test cases that ensure that we are not leaking a
      reference on the nexthop device when we are unable to delete its
      associated route.
      
      Without the fix in a previous patch ("netdevsim: fib: Fix reference
      count leak on route deletion failure") both test cases get stuck,
      waiting for the reference to be released from the dummy device [1][2].
      
      [1]
      unregister_netdevice: waiting for dummy1 to become free. Usage count = 5
      leaked reference.
       fib_check_nh+0x275/0x620
       fib_create_info+0x237c/0x4d30
       fib_table_insert+0x1dd/0x1d20
       inet_rtm_newroute+0x11b/0x200
       rtnetlink_rcv_msg+0x43b/0xd20
       netlink_rcv_skb+0x15e/0x430
       netlink_unicast+0x53b/0x800
       netlink_sendmsg+0x945/0xe40
       ____sys_sendmsg+0x747/0x960
       ___sys_sendmsg+0x11d/0x190
       __sys_sendmsg+0x118/0x1e0
       do_syscall_64+0x34/0x80
       entry_SYSCALL_64_after_hwframe+0x63/0xcd
      
      [2]
      unregister_netdevice: waiting for dummy1 to become free. Usage count = 5
      leaked reference.
       fib6_nh_init+0xc46/0x1ca0
       ip6_route_info_create+0x1167/0x19a0
       ip6_route_add+0x27/0x150
       inet6_rtm_newroute+0x161/0x170
       rtnetlink_rcv_msg+0x43b/0xd20
       netlink_rcv_skb+0x15e/0x430
       netlink_unicast+0x53b/0x800
       netlink_sendmsg+0x945/0xe40
       ____sys_sendmsg+0x747/0x960
       ___sys_sendmsg+0x11d/0x190
       __sys_sendmsg+0x118/0x1e0
       do_syscall_64+0x34/0x80
       entry_SYSCALL_64_after_hwframe+0x63/0xcd
      Signed-off-by: default avatarIdo Schimmel <idosch@nvidia.com>
      Reviewed-by: default avatarAmit Cohen <amcohen@nvidia.com>
      Reviewed-by: default avatarDavid Ahern <dsahern@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      40823f3e
    • Ido Schimmel's avatar
      netdevsim: fib: Add debugfs knob to simulate route deletion failure · 974be75f
      Ido Schimmel authored
      The previous patch ("netdevsim: fib: Fix reference count leak on route
      deletion failure") fixed a reference count leak that happens on route
      deletion failure.
      
      Such failures can only be simulated by injecting slab allocation
      failures, which cannot be surgically injected.
      
      In order to be able to specifically test this scenario, add a debugfs
      knob that allows user space to fail route deletion requests when
      enabled.
      Signed-off-by: default avatarIdo Schimmel <idosch@nvidia.com>
      Reviewed-by: default avatarAmit Cohen <amcohen@nvidia.com>
      Reviewed-by: default avatarDavid Ahern <dsahern@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      974be75f
    • Ido Schimmel's avatar
      netdevsim: fib: Fix reference count leak on route deletion failure · 180a6a3e
      Ido Schimmel authored
      As part of FIB offload simulation, netdevsim stores IPv4 and IPv6 routes
      and holds a reference on FIB info structures that in turn hold a
      reference on the associated nexthop device(s).
      
      In the unlikely case where we are unable to allocate memory to process a
      route deletion request, netdevsim will not release the reference from
      the associated FIB info structure, thereby preventing the associated
      nexthop device(s) from ever being removed [1].
      
      Fix this by scheduling a work item that will flush netdevsim's FIB table
      upon route deletion failure. This will cause netdevsim to release its
      reference from all the FIB info structures in its table.
      
      Reported by Lucas Leong of Trend Micro Zero Day Initiative.
      
      Fixes: 0ae3eb7b ("netdevsim: fib: Perform the route programming in a non-atomic context")
      Signed-off-by: default avatarIdo Schimmel <idosch@nvidia.com>
      Reviewed-by: default avatarAmit Cohen <amcohen@nvidia.com>
      Reviewed-by: default avatarDavid Ahern <dsahern@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      180a6a3e
    • Mike Manning's avatar
      net: allow unbound socket for packets in VRF when tcp_l3mdev_accept set · 944fd1ae
      Mike Manning authored
      The commit 3c82a21f ("net: allow binding socket in a VRF when
      there's an unbound socket") changed the inet socket lookup to avoid
      packets in a VRF from matching an unbound socket. This is to ensure the
      necessary isolation between the default and other VRFs for routing and
      forwarding. VRF-unaware processes running in the default VRF cannot
      access another VRF and have to be run with 'ip vrf exec <vrf>'. This is
      to be expected with tcp_l3mdev_accept disabled, but could be reallowed
      when this sysctl option is enabled. So instead of directly checking dif
      and sdif in inet[6]_match, here call inet_sk_bound_dev_eq(). This
      allows a match on unbound socket for non-zero sdif i.e. for packets in
      a VRF, if tcp_l3mdev_accept is enabled.
      
      Fixes: 3c82a21f ("net: allow binding socket in a VRF when there's an unbound socket")
      Signed-off-by: default avatarMike Manning <mvrmanning@gmail.com>
      Link: https://lore.kernel.org/netdev/a54c149aed38fded2d3b5fdb1a6c89e36a083b74.camel@lasnet.de/Reviewed-by: default avatarDavid Ahern <dsahern@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      944fd1ae
    • Eric Dumazet's avatar
      ax25: fix incorrect dev_tracker usage · d7c4c9e0
      Eric Dumazet authored
      While investigating a separate rose issue [1], and enabling
      CONFIG_NET_DEV_REFCNT_TRACKER=y, Bernard reported an orthogonal ax25 issue [2]
      
      An ax25_dev can be used by one (or many) struct ax25_cb.
      We thus need different dev_tracker, one per struct ax25_cb.
      
      After this patch is applied, we are able to focus on rose.
      
      [1] https://lore.kernel.org/netdev/fb7544a1-f42e-9254-18cc-c9b071f4ca70@free.fr/
      
      [2]
      [  205.798723] reference already released.
      [  205.798732] allocated in:
      [  205.798734]  ax25_bind+0x1a2/0x230 [ax25]
      [  205.798747]  __sys_bind+0xea/0x110
      [  205.798753]  __x64_sys_bind+0x18/0x20
      [  205.798758]  do_syscall_64+0x5c/0x80
      [  205.798763]  entry_SYSCALL_64_after_hwframe+0x44/0xae
      [  205.798768] freed in:
      [  205.798770]  ax25_release+0x115/0x370 [ax25]
      [  205.798778]  __sock_release+0x42/0xb0
      [  205.798782]  sock_close+0x15/0x20
      [  205.798785]  __fput+0x9f/0x260
      [  205.798789]  ____fput+0xe/0x10
      [  205.798792]  task_work_run+0x64/0xa0
      [  205.798798]  exit_to_user_mode_prepare+0x18b/0x190
      [  205.798804]  syscall_exit_to_user_mode+0x26/0x40
      [  205.798808]  do_syscall_64+0x69/0x80
      [  205.798812]  entry_SYSCALL_64_after_hwframe+0x44/0xae
      [  205.798827] ------------[ cut here ]------------
      [  205.798829] WARNING: CPU: 2 PID: 2605 at lib/ref_tracker.c:136 ref_tracker_free.cold+0x60/0x81
      [  205.798837] Modules linked in: rose netrom mkiss ax25 rfcomm cmac algif_hash algif_skcipher af_alg bnep snd_hda_codec_hdmi nls_iso8859_1 i915 rtw88_8821ce rtw88_8821c x86_pkg_temp_thermal rtw88_pci intel_powerclamp rtw88_core snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio coretemp snd_hda_intel kvm_intel snd_intel_dspcfg mac80211 snd_hda_codec kvm i2c_algo_bit drm_buddy drm_dp_helper btusb drm_kms_helper snd_hwdep btrtl snd_hda_core btbcm joydev crct10dif_pclmul btintel crc32_pclmul ghash_clmulni_intel mei_hdcp btmtk intel_rapl_msr aesni_intel bluetooth input_leds snd_pcm crypto_simd syscopyarea processor_thermal_device_pci_legacy sysfillrect cryptd intel_soc_dts_iosf snd_seq sysimgblt ecdh_generic fb_sys_fops rapl libarc4 processor_thermal_device intel_cstate processor_thermal_rfim cec snd_timer ecc snd_seq_device cfg80211 processor_thermal_mbox mei_me processor_thermal_rapl mei rc_core at24 snd intel_pch_thermal intel_rapl_common ttm soundcore int340x_thermal_zone video
      [  205.798948]  mac_hid acpi_pad sch_fq_codel ipmi_devintf ipmi_msghandler drm msr parport_pc ppdev lp parport ramoops pstore_blk reed_solomon pstore_zone efi_pstore ip_tables x_tables autofs4 hid_generic usbhid hid i2c_i801 i2c_smbus r8169 xhci_pci ahci libahci realtek lpc_ich xhci_pci_renesas [last unloaded: ax25]
      [  205.798992] CPU: 2 PID: 2605 Comm: ax25ipd Not tainted 5.18.11-F6BVP #3
      [  205.798996] Hardware name: To be filled by O.E.M. To be filled by O.E.M./CK3, BIOS 5.011 09/16/2020
      [  205.798999] RIP: 0010:ref_tracker_free.cold+0x60/0x81
      [  205.799005] Code: e8 d2 01 9b ff 83 7b 18 00 74 14 48 c7 c7 2f d7 ff 98 e8 10 6e fc ff 8b 7b 18 e8 b8 01 9b ff 4c 89 ee 4c 89 e7 e8 5d fd 07 00 <0f> 0b b8 ea ff ff ff e9 30 05 9b ff 41 0f b6 f7 48 c7 c7 a0 fa 4e
      [  205.799008] RSP: 0018:ffffaf5281073958 EFLAGS: 00010286
      [  205.799011] RAX: 0000000080000000 RBX: ffff9a0bd687ebe0 RCX: 0000000000000000
      [  205.799014] RDX: 0000000000000001 RSI: 0000000000000282 RDI: 00000000ffffffff
      [  205.799016] RBP: ffffaf5281073a10 R08: 0000000000000003 R09: fffffffffffd5618
      [  205.799019] R10: 0000000000ffff10 R11: 000000000000000f R12: ffff9a0bc53384d0
      [  205.799022] R13: 0000000000000282 R14: 00000000ae000001 R15: 0000000000000001
      [  205.799024] FS:  0000000000000000(0000) GS:ffff9a0d0f300000(0000) knlGS:0000000000000000
      [  205.799028] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  205.799031] CR2: 00007ff6b8311554 CR3: 000000001ac10004 CR4: 00000000001706e0
      [  205.799033] Call Trace:
      [  205.799035]  <TASK>
      [  205.799038]  ? ax25_dev_device_down+0xd9/0x1b0 [ax25]
      [  205.799047]  ? ax25_device_event+0x9f/0x270 [ax25]
      [  205.799055]  ? raw_notifier_call_chain+0x49/0x60
      [  205.799060]  ? call_netdevice_notifiers_info+0x52/0xa0
      [  205.799065]  ? dev_close_many+0xc8/0x120
      [  205.799070]  ? unregister_netdevice_many+0x13d/0x890
      [  205.799073]  ? unregister_netdevice_queue+0x90/0xe0
      [  205.799076]  ? unregister_netdev+0x1d/0x30
      [  205.799080]  ? mkiss_close+0x7c/0xc0 [mkiss]
      [  205.799084]  ? tty_ldisc_close+0x2e/0x40
      [  205.799089]  ? tty_ldisc_hangup+0x137/0x210
      [  205.799092]  ? __tty_hangup.part.0+0x208/0x350
      [  205.799098]  ? tty_vhangup+0x15/0x20
      [  205.799103]  ? pty_close+0x127/0x160
      [  205.799108]  ? tty_release+0x139/0x5e0
      [  205.799112]  ? __fput+0x9f/0x260
      [  205.799118]  ax25_dev_device_down+0xd9/0x1b0 [ax25]
      [  205.799126]  ax25_device_event+0x9f/0x270 [ax25]
      [  205.799135]  raw_notifier_call_chain+0x49/0x60
      [  205.799140]  call_netdevice_notifiers_info+0x52/0xa0
      [  205.799146]  dev_close_many+0xc8/0x120
      [  205.799152]  unregister_netdevice_many+0x13d/0x890
      [  205.799157]  unregister_netdevice_queue+0x90/0xe0
      [  205.799161]  unregister_netdev+0x1d/0x30
      [  205.799165]  mkiss_close+0x7c/0xc0 [mkiss]
      [  205.799170]  tty_ldisc_close+0x2e/0x40
      [  205.799173]  tty_ldisc_hangup+0x137/0x210
      [  205.799178]  __tty_hangup.part.0+0x208/0x350
      [  205.799184]  tty_vhangup+0x15/0x20
      [  205.799188]  pty_close+0x127/0x160
      [  205.799193]  tty_release+0x139/0x5e0
      [  205.799199]  __fput+0x9f/0x260
      [  205.799203]  ____fput+0xe/0x10
      [  205.799208]  task_work_run+0x64/0xa0
      [  205.799213]  do_exit+0x33b/0xab0
      [  205.799217]  ? __handle_mm_fault+0xc4f/0x15f0
      [  205.799224]  do_group_exit+0x35/0xa0
      [  205.799228]  __x64_sys_exit_group+0x18/0x20
      [  205.799232]  do_syscall_64+0x5c/0x80
      [  205.799238]  ? handle_mm_fault+0xba/0x290
      [  205.799242]  ? debug_smp_processor_id+0x17/0x20
      [  205.799246]  ? fpregs_assert_state_consistent+0x26/0x50
      [  205.799251]  ? exit_to_user_mode_prepare+0x49/0x190
      [  205.799256]  ? irqentry_exit_to_user_mode+0x9/0x20
      [  205.799260]  ? irqentry_exit+0x33/0x40
      [  205.799263]  ? exc_page_fault+0x87/0x170
      [  205.799268]  ? asm_exc_page_fault+0x8/0x30
      [  205.799273]  entry_SYSCALL_64_after_hwframe+0x44/0xae
      [  205.799277] RIP: 0033:0x7ff6b80eaca1
      [  205.799281] Code: Unable to access opcode bytes at RIP 0x7ff6b80eac77.
      [  205.799283] RSP: 002b:00007fff6dfd4738 EFLAGS: 00000246 ORIG_RAX: 00000000000000e7
      [  205.799287] RAX: ffffffffffffffda RBX: 00007ff6b8215a00 RCX: 00007ff6b80eaca1
      [  205.799290] RDX: 000000000000003c RSI: 00000000000000e7 RDI: 0000000000000001
      [  205.799293] RBP: 0000000000000001 R08: ffffffffffffff80 R09: 0000000000000028
      [  205.799295] R10: 0000000000000000 R11: 0000000000000246 R12: 00007ff6b8215a00
      [  205.799298] R13: 0000000000000000 R14: 00007ff6b821aee8 R15: 00007ff6b821af00
      [  205.799304]  </TASK>
      
      Fixes: feef318c ("ax25: fix UAF bugs of net_device caused by rebinding operation")
      Reported-by: default avatarBernard F6BVP <f6bvp@free.fr>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Duoming Zhou <duoming@zju.edu.cn>
      Link: https://lore.kernel.org/r/20220728051821.3160118-1-eric.dumazet@gmail.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      d7c4c9e0
  5. 28 Jul, 2022 20 commits
    • Shay Drory's avatar
      net/mlx5: Fix driver use of uninitialized timeout · 42b4f7f6
      Shay Drory authored
      Currently, driver is setting default values to all timeouts during
      function setup. The offending commit is using a timeout before
      function setup, meaning: the timeout is 0 (or garbage), since no
      value have been set.
      This may result in failure to probe the driver:
      mlx5_function_setup:1034:(pid 69850): Firmware over 4294967296 MS in pre-initializing state, aborting
      probe_one:1591:(pid 69850): mlx5_init_one failed with error code -16
      
      Hence, set default values to timeouts during tout_init()
      
      Fixes: 37ca95e6 ("net/mlx5: Increase FW pre-init timeout for health recovery")
      Signed-off-by: default avatarShay Drory <shayd@nvidia.com>
      Reviewed-by: default avatarMoshe Shemesh <moshe@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      42b4f7f6
    • Yevgeny Kliteynik's avatar
      net/mlx5: DR, Fix SMFS steering info dump format · 62d26643
      Yevgeny Kliteynik authored
      Fix several issues in SMFS steering info dump:
       - Fix outdated macro value for matcher mask in the SMFS debug dump format.
         The existing value denotes the old format of the matcher mask, as it was
         used during the early stages of development, and it results in wrong
         parsing by the steering dump parser - wrong fields are shown in the
         parsed output.
       - Add the missing destination table to the dumped action.
         The missing dest table handle breaks the ability to associate between
         the "go to table" action and the actual table in the steering info.
      
      Fixes: 9222f0b2 ("net/mlx5: DR, Add support for dumping steering info")
      Signed-off-by: default avatarYevgeny Kliteynik <kliteyn@nvidia.com>
      Signed-off-by: default avatarMuhammad Sammar <muhammads@nvidia.com>
      Reviewed-by: default avatarAlex Vesker <valex@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      62d26643
    • Maher Sanalla's avatar
      net/mlx5: Adjust log_max_qp to be 18 at most · a6e9085d
      Maher Sanalla authored
      The cited commit limited log_max_qp to be 17 due to FW capabilities.
      Recently, it turned out that there are old FW versions that supported
      more than 17, so the cited commit caused a degradation.
      
      Thus, set the maximum log_max_qp back to 18 as it was before the
      cited commit.
      
      Fixes: 7f839965 ("net/mlx5: Update log_max_qp value to be 17 at most")
      Signed-off-by: default avatarMaher Sanalla <msanalla@nvidia.com>
      Reviewed-by: default avatarMaor Gottlieb <maorg@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      a6e9085d
    • Vlad Buslov's avatar
      net/mlx5e: Modify slow path rules to go to slow fdb · c0063a43
      Vlad Buslov authored
      While extending available range of supported chains/prios referenced commit
      also modified slow path rules to go to FT chain instead of actual slow FDB.
      However neither of existing users of the MLX5_ATTR_FLAG_SLOW_PATH
      flag (tunnel encap entries with invalid encap and flows with trap action)
      need to match on FT chain. After bridge offload was implemented packets of
      such flows can also be matched by bridge priority tables which is
      undesirable. Restore slow path flows implementation to redirect packets to
      slow_fdb.
      
      Fixes: 278d51f2 ("net/mlx5: E-Switch, Increase number of chains and priorities")
      Signed-off-by: default avatarVlad Buslov <vladbu@nvidia.com>
      Reviewed-by: default avatarRoi Dayan <roid@nvidia.com>
      Reviewed-by: default avatarPaul Blakey <paulb@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      c0063a43
    • Maxim Mikityanskiy's avatar
      net/mlx5e: Fix calculations related to max MPWQE size · 677e78c8
      Maxim Mikityanskiy authored
      Before commit 76c31e5f ("net/mlx5e: Use FW limitation for max MPW
      WQEBBs"), the maximum size of MPWQE in WQEBBs was hardcoded as a driver
      constant. That commit started using the firmware capability that can
      further limit the size, however, it unintentionally changed a few
      things:
      
      1. The calculation of MLX5E_MAX_KLM_PER_WQE used the size in DS, which
      was replaced by the size in WQEBBs, making the resulting value 4 times
      smaller.
      
      2. MLX5E_TX_MPW_MAX_WQEBBS used to be aligned to the cache line size
      (either 64 or 128 bytes, i.e. 1 or 2 WQEBBs), but it's no longer the
      case if the firmware capability is smaller than the driver maximum.
      
      Fix both issues by using the correct units for MLX5E_MAX_KLM_PER_WQE and
      by aligning mlx5e_get_sw_max_sq_mpw_wqebbs after taking the minimum.
      
      Besides fixing the arithmetics in calculation of MLX5E_MAX_KLM_PER_WQE,
      also use appropriate constants: `size of BSF * num of DS per WQEBB *
      number of WQEBBs` (the calculation before the blamed commit) doesn't
      make much sense to calculate the WQE size in bytes, so just use `size of
      WQEBB * number of WQEBBs`.
      
      While at it, replace the types that hold the number of WQEBBs by u8.
      These values don't exceed 16, and it allows to fill holes in two
      structs.
      
      Fixes: 76c31e5f ("net/mlx5e: Use FW limitation for max MPW WQEBBs")
      Signed-off-by: default avatarMaxim Mikityanskiy <maximmi@nvidia.com>
      Reviewed-by: default avatarTariq Toukan <tariqt@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      677e78c8
    • Maxim Mikityanskiy's avatar
      net/mlx5e: xsk: Account for XSK RQ UMRs when calculating ICOSQ size · 52586d2f
      Maxim Mikityanskiy authored
      ICOSQ is used to post UMR WQEs for both regular RQ and XSK RQ. However,
      space in ICOSQ is reserved only for the regular RQ, which may cause
      ICOSQ overflows when using XSK (the most risk is on activating
      channels).
      
      This commit fixes the issue by reserving space for XSK UMR WQEs as well.
      As XSK may be enabled without restarting the channel and recreating the
      ICOSQ, this space is reserved unconditionally.
      
      Fixes: db05815b ("net/mlx5e: Add XSK zero-copy support")
      Signed-off-by: default avatarMaxim Mikityanskiy <maximmi@nvidia.com>
      Reviewed-by: default avatarTariq Toukan <tariqt@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      52586d2f
    • Maxim Mikityanskiy's avatar
      net/mlx5e: Fix the value of MLX5E_MAX_RQ_NUM_MTTS · 562696c3
      Maxim Mikityanskiy authored
      MLX5E_MAX_RQ_NUM_MTTS should be the maximum value, so that
      MLX5_MTT_OCTW(MLX5E_MAX_RQ_NUM_MTTS) fits into u16. The current value of
      1 << 17 results in MLX5_MTT_OCTW(1 << 17) = 1 << 16, which doesn't fit
      into u16. This commit replaces it with the maximum value that still
      fits u16.
      
      Fixes: 73281b78 ("net/mlx5e: Derive Striding RQ size from MTU")
      Signed-off-by: default avatarMaxim Mikityanskiy <maximmi@nvidia.com>
      Reviewed-by: default avatarTariq Toukan <tariqt@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      562696c3
    • Maor Dickman's avatar
      net/mlx5e: TC, Fix post_act to not match on in_port metadata · 903f2194
      Maor Dickman authored
      The cited commit changed CT to use multi table actions post act infrastructure instead
      of using it own post act infrastructure, this broke decap during VF tunnel offload
      (Stack devices) with CT due to wrong match on in_port metadata in the post act table.
      This changed only broke VF tunnel offload because it modify the packet in_port metadata
      to be VF metadata and it isn't propagate the post act creation.
      
      Fixed by modify post act rules to match only on fte_id and not match on in_port metadata
      which isn't needed.
      
      Fixes: a8128326 ("net/mlx5e: Use multi table support for CT and sample actions")
      Signed-off-by: default avatarMaor Dickman <maord@nvidia.com>
      Reviewed-by: default avatarRoi Dayan <roid@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      903f2194
    • Gal Pressman's avatar
      net/mlx5e: Remove WARN_ON when trying to offload an unsupported TLS cipher/version · 115d9f95
      Gal Pressman authored
      The driver reports whether TX/RX TLS device offloads are supported, but
      not which ciphers/versions, these should be handled by returning
      -EOPNOTSUPP when .tls_dev_add() is called.
      
      Remove the WARN_ON kernel trace when the driver gets a request to
      offload a cipher/version that is not supported as it is expected.
      
      Fixes: d2ead1f3 ("net/mlx5e: Add kTLS TX HW offload support")
      Signed-off-by: default avatarGal Pressman <gal@nvidia.com>
      Reviewed-by: default avatarTariq Toukan <tariqt@nvidia.com>
      Reviewed-by: default avatarMaxim Mikityanskiy <maximmi@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      115d9f95
    • Linus Torvalds's avatar
      Merge tag 'net-5.19-final' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · 33ea1340
      Linus Torvalds authored
      Pull networking fixes from Jakub Kicinski:
       "Including fixes from bluetooth and netfilter, no known blockers for
        the release.
      
        Current release - regressions:
      
         - wifi: mac80211: do not abuse fq.lock in ieee80211_do_stop(), fix
           taking the lock before its initialized
      
         - Bluetooth: mgmt: fix double free on error path
      
        Current release - new code bugs:
      
         - eth: ice: fix tunnel checksum offload with fragmented traffic
      
        Previous releases - regressions:
      
         - tcp: md5: fix IPv4-mapped support after refactoring, don't take the
           pure v6 path
      
         - Revert "tcp: change pingpong threshold to 3", improving detection
           of interactive sessions
      
         - mld: fix netdev refcount leak in mld_{query | report}_work() due to
           a race
      
         - Bluetooth:
            - always set event mask on suspend, avoid early wake ups
            - L2CAP: fix use-after-free caused by l2cap_chan_put
      
         - bridge: do not send empty IFLA_AF_SPEC attribute
      
        Previous releases - always broken:
      
         - ping6: fix memleak in ipv6_renew_options()
      
         - sctp: prevent null-deref caused by over-eager error paths
      
         - virtio-net: fix the race between refill work and close, resulting
           in NAPI scheduled after close and a BUG()
      
         - macsec:
            - fix three netlink parsing bugs
            - avoid breaking the device state on invalid change requests
            - fix a memleak in another error path
      
        Misc:
      
         - dt-bindings: net: ethernet-controller: rework 'fixed-link' schema
      
         - two more batches of sysctl data race adornment"
      
      * tag 'net-5.19-final' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (67 commits)
        stmmac: dwmac-mediatek: fix resource leak in probe
        ipv6/addrconf: fix a null-ptr-deref bug for ip6_ptr
        net: ping6: Fix memleak in ipv6_renew_options().
        net/funeth: Fix fun_xdp_tx() and XDP packet reclaim
        sctp: leave the err path free in sctp_stream_init to sctp_stream_free
        sfc: disable softirqs for ptp TX
        ptp: ocp: Select CRC16 in the Kconfig.
        tcp: md5: fix IPv4-mapped support
        virtio-net: fix the race between refill work and close
        mptcp: Do not return EINPROGRESS when subflow creation succeeds
        Bluetooth: L2CAP: Fix use-after-free caused by l2cap_chan_put
        Bluetooth: Always set event mask on suspend
        Bluetooth: mgmt: Fix double free on error path
        wifi: mac80211: do not abuse fq.lock in ieee80211_do_stop()
        ice: do not setup vlan for loopback VSI
        ice: check (DD | EOF) bits on Rx descriptor rather than (EOP | RS)
        ice: Fix VSIs unable to share unicast MAC
        ice: Fix tunnel checksum offload with fragmented traffic
        ice: Fix max VLANs available for VF
        netfilter: nft_queue: only allow supported familes and hooks
        ...
      33ea1340
    • Dan Carpenter's avatar
      stmmac: dwmac-mediatek: fix resource leak in probe · 4d3d3a1b
      Dan Carpenter authored
      If mediatek_dwmac_clks_config() fails, then call stmmac_remove_config_dt()
      before returning.  Otherwise it is a resource leak.
      
      Fixes: fa4b3ca6 ("stmmac: dwmac-mediatek: fix clock issue")
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Link: https://lore.kernel.org/r/YuJ4aZyMUlG6yGGa@kiliSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      4d3d3a1b
    • Ziyang Xuan's avatar
      ipv6/addrconf: fix a null-ptr-deref bug for ip6_ptr · 85f0173d
      Ziyang Xuan authored
      Change net device's MTU to smaller than IPV6_MIN_MTU or unregister
      device while matching route. That may trigger null-ptr-deref bug
      for ip6_ptr probability as following.
      
      =========================================================
      BUG: KASAN: null-ptr-deref in find_match.part.0+0x70/0x134
      Read of size 4 at addr 0000000000000308 by task ping6/263
      
      CPU: 2 PID: 263 Comm: ping6 Not tainted 5.19.0-rc7+ #14
      Call trace:
       dump_backtrace+0x1a8/0x230
       show_stack+0x20/0x70
       dump_stack_lvl+0x68/0x84
       print_report+0xc4/0x120
       kasan_report+0x84/0x120
       __asan_load4+0x94/0xd0
       find_match.part.0+0x70/0x134
       __find_rr_leaf+0x408/0x470
       fib6_table_lookup+0x264/0x540
       ip6_pol_route+0xf4/0x260
       ip6_pol_route_output+0x58/0x70
       fib6_rule_lookup+0x1a8/0x330
       ip6_route_output_flags_noref+0xd8/0x1a0
       ip6_route_output_flags+0x58/0x160
       ip6_dst_lookup_tail+0x5b4/0x85c
       ip6_dst_lookup_flow+0x98/0x120
       rawv6_sendmsg+0x49c/0xc70
       inet_sendmsg+0x68/0x94
      
      Reproducer as following:
      Firstly, prepare conditions:
      $ip netns add ns1
      $ip netns add ns2
      $ip link add veth1 type veth peer name veth2
      $ip link set veth1 netns ns1
      $ip link set veth2 netns ns2
      $ip netns exec ns1 ip -6 addr add 2001:0db8:0:f101::1/64 dev veth1
      $ip netns exec ns2 ip -6 addr add 2001:0db8:0:f101::2/64 dev veth2
      $ip netns exec ns1 ifconfig veth1 up
      $ip netns exec ns2 ifconfig veth2 up
      $ip netns exec ns1 ip -6 route add 2000::/64 dev veth1 metric 1
      $ip netns exec ns2 ip -6 route add 2001::/64 dev veth2 metric 1
      
      Secondly, execute the following two commands in two ssh windows
      respectively:
      $ip netns exec ns1 sh
      $while true; do ip -6 addr add 2001:0db8:0:f101::1/64 dev veth1; ip -6 route add 2000::/64 dev veth1 metric 1; ping6 2000::2; done
      
      $ip netns exec ns1 sh
      $while true; do ip link set veth1 mtu 1000; ip link set veth1 mtu 1500; sleep 5; done
      
      It is because ip6_ptr has been assigned to NULL in addrconf_ifdown() firstly,
      then ip6_ignore_linkdown() accesses ip6_ptr directly without NULL check.
      
      	cpu0			cpu1
      fib6_table_lookup
      __find_rr_leaf
      			addrconf_notify [ NETDEV_CHANGEMTU ]
      			addrconf_ifdown
      			RCU_INIT_POINTER(dev->ip6_ptr, NULL)
      find_match
      ip6_ignore_linkdown
      
      So we can add NULL check for ip6_ptr before using in ip6_ignore_linkdown() to
      fix the null-ptr-deref bug.
      
      Fixes: dcd1f572 ("net/ipv6: Remove fib6_idev")
      Signed-off-by: default avatarZiyang Xuan <william.xuanziyang@huawei.com>
      Reviewed-by: default avatarDavid Ahern <dsahern@kernel.org>
      Link: https://lore.kernel.org/r/20220728013307.656257-1-william.xuanziyang@huawei.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      85f0173d
    • Kuniyuki Iwashima's avatar
      net: ping6: Fix memleak in ipv6_renew_options(). · e2732600
      Kuniyuki Iwashima authored
      When we close ping6 sockets, some resources are left unfreed because
      pingv6_prot is missing sk->sk_prot->destroy().  As reported by
      syzbot [0], just three syscalls leak 96 bytes and easily cause OOM.
      
          struct ipv6_sr_hdr *hdr;
          char data[24] = {0};
          int fd;
      
          hdr = (struct ipv6_sr_hdr *)data;
          hdr->hdrlen = 2;
          hdr->type = IPV6_SRCRT_TYPE_4;
      
          fd = socket(AF_INET6, SOCK_DGRAM, NEXTHDR_ICMP);
          setsockopt(fd, IPPROTO_IPV6, IPV6_RTHDR, data, 24);
          close(fd);
      
      To fix memory leaks, let's add a destroy function.
      
      Note the socket() syscall checks if the GID is within the range of
      net.ipv4.ping_group_range.  The default value is [1, 0] so that no
      GID meets the condition (1 <= GID <= 0).  Thus, the local DoS does
      not succeed until we change the default value.  However, at least
      Ubuntu/Fedora/RHEL loosen it.
      
          $ cat /usr/lib/sysctl.d/50-default.conf
          ...
          -net.ipv4.ping_group_range = 0 2147483647
      
      Also, there could be another path reported with these options, and
      some of them require CAP_NET_RAW.
      
        setsockopt
            IPV6_ADDRFORM (inet6_sk(sk)->pktoptions)
            IPV6_RECVPATHMTU (inet6_sk(sk)->rxpmtu)
            IPV6_HOPOPTS (inet6_sk(sk)->opt)
            IPV6_RTHDRDSTOPTS (inet6_sk(sk)->opt)
            IPV6_RTHDR (inet6_sk(sk)->opt)
            IPV6_DSTOPTS (inet6_sk(sk)->opt)
            IPV6_2292PKTOPTIONS (inet6_sk(sk)->opt)
      
        getsockopt
            IPV6_FLOWLABEL_MGR (inet6_sk(sk)->ipv6_fl_list)
      
      For the record, I left a different splat with syzbot's one.
      
        unreferenced object 0xffff888006270c60 (size 96):
          comm "repro2", pid 231, jiffies 4294696626 (age 13.118s)
          hex dump (first 32 bytes):
            01 00 00 00 44 00 00 00 00 00 00 00 00 00 00 00  ....D...........
            00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
          backtrace:
            [<00000000f6bc7ea9>] sock_kmalloc (net/core/sock.c:2564 net/core/sock.c:2554)
            [<000000006d699550>] do_ipv6_setsockopt.constprop.0 (net/ipv6/ipv6_sockglue.c:715)
            [<00000000c3c3b1f5>] ipv6_setsockopt (net/ipv6/ipv6_sockglue.c:1024)
            [<000000007096a025>] __sys_setsockopt (net/socket.c:2254)
            [<000000003a8ff47b>] __x64_sys_setsockopt (net/socket.c:2265 net/socket.c:2262 net/socket.c:2262)
            [<000000007c409dcb>] do_syscall_64 (arch/x86/entry/common.c:50 arch/x86/entry/common.c:80)
            [<00000000e939c4a9>] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:120)
      
      [0]: https://syzkaller.appspot.com/bug?extid=a8430774139ec3ab7176
      
      Fixes: 6d0bfe22 ("net: ipv6: Add IPv6 support to the ping socket.")
      Reported-by: syzbot+a8430774139ec3ab7176@syzkaller.appspotmail.com
      Reported-by: default avatarAyushman Dutta <ayudutta@amazon.com>
      Signed-off-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Reviewed-by: default avatarDavid Ahern <dsahern@kernel.org>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Link: https://lore.kernel.org/r/20220728012220.46918-1-kuniyu@amazon.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      e2732600
    • Linus Torvalds's avatar
      watch_queue: Fix missing locking in add_watch_to_object() · e64ab2db
      Linus Torvalds authored
      If a watch is being added to a queue, it needs to guard against
      interference from addition of a new watch, manual removal of a watch and
      removal of a watch due to some other queue being destroyed.
      
      KEYCTL_WATCH_KEY guards against this for the same {key,queue} pair by
      holding the key->sem writelocked and by holding refs on both the key and
      the queue - but that doesn't prevent interaction from other {key,queue}
      pairs.
      
      While add_watch_to_object() does take the spinlock on the event queue,
      it doesn't take the lock on the source's watch list.  The assumption was
      that the caller would prevent that (say by taking key->sem) - but that
      doesn't prevent interference from the destruction of another queue.
      
      Fix this by locking the watcher list in add_watch_to_object().
      
      Fixes: c73be61c ("pipe: Add general notification queue support")
      Reported-by: syzbot+03d7b43290037d1f87ca@syzkaller.appspotmail.com
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: keyrings@vger.kernel.org
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      e64ab2db
    • David Howells's avatar
      watch_queue: Fix missing rcu annotation · e0339f03
      David Howells authored
      Since __post_watch_notification() walks wlist->watchers with only the
      RCU read lock held, we need to use RCU methods to add to the list (we
      already use RCU methods to remove from the list).
      
      Fix add_watch_to_object() to use hlist_add_head_rcu() instead of
      hlist_add_head() for that list.
      
      Fixes: c73be61c ("pipe: Add general notification queue support")
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      e0339f03
    • Dimitris Michailidis's avatar
      net/funeth: Fix fun_xdp_tx() and XDP packet reclaim · 51a83391
      Dimitris Michailidis authored
      The current implementation of fun_xdp_tx(), used for XPD_TX, is
      incorrect in that it takes an address/length pair and later releases it
      with page_frag_free(). It is OK for XDP_TX but the same code is used by
      ndo_xdp_xmit. In that case it loses the XDP memory type and releases the
      packet incorrectly for some of the types. Assorted breakage follows.
      
      Change fun_xdp_tx() to take xdp_frame and rely on xdp_return_frame() in
      reclaim.
      
      Fixes: db37bc17 ("net/funeth: add the data path")
      Signed-off-by: default avatarDimitris Michailidis <dmichail@fungible.com>
      Link: https://lore.kernel.org/r/20220726215923.7887-1-dmichail@fungible.comSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      51a83391
    • Jakub Kicinski's avatar
      Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue · bf84719d
      Jakub Kicinski authored
      Tony Nguyen says:
      
      ====================
      Intel Wired LAN Driver Updates 2022-07-26
      
      This series contains updates to ice driver only.
      
      Przemyslaw corrects accounting for VF VLANs to allow for correct number
      of VLANs for untrusted VF. He also correct issue with checksum offload
      on VXLAN tunnels.
      
      Ani allows for two VSIs to share the same MAC address.
      
      Maciej corrects checked bits for descriptor completion of loopback
      
      * '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
        ice: do not setup vlan for loopback VSI
        ice: check (DD | EOF) bits on Rx descriptor rather than (EOP | RS)
        ice: Fix VSIs unable to share unicast MAC
        ice: Fix tunnel checksum offload with fragmented traffic
        ice: Fix max VLANs available for VF
      ====================
      
      Link: https://lore.kernel.org/r/20220726204646.2171589-1-anthony.l.nguyen@intel.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      bf84719d
    • Xin Long's avatar
      sctp: leave the err path free in sctp_stream_init to sctp_stream_free · 181d8d20
      Xin Long authored
      A NULL pointer dereference was reported by Wei Chen:
      
        BUG: kernel NULL pointer dereference, address: 0000000000000000
        RIP: 0010:__list_del_entry_valid+0x26/0x80
        Call Trace:
         <TASK>
         sctp_sched_dequeue_common+0x1c/0x90
         sctp_sched_prio_dequeue+0x67/0x80
         __sctp_outq_teardown+0x299/0x380
         sctp_outq_free+0x15/0x20
         sctp_association_free+0xc3/0x440
         sctp_do_sm+0x1ca7/0x2210
         sctp_assoc_bh_rcv+0x1f6/0x340
      
      This happens when calling sctp_sendmsg without connecting to server first.
      In this case, a data chunk already queues up in send queue of client side
      when processing the INIT_ACK from server in sctp_process_init() where it
      calls sctp_stream_init() to alloc stream_in. If it fails to alloc stream_in
      all stream_out will be freed in sctp_stream_init's err path. Then in the
      asoc freeing it will crash when dequeuing this data chunk as stream_out
      is missing.
      
      As we can't free stream out before dequeuing all data from send queue, and
      this patch is to fix it by moving the err path stream_out/in freeing in
      sctp_stream_init() to sctp_stream_free() which is eventually called when
      freeing the asoc in sctp_association_free(). This fix also makes the code
      in sctp_process_init() more clear.
      
      Note that in sctp_association_init() when it fails in sctp_stream_init(),
      sctp_association_free() will not be called, and in that case it should
      go to 'stream_free' err path to free stream instead of 'fail_init'.
      
      Fixes: 5bbbbe32 ("sctp: introduce stream scheduler foundations")
      Reported-by: default avatarWei Chen <harperchen1110@gmail.com>
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Link: https://lore.kernel.org/r/831a3dc100c4908ff76e5bcc363be97f2778bc0b.1658787066.git.lucien.xin@gmail.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      181d8d20
    • Alejandro Lucero's avatar
      sfc: disable softirqs for ptp TX · 67c3b611
      Alejandro Lucero authored
      Sending a PTP packet can imply to use the normal TX driver datapath but
      invoked from the driver's ptp worker. The kernel generic TX code
      disables softirqs and preemption before calling specific driver TX code,
      but the ptp worker does not. Although current ptp driver functionality
      does not require it, there are several reasons for doing so:
      
         1) The invoked code is always executed with softirqs disabled for non
            PTP packets.
         2) Better if a ptp packet transmission is not interrupted by softirq
            handling which could lead to high latencies.
         3) netdev_xmit_more used by the TX code requires preemption to be
            disabled.
      
      Indeed a solution for dealing with kernel preemption state based on static
      kernel configuration is not possible since the introduction of dynamic
      preemption level configuration at boot time using the static calls
      functionality.
      
      Fixes: f79c957a ("drivers: net: sfc: use netdev_xmit_more helper")
      Signed-off-by: default avatarAlejandro Lucero <alejandro.lucero-palau@amd.com>
      Link: https://lore.kernel.org/r/20220726064504.49613-1-alejandro.lucero-palau@amd.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      67c3b611
    • Jonathan Lemon's avatar
      ptp: ocp: Select CRC16 in the Kconfig. · 0c104556
      Jonathan Lemon authored
      The crc16() function is used to check the firmware validity, but
      the library was not explicitly selected.
      
      Fixes: 3c3673bd ("ptp: ocp: Add firmware header checks")
      Reported-by: default avatarkernel test robot <lkp@intel.com>
      Signed-off-by: default avatarJonathan Lemon <jonathan.lemon@gmail.com>
      Acked-by: default avatarVadim Fedorenko <vadfed@fb.com>
      Link: https://lore.kernel.org/r/20220726220604.1339972-1-jonathan.lemon@gmail.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      0c104556
  6. 27 Jul, 2022 3 commits