1. 12 Mar, 2021 9 commits
    • David S. Miller's avatar
      Revert "net: bonding: fix error return code of bond_neigh_init()" · 080bfa1e
      David S. Miller authored
      This reverts commit 2055a99d.
      
      This change rejects legitimate configurations.
      
      A slave doesn't need to exist nor implement ndo_slave_setup.
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      080bfa1e
    • David S. Miller's avatar
      Merge branch 'htb-fixes' · 451b2596
      David S. Miller authored
      Maxim Mikityanskiy says:
      
      ====================
      Bugfixes for HTB
      
      The HTB offload feature introduced a few bugs in HTB. One affects the
      non-offload mode, preventing attaching qdiscs to HTB classes, and the
      other affects the error flow, when the netdev doesn't support the
      offload, but it was requested. This short series fixes them.
      ====================
      Acked-by: default avatarCong Wang <cong.wang@bytedance.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      451b2596
    • Maxim Mikityanskiy's avatar
      sch_htb: Fix offload cleanup in htb_destroy on htb_init failure · fb3a3e37
      Maxim Mikityanskiy authored
      htb_init may fail to do the offload if it's not supported or if a
      runtime error happens when allocating direct qdiscs. In those cases
      TC_HTB_CREATE command is not sent to the driver, however, htb_destroy
      gets called anyway and attempts to send TC_HTB_DESTROY.
      
      It shouldn't happen, because the driver didn't receive TC_HTB_CREATE,
      and also because the driver may not support ndo_setup_tc at all, while
      q->offload is true, and htb_destroy mistakenly thinks the offload is
      supported. Trying to call ndo_setup_tc in the latter case will lead to a
      NULL pointer dereference.
      
      This commit fixes the issues with htb_destroy by deferring assignment of
      q->offload until after the TC_HTB_CREATE command. The necessary cleanup
      of the offload entities is already done in htb_init.
      
      Reported-by: syzbot+b53a709f04722ca12a3c@syzkaller.appspotmail.com
      Fixes: d03b195b ("sch_htb: Hierarchical QoS hardware offload")
      Suggested-by: default avatarEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: default avatarMaxim Mikityanskiy <maximmi@nvidia.com>
      Reviewed-by: default avatarTariq Toukan <tariqt@nvidia.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      fb3a3e37
    • Maxim Mikityanskiy's avatar
      sch_htb: Fix select_queue for non-offload mode · 93bde210
      Maxim Mikityanskiy authored
      htb_select_queue assumes it's always the offload mode, and it ends up in
      calling ndo_setup_tc without any checks. It may lead to a NULL pointer
      dereference if ndo_setup_tc is not implemented, or to an error returned
      from the driver, which will prevent attaching qdiscs to HTB classes in
      the non-offload mode.
      
      This commit fixes the bug by adding the missing check to
      htb_select_queue. In the non-offload mode it will return sch->dev_queue,
      mimicking tc_modify_qdisc's behavior for the case where select_queue is
      not implemented.
      
      Reported-by: syzbot+b53a709f04722ca12a3c@syzkaller.appspotmail.com
      Fixes: d03b195b ("sch_htb: Hierarchical QoS hardware offload")
      Signed-off-by: default avatarMaxim Mikityanskiy <maximmi@nvidia.com>
      Reviewed-by: default avatarTariq Toukan <tariqt@nvidia.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      93bde210
    • Florian Fainelli's avatar
      net: phy: broadcom: Add power down exit reset state delay · 7a1468ba
      Florian Fainelli authored
      Per the datasheet, when we clear the power down bit, the PHY remains in
      an internal reset state for 40us and then resume normal operation.
      Account for that delay to avoid any issues in the future if
      genphy_resume() changes.
      
      Fixes: fe26821f ("net: phy: broadcom: Wire suspend/resume for BCM54810")
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7a1468ba
    • Tong Zhang's avatar
      mISDN: fix crash in fritzpci · a9f81244
      Tong Zhang authored
      setup_fritz() in avmfritz.c might fail with -EIO and in this case the
      isac.type and isac.write_reg is not initialized and remains 0(NULL).
      A subsequent call to isac_release() will dereference isac->write_reg and
      crash.
      
      [    1.737444] BUG: kernel NULL pointer dereference, address: 0000000000000000
      [    1.737809] #PF: supervisor instruction fetch in kernel mode
      [    1.738106] #PF: error_code(0x0010) - not-present page
      [    1.738378] PGD 0 P4D 0
      [    1.738515] Oops: 0010 [#1] SMP NOPTI
      [    1.738711] CPU: 0 PID: 180 Comm: systemd-udevd Not tainted 5.12.0-rc2+ #78
      [    1.739077] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-48-gd9c812dda519-p
      rebuilt.qemu.org 04/01/2014
      [    1.739664] RIP: 0010:0x0
      [    1.739807] Code: Unable to access opcode bytes at RIP 0xffffffffffffffd6.
      [    1.740200] RSP: 0018:ffffc9000027ba10 EFLAGS: 00010202
      [    1.740478] RAX: 0000000000000000 RBX: ffff888102f41840 RCX: 0000000000000027
      [    1.740853] RDX: 00000000000000ff RSI: 0000000000000020 RDI: ffff888102f41800
      [    1.741226] RBP: ffffc9000027ba20 R08: ffff88817bc18440 R09: ffffc9000027b808
      [    1.741600] R10: 0000000000000001 R11: 0000000000000001 R12: ffff888102f41840
      [    1.741976] R13: 00000000fffffffb R14: ffff888102f41800 R15: ffff8881008b0000
      [    1.742351] FS:  00007fda3a38a8c0(0000) GS:ffff88817bc00000(0000) knlGS:0000000000000000
      [    1.742774] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [    1.743076] CR2: ffffffffffffffd6 CR3: 00000001021ec000 CR4: 00000000000006f0
      [    1.743452] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [    1.743828] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [    1.744206] Call Trace:
      [    1.744339]  isac_release+0xcc/0xe0 [mISDNipac]
      [    1.744582]  fritzpci_probe.cold+0x282/0x739 [avmfritz]
      [    1.744861]  local_pci_probe+0x48/0x80
      [    1.745063]  pci_device_probe+0x10f/0x1c0
      [    1.745278]  really_probe+0xfb/0x420
      [    1.745471]  driver_probe_device+0xe9/0x160
      [    1.745693]  device_driver_attach+0x5d/0x70
      [    1.745917]  __driver_attach+0x8f/0x150
      [    1.746123]  ? device_driver_attach+0x70/0x70
      [    1.746354]  bus_for_each_dev+0x7e/0xc0
      [    1.746560]  driver_attach+0x1e/0x20
      [    1.746751]  bus_add_driver+0x152/0x1f0
      [    1.746957]  driver_register+0x74/0xd0
      [    1.747157]  ? 0xffffffffc00d8000
      [    1.747334]  __pci_register_driver+0x54/0x60
      [    1.747562]  AVM_init+0x36/0x1000 [avmfritz]
      [    1.747791]  do_one_initcall+0x48/0x1d0
      [    1.747997]  ? __cond_resched+0x19/0x30
      [    1.748206]  ? kmem_cache_alloc_trace+0x390/0x440
      [    1.748458]  ? do_init_module+0x28/0x250
      [    1.748669]  do_init_module+0x62/0x250
      [    1.748870]  load_module+0x23ee/0x26a0
      [    1.749073]  __do_sys_finit_module+0xc2/0x120
      [    1.749307]  ? __do_sys_finit_module+0xc2/0x120
      [    1.749549]  __x64_sys_finit_module+0x1a/0x20
      [    1.749782]  do_syscall_64+0x38/0x90
      Signed-off-by: default avatarTong Zhang <ztong0001@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a9f81244
    • Lv Yunlong's avatar
      net/qlcnic: Fix a use after free in qlcnic_83xx_get_minidump_template · db74623a
      Lv Yunlong authored
      In qlcnic_83xx_get_minidump_template, fw_dump->tmpl_hdr was freed by
      vfree(). But unfortunately, it is used when extended is true.
      
      Fixes: 7061b2bd ("qlogic: Deletion of unnecessary checks before two function calls")
      Signed-off-by: default avatarLv Yunlong <lyl2019@mail.ustc.edu.cn>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      db74623a
    • David S. Miller's avatar
      Merge branch '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue · ce6c13e4
      David S. Miller authored
      Tony Nguyen says:
      
      ====================
      Intel Wired LAN Driver Updates 2021-03-11
      
      This series contains updates to igc and e1000e drivers.
      
      Sasha adds locking to reset task to prevent race condition for igc.
      
      Muhammad fixes reporting of supported pause frame as well as advertised
      pause frame for Tx/Rx off for igc.
      
      Andre fixes timestamp retrieval from the wrong timer for igc.
      
      Vitaly adds locking to reset task to prevent race condition for e1000e.
      
      Dinghao Liu adds a missed check to return on error in
      e1000_set_d0_lplu_state_82571.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ce6c13e4
    • Tonghao Zhang's avatar
      net: sock: simplify tw proto registration · b80350f3
      Tonghao Zhang authored
      Introduce the new function tw_prot_init (inspired by
      req_prot_init) to simplify "proto_register" function.
      
      tw_prot_cleanup will take care of a partially initialized
      timewait_sock_ops.
      Signed-off-by: default avatarTonghao Zhang <xiangxia.m.yue@gmail.com>
      Reviewed-by: default avatarAlexander Duyck <alexanderduyck@fb.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b80350f3
  2. 11 Mar, 2021 7 commits
  3. 10 Mar, 2021 24 commits
    • Florian Fainelli's avatar
      net: dsa: b53: VLAN filtering is global to all users · d45c36ba
      Florian Fainelli authored
      The bcm_sf2 driver uses the b53 driver as a library but does not make
      usre of the b53_setup() function, this made it fail to inherit the
      vlan_filtering_is_global attribute. Fix this by moving the assignment to
      b53_switch_alloc() which is used by bcm_sf2.
      
      Fixes: 7228b23e ("net: dsa: b53: Let DSA handle mismatched VLAN filtering settings")
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d45c36ba
    • Eric Dumazet's avatar
      net: sched: validate stab values · e323d865
      Eric Dumazet authored
      iproute2 package is well behaved, but malicious user space can
      provide illegal shift values and trigger UBSAN reports.
      
      Add stab parameter to red_check_params() to validate user input.
      
      syzbot reported:
      
      UBSAN: shift-out-of-bounds in ./include/net/red.h:312:18
      shift exponent 111 is too large for 64-bit type 'long unsigned int'
      CPU: 1 PID: 14662 Comm: syz-executor.3 Not tainted 5.12.0-rc2-syzkaller #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       __dump_stack lib/dump_stack.c:79 [inline]
       dump_stack+0x141/0x1d7 lib/dump_stack.c:120
       ubsan_epilogue+0xb/0x5a lib/ubsan.c:148
       __ubsan_handle_shift_out_of_bounds.cold+0xb1/0x181 lib/ubsan.c:327
       red_calc_qavg_from_idle_time include/net/red.h:312 [inline]
       red_calc_qavg include/net/red.h:353 [inline]
       choke_enqueue.cold+0x18/0x3dd net/sched/sch_choke.c:221
       __dev_xmit_skb net/core/dev.c:3837 [inline]
       __dev_queue_xmit+0x1943/0x2e00 net/core/dev.c:4150
       neigh_hh_output include/net/neighbour.h:499 [inline]
       neigh_output include/net/neighbour.h:508 [inline]
       ip6_finish_output2+0x911/0x1700 net/ipv6/ip6_output.c:117
       __ip6_finish_output net/ipv6/ip6_output.c:182 [inline]
       __ip6_finish_output+0x4c1/0xe10 net/ipv6/ip6_output.c:161
       ip6_finish_output+0x35/0x200 net/ipv6/ip6_output.c:192
       NF_HOOK_COND include/linux/netfilter.h:290 [inline]
       ip6_output+0x1e4/0x530 net/ipv6/ip6_output.c:215
       dst_output include/net/dst.h:448 [inline]
       NF_HOOK include/linux/netfilter.h:301 [inline]
       NF_HOOK include/linux/netfilter.h:295 [inline]
       ip6_xmit+0x127e/0x1eb0 net/ipv6/ip6_output.c:320
       inet6_csk_xmit+0x358/0x630 net/ipv6/inet6_connection_sock.c:135
       dccp_transmit_skb+0x973/0x12c0 net/dccp/output.c:138
       dccp_send_reset+0x21b/0x2b0 net/dccp/output.c:535
       dccp_finish_passive_close net/dccp/proto.c:123 [inline]
       dccp_finish_passive_close+0xed/0x140 net/dccp/proto.c:118
       dccp_terminate_connection net/dccp/proto.c:958 [inline]
       dccp_close+0xb3c/0xe60 net/dccp/proto.c:1028
       inet_release+0x12e/0x280 net/ipv4/af_inet.c:431
       inet6_release+0x4c/0x70 net/ipv6/af_inet6.c:478
       __sock_release+0xcd/0x280 net/socket.c:599
       sock_close+0x18/0x20 net/socket.c:1258
       __fput+0x288/0x920 fs/file_table.c:280
       task_work_run+0xdd/0x1a0 kernel/task_work.c:140
       tracehook_notify_resume include/linux/tracehook.h:189 [inline]
      
      Fixes: 8afa10cb ("net_sched: red: Avoid illegal values")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e323d865
    • David S. Miller's avatar
    • Rafał Miłecki's avatar
      net: dsa: bcm_sf2: use 2 Gbps IMP port link on BCM4908 · 8373a0fe
      Rafał Miłecki authored
      BCM4908 uses 2 Gbps link between switch and the Ethernet interface.
      Without this BCM4908 devices were able to achieve only 2 x ~895 Mb/s.
      This allows handling e.g. NAT traffic with 940 Mb/s.
      Signed-off-by: default avatarRafał Miłecki <rafal@milecki.pl>
      Acked-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8373a0fe
    • Pavel Andrianov's avatar
      net: pxa168_eth: Fix a potential data race in pxa168_eth_remove · 0571a753
      Pavel Andrianov authored
      pxa168_eth_remove() firstly calls unregister_netdev(),
      then cancels a timeout work. unregister_netdev() shuts down a device
      interface and removes it from the kernel tables. If the timeout occurs
      in parallel, the timeout work (pxa168_eth_tx_timeout_task) performs stop
      and open of the device. It may lead to an inconsistent state and memory
      leaks.
      
      Found by Linux Driver Verification project (linuxtesting.org).
      Signed-off-by: default avatarPavel Andrianov <andrianov@ispras.ru>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0571a753
    • Eric Dumazet's avatar
      macvlan: macvlan_count_rx() needs to be aware of preemption · dd4fa1da
      Eric Dumazet authored
      macvlan_count_rx() can be called from process context, it is thus
      necessary to disable preemption before calling u64_stats_update_begin()
      
      syzbot was able to spot this on 32bit arch:
      
      WARNING: CPU: 1 PID: 4632 at include/linux/seqlock.h:271 __seqprop_assert include/linux/seqlock.h:271 [inline]
      WARNING: CPU: 1 PID: 4632 at include/linux/seqlock.h:271 __seqprop_assert.constprop.0+0xf0/0x11c include/linux/seqlock.h:269
      Modules linked in:
      Kernel panic - not syncing: panic_on_warn set ...
      CPU: 1 PID: 4632 Comm: kworker/1:3 Not tainted 5.12.0-rc2-syzkaller #0
      Hardware name: ARM-Versatile Express
      Workqueue: events macvlan_process_broadcast
      Backtrace:
      [<82740468>] (dump_backtrace) from [<827406dc>] (show_stack+0x18/0x1c arch/arm/kernel/traps.c:252)
       r7:00000080 r6:60000093 r5:00000000 r4:8422a3c4
      [<827406c4>] (show_stack) from [<82751b58>] (__dump_stack lib/dump_stack.c:79 [inline])
      [<827406c4>] (show_stack) from [<82751b58>] (dump_stack+0xb8/0xe8 lib/dump_stack.c:120)
      [<82751aa0>] (dump_stack) from [<82741270>] (panic+0x130/0x378 kernel/panic.c:231)
       r7:830209b4 r6:84069ea4 r5:00000000 r4:844350d0
      [<82741140>] (panic) from [<80244924>] (__warn+0xb0/0x164 kernel/panic.c:605)
       r3:8404ec8c r2:00000000 r1:00000000 r0:830209b4
       r7:0000010f
      [<80244874>] (__warn) from [<82741520>] (warn_slowpath_fmt+0x68/0xd4 kernel/panic.c:628)
       r7:81363f70 r6:0000010f r5:83018e50 r4:00000000
      [<827414bc>] (warn_slowpath_fmt) from [<81363f70>] (__seqprop_assert include/linux/seqlock.h:271 [inline])
      [<827414bc>] (warn_slowpath_fmt) from [<81363f70>] (__seqprop_assert.constprop.0+0xf0/0x11c include/linux/seqlock.h:269)
       r8:5a109000 r7:0000000f r6:a568dac0 r5:89802300 r4:00000001
      [<81363e80>] (__seqprop_assert.constprop.0) from [<81364af0>] (u64_stats_update_begin include/linux/u64_stats_sync.h:128 [inline])
      [<81363e80>] (__seqprop_assert.constprop.0) from [<81364af0>] (macvlan_count_rx include/linux/if_macvlan.h:47 [inline])
      [<81363e80>] (__seqprop_assert.constprop.0) from [<81364af0>] (macvlan_broadcast+0x154/0x26c drivers/net/macvlan.c:291)
       r5:89802300 r4:8a927740
      [<8136499c>] (macvlan_broadcast) from [<81365020>] (macvlan_process_broadcast+0x258/0x2d0 drivers/net/macvlan.c:317)
       r10:81364f78 r9:8a86d000 r8:8a9c7e7c r7:8413aa5c r6:00000000 r5:00000000
       r4:89802840
      [<81364dc8>] (macvlan_process_broadcast) from [<802696a4>] (process_one_work+0x2d4/0x998 kernel/workqueue.c:2275)
       r10:00000008 r9:8404ec98 r8:84367a02 r7:ddfe6400 r6:ddfe2d40 r5:898dac80
       r4:8a86d43c
      [<802693d0>] (process_one_work) from [<80269dcc>] (worker_thread+0x64/0x54c kernel/workqueue.c:2421)
       r10:00000008 r9:8a9c6000 r8:84006d00 r7:ddfe2d78 r6:898dac94 r5:ddfe2d40
       r4:898dac80
      [<80269d68>] (worker_thread) from [<80271f40>] (kthread+0x184/0x1a4 kernel/kthread.c:292)
       r10:85247e64 r9:898dac80 r8:80269d68 r7:00000000 r6:8a9c6000 r5:89a2ee40
       r4:8a97bd00
      [<80271dbc>] (kthread) from [<80200114>] (ret_from_fork+0x14/0x20 arch/arm/kernel/entry-common.S:158)
      Exception stack(0x8a9c7fb0 to 0x8a9c7ff8)
      
      Fixes: 412ca155 ("macvlan: Move broadcasts into a work queue")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Herbert Xu <herbert@gondor.apana.org.au>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Acked-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      dd4fa1da
    • Ido Schimmel's avatar
      drop_monitor: Perform cleanup upon probe registration failure · 9398e9c0
      Ido Schimmel authored
      In the rare case that drop_monitor fails to register its probe on the
      'napi_poll' tracepoint, it will not deactivate its hysteresis timer as
      part of the error path. If the hysteresis timer was armed by the shortly
      lived 'kfree_skb' probe and user space retries to initiate tracing, a
      warning will be emitted for trying to initialize an active object [1].
      
      Fix this by properly undoing all the operations that were done prior to
      probe registration, in both software and hardware code paths.
      
      Note that syzkaller managed to fail probe registration by injecting a
      slab allocation failure [2].
      
      [1]
      ODEBUG: init active (active state 0) object type: timer_list hint: sched_send_work+0x0/0x60 include/linux/list.h:135
      WARNING: CPU: 1 PID: 8649 at lib/debugobjects.c:505 debug_print_object+0x16e/0x250 lib/debugobjects.c:505
      Modules linked in:
      CPU: 1 PID: 8649 Comm: syz-executor.0 Not tainted 5.11.0-syzkaller #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      RIP: 0010:debug_print_object+0x16e/0x250 lib/debugobjects.c:505
      [...]
      Call Trace:
       __debug_object_init+0x524/0xd10 lib/debugobjects.c:588
       debug_timer_init kernel/time/timer.c:722 [inline]
       debug_init kernel/time/timer.c:770 [inline]
       init_timer_key+0x2d/0x340 kernel/time/timer.c:814
       net_dm_trace_on_set net/core/drop_monitor.c:1111 [inline]
       set_all_monitor_traces net/core/drop_monitor.c:1188 [inline]
       net_dm_monitor_start net/core/drop_monitor.c:1295 [inline]
       net_dm_cmd_trace+0x720/0x1220 net/core/drop_monitor.c:1339
       genl_family_rcv_msg_doit+0x228/0x320 net/netlink/genetlink.c:739
       genl_family_rcv_msg net/netlink/genetlink.c:783 [inline]
       genl_rcv_msg+0x328/0x580 net/netlink/genetlink.c:800
       netlink_rcv_skb+0x153/0x420 net/netlink/af_netlink.c:2502
       genl_rcv+0x24/0x40 net/netlink/genetlink.c:811
       netlink_unicast_kernel net/netlink/af_netlink.c:1312 [inline]
       netlink_unicast+0x533/0x7d0 net/netlink/af_netlink.c:1338
       netlink_sendmsg+0x856/0xd90 net/netlink/af_netlink.c:1927
       sock_sendmsg_nosec net/socket.c:652 [inline]
       sock_sendmsg+0xcf/0x120 net/socket.c:672
       ____sys_sendmsg+0x6e8/0x810 net/socket.c:2348
       ___sys_sendmsg+0xf3/0x170 net/socket.c:2402
       __sys_sendmsg+0xe5/0x1b0 net/socket.c:2435
       do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
       entry_SYSCALL_64_after_hwframe+0x44/0xae
      
      [2]
       FAULT_INJECTION: forcing a failure.
       name failslab, interval 1, probability 0, space 0, times 1
       CPU: 1 PID: 8645 Comm: syz-executor.0 Not tainted 5.11.0-syzkaller #0
       Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
       Call Trace:
        dump_stack+0xfa/0x151
        should_fail.cold+0x5/0xa
        should_failslab+0x5/0x10
        __kmalloc+0x72/0x3f0
        tracepoint_add_func+0x378/0x990
        tracepoint_probe_register+0x9c/0xe0
        net_dm_cmd_trace+0x7fc/0x1220
        genl_family_rcv_msg_doit+0x228/0x320
        genl_rcv_msg+0x328/0x580
        netlink_rcv_skb+0x153/0x420
        genl_rcv+0x24/0x40
        netlink_unicast+0x533/0x7d0
        netlink_sendmsg+0x856/0xd90
        sock_sendmsg+0xcf/0x120
        ____sys_sendmsg+0x6e8/0x810
        ___sys_sendmsg+0xf3/0x170
        __sys_sendmsg+0xe5/0x1b0
        do_syscall_64+0x2d/0x70
        entry_SYSCALL_64_after_hwframe+0x44/0xae
      
      Fixes: 70c69274 ("drop_monitor: Initialize timer and work item upon tracing enable")
      Fixes: 8ee2267a ("drop_monitor: Convert to using devlink tracepoint")
      Reported-by: syzbot+779559d6503f3a56213d@syzkaller.appspotmail.com
      Signed-off-by: default avatarIdo Schimmel <idosch@nvidia.com>
      Reviewed-by: default avatarJiri Pirko <jiri@nvidia.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9398e9c0
    • David S. Miller's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf · 547fd083
      David S. Miller authored
      Daniel Borkmann says:
      
      ====================
      pull-request: bpf 2021-03-10
      
      The following pull-request contains BPF updates for your *net* tree.
      
      We've added 8 non-merge commits during the last 5 day(s) which contain
      a total of 11 files changed, 136 insertions(+), 17 deletions(-).
      
      The main changes are:
      
      1) Reject bogus use of vmlinux BTF as map/prog creation BTF, from Alexei Starovoitov.
      
      2) Fix allocation failure splat in x86 JIT for large progs. Also fix overwriting
         percpu cgroup storage from tracing programs when nested, from Yonghong Song.
      
      3) Fix rx queue retrieval in XDP for multi-queue veth, from Maciej Fijalkowski.
      
      4) Fix bpf_check_mtu() helper API before freeze to have mtu_len as custom skb/xdp
         L3 input length, from Jesper Dangaard Brouer.
      
      5) Fix inode_storage's lookup_elem return value upon having bad fd, from Tal Lossos.
      
      6) Fix bpftool and libbpf cross-build on MacOS, from Georgi Valkov.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      547fd083
    • Wei Wang's avatar
      ipv6: fix suspecious RCU usage warning · 28259bac
      Wei Wang authored
      Syzbot reported the suspecious RCU usage in nexthop_fib6_nh() when
      called from ipv6_route_seq_show(). The reason is ipv6_route_seq_start()
      calls rcu_read_lock_bh(), while nexthop_fib6_nh() calls
      rcu_dereference_rtnl().
      The fix proposed is to add a variant of nexthop_fib6_nh() to use
      rcu_dereference_bh_rtnl() for ipv6_route_seq_show().
      
      The reported trace is as follows:
      ./include/net/nexthop.h:416 suspicious rcu_dereference_check() usage!
      
      other info that might help us debug this:
      
      rcu_scheduler_active = 2, debug_locks = 1
      2 locks held by syz-executor.0/17895:
           at: seq_read+0x71/0x12a0 fs/seq_file.c:169
           at: seq_file_net include/linux/seq_file_net.h:19 [inline]
           at: ipv6_route_seq_start+0xaf/0x300 net/ipv6/ip6_fib.c:2616
      
      stack backtrace:
      CPU: 1 PID: 17895 Comm: syz-executor.0 Not tainted 4.15.0-syzkaller #0
      Call Trace:
       [<ffffffff849edf9e>] __dump_stack lib/dump_stack.c:17 [inline]
       [<ffffffff849edf9e>] dump_stack+0xd8/0x147 lib/dump_stack.c:53
       [<ffffffff8480b7fa>] lockdep_rcu_suspicious+0x153/0x15d kernel/locking/lockdep.c:5745
       [<ffffffff8459ada6>] nexthop_fib6_nh include/net/nexthop.h:416 [inline]
       [<ffffffff8459ada6>] ipv6_route_native_seq_show net/ipv6/ip6_fib.c:2488 [inline]
       [<ffffffff8459ada6>] ipv6_route_seq_show+0x436/0x7a0 net/ipv6/ip6_fib.c:2673
       [<ffffffff81c556df>] seq_read+0xccf/0x12a0 fs/seq_file.c:276
       [<ffffffff81dbc62c>] proc_reg_read+0x10c/0x1d0 fs/proc/inode.c:231
       [<ffffffff81bc28ae>] do_loop_readv_writev fs/read_write.c:714 [inline]
       [<ffffffff81bc28ae>] do_loop_readv_writev fs/read_write.c:701 [inline]
       [<ffffffff81bc28ae>] do_iter_read+0x49e/0x660 fs/read_write.c:935
       [<ffffffff81bc81ab>] vfs_readv+0xfb/0x170 fs/read_write.c:997
       [<ffffffff81c88847>] kernel_readv fs/splice.c:361 [inline]
       [<ffffffff81c88847>] default_file_splice_read+0x487/0x9c0 fs/splice.c:416
       [<ffffffff81c86189>] do_splice_to+0x129/0x190 fs/splice.c:879
       [<ffffffff81c86f66>] splice_direct_to_actor+0x256/0x890 fs/splice.c:951
       [<ffffffff81c8777d>] do_splice_direct+0x1dd/0x2b0 fs/splice.c:1060
       [<ffffffff81bc4747>] do_sendfile+0x597/0xce0 fs/read_write.c:1459
       [<ffffffff81bca205>] SYSC_sendfile64 fs/read_write.c:1520 [inline]
       [<ffffffff81bca205>] SyS_sendfile64+0x155/0x170 fs/read_write.c:1506
       [<ffffffff81015fcf>] do_syscall_64+0x1ff/0x310 arch/x86/entry/common.c:305
       [<ffffffff84a00076>] entry_SYSCALL_64_after_hwframe+0x42/0xb7
      
      Fixes: f88d8ea6 ("ipv6: Plumb support for nexthop object in a fib6_info")
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarWei Wang <weiwan@google.com>
      Cc: David Ahern <dsahern@kernel.org>
      Cc: Ido Schimmel <idosch@idosch.org>
      Cc: Petr Machata <petrm@nvidia.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Reviewed-by: default avatarIdo Schimmel <idosch@nvidia.com>
      Reviewed-by: default avatarDavid Ahern <dsahern@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      28259bac
    • David S. Miller's avatar
      Merge branch 'ip6ip6-crash' · c89489b4
      David S. Miller authored
      Daniel Borkmann says:
      
      ====================
      Fix ip6ip6 crash for collect_md skbs
      
      Fix a NULL pointer deref panic I ran into for regular ip6ip6 tunnel devices
      when collect_md populated skbs were redirected to them for xmit. See patches
      for further details, thanks!
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c89489b4
    • Daniel Borkmann's avatar
      net, bpf: Fix ip6ip6 crash with collect_md populated skbs · a188bb56
      Daniel Borkmann authored
      I ran into a crash where setting up a ip6ip6 tunnel device which was /not/
      set to collect_md mode was receiving collect_md populated skbs for xmit.
      
      The BPF prog was populating the skb via bpf_skb_set_tunnel_key() which is
      assigning special metadata dst entry and then redirecting the skb to the
      device, taking ip6_tnl_start_xmit() -> ipxip6_tnl_xmit() -> ip6_tnl_xmit()
      and in the latter it performs a neigh lookup based on skb_dst(skb) where
      we trigger a NULL pointer dereference on dst->ops->neigh_lookup() since
      the md_dst_ops do not populate neigh_lookup callback with a fake handler.
      
      Transform the md_dst_ops into generic dst_blackhole_ops that can also be
      reused elsewhere when needed, and use them for the metadata dst entries as
      callback ops.
      
      Also, remove the dst_md_discard{,_out}() ops and rely on dst_discard{,_out}()
      from dst_init() which free the skb the same way modulo the splat. Given we
      will be able to recover just fine from there, avoid any potential splats
      iff this gets ever triggered in future (or worse, panic on warns when set).
      
      Fixes: f38a9eb1 ("dst: Metadata destinations")
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a188bb56
    • Daniel Borkmann's avatar
      net: Consolidate common blackhole dst ops · c4c877b2
      Daniel Borkmann authored
      Move generic blackhole dst ops to the core and use them from both
      ipv4_dst_blackhole_ops and ip6_dst_blackhole_ops where possible. No
      functional change otherwise. We need these also in other locations
      and having to define them over and over again is not great.
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c4c877b2
    • Yevgeny Kliteynik's avatar
      net/mlx5: DR, Fix potential shift wrapping of 32-bit value in STEv1 getter · 84076c4c
      Yevgeny Kliteynik authored
      Fix 32-bit variable shift wrapping in dr_ste_v1_get_miss_addr.
      
      Fixes: a6098129 ("net/mlx5: DR, Add STEv1 setters and getters")
      Reported-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarYevgeny Kliteynik <kliteyn@nvidia.com>
      Reviewed-by: default avatarAlex Vesker <valex@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      84076c4c
    • Shay Drory's avatar
      net/mlx5: SF: Fix error flow of SFs allocation flow · dc694f11
      Shay Drory authored
      When SF id is unavailable, code jumps to wrong label that accesses
      sw id array outside of its range.
      Hence, when SF id is not allocated, avoid accessing such array.
      
      Fixes: 8f010541 ("net/mlx5: SF, Add port add delete functionality")
      Signed-off-by: default avatarShay Drory <shayd@nvidia.com>
      Reviewed-by: default avatarParav Pandit <parav@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      dc694f11
    • Shay Drory's avatar
      net/mlx5: SF: Fix memory leak of work item · 6fa37d66
      Shay Drory authored
      Cited patch in the fixes tag missed to free the allocated work.
      Fix it by freeing the work after work execution.
      
      Fixes: f3196bb0 ("net/mlx5: Introduce vhca state event notifier")
      Signed-off-by: default avatarShay Drory <shayd@nvidia.com>
      Reviewed-by: default avatarParav Pandit <parav@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      6fa37d66
    • Parav Pandit's avatar
      net/mlx5: SF, Correct vhca context size · 6a371754
      Parav Pandit authored
      Fix vhca context size as defined by device interface specification.
      
      Fixes: f3196bb0 ("net/mlx5: Introduce vhca state event notifier")
      Signed-off-by: default avatarParav Pandit <parav@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      6a371754
    • Parav Pandit's avatar
      net/mlx5e: E-switch, Fix rate calculation division · 8b90d897
      Parav Pandit authored
      do_div() returns reminder, while cited patch wanted to use
      quotient.
      Fix it by using quotient.
      
      Fixes: 0e22bfb7 ("net/mlx5e: E-switch, Fix rate calculation for overflow")
      Signed-off-by: default avatarParav Pandit <parav@nvidia.com>
      Signed-off-by: default avatarMaor Dickman <maord@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      8b90d897
    • Maor Gottlieb's avatar
      RDMA/mlx5: Fix timestamp default mode · 8256c69b
      Maor Gottlieb authored
      1. Don't set the ts_format bit to default when it reserved - device is
         running in the old mode (free running).
      2. XRC doesn't have a CQ therefore the ts format in the QP
         context should be default / free running.
      3. Set ts_format to WQ.
      
      Fixes: 2fe8d4b8 ("RDMA/mlx5: Fail QP creation if the device can not support the CQE TS")
      Signed-off-by: default avatarMaor Gottlieb <maorg@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      8256c69b
    • Maor Gottlieb's avatar
      net/mlx5: Set QP timestamp mode to default · 4806f1e2
      Maor Gottlieb authored
      QPs which don't care from timestamp mode, should set the ts_format
      to default, otherwise the QP creation could be failed if the timestamp
      mode is not supported.
      
      Fixes: 2fe8d4b8 ("RDMA/mlx5: Fail QP creation if the device can not support the CQE TS")
      Signed-off-by: default avatarMaor Gottlieb <maorg@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      4806f1e2
    • Roi Dayan's avatar
      net/mlx5e: Fix error flow in change profile · 469549e4
      Roi Dayan authored
      Move priv memset from init to cleanup to avoid double priv cleanup
      that can happen on profile change if also roolback fails.
      Add missing cleanup flow in mlx5e_netdev_attach_profile().
      
      Fixes: c4d7eb57 ("net/mxl5e: Add change profile method")
      Signed-off-by: default avatarRoi Dayan <roid@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      469549e4
    • Maor Dickman's avatar
      net/mlx5: Disable VF tunnel TX offload if ignore_flow_level isn't supported · f574531a
      Maor Dickman authored
      VF tunnel TX traffic offload is adding flow which forward to flow
      tables with lower level, which isn't support on all FW versions
      and may cause firmware to fail with syndrome.
      
      Fixed by enabling VF tunnel TX offload only if flow table capability
      ignore_flow_level is enabled.
      
      Fixes: 10742efc ("net/mlx5e: VF tunnel TX traffic offloading")
      Signed-off-by: default avatarMaor Dickman <maord@nvidia.com>
      Reviewed-by: default avatarVlad Buslov <vladbu@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      f574531a
    • Roi Dayan's avatar
      net/mlx5e: Check correct ip_version in decapsulation route resolution · 1e74152e
      Roi Dayan authored
      flow_attr->ip_version has the matching that should be done inner/outer.
      When working with chains, decapsulation is done on chain0 and next chain
      match on outer header which is the original inner which could be ipv4.
      So in tunnel route resolution we cannot use that to know which ip version
      we are at so save tun_ip_version when parsing the tunnel match and use
      that.
      
      Fixes: a508728a ("net/mlx5e: VF tunnel RX traffic offloading")
      Signed-off-by: default avatarRoi Dayan <roid@nvidia.com>
      Reviewed-by: default avatarDmytro Linkin <dlinkin@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      1e74152e
    • Aya Levin's avatar
      net/mlx5: Fix turn-off PPS command · 55affa97
      Aya Levin authored
      Fix a bug of uninitialized pin index when trying to turn off PPS out.
      
      Fixes: de19cd6c ("net/mlx5: Move some PPS logic into helper functions")
      Signed-off-by: default avatarAya Levin <ayal@nvidia.com>
      Reviewed-by: default avatarEran Ben Elisha <eranbe@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      55affa97
    • Maor Dickman's avatar
      net/mlx5e: Don't match on Geneve options in case option masks are all zero · 385d40b0
      Maor Dickman authored
      The cited change added offload support for Geneve options without verifying
      the validity of the options masks, this caused offload of rules with match
      on Geneve options with class,type and data masks which are zero to fail.
      
      Fix by ignoring the match on Geneve options in case option masks are
      all zero.
      
      Fixes: 9272e3df ("net/mlx5e: Geneve, Add support for encap/decap flows offload")
      Signed-off-by: default avatarMaor Dickman <maord@nvidia.com>
      Reviewed-by: default avatarRoi Dayan <roid@nvidia.com>
      Reviewed-by: default avatarOz Shlomo <ozsh@nvidia.com>
      Reviewed-by: default avatarYevgeny Kliteynik <kliteyn@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      385d40b0