1. 27 Feb, 2020 33 commits
    • Amit Cohen's avatar
      mlxsw: pci: Wait longer before accessing the device after reset · ac004e84
      Amit Cohen authored
      During initialization the driver issues a reset to the device and waits
      for 100ms before checking if the firmware is ready. The waiting is
      necessary because before that the device is irresponsive and the first
      read can result in a completion timeout.
      
      While 100ms is sufficient for Spectrum-1 and Spectrum-2, it is
      insufficient for Spectrum-3.
      
      Fix this by increasing the timeout to 200ms.
      
      Fixes: da382875 ("mlxsw: spectrum: Extend to support Spectrum-3 ASIC")
      Signed-off-by: default avatarAmit Cohen <amitc@mellanox.com>
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ac004e84
    • Alex Maftei (amaftei)'s avatar
      sfc: fix timestamp reconstruction at 16-bit rollover points · 23797b98
      Alex Maftei (amaftei) authored
      We can't just use the top bits of the last sync event as they could be
      off-by-one every 65,536 seconds, giving an error in reconstruction of
      65,536 seconds.
      
      This patch uses the difference in the bottom 16 bits (mod 2^16) to
      calculate an offset that needs to be applied to the last sync event to
      get to the current time.
      Signed-off-by: default avatarAlexandru-Mihai Maftei <amaftei@solarflare.com>
      Acked-by: default avatarMartin Habets <mhabets@solarflare.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      23797b98
    • Stefano Garzarella's avatar
      vsock: fix potential deadlock in transport->release() · 3f74957f
      Stefano Garzarella authored
      Some transports (hyperv, virtio) acquire the sock lock during the
      .release() callback.
      
      In the vsock_stream_connect() we call vsock_assign_transport(); if
      the socket was previously assigned to another transport, the
      vsk->transport->release() is called, but the sock lock is already
      held in the vsock_stream_connect(), causing a deadlock reported by
      syzbot:
      
          INFO: task syz-executor280:9768 blocked for more than 143 seconds.
            Not tainted 5.6.0-rc1-syzkaller #0
          "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
          syz-executor280 D27912  9768   9766 0x00000000
          Call Trace:
           context_switch kernel/sched/core.c:3386 [inline]
           __schedule+0x934/0x1f90 kernel/sched/core.c:4082
           schedule+0xdc/0x2b0 kernel/sched/core.c:4156
           __lock_sock+0x165/0x290 net/core/sock.c:2413
           lock_sock_nested+0xfe/0x120 net/core/sock.c:2938
           virtio_transport_release+0xc4/0xd60 net/vmw_vsock/virtio_transport_common.c:832
           vsock_assign_transport+0xf3/0x3b0 net/vmw_vsock/af_vsock.c:454
           vsock_stream_connect+0x2b3/0xc70 net/vmw_vsock/af_vsock.c:1288
           __sys_connect_file+0x161/0x1c0 net/socket.c:1857
           __sys_connect+0x174/0x1b0 net/socket.c:1874
           __do_sys_connect net/socket.c:1885 [inline]
           __se_sys_connect net/socket.c:1882 [inline]
           __x64_sys_connect+0x73/0xb0 net/socket.c:1882
           do_syscall_64+0xfa/0x790 arch/x86/entry/common.c:294
           entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      To avoid this issue, this patch remove the lock acquiring in the
      .release() callback of hyperv and virtio transports, and it holds
      the lock when we call vsk->transport->release() in the vsock core.
      
      Reported-by: syzbot+731710996d79d0d58fbc@syzkaller.appspotmail.com
      Fixes: 408624af ("vsock: use local transport when it is loaded")
      Signed-off-by: default avatarStefano Garzarella <sgarzare@redhat.com>
      Reviewed-by: default avatarStefan Hajnoczi <stefanha@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3f74957f
    • David S. Miller's avatar
      unix: It's CONFIG_PROC_FS not CONFIG_PROCFS · 5c05a164
      David S. Miller authored
      Fixes: 3a12500e ("unix: define and set show_fdinfo only if procfs is enabled")
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5c05a164
    • David S. Miller's avatar
      Merge branch 'net-rmnet-fix-several-bugs' · 795c03a5
      David S. Miller authored
      Taehee Yoo says:
      
      ====================
      net: rmnet: fix several bugs
      
      This patchset is to fix several bugs in RMNET module.
      
      1. The first patch fixes NULL-ptr-deref in rmnet_newlink().
      When rmnet interface is being created, it uses IFLA_LINK
      without checking NULL.
      So, if userspace doesn't set IFLA_LINK, panic will occur.
      In this patch, checking NULL pointer code is added.
      
      2. The second patch fixes NULL-ptr-deref in rmnet_changelink().
      To get real device in rmnet_changelink(), it uses IFLA_LINK.
      But, IFLA_LINK should not be used in rmnet_changelink().
      
      3. The third patch fixes suspicious RCU usage in rmnet_get_port().
      rmnet_get_port() uses rcu_dereference_rtnl().
      But, rmnet_get_port() is used by datapath.
      So, rcu_dereference_bh() should be used instead of rcu_dereference_rtnl().
      
      4. The fourth patch fixes suspicious RCU usage in
      rmnet_force_unassociate_device().
      RCU critical section should not be scheduled.
      But, unregister_netdevice_queue() in the rmnet_force_unassociate_device()
      would be scheduled.
      So, the RCU warning occurs.
      In this patch, the rcu_read_lock() in the rmnet_force_unassociate_device()
      is removed because it's unnecessary.
      
      5. The fifth patch fixes duplicate MUX ID case.
      RMNET MUX ID is unique.
      So, rmnet interface isn't allowed to be created, which have
      a duplicate MUX ID.
      But, only rmnet_newlink() checks this condition, rmnet_changelink()
      doesn't check this.
      So, duplicate MUX ID case would happen.
      
      6. The sixth patch fixes upper/lower interface relationship problems.
      When IFLA_LINK is used, the upper/lower infrastructure should be used.
      Because it checks the maximum depth of upper/lower interfaces and it also
      checks circular interface relationship, etc.
      In this patch, netdev_upper_dev_link() is used.
      
      7. The seventh patch fixes bridge related problems.
      a) ->ndo_del_slave() doesn't work.
      b) It couldn't detect circular upper/lower interface relationship.
      c) It couldn't prevent stack overflow because of too deep depth
      of upper/lower interface
      d) It doesn't check the number of lower interfaces.
      e) Panics because of several reasons.
      These problems are actually the same problem.
      So, this patch fixes these problems.
      
      8. The eighth patch fixes packet forwarding issue in bridge mode
      Packet forwarding is not working in rmnet bridge mode.
      Because when a packet is forwarded, skb_push() for an ethernet header
      is needed. But it doesn't call skb_push().
      So, the ethernet header will be lost.
      
      Change log:
       - update commit logs.
       - drop two patches in this patchset because of wrong target branch.
         - ("net: rmnet: add missing module alias")
         - ("net: rmnet: print error message when command fails")
       - remove unneessary rcu_read_lock() in the third patch.
       - use rcu_dereference_bh() instead of rcu_dereference in third patch.
       - do not allow to add a bridge device if rmnet interface is already
         bridge mode in the seventh patch.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      795c03a5
    • Taehee Yoo's avatar
      net: rmnet: fix packet forwarding in rmnet bridge mode · ad3cc31b
      Taehee Yoo authored
      Packet forwarding is not working in rmnet bridge mode.
      Because when a packet is forwarded, skb_push() for an ethernet header
      is needed. But it doesn't call skb_push().
      So, the ethernet header will be lost.
      
      Test commands:
          modprobe rmnet
          ip netns add nst
          ip netns add nst2
          ip link add veth0 type veth peer name veth1
          ip link add veth2 type veth peer name veth3
          ip link set veth1 netns nst
          ip link set veth3 netns nst2
      
          ip link add rmnet0 link veth0 type rmnet mux_id 1
          ip link set veth2 master rmnet0
          ip link set veth0 up
          ip link set veth2 up
          ip link set rmnet0 up
          ip a a 192.168.100.1/24 dev rmnet0
      
          ip netns exec nst ip link set veth1 up
          ip netns exec nst ip a a 192.168.100.2/24 dev veth1
          ip netns exec nst2 ip link set veth3 up
          ip netns exec nst2 ip a a 192.168.100.3/24 dev veth3
          ip netns exec nst2 ping 192.168.100.2
      
      Fixes: 60d58f97 ("net: qualcomm: rmnet: Implement bridge mode")
      Signed-off-by: default avatarTaehee Yoo <ap420073@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ad3cc31b
    • Taehee Yoo's avatar
      net: rmnet: fix bridge mode bugs · d939b6d3
      Taehee Yoo authored
      In order to attach a bridge interface to the rmnet interface,
      "master" operation is used.
      (e.g. ip link set dummy1 master rmnet0)
      But, in the rmnet_add_bridge(), which is a callback of ->ndo_add_slave()
      doesn't register lower interface.
      So, ->ndo_del_slave() doesn't work.
      There are other problems too.
      1. It couldn't detect circular upper/lower interface relationship.
      2. It couldn't prevent stack overflow because of too deep depth
      of upper/lower interface
      3. It doesn't check the number of lower interfaces.
      4. Panics because of several reasons.
      
      The root problem of these issues is actually the same.
      So, in this patch, these all problems will be fixed.
      
      Test commands:
          modprobe rmnet
          ip link add dummy0 type dummy
          ip link add rmnet0 link dummy0 type rmnet mux_id 1
          ip link add dummy1 master rmnet0 type dummy
          ip link add dummy2 master rmnet0 type dummy
          ip link del rmnet0
          ip link del dummy2
          ip link del dummy1
      
      Splat looks like:
      [   41.867595][ T1164] general protection fault, probably for non-canonical address 0xdffffc0000000101I
      [   41.869993][ T1164] KASAN: null-ptr-deref in range [0x0000000000000808-0x000000000000080f]
      [   41.872950][ T1164] CPU: 0 PID: 1164 Comm: ip Not tainted 5.6.0-rc1+ #447
      [   41.873915][ T1164] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
      [   41.875161][ T1164] RIP: 0010:rmnet_unregister_bridge.isra.6+0x71/0xf0 [rmnet]
      [   41.876178][ T1164] Code: 48 89 ef 48 89 c6 5b 5d e9 fc fe ff ff e8 f7 f3 ff ff 48 8d b8 08 08 00 00 48 ba 00 7
      [   41.878925][ T1164] RSP: 0018:ffff8880c4d0f188 EFLAGS: 00010202
      [   41.879774][ T1164] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000101
      [   41.887689][ T1164] RDX: dffffc0000000000 RSI: ffffffffb8cf64f0 RDI: 0000000000000808
      [   41.888727][ T1164] RBP: ffff8880c40e4000 R08: ffffed101b3c0e3c R09: 0000000000000001
      [   41.889749][ T1164] R10: 0000000000000001 R11: ffffed101b3c0e3b R12: 1ffff110189a1e3c
      [   41.890783][ T1164] R13: ffff8880c4d0f200 R14: ffffffffb8d56160 R15: ffff8880ccc2c000
      [   41.891794][ T1164] FS:  00007f4300edc0c0(0000) GS:ffff8880d9c00000(0000) knlGS:0000000000000000
      [   41.892953][ T1164] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [   41.893800][ T1164] CR2: 00007f43003bc8c0 CR3: 00000000ca53e001 CR4: 00000000000606f0
      [   41.894824][ T1164] Call Trace:
      [   41.895274][ T1164]  ? rcu_is_watching+0x2c/0x80
      [   41.895895][ T1164]  rmnet_config_notify_cb+0x1f7/0x590 [rmnet]
      [   41.896687][ T1164]  ? rmnet_unregister_bridge.isra.6+0xf0/0xf0 [rmnet]
      [   41.897611][ T1164]  ? rmnet_unregister_bridge.isra.6+0xf0/0xf0 [rmnet]
      [   41.898508][ T1164]  ? __module_text_address+0x13/0x140
      [   41.899162][ T1164]  notifier_call_chain+0x90/0x160
      [   41.899814][ T1164]  rollback_registered_many+0x660/0xcf0
      [   41.900544][ T1164]  ? netif_set_real_num_tx_queues+0x780/0x780
      [   41.901316][ T1164]  ? __lock_acquire+0xdfe/0x3de0
      [   41.901958][ T1164]  ? memset+0x1f/0x40
      [   41.902468][ T1164]  ? __nla_validate_parse+0x98/0x1ab0
      [   41.903166][ T1164]  unregister_netdevice_many.part.133+0x13/0x1b0
      [   41.903988][ T1164]  rtnl_delete_link+0xbc/0x100
      [ ... ]
      
      Fixes: 60d58f97 ("net: qualcomm: rmnet: Implement bridge mode")
      Signed-off-by: default avatarTaehee Yoo <ap420073@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d939b6d3
    • Taehee Yoo's avatar
      net: rmnet: use upper/lower device infrastructure · 037f9cdf
      Taehee Yoo authored
      netdev_upper_dev_link() is useful to manage lower/upper interfaces.
      And this function internally validates looping, maximum depth.
      All or most virtual interfaces that could have a real interface
      (e.g. macsec, macvlan, ipvlan etc.) use lower/upper infrastructure.
      
      Test commands:
          modprobe rmnet
          ip link add dummy0 type dummy
          ip link add rmnet1 link dummy0 type rmnet mux_id 1
          for i in {2..100}
          do
              let A=$i-1
              ip link add rmnet$i link rmnet$A type rmnet mux_id $i
          done
          ip link del dummy0
      
      The purpose of the test commands is to make stack overflow.
      
      Splat looks like:
      [   52.411438][ T1395] BUG: KASAN: slab-out-of-bounds in find_busiest_group+0x27e/0x2c00
      [   52.413218][ T1395] Write of size 64 at addr ffff8880c774bde0 by task ip/1395
      [   52.414841][ T1395]
      [   52.430720][ T1395] CPU: 1 PID: 1395 Comm: ip Not tainted 5.6.0-rc1+ #447
      [   52.496511][ T1395] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
      [   52.513597][ T1395] Call Trace:
      [   52.546516][ T1395]
      [   52.558773][ T1395] Allocated by task 3171537984:
      [   52.588290][ T1395] BUG: unable to handle page fault for address: ffffffffb999e260
      [   52.589311][ T1395] #PF: supervisor read access in kernel mode
      [   52.590529][ T1395] #PF: error_code(0x0000) - not-present page
      [   52.591374][ T1395] PGD d6818067 P4D d6818067 PUD d6819063 PMD 0
      [   52.592288][ T1395] Thread overran stack, or stack corrupted
      [   52.604980][ T1395] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI
      [   52.605856][ T1395] CPU: 1 PID: 1395 Comm: ip Not tainted 5.6.0-rc1+ #447
      [   52.611764][ T1395] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
      [   52.621520][ T1395] RIP: 0010:stack_depot_fetch+0x10/0x30
      [   52.622296][ T1395] Code: ff e9 f9 fe ff ff 48 89 df e8 9c 1d 91 ff e9 ca fe ff ff cc cc cc cc cc cc cc 89 f8 0
      [   52.627887][ T1395] RSP: 0018:ffff8880c774bb60 EFLAGS: 00010006
      [   52.628735][ T1395] RAX: 00000000001f8880 RBX: ffff8880c774d140 RCX: 0000000000000000
      [   52.631773][ T1395] RDX: 000000000000001d RSI: ffff8880c774bb68 RDI: 0000000000003ff0
      [   52.649584][ T1395] RBP: ffffea00031dd200 R08: ffffed101b43e403 R09: ffffed101b43e403
      [   52.674857][ T1395] R10: 0000000000000001 R11: ffffed101b43e402 R12: ffff8880d900e5c0
      [   52.678257][ T1395] R13: ffff8880c774c000 R14: 0000000000000000 R15: dffffc0000000000
      [   52.694541][ T1395] FS:  00007fe867f6e0c0(0000) GS:ffff8880da000000(0000) knlGS:0000000000000000
      [   52.764039][ T1395] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [   52.815008][ T1395] CR2: ffffffffb999e260 CR3: 00000000c26aa005 CR4: 00000000000606e0
      [   52.862312][ T1395] Call Trace:
      [   52.887133][ T1395] Modules linked in: dummy rmnet veth openvswitch nsh nf_conncount nf_nat nf_conntrack nf_dex
      [   52.936749][ T1395] CR2: ffffffffb999e260
      [   52.965695][ T1395] ---[ end trace 7e32ca99482dbb31 ]---
      [   52.966556][ T1395] RIP: 0010:stack_depot_fetch+0x10/0x30
      [   52.971083][ T1395] Code: ff e9 f9 fe ff ff 48 89 df e8 9c 1d 91 ff e9 ca fe ff ff cc cc cc cc cc cc cc 89 f8 0
      [   53.003650][ T1395] RSP: 0018:ffff8880c774bb60 EFLAGS: 00010006
      [   53.043183][ T1395] RAX: 00000000001f8880 RBX: ffff8880c774d140 RCX: 0000000000000000
      [   53.076480][ T1395] RDX: 000000000000001d RSI: ffff8880c774bb68 RDI: 0000000000003ff0
      [   53.093858][ T1395] RBP: ffffea00031dd200 R08: ffffed101b43e403 R09: ffffed101b43e403
      [   53.112795][ T1395] R10: 0000000000000001 R11: ffffed101b43e402 R12: ffff8880d900e5c0
      [   53.139837][ T1395] R13: ffff8880c774c000 R14: 0000000000000000 R15: dffffc0000000000
      [   53.141500][ T1395] FS:  00007fe867f6e0c0(0000) GS:ffff8880da000000(0000) knlGS:0000000000000000
      [   53.143343][ T1395] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [   53.152007][ T1395] CR2: ffffffffb999e260 CR3: 00000000c26aa005 CR4: 00000000000606e0
      [   53.156459][ T1395] Kernel panic - not syncing: Fatal exception
      [   54.213570][ T1395] Shutting down cpus with NMI
      [   54.354112][ T1395] Kernel Offset: 0x33000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0x)
      [   54.355687][ T1395] Rebooting in 5 seconds..
      
      Fixes: b37f78f2 ("net: qualcomm: rmnet: Fix crash on real dev unregistration")
      Signed-off-by: default avatarTaehee Yoo <ap420073@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      037f9cdf
    • Taehee Yoo's avatar
      net: rmnet: do not allow to change mux id if mux id is duplicated · 1dc49e9d
      Taehee Yoo authored
      Basically, duplicate mux id isn't be allowed.
      So, the creation of rmnet will be failed if there is duplicate mux id
      is existing.
      But, changelink routine doesn't check duplicate mux id.
      
      Test commands:
          modprobe rmnet
          ip link add dummy0 type dummy
          ip link add rmnet0 link dummy0 type rmnet mux_id 1
          ip link add rmnet1 link dummy0 type rmnet mux_id 2
          ip link set rmnet1 type rmnet mux_id 1
      
      Fixes: 23790ef1 ("net: qualcomm: rmnet: Allow to configure flags for existing devices")
      Signed-off-by: default avatarTaehee Yoo <ap420073@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1dc49e9d
    • Taehee Yoo's avatar
      net: rmnet: remove rcu_read_lock in rmnet_force_unassociate_device() · c026d970
      Taehee Yoo authored
      The notifier_call() of the slave interface removes rmnet interface with
      unregister_netdevice_queue().
      But, before calling unregister_netdevice_queue(), it acquires
      rcu readlock.
      In the RCU critical section, sleeping isn't be allowed.
      But, unregister_netdevice_queue() internally calls synchronize_net(),
      which would sleep.
      So, suspicious RCU usage warning occurs.
      
      Test commands:
          modprobe rmnet
          ip link add dummy0 type dummy
          ip link add dummy1 type dummy
          ip link add rmnet0 link dummy0 type rmnet mux_id 1
          ip link set dummy1 master rmnet0
          ip link del dummy0
      
      Splat looks like:
      [   79.639245][ T1195] =============================
      [   79.640134][ T1195] WARNING: suspicious RCU usage
      [   79.640852][ T1195] 5.6.0-rc1+ #447 Not tainted
      [   79.641657][ T1195] -----------------------------
      [   79.642472][ T1195] ./include/linux/rcupdate.h:273 Illegal context switch in RCU read-side critical section!
      [   79.644043][ T1195]
      [   79.644043][ T1195] other info that might help us debug this:
      [   79.644043][ T1195]
      [   79.645682][ T1195]
      [   79.645682][ T1195] rcu_scheduler_active = 2, debug_locks = 1
      [   79.646980][ T1195] 2 locks held by ip/1195:
      [   79.647629][ T1195]  #0: ffffffffa3cf64f0 (rtnl_mutex){+.+.}, at: rtnetlink_rcv_msg+0x457/0x890
      [   79.649312][ T1195]  #1: ffffffffa39256c0 (rcu_read_lock){....}, at: rmnet_config_notify_cb+0xf0/0x590 [rmnet]
      [   79.651717][ T1195]
      [   79.651717][ T1195] stack backtrace:
      [   79.652650][ T1195] CPU: 3 PID: 1195 Comm: ip Not tainted 5.6.0-rc1+ #447
      [   79.653702][ T1195] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
      [   79.655037][ T1195] Call Trace:
      [   79.655560][ T1195]  dump_stack+0x96/0xdb
      [   79.656252][ T1195]  ___might_sleep+0x345/0x440
      [   79.656994][ T1195]  synchronize_net+0x18/0x30
      [   79.661132][ T1195]  netdev_rx_handler_unregister+0x40/0xb0
      [   79.666266][ T1195]  rmnet_unregister_real_device+0x42/0xb0 [rmnet]
      [   79.667211][ T1195]  rmnet_config_notify_cb+0x1f7/0x590 [rmnet]
      [   79.668121][ T1195]  ? rmnet_unregister_bridge.isra.6+0xf0/0xf0 [rmnet]
      [   79.669166][ T1195]  ? rmnet_unregister_bridge.isra.6+0xf0/0xf0 [rmnet]
      [   79.670286][ T1195]  ? __module_text_address+0x13/0x140
      [   79.671139][ T1195]  notifier_call_chain+0x90/0x160
      [   79.671973][ T1195]  rollback_registered_many+0x660/0xcf0
      [   79.672893][ T1195]  ? netif_set_real_num_tx_queues+0x780/0x780
      [   79.675091][ T1195]  ? __lock_acquire+0xdfe/0x3de0
      [   79.675825][ T1195]  ? memset+0x1f/0x40
      [   79.676367][ T1195]  ? __nla_validate_parse+0x98/0x1ab0
      [   79.677290][ T1195]  unregister_netdevice_many.part.133+0x13/0x1b0
      [   79.678163][ T1195]  rtnl_delete_link+0xbc/0x100
      [ ... ]
      
      Fixes: ceed73a2 ("drivers: net: ethernet: qualcomm: rmnet: Initial implementation")
      Signed-off-by: default avatarTaehee Yoo <ap420073@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c026d970
    • Taehee Yoo's avatar
      net: rmnet: fix suspicious RCU usage · 102210f7
      Taehee Yoo authored
      rmnet_get_port() internally calls rcu_dereference_rtnl(),
      which checks RTNL.
      But rmnet_get_port() could be called by packet path.
      The packet path is not protected by RTNL.
      So, the suspicious RCU usage problem occurs.
      
      Test commands:
          modprobe rmnet
          ip netns add nst
          ip link add veth0 type veth peer name veth1
          ip link set veth1 netns nst
          ip link add rmnet0 link veth0 type rmnet mux_id 1
          ip netns exec nst ip link add rmnet1 link veth1 type rmnet mux_id 1
          ip netns exec nst ip link set veth1 up
          ip netns exec nst ip link set rmnet1 up
          ip netns exec nst ip a a 192.168.100.2/24 dev rmnet1
          ip link set veth0 up
          ip link set rmnet0 up
          ip a a 192.168.100.1/24 dev rmnet0
          ping 192.168.100.2
      
      Splat looks like:
      [  146.630958][ T1174] WARNING: suspicious RCU usage
      [  146.631735][ T1174] 5.6.0-rc1+ #447 Not tainted
      [  146.632387][ T1174] -----------------------------
      [  146.633151][ T1174] drivers/net/ethernet/qualcomm/rmnet/rmnet_config.c:386 suspicious rcu_dereference_check() !
      [  146.634742][ T1174]
      [  146.634742][ T1174] other info that might help us debug this:
      [  146.634742][ T1174]
      [  146.645992][ T1174]
      [  146.645992][ T1174] rcu_scheduler_active = 2, debug_locks = 1
      [  146.646937][ T1174] 5 locks held by ping/1174:
      [  146.647609][ T1174]  #0: ffff8880c31dea70 (sk_lock-AF_INET){+.+.}, at: raw_sendmsg+0xab8/0x2980
      [  146.662463][ T1174]  #1: ffffffff93925660 (rcu_read_lock_bh){....}, at: ip_finish_output2+0x243/0x2150
      [  146.671696][ T1174]  #2: ffffffff93925660 (rcu_read_lock_bh){....}, at: __dev_queue_xmit+0x213/0x2940
      [  146.673064][ T1174]  #3: ffff8880c19ecd58 (&dev->qdisc_running_key#7){+...}, at: ip_finish_output2+0x714/0x2150
      [  146.690358][ T1174]  #4: ffff8880c5796898 (&dev->qdisc_xmit_lock_key#3){+.-.}, at: sch_direct_xmit+0x1e2/0x1020
      [  146.699875][ T1174]
      [  146.699875][ T1174] stack backtrace:
      [  146.701091][ T1174] CPU: 0 PID: 1174 Comm: ping Not tainted 5.6.0-rc1+ #447
      [  146.705215][ T1174] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
      [  146.706565][ T1174] Call Trace:
      [  146.707102][ T1174]  dump_stack+0x96/0xdb
      [  146.708007][ T1174]  rmnet_get_port.part.9+0x76/0x80 [rmnet]
      [  146.709233][ T1174]  rmnet_egress_handler+0x107/0x420 [rmnet]
      [  146.710492][ T1174]  ? sch_direct_xmit+0x1e2/0x1020
      [  146.716193][ T1174]  rmnet_vnd_start_xmit+0x3d/0xa0 [rmnet]
      [  146.717012][ T1174]  dev_hard_start_xmit+0x160/0x740
      [  146.717854][ T1174]  sch_direct_xmit+0x265/0x1020
      [  146.718577][ T1174]  ? register_lock_class+0x14d0/0x14d0
      [  146.719429][ T1174]  ? dev_watchdog+0xac0/0xac0
      [  146.723738][ T1174]  ? __dev_queue_xmit+0x15fd/0x2940
      [  146.724469][ T1174]  ? lock_acquire+0x164/0x3b0
      [  146.725172][ T1174]  __dev_queue_xmit+0x20c7/0x2940
      [ ... ]
      
      Fixes: ceed73a2 ("drivers: net: ethernet: qualcomm: rmnet: Initial implementation")
      Signed-off-by: default avatarTaehee Yoo <ap420073@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      102210f7
    • Taehee Yoo's avatar
      net: rmnet: fix NULL pointer dereference in rmnet_changelink() · 1eb1f43a
      Taehee Yoo authored
      In the rmnet_changelink(), it uses IFLA_LINK without checking
      NULL pointer.
      tb[IFLA_LINK] could be NULL pointer.
      So, NULL-ptr-deref could occur.
      
      rmnet already has a lower interface (real_dev).
      So, after this patch, rmnet_changelink() does not use IFLA_LINK anymore.
      
      Test commands:
          modprobe rmnet
          ip link add dummy0 type dummy
          ip link add rmnet0 link dummy0 type rmnet mux_id 1
          ip link set rmnet0 type rmnet mux_id 2
      
      Splat looks like:
      [   90.578726][ T1131] general protection fault, probably for non-canonical address 0xdffffc0000000000I
      [   90.581121][ T1131] KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
      [   90.582380][ T1131] CPU: 2 PID: 1131 Comm: ip Not tainted 5.6.0-rc1+ #447
      [   90.584285][ T1131] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
      [   90.587506][ T1131] RIP: 0010:rmnet_changelink+0x5a/0x8a0 [rmnet]
      [   90.588546][ T1131] Code: 83 ec 20 48 c1 ea 03 80 3c 02 00 0f 85 6f 07 00 00 48 8b 5e 28 48 b8 00 00 00 00 00 0
      [   90.591447][ T1131] RSP: 0018:ffff8880ce78f1b8 EFLAGS: 00010247
      [   90.592329][ T1131] RAX: dffffc0000000000 RBX: 0000000000000000 RCX: ffff8880ce78f8b0
      [   90.593253][ T1131] RDX: 0000000000000000 RSI: ffff8880ce78f4a0 RDI: 0000000000000004
      [   90.594058][ T1131] RBP: ffff8880cf543e00 R08: 0000000000000002 R09: 0000000000000002
      [   90.594859][ T1131] R10: ffffffffc0586a40 R11: 0000000000000000 R12: ffff8880ca47c000
      [   90.595690][ T1131] R13: ffff8880ca47c000 R14: ffff8880cf545000 R15: 0000000000000000
      [   90.596553][ T1131] FS:  00007f21f6c7e0c0(0000) GS:ffff8880da400000(0000) knlGS:0000000000000000
      [   90.597504][ T1131] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [   90.599418][ T1131] CR2: 0000556e413db458 CR3: 00000000c917a002 CR4: 00000000000606e0
      [   90.600289][ T1131] Call Trace:
      [   90.600631][ T1131]  __rtnl_newlink+0x922/0x1270
      [   90.601194][ T1131]  ? lock_downgrade+0x6e0/0x6e0
      [   90.601724][ T1131]  ? rtnl_link_unregister+0x220/0x220
      [   90.602309][ T1131]  ? lock_acquire+0x164/0x3b0
      [   90.602784][ T1131]  ? is_bpf_image_address+0xff/0x1d0
      [   90.603331][ T1131]  ? rtnl_newlink+0x4c/0x90
      [   90.603810][ T1131]  ? kernel_text_address+0x111/0x140
      [   90.604419][ T1131]  ? __kernel_text_address+0xe/0x30
      [   90.604981][ T1131]  ? unwind_get_return_address+0x5f/0xa0
      [   90.605616][ T1131]  ? create_prof_cpu_mask+0x20/0x20
      [   90.606304][ T1131]  ? arch_stack_walk+0x83/0xb0
      [   90.606985][ T1131]  ? stack_trace_save+0x82/0xb0
      [   90.607656][ T1131]  ? stack_trace_consume_entry+0x160/0x160
      [   90.608503][ T1131]  ? deactivate_slab.isra.78+0x2c5/0x800
      [   90.609336][ T1131]  ? kasan_unpoison_shadow+0x30/0x40
      [   90.610096][ T1131]  ? kmem_cache_alloc_trace+0x135/0x350
      [   90.610889][ T1131]  ? rtnl_newlink+0x4c/0x90
      [   90.611512][ T1131]  rtnl_newlink+0x65/0x90
      [ ... ]
      
      Fixes: 23790ef1 ("net: qualcomm: rmnet: Allow to configure flags for existing devices")
      Signed-off-by: default avatarTaehee Yoo <ap420073@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1eb1f43a
    • Taehee Yoo's avatar
      net: rmnet: fix NULL pointer dereference in rmnet_newlink() · 93b5cbfa
      Taehee Yoo authored
      rmnet registers IFLA_LINK interface as a lower interface.
      But, IFLA_LINK could be NULL.
      In the current code, rmnet doesn't check IFLA_LINK.
      So, panic would occur.
      
      Test commands:
          modprobe rmnet
          ip link add rmnet0 type rmnet mux_id 1
      
      Splat looks like:
      [   36.826109][ T1115] general protection fault, probably for non-canonical address 0xdffffc0000000000I
      [   36.838817][ T1115] KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
      [   36.839908][ T1115] CPU: 1 PID: 1115 Comm: ip Not tainted 5.6.0-rc1+ #447
      [   36.840569][ T1115] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
      [   36.841408][ T1115] RIP: 0010:rmnet_newlink+0x54/0x510 [rmnet]
      [   36.841986][ T1115] Code: 83 ec 18 48 c1 e9 03 80 3c 01 00 0f 85 d4 03 00 00 48 8b 6a 28 48 b8 00 00 00 00 00 c
      [   36.843923][ T1115] RSP: 0018:ffff8880b7e0f1c0 EFLAGS: 00010247
      [   36.844756][ T1115] RAX: dffffc0000000000 RBX: ffff8880d14cca00 RCX: 1ffff11016fc1e99
      [   36.845859][ T1115] RDX: 0000000000000000 RSI: ffff8880c3d04000 RDI: 0000000000000004
      [   36.846961][ T1115] RBP: 0000000000000000 R08: ffff8880b7e0f8b0 R09: ffff8880b6ac2d90
      [   36.848020][ T1115] R10: ffffffffc0589a40 R11: ffffed1016d585b7 R12: ffffffff88ceaf80
      [   36.848788][ T1115] R13: ffff8880c3d04000 R14: ffff8880b7e0f8b0 R15: ffff8880c3d04000
      [   36.849546][ T1115] FS:  00007f50ab3360c0(0000) GS:ffff8880da000000(0000) knlGS:0000000000000000
      [   36.851784][ T1115] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [   36.852422][ T1115] CR2: 000055871afe5ab0 CR3: 00000000ae246001 CR4: 00000000000606e0
      [   36.853181][ T1115] Call Trace:
      [   36.853514][ T1115]  __rtnl_newlink+0xbdb/0x1270
      [   36.853967][ T1115]  ? lock_downgrade+0x6e0/0x6e0
      [   36.854420][ T1115]  ? rtnl_link_unregister+0x220/0x220
      [   36.854936][ T1115]  ? lock_acquire+0x164/0x3b0
      [   36.855376][ T1115]  ? is_bpf_image_address+0xff/0x1d0
      [   36.855884][ T1115]  ? rtnl_newlink+0x4c/0x90
      [   36.856304][ T1115]  ? kernel_text_address+0x111/0x140
      [   36.856857][ T1115]  ? __kernel_text_address+0xe/0x30
      [   36.857440][ T1115]  ? unwind_get_return_address+0x5f/0xa0
      [   36.858063][ T1115]  ? create_prof_cpu_mask+0x20/0x20
      [   36.858644][ T1115]  ? arch_stack_walk+0x83/0xb0
      [   36.859171][ T1115]  ? stack_trace_save+0x82/0xb0
      [   36.859710][ T1115]  ? stack_trace_consume_entry+0x160/0x160
      [   36.860357][ T1115]  ? deactivate_slab.isra.78+0x2c5/0x800
      [   36.860928][ T1115]  ? kasan_unpoison_shadow+0x30/0x40
      [   36.861520][ T1115]  ? kmem_cache_alloc_trace+0x135/0x350
      [   36.862125][ T1115]  ? rtnl_newlink+0x4c/0x90
      [   36.864073][ T1115]  rtnl_newlink+0x65/0x90
      [ ... ]
      
      Fixes: ceed73a2 ("drivers: net: ethernet: qualcomm: rmnet: Initial implementation")
      Signed-off-by: default avatarTaehee Yoo <ap420073@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      93b5cbfa
    • Russell King's avatar
      net: phy: marvell: don't interpret PHY status unless resolved · b82cf17f
      Russell King authored
      Don't attempt to interpret the PHY specific status register unless
      the PHY is indicating that the resolution is valid.
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarRussell King <rmk+kernel@armlinux.org.uk>
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b82cf17f
    • Jiri Pirko's avatar
      mlx5: register lag notifier for init network namespace only · e387f7d5
      Jiri Pirko authored
      The current code causes problems when the unregistering netdevice could
      be different then the registering one.
      
      Since the check in mlx5_lag_netdev_event() does not allow any other
      network namespace anyway, fix this by registerting the lag notifier
      per init network namespace only.
      
      Fixes: d48834f9 ("mlx5: Use dev_net netdevice notifier registrations")
      Signed-off-by: default avatarJiri Pirko <jiri@mellanox.com>
      Tested-by: default avatarAya Levin <ayal@mellanox.com>
      Acked-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e387f7d5
    • Tobias Klauser's avatar
      unix: define and set show_fdinfo only if procfs is enabled · 3a12500e
      Tobias Klauser authored
      Follow the pattern used with other *_show_fdinfo functions and only
      define unix_show_fdinfo and set it in proto_ops if CONFIG_PROCFS
      is set.
      
      Fixes: 3c32da19 ("unix: Show number of pending scm files of receive queue in fdinfo")
      Signed-off-by: default avatarTobias Klauser <tklauser@distanz.ch>
      Reviewed-by: default avatarKirill Tkhai <ktkhai@virtuozzo.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3a12500e
    • David S. Miller's avatar
      Merge branch 'hinic-BugFixes' · f4979b41
      David S. Miller authored
      Luo bin says:
      
      ====================
      hinic: BugFixes
      
      the bug fixed in patch #2 has been present since the first commit.
      the bugs fixed in patch #1 and patch #3 have been present since the
      following commits:
      patch #1: 352f58b0 ("net-next/hinic: Set Rxq irq to specific cpu for NUMA")
      patch #3: 421e9526 ("hinic: add rss support")
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f4979b41
    • Luo bin's avatar
      hinic: fix a bug of rss configuration · 386d4716
      Luo bin authored
      should use real receive queue number to configure hw rss
      indirect table rather than maximal queue number
      Signed-off-by: default avatarLuo bin <luobin9@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      386d4716
    • Luo bin's avatar
      hinic: fix a bug of setting hw_ioctxt · d2ed69ce
      Luo bin authored
      a reserved field is used to signify prime physical function index
      in the latest firmware version, so we must assign a value to it
      correctly
      Signed-off-by: default avatarLuo bin <luobin9@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d2ed69ce
    • Luo bin's avatar
      hinic: fix a irq affinity bug · 0bff777b
      Luo bin authored
      can not use a local variable as an input parameter of
      irq_set_affinity_hint
      Signed-off-by: default avatarLuo bin <luobin9@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0bff777b
    • Karsten Graul's avatar
      net/smc: check for valid ib_client_data · a2f2ef4a
      Karsten Graul authored
      In smc_ib_remove_dev() check if the provided ib device was actually
      initialized for SMC before.
      
      Reported-by: syzbot+84484ccebdd4e5451d91@syzkaller.appspotmail.com
      Fixes: a4cf0443 ("smc: introduce SMC as an IB-client")
      Signed-off-by: default avatarKarsten Graul <kgraul@linux.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a2f2ef4a
    • Aaro Koskinen's avatar
      net: stmmac: fix notifier registration · 474a31e1
      Aaro Koskinen authored
      We cannot register the same netdev notifier multiple times when probing
      stmmac devices. Register the notifier only once in module init, and also
      make debugfs creation/deletion safe against simultaneous notifier call.
      
      Fixes: 481a7d15 ("stmmac: debugfs entry name is not be changed when udev rename device name.")
      Signed-off-by: default avatarAaro Koskinen <aaro.koskinen@nokia.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      474a31e1
    • Antoine Tenart's avatar
      net: phy: mscc: fix firmware paths · c87a9d6f
      Antoine Tenart authored
      The firmware paths for the VSC8584 PHYs not not contain the leading
      'microchip/' directory, as used in linux-firmware, resulting in an
      error when probing the driver. This patch fixes it.
      
      Fixes: a5afc167 ("net: phy: mscc: add support for VSC8584 PHY")
      Signed-off-by: default avatarAntoine Tenart <antoine.tenart@bootlin.com>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c87a9d6f
    • Paolo Abeni's avatar
      mptcp: add dummy icsk_sync_mss() · dc24f8b4
      Paolo Abeni authored
      syzbot noted that the master MPTCP socket lacks the icsk_sync_mss
      callback, and was able to trigger a null pointer dereference:
      
      BUG: kernel NULL pointer dereference, address: 0000000000000000
      PGD 8e171067 P4D 8e171067 PUD 93fa2067 PMD 0
      Oops: 0010 [#1] PREEMPT SMP KASAN
      CPU: 0 PID: 8984 Comm: syz-executor066 Not tainted 5.6.0-rc2-syzkaller #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      RIP: 0010:0x0
      Code: Bad RIP value.
      RSP: 0018:ffffc900020b7b80 EFLAGS: 00010246
      RAX: 1ffff110124ba600 RBX: 0000000000000000 RCX: ffff88809fefa600
      RDX: ffff8880994cdb18 RSI: 0000000000000000 RDI: ffff8880925d3140
      RBP: ffffc900020b7bd8 R08: ffffffff870225be R09: fffffbfff140652a
      R10: fffffbfff140652a R11: 0000000000000000 R12: ffff8880925d35d0
      R13: ffff8880925d3140 R14: dffffc0000000000 R15: 1ffff110124ba6ba
      FS:  0000000001a0b880(0000) GS:ffff8880aea00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: ffffffffffffffd6 CR3: 00000000a6d6f000 CR4: 00000000001406f0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Call Trace:
       cipso_v4_sock_setattr+0x34b/0x470 net/ipv4/cipso_ipv4.c:1888
       netlbl_sock_setattr+0x2a7/0x310 net/netlabel/netlabel_kapi.c:989
       smack_netlabel security/smack/smack_lsm.c:2425 [inline]
       smack_inode_setsecurity+0x3da/0x4a0 security/smack/smack_lsm.c:2716
       security_inode_setsecurity+0xb2/0x140 security/security.c:1364
       __vfs_setxattr_noperm+0x16f/0x3e0 fs/xattr.c:197
       vfs_setxattr fs/xattr.c:224 [inline]
       setxattr+0x335/0x430 fs/xattr.c:451
       __do_sys_fsetxattr fs/xattr.c:506 [inline]
       __se_sys_fsetxattr+0x130/0x1b0 fs/xattr.c:495
       __x64_sys_fsetxattr+0xbf/0xd0 fs/xattr.c:495
       do_syscall_64+0xf7/0x1c0 arch/x86/entry/common.c:294
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      RIP: 0033:0x440199
      Code: 18 89 d0 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 fb 13 fc ff c3 66 2e 0f 1f 84 00 00 00 00
      RSP: 002b:00007ffcadc19e48 EFLAGS: 00000246 ORIG_RAX: 00000000000000be
      RAX: ffffffffffffffda RBX: 00000000004002c8 RCX: 0000000000440199
      RDX: 0000000020000200 RSI: 00000000200001c0 RDI: 0000000000000003
      RBP: 00000000006ca018 R08: 0000000000000003 R09: 00000000004002c8
      R10: 0000000000000009 R11: 0000000000000246 R12: 0000000000401a20
      R13: 0000000000401ab0 R14: 0000000000000000 R15: 0000000000000000
      Modules linked in:
      CR2: 0000000000000000
      
      Address the issue adding a dummy icsk_sync_mss callback.
      To properly sync the subflows mss and options list we need some
      additional infrastructure, which will land to net-next.
      
      Reported-by: syzbot+f4dfece964792d80b139@syzkaller.appspotmail.com
      Fixes: 2303f994 ("mptcp: Associate MPTCP context with TCP socket")
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Reviewed-by: default avatarMat Martineau <mathew.j.martineau@linux.intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      dc24f8b4
    • Sudheesh Mavila's avatar
      net: phy: corrected the return value for genphy_check_and_restart_aneg and... · 4f31c532
      Sudheesh Mavila authored
      net: phy: corrected the return value for genphy_check_and_restart_aneg and genphy_c45_check_and_restart_aneg
      
      When auto-negotiation is not required, return value should be zero.
      
      Changes v1->v2:
      - improved comments and code as Andrew Lunn and Heiner Kallweit suggestion
      - fixed issue in genphy_c45_check_and_restart_aneg as Russell King
        suggestion.
      
      Fixes: 2a10ab04 ("net: phy: add genphy_check_and_restart_aneg()")
      Fixes: 1af9f168 ("net: phy: add genphy_c45_check_and_restart_aneg()")
      Signed-off-by: default avatarSudheesh Mavila <sudheesh.mavila@amd.com>
      Reviewed-by: default avatarHeiner Kallweit <hkallweit1@gmail.com>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4f31c532
    • yangerkun's avatar
      slip: not call free_netdev before rtnl_unlock in slip_open · f596c870
      yangerkun authored
      As the description before netdev_run_todo, we cannot call free_netdev
      before rtnl_unlock, fix it by reorder the code.
      Signed-off-by: default avataryangerkun <yangerkun@huawei.com>
      Reviewed-by: default avatarOliver Hartkopp <socketcan@hartkopp.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f596c870
    • Eric Dumazet's avatar
      ipv6: restrict IPV6_ADDRFORM operation · b6f61189
      Eric Dumazet authored
      IPV6_ADDRFORM is able to transform IPv6 socket to IPv4 one.
      While this operation sounds illogical, we have to support it.
      
      One of the things it does for TCP socket is to switch sk->sk_prot
      to tcp_prot.
      
      We now have other layers playing with sk->sk_prot, so we should make
      sure to not interfere with them.
      
      This patch makes sure sk_prot is the default pointer for TCP IPv6 socket.
      
      syzbot reported :
      BUG: kernel NULL pointer dereference, address: 0000000000000000
      PGD a0113067 P4D a0113067 PUD a8771067 PMD 0
      Oops: 0010 [#1] PREEMPT SMP KASAN
      CPU: 0 PID: 10686 Comm: syz-executor.0 Not tainted 5.6.0-rc2-syzkaller #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      RIP: 0010:0x0
      Code: Bad RIP value.
      RSP: 0018:ffffc9000281fce0 EFLAGS: 00010246
      RAX: 1ffffffff15f48ac RBX: ffffffff8afa4560 RCX: dffffc0000000000
      RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8880a69a8f40
      RBP: ffffc9000281fd10 R08: ffffffff86ed9b0c R09: ffffed1014d351f5
      R10: ffffed1014d351f5 R11: 0000000000000000 R12: ffff8880920d3098
      R13: 1ffff1101241a613 R14: ffff8880a69a8f40 R15: 0000000000000000
      FS:  00007f2ae75db700(0000) GS:ffff8880aea00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: ffffffffffffffd6 CR3: 00000000a3b85000 CR4: 00000000001406f0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Call Trace:
       inet_release+0x165/0x1c0 net/ipv4/af_inet.c:427
       __sock_release net/socket.c:605 [inline]
       sock_close+0xe1/0x260 net/socket.c:1283
       __fput+0x2e4/0x740 fs/file_table.c:280
       ____fput+0x15/0x20 fs/file_table.c:313
       task_work_run+0x176/0x1b0 kernel/task_work.c:113
       tracehook_notify_resume include/linux/tracehook.h:188 [inline]
       exit_to_usermode_loop arch/x86/entry/common.c:164 [inline]
       prepare_exit_to_usermode+0x480/0x5b0 arch/x86/entry/common.c:195
       syscall_return_slowpath+0x113/0x4a0 arch/x86/entry/common.c:278
       do_syscall_64+0x11f/0x1c0 arch/x86/entry/common.c:304
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      RIP: 0033:0x45c429
      Code: ad b6 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 7b b6 fb ff c3 66 2e 0f 1f 84 00 00 00 00
      RSP: 002b:00007f2ae75dac78 EFLAGS: 00000246 ORIG_RAX: 0000000000000036
      RAX: 0000000000000000 RBX: 00007f2ae75db6d4 RCX: 000000000045c429
      RDX: 0000000000000001 RSI: 000000000000011a RDI: 0000000000000004
      RBP: 000000000076bf20 R08: 0000000000000038 R09: 0000000000000000
      R10: 0000000020000180 R11: 0000000000000246 R12: 00000000ffffffff
      R13: 0000000000000a9d R14: 00000000004ccfb4 R15: 000000000076bf2c
      Modules linked in:
      CR2: 0000000000000000
      ---[ end trace 82567b5207e87bae ]---
      RIP: 0010:0x0
      Code: Bad RIP value.
      RSP: 0018:ffffc9000281fce0 EFLAGS: 00010246
      RAX: 1ffffffff15f48ac RBX: ffffffff8afa4560 RCX: dffffc0000000000
      RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8880a69a8f40
      RBP: ffffc9000281fd10 R08: ffffffff86ed9b0c R09: ffffed1014d351f5
      R10: ffffed1014d351f5 R11: 0000000000000000 R12: ffff8880920d3098
      R13: 1ffff1101241a613 R14: ffff8880a69a8f40 R15: 0000000000000000
      FS:  00007f2ae75db700(0000) GS:ffff8880aea00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: ffffffffffffffd6 CR3: 00000000a3b85000 CR4: 00000000001406f0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      
      Fixes: 604326b4 ("bpf, sockmap: convert to generic sk_msg interface")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: syzbot+1938db17e275e85dc328@syzkaller.appspotmail.com
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b6f61189
    • Ursula Braun's avatar
      net/smc: fix cleanup for linkgroup setup failures · 51e3dfa8
      Ursula Braun authored
      If an SMC connection to a certain peer is setup the first time,
      a new linkgroup is created. In case of setup failures, such a
      linkgroup is unusable and should disappear. As a first step the
      linkgroup is removed from the linkgroup list in smc_lgr_forget().
      
      There are 2 problems:
      smc_listen_decline() might be called before linkgroup creation
      resulting in a crash due to calling smc_lgr_forget() with
      parameter NULL.
      If a setup failure occurs after linkgroup creation, the connection
      is never unregistered from the linkgroup, preventing linkgroup
      freeing.
      
      This patch introduces an enhanced smc_lgr_cleanup_early() function
      which
      * contains a linkgroup check for early smc_listen_decline()
        invocations
      * invokes smc_conn_free() to guarantee unregistering of the
        connection.
      * schedules fast linkgroup removal of the unusable linkgroup
      
      And the unused function smcd_conn_free() is removed from smc_core.h.
      
      Fixes: 3b2dec26 ("net/smc: restructure client and server code in af_smc")
      Fixes: 2a0674ff ("net/smc: improve abnormal termination of link groups")
      Signed-off-by: default avatarUrsula Braun <ubraun@linux.ibm.com>
      Signed-off-by: default avatarKarsten Graul <kgraul@linux.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      51e3dfa8
    • Nicolas Saenz Julienne's avatar
      net: bcmgenet: Clear ID_MODE_DIS in EXT_RGMII_OOB_CTRL when not needed · 402482a6
      Nicolas Saenz Julienne authored
      Outdated Raspberry Pi 4 firmware might configure the external PHY as
      rgmii although the kernel currently sets it as rgmii-rxid. This makes
      connections unreliable as ID_MODE_DIS is left enabled. To avoid this,
      explicitly clear that bit whenever we don't need it.
      
      Fixes: da388022 ("net: bcmgenet: Add RGMII_RXID support")
      Signed-off-by: default avatarNicolas Saenz Julienne <nsaenzjulienne@suse.de>
      Acked-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      402482a6
    • Jiri Pirko's avatar
      sched: act: count in the size of action flags bitfield · 1521a67e
      Jiri Pirko authored
      The put of the flags was added by the commit referenced in fixes tag,
      however the size of the message was not extended accordingly.
      
      Fix this by adding size of the flags bitfield to the message size.
      
      Fixes: e3822678 ("net: sched: update action implementations to support flags")
      Signed-off-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1521a67e
    • Madhuparna Bhowmik's avatar
      net: core: devlink.c: Use built-in RCU list checking · 2eb51c75
      Madhuparna Bhowmik authored
      list_for_each_entry_rcu() has built-in RCU and lock checking.
      
      Pass cond argument to list_for_each_entry_rcu() to silence
      false lockdep warning when CONFIG_PROVE_RCU_LIST is enabled.
      
      The devlink->lock is held when devlink_dpipe_table_find()
      is called in non RCU read side section. Therefore, pass struct devlink
      to devlink_dpipe_table_find() for lockdep checking.
      Signed-off-by: default avatarMadhuparna Bhowmik <madhuparnabhowmik10@gmail.com>
      Reviewed-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2eb51c75
    • Florian Fainelli's avatar
      net: dsa: bcm_sf2: Forcibly configure IMP port for 1Gb/sec · 98c5f7d4
      Florian Fainelli authored
      We are still experiencing some packet loss with the existing advanced
      congestion buffering (ACB) settings with the IMP port configured for
      2Gb/sec, so revert to conservative link speeds that do not produce
      packet loss until this is resolved.
      
      Fixes: 8f1880cb ("net: dsa: bcm_sf2: Configure IMP port for 2Gb/sec")
      Fixes: de34d708 ("net: dsa: bcm_sf2: Only 7278 supports 2Gb/sec IMP port")
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Reviewed-by: default avatarVivien Didelot <vivien.didelot@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      98c5f7d4
    • David S. Miller's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf · 574b238f
      David S. Miller authored
      Pablo Neira Ayuso says:
      
      ====================
      Netfilter fixes for net
      
      The following patchset contains Netfilter fixes:
      
      1) Perform garbage collection from workqueue to fix rcu detected
         stall in ipset hash set types, from Jozsef Kadlecsik.
      
      2) Fix the forceadd evaluation path, also from Jozsef.
      
      3) Fix nft_set_pipapo selftest, from Stefano Brivio.
      
      4) Crash when add-flush-add element in pipapo set, also from Stefano.
         Add test to cover this crash.
      
      5) Remove sysctl entry under mutex in hashlimit, from Cong Wang.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      574b238f
  2. 26 Feb, 2020 7 commits
    • Jonathan Lemon's avatar
      bnxt_en: add newline to netdev_*() format strings · 9a005c38
      Jonathan Lemon authored
      Add missing newlines to netdev_* format strings so the lines
      aren't buffered by the printk subsystem.
      Nitpicked-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarJonathan Lemon <jonathan.lemon@gmail.com>
      Acked-by: default avatarMichael Chan <michael.chan@broadcom.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9a005c38
    • Cong Wang's avatar
      netfilter: xt_hashlimit: unregister proc file before releasing mutex · 99b79c39
      Cong Wang authored
      Before releasing the global mutex, we only unlink the hashtable
      from the hash list, its proc file is still not unregistered at
      this point. So syzbot could trigger a race condition where a
      parallel htable_create() could register the same file immediately
      after the mutex is released.
      
      Move htable_remove_proc_entry() back to mutex protection to
      fix this. And, fold htable_destroy() into htable_put() to make
      the code slightly easier to understand.
      
      Reported-and-tested-by: syzbot+d195fd3b9a364ddd6731@syzkaller.appspotmail.com
      Fixes: c4a3922d ("netfilter: xt_hashlimit: reduce hashlimit_mutex scope for htable_put()")
      Signed-off-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      99b79c39
    • Michal Kubecek's avatar
      ethtool: limit bitset size · e34f1753
      Michal Kubecek authored
      Syzbot reported that ethnl_compact_sanity_checks() can be tricked into
      reading past the end of ETHTOOL_A_BITSET_VALUE and ETHTOOL_A_BITSET_MASK
      attributes and even the message by passing a value between (u32)(-31)
      and (u32)(-1) as ETHTOOL_A_BITSET_SIZE.
      
      The problem is that DIV_ROUND_UP(attr_nbits, 32) is 0 for such values so
      that zero length ETHTOOL_A_BITSET_VALUE will pass the length check but
      ethnl_bitmap32_not_zero() check would try to access up to 512 MB of
      attribute "payload".
      
      Prevent this overflow byt limiting the bitset size. Technically, compact
      bitset format would allow bitset sizes up to almost 2^18 (so that the
      nest size does not exceed U16_MAX) but bitsets used by ethtool are much
      shorter. S16_MAX, the largest value which can be directly used as an
      upper limit in policy, should be a reasonable compromise.
      
      Fixes: 10b518d4 ("ethtool: netlink bitset handling")
      Reported-by: syzbot+7fd4ed5b4234ab1fdccd@syzkaller.appspotmail.com
      Reported-by: syzbot+709b7a64d57978247e44@syzkaller.appspotmail.com
      Reported-by: syzbot+983cb8fb2d17a7af549d@syzkaller.appspotmail.com
      Signed-off-by: default avatarMichal Kubecek <mkubecek@suse.cz>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e34f1753
    • Amritha Nambiar's avatar
      net: Fix Tx hash bound checking · 6e11d157
      Amritha Nambiar authored
      Fixes the lower and upper bounds when there are multiple TCs and
      traffic is on the the same TC on the same device.
      
      The lower bound is represented by 'qoffset' and the upper limit for
      hash value is 'qcount + qoffset'. This gives a clean Rx to Tx queue
      mapping when there are multiple TCs, as the queue indices for upper TCs
      will be offset by 'qoffset'.
      
      v2: Fixed commit description based on comments.
      
      Fixes: 1b837d48 ("net: Revoke export for __skb_tx_hash, update it to just be static skb_tx_hash")
      Fixes: eadec877 ("net: Add support for subordinate traffic classes to netdev_pick_tx")
      Signed-off-by: default avatarAmritha Nambiar <amritha.nambiar@intel.com>
      Reviewed-by: default avatarAlexander Duyck <alexander.h.duyck@linux.intel.com>
      Reviewed-by: default avatarSridhar Samudrala <sridhar.samudrala@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6e11d157
    • Stefano Brivio's avatar
      selftests: nft_concat_range: Add test for reported add/flush/add issue · 0954df70
      Stefano Brivio authored
      Add a specific test for the crash reported by Phil Sutter and addressed
      in the previous patch. The test cases that, in my intention, should
      have covered these cases, that is, the ones from the 'concurrency'
      section, don't run these sequences tightly enough and spectacularly
      failed to catch this.
      
      While at it, define a convenient way to add these kind of tests, by
      adding a "reported issues" test section.
      
      It's more convenient, for this particular test, to execute the set
      setup in its own function. However, future test cases like this one
      might need to call setup functions, and will typically need no tools
      other than nft, so allow for this in check_tools().
      
      The original form of the reproducer used here was provided by Phil.
      Reported-by: default avatarPhil Sutter <phil@nwl.cc>
      Signed-off-by: default avatarStefano Brivio <sbrivio@redhat.com>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      0954df70
    • Stefano Brivio's avatar
      nft_set_pipapo: Actually fetch key data in nft_pipapo_remove() · 212d58c1
      Stefano Brivio authored
      Phil reports that adding elements, flushing and re-adding them
      right away:
      
        nft add table t '{ set s { type ipv4_addr . inet_service; flags interval; }; }'
        nft add element t s '{ 10.0.0.1 . 22-25, 10.0.0.1 . 10-20 }'
        nft flush set t s
        nft add element t s '{ 10.0.0.1 . 10-20, 10.0.0.1 . 22-25 }'
      
      triggers, almost reliably, a crash like this one:
      
        [   71.319848] general protection fault, probably for non-canonical address 0x6f6b6e696c2e756e: 0000 [#1] PREEMPT SMP PTI
        [   71.321540] CPU: 3 PID: 1201 Comm: kworker/3:2 Not tainted 5.6.0-rc1-00377-g2bb07f4e #192
        [   71.322746] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20190711_202441-buildvm-armv7-10.arm.fedoraproject.org-2.fc31 04/01/2014
        [   71.324430] Workqueue: events nf_tables_trans_destroy_work [nf_tables]
        [   71.325387] RIP: 0010:nft_set_elem_destroy+0xa5/0x110 [nf_tables]
        [   71.326164] Code: 89 d4 84 c0 74 0e 8b 77 44 0f b6 f8 48 01 df e8 41 ff ff ff 45 84 e4 74 36 44 0f b6 63 08 45 84 e4 74 2c 49 01 dc 49 8b 04 24 <48> 8b 40 38 48 85 c0 74 4f 48 89 e7 4c 8b
        [   71.328423] RSP: 0018:ffffc9000226fd90 EFLAGS: 00010282
        [   71.329225] RAX: 6f6b6e696c2e756e RBX: ffff88813ab79f60 RCX: ffff88813931b5a0
        [   71.330365] RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff88813ab79f9a
        [   71.331473] RBP: ffff88813ab79f60 R08: 0000000000000008 R09: 0000000000000000
        [   71.332627] R10: 000000000000021c R11: 0000000000000000 R12: ffff88813ab79fc2
        [   71.333615] R13: ffff88813b3adf50 R14: dead000000000100 R15: ffff88813931b8a0
        [   71.334596] FS:  0000000000000000(0000) GS:ffff88813bd80000(0000) knlGS:0000000000000000
        [   71.335780] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
        [   71.336577] CR2: 000055ac683710f0 CR3: 000000013a222003 CR4: 0000000000360ee0
        [   71.337533] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
        [   71.338557] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
        [   71.339718] Call Trace:
        [   71.340093]  nft_pipapo_destroy+0x7a/0x170 [nf_tables_set]
        [   71.340973]  nft_set_destroy+0x20/0x50 [nf_tables]
        [   71.341879]  nf_tables_trans_destroy_work+0x246/0x260 [nf_tables]
        [   71.342916]  process_one_work+0x1d5/0x3c0
        [   71.343601]  worker_thread+0x4a/0x3c0
        [   71.344229]  kthread+0xfb/0x130
        [   71.344780]  ? process_one_work+0x3c0/0x3c0
        [   71.345477]  ? kthread_park+0x90/0x90
        [   71.346129]  ret_from_fork+0x35/0x40
        [   71.346748] Modules linked in: nf_tables_set nf_tables nfnetlink 8021q [last unloaded: nfnetlink]
        [   71.348153] ---[ end trace 2eaa8149ca759bcc ]---
        [   71.349066] RIP: 0010:nft_set_elem_destroy+0xa5/0x110 [nf_tables]
        [   71.350016] Code: 89 d4 84 c0 74 0e 8b 77 44 0f b6 f8 48 01 df e8 41 ff ff ff 45 84 e4 74 36 44 0f b6 63 08 45 84 e4 74 2c 49 01 dc 49 8b 04 24 <48> 8b 40 38 48 85 c0 74 4f 48 89 e7 4c 8b
        [   71.350017] RSP: 0018:ffffc9000226fd90 EFLAGS: 00010282
        [   71.350019] RAX: 6f6b6e696c2e756e RBX: ffff88813ab79f60 RCX: ffff88813931b5a0
        [   71.350019] RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff88813ab79f9a
        [   71.350020] RBP: ffff88813ab79f60 R08: 0000000000000008 R09: 0000000000000000
        [   71.350021] R10: 000000000000021c R11: 0000000000000000 R12: ffff88813ab79fc2
        [   71.350022] R13: ffff88813b3adf50 R14: dead000000000100 R15: ffff88813931b8a0
        [   71.350025] FS:  0000000000000000(0000) GS:ffff88813bd80000(0000) knlGS:0000000000000000
        [   71.350026] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
        [   71.350027] CR2: 000055ac683710f0 CR3: 000000013a222003 CR4: 0000000000360ee0
        [   71.350028] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
        [   71.350028] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
        [   71.350030] Kernel panic - not syncing: Fatal exception
        [   71.350412] Kernel Offset: disabled
        [   71.365922] ---[ end Kernel panic - not syncing: Fatal exception ]---
      
      which is caused by dangling elements that have been deactivated, but
      never removed.
      
      On a flush operation, nft_pipapo_walk() walks through all the elements
      in the mapping table, which are then deactivated by nft_flush_set(),
      one by one, and added to the commit list for removal. Element data is
      then freed.
      
      On transaction commit, nft_pipapo_remove() is called, and failed to
      remove these elements, leading to the stale references in the mapping.
      The first symptom of this, revealed by KASan, is a one-byte
      use-after-free in subsequent calls to nft_pipapo_walk(), which is
      usually not enough to trigger a panic. When stale elements are used
      more heavily, though, such as double-free via nft_pipapo_destroy()
      as in Phil's case, the problem becomes more noticeable.
      
      The issue comes from that fact that, on a flush operation,
      nft_pipapo_remove() won't get the actual key data via elem->key,
      elements to be deleted upon commit won't be found by the lookup via
      pipapo_get(), and removal will be skipped. Key data should be fetched
      via nft_set_ext_key(), instead.
      Reported-by: default avatarPhil Sutter <phil@nwl.cc>
      Fixes: 3c4287f6 ("nf_tables: Add set type for arbitrary concatenation of ranges")
      Signed-off-by: default avatarStefano Brivio <sbrivio@redhat.com>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      212d58c1
    • Pablo Neira Ayuso's avatar
      Merge branch 'master' of git://blackhole.kfki.hu/nf · 9ea4894b
      Pablo Neira Ayuso authored
      Jozsef Kadlecsik says:
      
      ====================
      ipset patches for nf
      
      The first one is larger than usual, but the issue could not be solved simpler.
      Also, it's a resend of the patch I submitted a few days ago, with a one line
      fix on top of that: the size of the comment extensions was not taken into
      account at reporting the full size of the set.
      
      - Fix "INFO: rcu detected stall in hash_xxx" reports of syzbot
        by introducing region locking and using workqueue instead of timer based
        gc of timed out entries in hash types of sets in ipset.
      - Fix the forceadd evaluation path - the bug was also uncovered by the syzbot.
      ====================
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      9ea4894b