1. 08 Feb, 2020 1 commit
    • David S. Miller's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf · 2696e114
      David S. Miller authored
      Daniel Borkmann says:
      
      ====================
      pull-request: bpf 2020-02-07
      
      The following pull-request contains BPF updates for your *net* tree.
      
      We've added 15 non-merge commits during the last 10 day(s) which contain
      a total of 12 files changed, 114 insertions(+), 31 deletions(-).
      
      The main changes are:
      
      1) Various BPF sockmap fixes related to RCU handling in the map's tear-
         down code, from Jakub Sitnicki.
      
      2) Fix macro state explosion in BPF sk_storage map when calculating its
         bucket_log on allocation, from Martin KaFai Lau.
      
      3) Fix potential BPF sockmap update race by rechecking socket's established
         state under lock, from Lorenz Bauer.
      
      4) Fix crash in bpftool on missing xlated instructions when kptr_restrict
         sysctl is set, from Toke Høiland-Jørgensen.
      
      5) Fix i40e's XSK wakeup code to return proper error in busy state and
         various misc fixes in xdpsock BPF sample code, from Maciej Fijalkowski.
      
      6) Fix the way modifiers are skipped in BTF in the verifier while walking
         pointers to avoid program rejection, from Alexei Starovoitov.
      
      7) Fix Makefile for runqslower BPF tool to i) rebuild on libbpf changes and
         ii) to fix undefined reference linker errors for older gcc version due to
         order of passed gcc parameters, from Yulia Kartseva and Song Liu.
      
      8) Fix a trampoline_count BPF kselftest warning about missing braces around
         initializer, from Andrii Nakryiko.
      
      9) Fix up redundant "HAVE" prefix from large INSN limit kernel probe in
         bpftool, from Michal Rostecki.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2696e114
  2. 07 Feb, 2020 34 commits
    • Martin KaFai Lau's avatar
      bpf: Improve bucket_log calculation logic · 88d6f130
      Martin KaFai Lau authored
      It was reported that the max_t, ilog2, and roundup_pow_of_two macros have
      exponential effects on the number of states in the sparse checker.
      
      This patch breaks them up by calculating the "nbuckets" first so that the
      "bucket_log" only needs to take ilog2().
      
      In addition, Linus mentioned:
      
        Patch looks good, but I'd like to point out that it's not just sparse.
      
        You can see it with a simple
      
          make net/core/bpf_sk_storage.i
          grep 'smap->bucket_log = ' net/core/bpf_sk_storage.i | wc
      
        and see the end result:
      
            1  365071 2686974
      
        That's one line (the assignment line) that is 2,686,974 characters in
        length.
      
        Now, sparse does happen to react particularly badly to that (I didn't
        look to why, but I suspect it's just that evaluating all the types
        that don't actually ever end up getting used ends up being much more
        expensive than it should be), but I bet it's not good for gcc either.
      
      Fixes: 6ac99e8f ("bpf: Introduce bpf sk local storage")
      Reported-by: default avatarRandy Dunlap <rdunlap@infradead.org>
      Reported-by: default avatarLuc Van Oostenryck <luc.vanoostenryck@gmail.com>
      Suggested-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarMartin KaFai Lau <kafai@fb.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Reviewed-by: default avatarLuc Van Oostenryck <luc.vanoostenryck@gmail.com>
      Link: https://lore.kernel.org/bpf/20200207081810.3918919-1-kafai@fb.com
      88d6f130
    • Jakub Sitnicki's avatar
      selftests/bpf: Test freeing sockmap/sockhash with a socket in it · 5d3919a9
      Jakub Sitnicki authored
      Commit 7e81a353 ("bpf: Sockmap, ensure sock lock held during tear
      down") introduced sleeping issues inside RCU critical sections and while
      holding a spinlock on sockmap/sockhash tear-down. There has to be at least
      one socket in the map for the problem to surface.
      
      This adds a test that triggers the warnings for broken locking rules. Not a
      fix per se, but rather tooling to verify the accompanying fixes. Run on a
      VM with 1 vCPU to reproduce the warnings.
      
      Fixes: 7e81a353 ("bpf: Sockmap, ensure sock lock held during tear down")
      Signed-off-by: default avatarJakub Sitnicki <jakub@cloudflare.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarJohn Fastabend <john.fastabend@gmail.com>
      Link: https://lore.kernel.org/bpf/20200206111652.694507-4-jakub@cloudflare.com
      5d3919a9
    • Jakub Sitnicki's avatar
      bpf, sockhash: Synchronize_rcu before free'ing map · 0b2dc839
      Jakub Sitnicki authored
      We need to have a synchronize_rcu before free'ing the sockhash because any
      outstanding psock references will have a pointer to the map and when they
      use it, this could trigger a use after free.
      
      This is a sister fix for sockhash, following commit 2bb90e5c ("bpf:
      sockmap, synchronize_rcu before free'ing map") which addressed sockmap,
      which comes from a manual audit.
      
      Fixes: 604326b4 ("bpf, sockmap: convert to generic sk_msg interface")
      Signed-off-by: default avatarJakub Sitnicki <jakub@cloudflare.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarJohn Fastabend <john.fastabend@gmail.com>
      Link: https://lore.kernel.org/bpf/20200206111652.694507-3-jakub@cloudflare.com
      0b2dc839
    • Jakub Sitnicki's avatar
      bpf, sockmap: Don't sleep while holding RCU lock on tear-down · db6a5018
      Jakub Sitnicki authored
      rcu_read_lock is needed to protect access to psock inside sock_map_unref
      when tearing down the map. However, we can't afford to sleep in lock_sock
      while in RCU read-side critical section. Grab the RCU lock only after we
      have locked the socket.
      
      This fixes RCU warnings triggerable on a VM with 1 vCPU when free'ing a
      sockmap/sockhash that contains at least one socket:
      
      | =============================
      | WARNING: suspicious RCU usage
      | 5.5.0-04005-g8fc91b97 #450 Not tainted
      | -----------------------------
      | include/linux/rcupdate.h:272 Illegal context switch in RCU read-side critical section!
      |
      | other info that might help us debug this:
      |
      |
      | rcu_scheduler_active = 2, debug_locks = 1
      | 4 locks held by kworker/0:1/62:
      |  #0: ffff88813b019748 ((wq_completion)events){+.+.}, at: process_one_work+0x1d7/0x5e0
      |  #1: ffffc900000abe50 ((work_completion)(&map->work)){+.+.}, at: process_one_work+0x1d7/0x5e0
      |  #2: ffffffff82065d20 (rcu_read_lock){....}, at: sock_map_free+0x5/0x170
      |  #3: ffff8881368c5df8 (&stab->lock){+...}, at: sock_map_free+0x64/0x170
      |
      | stack backtrace:
      | CPU: 0 PID: 62 Comm: kworker/0:1 Not tainted 5.5.0-04005-g8fc91b97 #450
      | Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20190727_073836-buildvm-ppc64le-16.ppc.fedoraproject.org-3.fc31 04/01/2014
      | Workqueue: events bpf_map_free_deferred
      | Call Trace:
      |  dump_stack+0x71/0xa0
      |  ___might_sleep+0x105/0x190
      |  lock_sock_nested+0x28/0x90
      |  sock_map_free+0x95/0x170
      |  bpf_map_free_deferred+0x58/0x80
      |  process_one_work+0x260/0x5e0
      |  worker_thread+0x4d/0x3e0
      |  kthread+0x108/0x140
      |  ? process_one_work+0x5e0/0x5e0
      |  ? kthread_park+0x90/0x90
      |  ret_from_fork+0x3a/0x50
      
      | =============================
      | WARNING: suspicious RCU usage
      | 5.5.0-04005-g8fc91b97-dirty #452 Not tainted
      | -----------------------------
      | include/linux/rcupdate.h:272 Illegal context switch in RCU read-side critical section!
      |
      | other info that might help us debug this:
      |
      |
      | rcu_scheduler_active = 2, debug_locks = 1
      | 4 locks held by kworker/0:1/62:
      |  #0: ffff88813b019748 ((wq_completion)events){+.+.}, at: process_one_work+0x1d7/0x5e0
      |  #1: ffffc900000abe50 ((work_completion)(&map->work)){+.+.}, at: process_one_work+0x1d7/0x5e0
      |  #2: ffffffff82065d20 (rcu_read_lock){....}, at: sock_hash_free+0x5/0x1d0
      |  #3: ffff888139966e00 (&htab->buckets[i].lock){+...}, at: sock_hash_free+0x92/0x1d0
      |
      | stack backtrace:
      | CPU: 0 PID: 62 Comm: kworker/0:1 Not tainted 5.5.0-04005-g8fc91b97-dirty #452
      | Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20190727_073836-buildvm-ppc64le-16.ppc.fedoraproject.org-3.fc31 04/01/2014
      | Workqueue: events bpf_map_free_deferred
      | Call Trace:
      |  dump_stack+0x71/0xa0
      |  ___might_sleep+0x105/0x190
      |  lock_sock_nested+0x28/0x90
      |  sock_hash_free+0xec/0x1d0
      |  bpf_map_free_deferred+0x58/0x80
      |  process_one_work+0x260/0x5e0
      |  worker_thread+0x4d/0x3e0
      |  kthread+0x108/0x140
      |  ? process_one_work+0x5e0/0x5e0
      |  ? kthread_park+0x90/0x90
      |  ret_from_fork+0x3a/0x50
      
      Fixes: 7e81a353 ("bpf: Sockmap, ensure sock lock held during tear down")
      Signed-off-by: default avatarJakub Sitnicki <jakub@cloudflare.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarJohn Fastabend <john.fastabend@gmail.com>
      Link: https://lore.kernel.org/bpf/20200206111652.694507-2-jakub@cloudflare.com
      db6a5018
    • Toke Høiland-Jørgensen's avatar
      bpftool: Don't crash on missing xlated program instructions · d95f1e8b
      Toke Høiland-Jørgensen authored
      Turns out the xlated program instructions can also be missing if
      kptr_restrict sysctl is set. This means that the previous fix to check the
      jited_prog_insns pointer was insufficient; add another check of the
      xlated_prog_insns pointer as well.
      
      Fixes: 5b79bcdf ("bpftool: Don't crash on missing jited insns or ksyms")
      Fixes: cae73f23 ("bpftool: use bpf_program__get_prog_info_linear() in prog.c:do_dump()")
      Signed-off-by: default avatarToke Høiland-Jørgensen <toke@redhat.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Reviewed-by: default avatarQuentin Monnet <quentin@isovalent.com>
      Link: https://lore.kernel.org/bpf/20200206102906.112551-1-toke@redhat.com
      d95f1e8b
    • Lorenz Bauer's avatar
      bpf, sockmap: Check update requirements after locking · 85b8ac01
      Lorenz Bauer authored
      It's currently possible to insert sockets in unexpected states into
      a sockmap, due to a TOCTTOU when updating the map from a syscall.
      sock_map_update_elem checks that sk->sk_state == TCP_ESTABLISHED,
      locks the socket and then calls sock_map_update_common. At this
      point, the socket may have transitioned into another state, and
      the earlier assumptions don't hold anymore. Crucially, it's
      conceivable (though very unlikely) that a socket has become unhashed.
      This breaks the sockmap's assumption that it will get a callback
      via sk->sk_prot->unhash.
      
      Fix this by checking the (fixed) sk_type and sk_protocol without the
      lock, followed by a locked check of sk_state.
      
      Unfortunately it's not possible to push the check down into
      sock_(map|hash)_update_common, since BPF_SOCK_OPS_PASSIVE_ESTABLISHED_CB
      run before the socket has transitioned from TCP_SYN_RECV into
      TCP_ESTABLISHED.
      
      Fixes: 604326b4 ("bpf, sockmap: convert to generic sk_msg interface")
      Signed-off-by: default avatarLorenz Bauer <lmb@cloudflare.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Reviewed-by: default avatarJakub Sitnicki <jakub@cloudflare.com>
      Link: https://lore.kernel.org/bpf/20200207103713.28175-1-lmb@cloudflare.com
      85b8ac01
    • Ido Schimmel's avatar
      drop_monitor: Do not cancel uninitialized work item · dfa7f709
      Ido Schimmel authored
      Drop monitor uses a work item that takes care of constructing and
      sending netlink notifications to user space. In case drop monitor never
      started to monitor, then the work item is uninitialized and not
      associated with a function.
      
      Therefore, a stop command from user space results in canceling an
      uninitialized work item which leads to the following warning [1].
      
      Fix this by not processing a stop command if drop monitor is not
      currently monitoring.
      
      [1]
      [   31.735402] ------------[ cut here ]------------
      [   31.736470] WARNING: CPU: 0 PID: 143 at kernel/workqueue.c:3032 __flush_work+0x89f/0x9f0
      ...
      [   31.738120] CPU: 0 PID: 143 Comm: dwdump Not tainted 5.5.0-custom-09491-g16d4077796b8 #727
      [   31.741968] RIP: 0010:__flush_work+0x89f/0x9f0
      ...
      [   31.760526] Call Trace:
      [   31.771689]  __cancel_work_timer+0x2a6/0x3b0
      [   31.776809]  net_dm_cmd_trace+0x300/0xef0
      [   31.777549]  genl_rcv_msg+0x5c6/0xd50
      [   31.781005]  netlink_rcv_skb+0x13b/0x3a0
      [   31.784114]  genl_rcv+0x29/0x40
      [   31.784720]  netlink_unicast+0x49f/0x6a0
      [   31.787148]  netlink_sendmsg+0x7cf/0xc80
      [   31.790426]  ____sys_sendmsg+0x620/0x770
      [   31.793458]  ___sys_sendmsg+0xfd/0x170
      [   31.802216]  __sys_sendmsg+0xdf/0x1a0
      [   31.806195]  do_syscall_64+0xa0/0x540
      [   31.806885]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      Fixes: 8e94c3bc ("drop_monitor: Allow user to start monitoring hardware drops")
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Reviewed-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      dfa7f709
    • David S. Miller's avatar
      Merge branch 'mlxsw-Various-fixes' · e036c587
      David S. Miller authored
      Ido Schimmel says:
      
      ====================
      mlxsw: Various fixes
      
      This patch set contains various fixes for the mlxsw driver.
      
      Patch #1 fixes an issue introduced in 5.6 in which a route in the main
      table can replace an identical route in the local table despite the
      local table having an higher precedence.
      
      Patch #2 contains a test case for the bug fixed in patch #1.
      
      Patch #3 also fixes an issue introduced in 5.6 in which the driver
      failed to clear the offload indication from IPv6 nexthops upon abort.
      
      Patch #4 fixes an issue that prevents the driver from loading on
      Spectrum-3 systems. The problem and solution are explained in detail in
      the commit message.
      
      Patch #5 adds a missing error path. Discovered using smatch.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e036c587
    • Ido Schimmel's avatar
      mlxsw: spectrum_dpipe: Add missing error path · 3a99cbb6
      Ido Schimmel authored
      In case devlink_dpipe_entry_ctx_prepare() failed, release RTNL that was
      previously taken and free the memory allocated by
      mlxsw_sp_erif_entry_prepare().
      
      Fixes: 2ba5999f ("mlxsw: spectrum: Add Support for erif table entries access")
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3a99cbb6
    • Vadim Pasternak's avatar
      mlxsw: core: Add validation of hardware device types for MGPIR register · 36844c85
      Vadim Pasternak authored
      When reading the number of gearboxes from the hardware, the driver does
      not validate the returned 'device type' field. The driver can therefore
      wrongly assume that the queried devices are gearboxes.
      
      On Spectrum-3 systems that support different types of devices, this can
      prevent the driver from loading, as it will try to query the
      temperature sensors from devices which it assumes are gearboxes and in
      fact are not.
      
      For example:
      [  218.129230] mlxsw_minimal 2-0048: Reg cmd access status failed (status=7(bad parameter))
      [  218.138282] mlxsw_minimal 2-0048: Reg cmd access failed (reg_id=900a(mtmp),type=write)
      [  218.147131] mlxsw_minimal 2-0048: Failed to setup temp sensor number 256
      [  218.534480] mlxsw_minimal 2-0048: Fail to register core bus
      [  218.540714] mlxsw_minimal: probe of 2-0048 failed with error -5
      
      Fix this by validating the 'device type' field.
      
      Fixes: 2e265a8b ("mlxsw: core: Extend hwmon interface with inter-connect temperature attributes")
      Fixes: f14f4e62 ("mlxsw: core: Extend thermal core with per inter-connect device thermal zones")
      Signed-off-by: default avatarVadim Pasternak <vadimp@mellanox.com>
      Acked-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      36844c85
    • Ido Schimmel's avatar
      mlxsw: spectrum_router: Clear offload indication from IPv6 nexthops on abort · 490f0542
      Ido Schimmel authored
      Unlike IPv4, in IPv6 there is no unique structure to represent the
      nexthop and both the route and nexthop information are squashed to the
      same structure ('struct fib6_info'). In order to improve resource
      utilization the driver consolidates identical nexthop groups to the same
      internal representation of a nexthop group.
      
      Therefore, when the offload indication of a nexthop changes, the driver
      needs to iterate over all the linked fib6_info and toggle their offload
      flag accordingly.
      
      During abort, all the routes are removed from the device and unlinked
      from their nexthop group. The offload indication is cleared just before
      the group is destroyed, but by that time no fib6_info is linked to the
      group and the offload indication remains set.
      
      Fix this by clearing the offload indication just before dropping the
      reference from the nexthop.
      
      Fixes: ee5a0448 ("mlxsw: spectrum_router: Set hardware flags for routes")
      Reported-by: default avatarAlex Kushnarov <alexanderk@mellanox.com>
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Acked-by: default avatarJiri Pirko <jiri@mellanox.com>
      Tested-by: default avatarAlex Kushnarov <alexanderk@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      490f0542
    • Ido Schimmel's avatar
      selftests: mlxsw: Add test cases for local table route replacement · 6c05ca26
      Ido Schimmel authored
      Test that routes in the main table do not replace identical routes in
      the local table and that routes in the local table do replace identical
      routes in the main table.
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Acked-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6c05ca26
    • Ido Schimmel's avatar
      mlxsw: spectrum_router: Prevent incorrect replacement of local table routes · 0508ff89
      Ido Schimmel authored
      The driver uses the same table to represent both the main and local
      routing tables. Prevent routes in the main table from replacing routes
      in the local table to reflect the fact that the local table is consulted
      first during lookup.
      
      Fixes: b6a1d871 ("mlxsw: spectrum_router: Start using new IPv4 route notifications")
      Fixes: dacad7b3 ("mlxsw: spectrum_router: Start using new IPv6 route notifications")
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Acked-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0508ff89
    • Razvan Stefanescu's avatar
      net: dsa: microchip: enable module autoprobe · f8c2afa6
      Razvan Stefanescu authored
      This matches /sys/devices/.../spi1.0/modalias content.
      
      Fixes: 9b2d9f05 ("net: dsa: microchip: add ksz9567 to ksz9477 driver")
      Fixes: d9033ae9 ("net: dsa: microchip: add KSZ8563 compatibility string")
      Fixes: 8c29bebb ("net: dsa: microchip: add KSZ9893 switch support")
      Fixes: 45316818 ("net: dsa: add support for ksz9897 ethernet switch")
      Fixes: b987e98e ("dsa: add DSA switch driver for Microchip KSZ9477")
      Signed-off-by: default avatarRazvan Stefanescu <razvan.stefanescu@microchip.com>
      Signed-off-by: default avatarCodrin Ciubotariu <codrin.ciubotariu@microchip.com>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f8c2afa6
    • Eric Dumazet's avatar
      ipv6/addrconf: fix potential NULL deref in inet6_set_link_af() · db3fa271
      Eric Dumazet authored
      __in6_dev_get(dev) called from inet6_set_link_af() can return NULL.
      
      The needed check has been recently removed, let's add it back.
      
      While do_setlink() does call validate_linkmsg() :
      ...
      err = validate_linkmsg(dev, tb); /* OK at this point */
      ...
      
      It is possible that the following call happening before the
      ->set_link_af() removes IPv6 if MTU is less than 1280 :
      
      if (tb[IFLA_MTU]) {
          err = dev_set_mtu_ext(dev, nla_get_u32(tb[IFLA_MTU]), extack);
          if (err < 0)
                goto errout;
          status |= DO_SETLINK_MODIFIED;
      }
      ...
      
      if (tb[IFLA_AF_SPEC]) {
         ...
         err = af_ops->set_link_af(dev, af);
            ->inet6_set_link_af() // CRASH because idev is NULL
      
      Please note that IPv4 is immune to the bug since inet_set_link_af() does :
      
      struct in_device *in_dev = __in_dev_get_rcu(dev);
      if (!in_dev)
          return -EAFNOSUPPORT;
      
      This problem has been mentioned in commit cf7afbfe ("rtnl: make
      link af-specific updates atomic") changelog :
      
          This method is not fail proof, while it is currently sufficient
          to make set_link_af() inerrable and thus 100% atomic, the
          validation function method will not be able to detect all error
          scenarios in the future, there will likely always be errors
          depending on states which are f.e. not protected by rtnl_mutex
          and thus may change between validation and setting.
      
      IPv6: ADDRCONF(NETDEV_CHANGE): lo: link becomes ready
      general protection fault, probably for non-canonical address 0xdffffc0000000056: 0000 [#1] PREEMPT SMP KASAN
      KASAN: null-ptr-deref in range [0x00000000000002b0-0x00000000000002b7]
      CPU: 0 PID: 9698 Comm: syz-executor712 Not tainted 5.5.0-syzkaller #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      RIP: 0010:inet6_set_link_af+0x66e/0xae0 net/ipv6/addrconf.c:5733
      Code: 38 d0 7f 08 84 c0 0f 85 20 03 00 00 48 8d bb b0 02 00 00 45 0f b6 64 24 04 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <0f> b6 04 02 84 c0 74 08 3c 03 0f 8e 1a 03 00 00 44 89 a3 b0 02 00
      RSP: 0018:ffffc90005b06d40 EFLAGS: 00010206
      RAX: dffffc0000000000 RBX: 0000000000000000 RCX: ffffffff86df39a6
      RDX: 0000000000000056 RSI: ffffffff86df3e74 RDI: 00000000000002b0
      RBP: ffffc90005b06e70 R08: ffff8880a2ac0380 R09: ffffc90005b06db0
      R10: fffff52000b60dbe R11: ffffc90005b06df7 R12: 0000000000000000
      R13: 0000000000000000 R14: ffff8880a1fcc424 R15: dffffc0000000000
      FS:  0000000000c46880(0000) GS:ffff8880ae800000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 000055f0494ca0d0 CR3: 000000009e4ac000 CR4: 00000000001406f0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Call Trace:
       do_setlink+0x2a9f/0x3720 net/core/rtnetlink.c:2754
       rtnl_group_changelink net/core/rtnetlink.c:3103 [inline]
       __rtnl_newlink+0xdd1/0x1790 net/core/rtnetlink.c:3257
       rtnl_newlink+0x69/0xa0 net/core/rtnetlink.c:3377
       rtnetlink_rcv_msg+0x45e/0xaf0 net/core/rtnetlink.c:5438
       netlink_rcv_skb+0x177/0x450 net/netlink/af_netlink.c:2477
       rtnetlink_rcv+0x1d/0x30 net/core/rtnetlink.c:5456
       netlink_unicast_kernel net/netlink/af_netlink.c:1302 [inline]
       netlink_unicast+0x59e/0x7e0 net/netlink/af_netlink.c:1328
       netlink_sendmsg+0x91c/0xea0 net/netlink/af_netlink.c:1917
       sock_sendmsg_nosec net/socket.c:652 [inline]
       sock_sendmsg+0xd7/0x130 net/socket.c:672
       ____sys_sendmsg+0x753/0x880 net/socket.c:2343
       ___sys_sendmsg+0x100/0x170 net/socket.c:2397
       __sys_sendmsg+0x105/0x1d0 net/socket.c:2430
       __do_sys_sendmsg net/socket.c:2439 [inline]
       __se_sys_sendmsg net/socket.c:2437 [inline]
       __x64_sys_sendmsg+0x78/0xb0 net/socket.c:2437
       do_syscall_64+0xfa/0x790 arch/x86/entry/common.c:294
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      RIP: 0033:0x4402e9
      Code: 18 89 d0 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 fb 13 fc ff c3 66 2e 0f 1f 84 00 00 00 00
      RSP: 002b:00007fffd62fbcf8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
      RAX: ffffffffffffffda RBX: 00000000004002c8 RCX: 00000000004402e9
      RDX: 0000000000000000 RSI: 0000000020000080 RDI: 0000000000000003
      RBP: 00000000006ca018 R08: 0000000000000008 R09: 00000000004002c8
      R10: 0000000000000005 R11: 0000000000000246 R12: 0000000000401b70
      R13: 0000000000401c00 R14: 0000000000000000 R15: 0000000000000000
      Modules linked in:
      ---[ end trace cfa7664b8fdcdff3 ]---
      RIP: 0010:inet6_set_link_af+0x66e/0xae0 net/ipv6/addrconf.c:5733
      Code: 38 d0 7f 08 84 c0 0f 85 20 03 00 00 48 8d bb b0 02 00 00 45 0f b6 64 24 04 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <0f> b6 04 02 84 c0 74 08 3c 03 0f 8e 1a 03 00 00 44 89 a3 b0 02 00
      RSP: 0018:ffffc90005b06d40 EFLAGS: 00010206
      RAX: dffffc0000000000 RBX: 0000000000000000 RCX: ffffffff86df39a6
      RDX: 0000000000000056 RSI: ffffffff86df3e74 RDI: 00000000000002b0
      RBP: ffffc90005b06e70 R08: ffff8880a2ac0380 R09: ffffc90005b06db0
      R10: fffff52000b60dbe R11: ffffc90005b06df7 R12: 0000000000000000
      R13: 0000000000000000 R14: ffff8880a1fcc424 R15: dffffc0000000000
      FS:  0000000000c46880(0000) GS:ffff8880ae900000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 0000000020000004 CR3: 000000009e4ac000 CR4: 00000000001406e0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      
      Fixes: 7dc2bcca ("Validate required parameters in inet6_validate_link_af")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Bisected-and-reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Cc: Maxim Mikityanskiy <maximmi@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      db3fa271
    • Madalin Bucur's avatar
      dpaa_eth: support all modes with rate adapting PHYs · 73a21fa8
      Madalin Bucur authored
      Stop removing modes that are not supported on the system interface
      when the connected PHY is capable of rate adaptation. This addresses
      an issue with the LS1046ARDB board 10G interface no longer working
      with an 1G link partner after autonegotiation support was added
      for the Aquantia PHY on board in
      
      commit 09c4c57f ("net: phy: aquantia: add support for auto-negotiation configuration")
      
      Before this commit the values advertised by the PHY were not
      influenced by the dpaa_eth driver removal of system-side unsupported
      modes as the aqr_config_aneg() was basically a no-op. After this
      commit, the modes removed by the dpaa_eth driver were no longer
      advertised thus autonegotiation with 1G link partners failed.
      Reported-by: default avatarMian Yousaf Kaukab <ykaukab@suse.de>
      Signed-off-by: default avatarMadalin Bucur <madalin.bucur@oss.nxp.com>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      73a21fa8
    • David S. Miller's avatar
      Merge branch 'stmmac-fixes' · 259039fa
      David S. Miller authored
      Ong Boon Leong says:
      
      ====================
      net: stmmac: general fixes for Ethernet functionality
      
      1/5: It ensures that the previous value of GMAC_VLAN_TAG register is
           read first before for updating the register.
      
      2/5: Similar to 2/6 patch but it is a fix for XGMAC_VLAN_TAG register
           as requested by Jose Abreu.
      
      3/5: It ensures the GMAC IP v4.xx and above behaves correctly to:-
             ip link set <devname> multicast off|on
      
      4/5: Added similar IFF_MULTICAST flag for xgmac2, similar to 4/6.
      
      5/5: It ensures PCI platform data is using plat->phy_interface.
      
      Changes from v4:-
         patch 1/6 - this patch is dropped now and will take the input on
                     handling return value from netif_set_real_num_rx|
                     tx_queues() in future patch series.
      
      v3:-
         patch 1/6 - add rtnl_lock() and rtnl_unlock() for stmmac_hw_setup()
                     called inside stmmac_resume()
         patch 3/6 - Added new patch to fix XGMAC_VLAN_TAG register writting
      
      v2:-
         patch 1/5 - added control for rtnl_lock() & rtnl_unlock() to ensure
                     they are used forstmmac_resume()
         patch 4/5 - added IFF_MULTICAST flag check for xgmac to ensure
                     multicast works correctly.
      
      v1:-
       - Drop v1 patches (1/7, 3/7 & 4/7) that are not valid.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      259039fa
    • Voon Weifeng's avatar
      net: stmmac: update pci platform data to use phy_interface · 909c1dde
      Voon Weifeng authored
      The recent patch to support passive mode converter did not take care the
      phy interface configuration in PCI platform data. Hence, converting all
      the PCI platform data from plat->interface to plat->phy_interface as the
      default mode is meant for PHY.
      
      Fixes: 0060c878 ("net: stmmac: implement support for passive mode converters via dt")
      Signed-off-by: default avatarVoon Weifeng <weifeng.voon@intel.com>
      Tested-by: default avatarTan, Tee Min <tee.min.tan@intel.com>
      Signed-off-by: default avatarOng Boon Leong <boon.leong.ong@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      909c1dde
    • Tan, Tee Min's avatar
      net: stmmac: xgmac: fix missing IFF_MULTICAST checki in dwxgmac2_set_filter · 2f633d58
      Tan, Tee Min authored
      Without checking for IFF_MULTICAST flag, it is wrong to assume multicast
      filtering is always enabled. By checking against IFF_MULTICAST, now
      the driver behaves correctly when the multicast support is toggled by below
      command:-
        ip link set <devname> multicast off|on
      
      Fixes: 0efedbf1 ("net: stmmac: xgmac: Fix XGMAC selftests")
      Signed-off-by: default avatarTan, Tee Min <tee.min.tan@intel.com>
      Signed-off-by: default avatarOng Boon Leong <boon.leong.ong@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2f633d58
    • Verma, Aashish's avatar
      net: stmmac: fix missing IFF_MULTICAST check in dwmac4_set_filter · 2ba31cd9
      Verma, Aashish authored
      Without checking for IFF_MULTICAST flag, it is wrong to assume multicast
      filtering is always enabled. By checking against IFF_MULTICAST, now
      the driver behaves correctly when the multicast support is toggled by below
      command:-
        ip link set <devname> multicast off|on
      
      Fixes: 477286b5 ("stmmac: add GMAC4 core support")
      Signed-off-by: default avatarVerma, Aashish <aashishx.verma@intel.com>
      Tested-by: default avatarTan, Tee Min <tee.min.tan@intel.com>
      Signed-off-by: default avatarOng Boon Leong <boon.leong.ong@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2ba31cd9
    • Ong Boon Leong's avatar
      net: stmmac: xgmac: fix incorrect XGMAC_VLAN_TAG register writting · 907a0768
      Ong Boon Leong authored
      We should always do a read of current value of XGMAC_VLAN_TAG instead of
      directly overwriting the register value.
      
      Fixes: 3cd1cfcb ("net: stmmac: Implement VLAN Hash Filtering in XGMAC")
      Signed-off-by: default avatarOng Boon Leong <boon.leong.ong@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      907a0768
    • Tan, Tee Min's avatar
      net: stmmac: fix incorrect GMAC_VLAN_TAG register writting in GMAC4+ · 9eeeb3c9
      Tan, Tee Min authored
      It should always do a read of current value of GMAC_VLAN_TAG instead of
      directly overwriting the register value.
      
      Fixes: c1be0022 ("net: stmmac: Add VLAN HASH filtering support in GMAC4+")
      Signed-off-by: default avatarTan, Tee Min <tee.min.tan@intel.com>
      Signed-off-by: default avatarOng Boon Leong <boon.leong.ong@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9eeeb3c9
    • Haiyang Zhang's avatar
      hv_netvsc: Fix XDP refcnt for synthetic and VF NICs · 184367dc
      Haiyang Zhang authored
      The caller of XDP_SETUP_PROG has already incremented refcnt in
      __bpf_prog_get(), so drivers should only increment refcnt by
      num_queues - 1.
      
      To fix the issue, update netvsc_xdp_set() to add the correct number
      to refcnt.
      
      Hold a refcnt in netvsc_xdp_set()’s other caller, netvsc_attach().
      
      And, do the same in netvsc_vf_setxdp(). Otherwise, every time when VF is
      removed and added from the host side, the refcnt will be decreased by one,
      which may cause page fault when unloading xdp program.
      
      Fixes: 351e1581 ("hv_netvsc: Add XDP support")
      Signed-off-by: default avatarHaiyang Zhang <haiyangz@microsoft.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      184367dc
    • David S. Miller's avatar
      Merge branch 'taprio-Some-fixes' · 6910fe95
      David S. Miller authored
      Vinicius Costa Gomes says:
      
      ====================
      taprio: Some fixes
      
      Changes from v3:
        - Replaced ENOTSUPP error code with EOPNOTSUPP (Jakub Kicinski);
        - Added the missing policy validation for the flags netlink argument
          (Jakub Kicinski);
        - Fixed the destroy() flow to also destroy the priority to traffic
          class mapping (David Miller);
        - Fixed dropping packets when taprio offloading is used together
          with ETF offloading (more on this below);
      
      Changes from v2:
        - Squashed commits 2/3 and 3/3 into a single one (I think a single
          commit is going to be easier to review);
        - Removed an "improvement" that was causing changes in user visible
          behavior;
      
      Changes from v1:
        - Fixed ignoring the 'flags' argument when adding a new
          instance (Vladimir Oltean);
        - Changed the order of commits;
      
      Updated cover letter:
      
      One bit that might need some attention is the fix for not dropping all
      packets when taprio and ETF offloading are used, patch 5/5. The
      behavior when the fix is applied is that packets that have a 'txtime'
      that would fall outside of their transmission window are now dropped
      by taprio. The question that might be raised is: should taprio be
      responsible for dropping these packets, or should it be handled lower
      in the stack?
      
      My opinion is: taprio has all the information, and it's able to give
      feeback to the user. Lower in the stack, those packets might go into
      the void, and the only feedback could be a hard to find counter
      increasing.
      
      Patch 1/5: Reported by Po Liu, is more of a improvement of usability for
      drivers implementing offloading features, now they can rely on the
      value of dev->num_tc, instead of going through some hops to get this
      value.
      
      Patch 2/5: Use 'q->flags' as the source of truth for the offloading
      flags. Tries to solidify the current behavior, while avoiding going
      into invalid states, one of which was causing a "rcu stall" (more
      information in the commit message).
      
      Patch 3/5: Adds the missing netlink attribute validation for
      TCA_TAPRIO_ATTR_FLAGS.
      
      Patch 4/5: Replaces the usage of netdev_set_num_tc() with
      netdev_reset_tc() in taprio_destroy(), taprio_destroy() is called when
      applying a configuration fails, making sure that the device traffic
      class configuration goes back to the default state.
      
      @Vladimir: If possible, I would appreciate your Ack on patch 2/5. I
      have been looking at this code for so long that I might have missed
      something obvious (and my growing dislike for the word 'flags' may be
      affecting my judgement :-).
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6910fe95
    • Vinicius Costa Gomes's avatar
      taprio: Fix dropping packets when using taprio + ETF offloading · bfabd41d
      Vinicius Costa Gomes authored
      When using taprio offloading together with ETF offloading, configured
      like this, for example:
      
      $ tc qdisc replace dev $IFACE parent root handle 100 taprio \
        	num_tc 4 \
              map 2 2 1 0 3 2 2 2 2 2 2 2 2 2 2 2 \
      	queues 1@0 1@1 1@2 1@3 \
      	base-time $BASE_TIME \
      	sched-entry S 01 1000000 \
      	sched-entry S 0e 1000000 \
      	flags 0x2
      
      $ tc qdisc replace dev $IFACE parent 100:1 etf \
           	offload delta 300000 clockid CLOCK_TAI
      
      During enqueue, it works out that the verification added for the
      "txtime" assisted mode is run when using taprio + ETF offloading, the
      only thing missing is initializing the 'next_txtime' of all the cycle
      entries. (if we don't set 'next_txtime' all packets from SO_TXTIME
      sockets are dropped)
      
      Fixes: 4cfd5779 ("taprio: Add support for txtime-assist mode")
      Signed-off-by: default avatarVinicius Costa Gomes <vinicius.gomes@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bfabd41d
    • Vinicius Costa Gomes's avatar
      taprio: Use taprio_reset_tc() to reset Traffic Classes configuration · 7c16680a
      Vinicius Costa Gomes authored
      When destroying the current taprio instance, which can happen when the
      creation of one fails, we should reset the traffic class configuration
      back to the default state.
      
      netdev_reset_tc() is a better way because in addition to setting the
      number of traffic classes to zero, it also resets the priority to
      traffic classes mapping to the default value.
      
      Fixes: 5a781ccb ("tc: Add support for configuring the taprio scheduler")
      Signed-off-by: default avatarVinicius Costa Gomes <vinicius.gomes@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7c16680a
    • Vinicius Costa Gomes's avatar
      taprio: Add missing policy validation for flags · 49c684d7
      Vinicius Costa Gomes authored
      netlink policy validation for the 'flags' argument was missing.
      
      Fixes: 4cfd5779 ("taprio: Add support for txtime-assist mode")
      Signed-off-by: default avatarVinicius Costa Gomes <vinicius.gomes@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      49c684d7
    • Vinicius Costa Gomes's avatar
      taprio: Fix still allowing changing the flags during runtime · a9d62274
      Vinicius Costa Gomes authored
      Because 'q->flags' starts as zero, and zero is a valid value, we
      aren't able to detect the transition from zero to something else
      during "runtime".
      
      The solution is to initialize 'q->flags' with an invalid value, so we
      can detect if 'q->flags' was set by the user or not.
      
      To better solidify the behavior, 'flags' handling is moved to a
      separate function. The behavior is:
       - 'flags' if unspecified by the user, is assumed to be zero;
       - 'flags' cannot change during "runtime" (i.e. a change() request
       cannot modify it);
      
      With this new function we can remove taprio_flags, which should reduce
      the risk of future accidents.
      
      Allowing flags to be changed was causing the following RCU stall:
      
      [ 1730.558249] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
      [ 1730.558258] rcu: 	  6-...0: (190 ticks this GP) idle=922/0/0x1 softirq=25580/25582 fqs=16250
      [ 1730.558264] 		  (detected by 2, t=65002 jiffies, g=33017, q=81)
      [ 1730.558269] Sending NMI from CPU 2 to CPUs 6:
      [ 1730.559277] NMI backtrace for cpu 6
      [ 1730.559277] CPU: 6 PID: 0 Comm: swapper/6 Tainted: G            E     5.5.0-rc6+ #35
      [ 1730.559278] Hardware name: Gigabyte Technology Co., Ltd. Z390 AORUS ULTRA/Z390 AORUS ULTRA-CF, BIOS F7 03/14/2019
      [ 1730.559278] RIP: 0010:__hrtimer_run_queues+0xe2/0x440
      [ 1730.559278] Code: 48 8b 43 28 4c 89 ff 48 8b 75 c0 48 89 45 c8 e8 f4 bb 7c 00 0f 1f 44 00 00 65 8b 05 40 31 f0 68 89 c0 48 0f a3 05 3e 5c 25 01 <0f> 82 fc 01 00 00 48 8b 45 c8 48 89 df ff d0 89 45 c8 0f 1f 44 00
      [ 1730.559279] RSP: 0018:ffff9970802d8f10 EFLAGS: 00000083
      [ 1730.559279] RAX: 0000000000000006 RBX: ffff8b31645bff38 RCX: 0000000000000000
      [ 1730.559280] RDX: 0000000000000000 RSI: ffffffff9710f2ec RDI: ffffffff978daf0e
      [ 1730.559280] RBP: ffff9970802d8f68 R08: 0000000000000000 R09: 0000000000000000
      [ 1730.559280] R10: 0000018336d7944e R11: 0000000000000001 R12: ffff8b316e39f9c0
      [ 1730.559281] R13: ffff8b316e39f940 R14: ffff8b316e39f998 R15: ffff8b316e39f7c0
      [ 1730.559281] FS:  0000000000000000(0000) GS:ffff8b316e380000(0000) knlGS:0000000000000000
      [ 1730.559281] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [ 1730.559281] CR2: 00007f1105303760 CR3: 0000000227210005 CR4: 00000000003606e0
      [ 1730.559282] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [ 1730.559282] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [ 1730.559282] Call Trace:
      [ 1730.559282]  <IRQ>
      [ 1730.559283]  ? taprio_dequeue_soft+0x2d0/0x2d0 [sch_taprio]
      [ 1730.559283]  hrtimer_interrupt+0x104/0x220
      [ 1730.559283]  ? irqtime_account_irq+0x34/0xa0
      [ 1730.559283]  smp_apic_timer_interrupt+0x6d/0x230
      [ 1730.559284]  apic_timer_interrupt+0xf/0x20
      [ 1730.559284]  </IRQ>
      [ 1730.559284] RIP: 0010:cpu_idle_poll+0x35/0x1a0
      [ 1730.559285] Code: 88 82 ff 65 44 8b 25 12 7d 73 68 0f 1f 44 00 00 e8 90 c3 89 ff fb 65 48 8b 1c 25 c0 7e 01 00 48 8b 03 a8 08 74 0b eb 1c f3 90 <48> 8b 03 a8 08 75 13 8b 05 be a8 a8 00 85 c0 75 ed e8 75 48 84 ff
      [ 1730.559285] RSP: 0018:ffff997080137ea8 EFLAGS: 00000202 ORIG_RAX: ffffffffffffff13
      [ 1730.559285] RAX: 0000000000000001 RBX: ffff8b316bc3c580 RCX: 0000000000000000
      [ 1730.559286] RDX: 0000000000000001 RSI: 000000002819aad9 RDI: ffffffff978da730
      [ 1730.559286] RBP: ffff997080137ec0 R08: 0000018324a6d387 R09: 0000000000000000
      [ 1730.559286] R10: 0000000000000400 R11: 0000000000000001 R12: 0000000000000006
      [ 1730.559286] R13: ffff8b316bc3c580 R14: 0000000000000000 R15: 0000000000000000
      [ 1730.559287]  ? cpu_idle_poll+0x20/0x1a0
      [ 1730.559287]  ? cpu_idle_poll+0x20/0x1a0
      [ 1730.559287]  do_idle+0x4d/0x1f0
      [ 1730.559287]  ? complete+0x44/0x50
      [ 1730.559288]  cpu_startup_entry+0x1b/0x20
      [ 1730.559288]  start_secondary+0x142/0x180
      [ 1730.559288]  secondary_startup_64+0xb6/0xc0
      [ 1776.686313] nvme nvme0: I/O 96 QID 1 timeout, completion polled
      
      Fixes: 4cfd5779 ("taprio: Add support for txtime-assist mode")
      Signed-off-by: default avatarVinicius Costa Gomes <vinicius.gomes@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a9d62274
    • Vinicius Costa Gomes's avatar
      taprio: Fix enabling offload with wrong number of traffic classes · 5652e63d
      Vinicius Costa Gomes authored
      If the driver implementing taprio offloading depends on the value of
      the network device number of traffic classes (dev->num_tc) for
      whatever reason, it was going to receive the value zero. The value was
      only set after the offloading function is called.
      
      So, moving setting the number of traffic classes to before the
      offloading function is called fixes this issue. This is safe because
      this only happens when taprio is instantiated (we don't allow this
      configuration to be changed without first removing taprio).
      
      Fixes: 9c66d156 ("taprio: Add support for hardware offloading")
      Reported-by: default avatarPo Liu <po.liu@nxp.com>
      Signed-off-by: default avatarVinicius Costa Gomes <vinicius.gomes@intel.com>
      Acked-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5652e63d
    • Florian Fainelli's avatar
      net: dsa: bcm_sf2: Only 7278 supports 2Gb/sec IMP port · de34d708
      Florian Fainelli authored
      The 7445 switch clocking profiles do not allow us to run the IMP port at
      2Gb/sec in a way that it is reliable and consistent. Make sure that the
      setting is only applied to the 7278 family.
      
      Fixes: 8f1880cb ("net: dsa: bcm_sf2: Configure IMP port for 2Gb/sec")
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      de34d708
    • Florian Fainelli's avatar
      net: dsa: b53: Always use dev->vlan_enabled in b53_configure_vlan() · df373702
      Florian Fainelli authored
      b53_configure_vlan() is called by the bcm_sf2 driver upon setup and
      indirectly through resume as well. During the initial setup, we are
      guaranteed that dev->vlan_enabled is false, so there is no change in
      behavior, however during suspend, we may have enabled VLANs before, so we
      do want to restore that setting.
      
      Fixes: dad8d7c6 ("net: dsa: b53: Properly account for VLAN filtering")
      Fixes: 967dd82f ("net: dsa: b53: Add support for Broadcom RoboSwitch")
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      df373702
    • Dejin Zheng's avatar
      net: stmmac: fix a possible endless loop · 7d10f077
      Dejin Zheng authored
      It forgot to reduce the value of the variable retry in a while loop
      in the ethqos_configure() function. It may cause an endless loop and
      without timeout.
      
      Fixes: a7c30e62 ("net: stmmac: Add driver for Qualcomm ethqos")
      Signed-off-by: default avatarDejin Zheng <zhengdejin5@gmail.com>
      Acked-by: default avatarVinod Koul <vkoul@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7d10f077
    • David Howells's avatar
      rxrpc: Fix call RCU cleanup using non-bh-safe locks · 963485d4
      David Howells authored
      rxrpc_rcu_destroy_call(), which is called as an RCU callback to clean up a
      put call, calls rxrpc_put_connection() which, deep in its bowels, takes a
      number of spinlocks in a non-BH-safe way, including rxrpc_conn_id_lock and
      local->client_conns_lock.  RCU callbacks, however, are normally called from
      softirq context, which can cause lockdep to notice the locking
      inconsistency.
      
      To get lockdep to detect this, it's necessary to have the connection
      cleaned up on the put at the end of the last of its calls, though normally
      the clean up is deferred.  This can be induced, however, by starting a call
      on an AF_RXRPC socket and then closing the socket without reading the
      reply.
      
      Fix this by having rxrpc_rcu_destroy_call() punt the destruction to a
      workqueue if in softirq-mode and defer the destruction to process context.
      
      Note that another way to fix this could be to add a bunch of bh-disable
      annotations to the spinlocks concerned - and there might be more than just
      those two - but that means spending more time with BHs disabled.
      
      Note also that some of these places were covered by bh-disable spinlocks
      belonging to the rxrpc_transport object, but these got removed without the
      _bh annotation being retained on the next lock in.
      
      Fixes: 999b69f8 ("rxrpc: Kill the client connection bundle concept")
      Reported-by: syzbot+d82f3ac8d87e7ccbb2c9@syzkaller.appspotmail.com
      Reported-by: syzbot+3f1fd6b8cbf8702d134e@syzkaller.appspotmail.com
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: Hillf Danton <hdanton@sina.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      963485d4
    • David Howells's avatar
      rxrpc: Fix service call disconnection · b39a934e
      David Howells authored
      The recent patch that substituted a flag on an rxrpc_call for the
      connection pointer being NULL as an indication that a call was disconnected
      puts the set_bit in the wrong place for service calls.  This is only a
      problem if a call is implicitly terminated by a new call coming in on the
      same connection channel instead of a terminating ACK packet.
      
      In such a case, rxrpc_input_implicit_end_call() calls
      __rxrpc_disconnect_call(), which is now (incorrectly) setting the
      disconnection bit, meaning that when rxrpc_release_call() is later called,
      it doesn't call rxrpc_disconnect_call() and so the call isn't removed from
      the peer's error distribution list and the list gets corrupted.
      
      KASAN finds the issue as an access after release on a call, but the
      position at which it occurs is confusing as it appears to be related to a
      different call (the call site is where the latter call is being removed
      from the error distribution list and either the next or pprev pointer
      points to a previously released call).
      
      Fix this by moving the setting of the flag from __rxrpc_disconnect_call()
      to rxrpc_disconnect_call() in the same place that the connection pointer
      was being cleared.
      
      Fixes: 5273a191 ("rxrpc: Fix NULL pointer deref due to call->conn being cleared on disconnect")
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b39a934e
  3. 06 Feb, 2020 5 commits