1. 10 Nov, 2019 40 commits
    • Andrew Lunn's avatar
      net: usb: lan78xx: Connect PHY before registering MAC · 6b5bf3f3
      Andrew Lunn authored
      [ Upstream commit 38b4fe32 ]
      
      As soon as the netdev is registers, the kernel can start using the
      interface. If the driver connects the MAC to the PHY after the netdev
      is registered, there is a race condition where the interface can be
      opened without having the PHY connected.
      
      Change the order to close this race condition.
      
      Fixes: 92571a1a ("lan78xx: Connect phy early")
      Reported-by: default avatarDaniel Wagner <dwagner@suse.de>
      Signed-off-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Tested-by: default avatarDaniel Wagner <dwagner@suse.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6b5bf3f3
    • Doug Berger's avatar
      net: bcmgenet: reset 40nm EPHY on energy detect · 07c62fc7
      Doug Berger authored
      [ Upstream commit 25382b99 ]
      
      The EPHY integrated into the 40nm Set-Top Box devices can falsely
      detect energy when connected to a disabled peer interface. When the
      peer interface is enabled the EPHY will detect and report the link
      as active, but on occasion may get into a state where it is not
      able to exchange data with the connected GENET MAC. This issue has
      not been observed when the link parameters are auto-negotiated;
      however, it has been observed with a manually configured link.
      
      It has been empirically determined that issuing a soft reset to the
      EPHY when energy is detected prevents it from getting into this bad
      state.
      
      Fixes: 1c1008c7 ("net: bcmgenet: add main driver file")
      Signed-off-by: default avatarDoug Berger <opendmb@gmail.com>
      Acked-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      07c62fc7
    • Doug Berger's avatar
      net: phy: bcm7xxx: define soft_reset for 40nm EPHY · 6d3ccc2a
      Doug Berger authored
      [ Upstream commit fe586b82 ]
      
      The internal 40nm EPHYs use a "Workaround for putting the PHY in
      IDDQ mode." These PHYs require a soft reset to restore functionality
      after they are powered back up.
      
      This commit defines the soft_reset function to use genphy_soft_reset
      during phy_init_hw to accommodate this.
      
      Fixes: 6e2d85ec ("net: phy: Stop with excessive soft reset")
      Signed-off-by: default avatarDoug Berger <opendmb@gmail.com>
      Acked-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6d3ccc2a
    • Doug Berger's avatar
      net: bcmgenet: don't set phydev->link from MAC · 97cc6827
      Doug Berger authored
      [ Upstream commit 7de48402 ]
      
      When commit 28b2e0d2 ("net: phy: remove parameter new_link from
      phy_mac_interrupt()") removed the new_link parameter it set the
      phydev->link state from the MAC before invoking phy_mac_interrupt().
      
      However, once commit 88d6272a ("net: phy: avoid unneeded MDIO
      reads in genphy_read_status") was added this initialization prevents
      the proper determination of the connection parameters by the function
      genphy_read_status().
      
      This commit removes that initialization to restore the proper
      functionality.
      
      Fixes: 88d6272a ("net: phy: avoid unneeded MDIO reads in genphy_read_status")
      Signed-off-by: default avatarDoug Berger <opendmb@gmail.com>
      Acked-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      97cc6827
    • Florian Fainelli's avatar
      net: dsa: b53: Do not clear existing mirrored port mask · 57e286f6
      Florian Fainelli authored
      [ Upstream commit c763ac43 ]
      
      Clearing the existing bitmask of mirrored ports essentially prevents us
      from capturing more than one port at any given time. This is clearly
      wrong, do not clear the bitmask prior to setting up the new port.
      Reported-by: default avatarHubert Feurstein <h.feurstein@gmail.com>
      Fixes: ed3af5fd ("net: dsa: b53: Add support for port mirroring")
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Reviewed-by: default avatarVivien Didelot <vivien.didelot@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      57e286f6
    • Aya Levin's avatar
      net/mlx5e: Fix ethtool self test: link speed · db91be8e
      Aya Levin authored
      [ Upstream commit 534e7366 ]
      
      Ethtool self test contains a test for link speed. This test reads the
      PTYS register and determines whether the current speed is valid or not.
      Change current implementation to use the function mlx5e_port_linkspeed()
      that does the same check and fails when speed is invalid. This code
      redundancy lead to a bug when mlx5e_port_linkspeed() was updated with
      expended speeds and the self test was not.
      
      Fixes: 2c81bfd5 ("net/mlx5e: Move port speed code from en_ethtool.c to en/port.c")
      Signed-off-by: default avatarAya Levin <ayal@mellanox.com>
      Reviewed-by: default avatarMoshe Shemesh <moshe@mellanox.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      db91be8e
    • Heiner Kallweit's avatar
      r8169: fix wrong PHY ID issue with RTL8168dp · 5eb1967b
      Heiner Kallweit authored
      [ Upstream commit 62bdc8fd ]
      
      As reported in [0] at least one RTL8168dp version has problems
      establishing a link. This chip version has an integrated RTL8211b PHY,
      however the chip seems to report a wrong PHY ID, resulting in a wrong
      PHY driver (for Generic Realtek PHY) being loaded.
      Work around this issue by adding a hook to r8168dp_2_mdio_read()
      for returning the correct PHY ID.
      
      [0] https://bbs.archlinux.org/viewtopic.php?id=246508
      
      Fixes: 242cd9b5 ("r8169: use phy_resume/phy_suspend")
      Signed-off-by: default avatarHeiner Kallweit <hkallweit1@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5eb1967b
    • Maxim Mikityanskiy's avatar
      net/mlx5e: Fix handling of compressed CQEs in case of low NAPI budget · 9e7c4fa2
      Maxim Mikityanskiy authored
      [ Upstream commit 9df86bdb ]
      
      When CQE compression is enabled, compressed CQEs use the following
      structure: a title is followed by one or many blocks, each containing 8
      mini CQEs (except the last, which may contain fewer mini CQEs).
      
      Due to NAPI budget restriction, a complete structure is not always
      parsed in one NAPI run, and some blocks with mini CQEs may be deferred
      to the next NAPI poll call - we have the mlx5e_decompress_cqes_cont call
      in the beginning of mlx5e_poll_rx_cq. However, if the budget is
      extremely low, some blocks may be left even after that, but the code
      that follows the mlx5e_decompress_cqes_cont call doesn't check it and
      assumes that a new CQE begins, which may not be the case. In such cases,
      random memory corruptions occur.
      
      An extremely low NAPI budget of 8 is used when busy_poll or busy_read is
      active.
      
      This commit adds a check to make sure that the previous compressed CQE
      has been completely parsed after mlx5e_decompress_cqes_cont, otherwise
      it prevents a new CQE from being fetched in the middle of a compressed
      CQE.
      
      This commit fixes random crashes in __build_skb, __page_pool_put_page
      and other not-related-directly places, that used to happen when both CQE
      compression and busy_poll/busy_read were enabled.
      
      Fixes: 7219ab34 ("net/mlx5e: CQE compression")
      Signed-off-by: default avatarMaxim Mikityanskiy <maximmi@mellanox.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9e7c4fa2
    • Paolo Abeni's avatar
      selftests: fib_tests: add more tests for metric update · 0c3355cc
      Paolo Abeni authored
      [ Upstream commit 37de3b35 ]
      
      This patch adds two more tests to ipv4_addr_metric_test() to
      explicitly cover the scenarios fixed by the previous patch.
      Suggested-by: default avatarDavid Ahern <dsahern@gmail.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Reviewed-by: default avatarDavid Ahern <dsahern@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0c3355cc
    • Paolo Abeni's avatar
      ipv4: fix route update on metric change. · b166e883
      Paolo Abeni authored
      [ Upstream commit 0b834ba0 ]
      
      Since commit af4d768a ("net/ipv4: Add support for specifying metric
      of connected routes"), when updating an IP address with a different metric,
      the associated connected route is updated, too.
      
      Still, the mentioned commit doesn't handle properly some corner cases:
      
      $ ip addr add dev eth0 192.168.1.0/24
      $ ip addr add dev eth0 192.168.2.1/32 peer 192.168.2.2
      $ ip addr add dev eth0 192.168.3.1/24
      $ ip addr change dev eth0 192.168.1.0/24 metric 10
      $ ip addr change dev eth0 192.168.2.1/32 peer 192.168.2.2 metric 10
      $ ip addr change dev eth0 192.168.3.1/24 metric 10
      $ ip -4 route
      192.168.1.0/24 dev eth0 proto kernel scope link src 192.168.1.0
      192.168.2.2 dev eth0 proto kernel scope link src 192.168.2.1
      192.168.3.0/24 dev eth0 proto kernel scope link src 192.168.2.1 metric 10
      
      Only the last route is correctly updated.
      
      The problem is the current test in fib_modify_prefix_metric():
      
      	if (!(dev->flags & IFF_UP) ||
      	    ifa->ifa_flags & (IFA_F_SECONDARY | IFA_F_NOPREFIXROUTE) ||
      	    ipv4_is_zeronet(prefix) ||
      	    prefix == ifa->ifa_local || ifa->ifa_prefixlen == 32)
      
      Which should be the logical 'not' of the pre-existing test in
      fib_add_ifaddr():
      
      	if (!ipv4_is_zeronet(prefix) && !(ifa->ifa_flags & IFA_F_SECONDARY) &&
      	    (prefix != addr || ifa->ifa_prefixlen < 32))
      
      To properly negate the original expression, we need to change the last
      logical 'or' to a logical 'and'.
      
      Fixes: af4d768a ("net/ipv4: Add support for specifying metric of connected routes")
      Reported-and-suggested-by: default avatarBeniamino Galvani <bgalvani@redhat.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Reviewed-by: default avatarDavid Ahern <dsahern@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b166e883
    • Eric Dumazet's avatar
      net: add READ_ONCE() annotation in __skb_wait_for_more_packets() · cd3bcb44
      Eric Dumazet authored
      [ Upstream commit 7c422d0c ]
      
      __skb_wait_for_more_packets() can be called while other cpus
      can feed packets to the socket receive queue.
      
      KCSAN reported :
      
      BUG: KCSAN: data-race in __skb_wait_for_more_packets / __udp_enqueue_schedule_skb
      
      write to 0xffff888102e40b58 of 8 bytes by interrupt on cpu 0:
       __skb_insert include/linux/skbuff.h:1852 [inline]
       __skb_queue_before include/linux/skbuff.h:1958 [inline]
       __skb_queue_tail include/linux/skbuff.h:1991 [inline]
       __udp_enqueue_schedule_skb+0x2d7/0x410 net/ipv4/udp.c:1470
       __udp_queue_rcv_skb net/ipv4/udp.c:1940 [inline]
       udp_queue_rcv_one_skb+0x7bd/0xc70 net/ipv4/udp.c:2057
       udp_queue_rcv_skb+0xb5/0x400 net/ipv4/udp.c:2074
       udp_unicast_rcv_skb.isra.0+0x7e/0x1c0 net/ipv4/udp.c:2233
       __udp4_lib_rcv+0xa44/0x17c0 net/ipv4/udp.c:2300
       udp_rcv+0x2b/0x40 net/ipv4/udp.c:2470
       ip_protocol_deliver_rcu+0x4d/0x420 net/ipv4/ip_input.c:204
       ip_local_deliver_finish+0x110/0x140 net/ipv4/ip_input.c:231
       NF_HOOK include/linux/netfilter.h:305 [inline]
       NF_HOOK include/linux/netfilter.h:299 [inline]
       ip_local_deliver+0x133/0x210 net/ipv4/ip_input.c:252
       dst_input include/net/dst.h:442 [inline]
       ip_rcv_finish+0x121/0x160 net/ipv4/ip_input.c:413
       NF_HOOK include/linux/netfilter.h:305 [inline]
       NF_HOOK include/linux/netfilter.h:299 [inline]
       ip_rcv+0x18f/0x1a0 net/ipv4/ip_input.c:523
       __netif_receive_skb_one_core+0xa7/0xe0 net/core/dev.c:5010
       __netif_receive_skb+0x37/0xf0 net/core/dev.c:5124
       process_backlog+0x1d3/0x420 net/core/dev.c:5955
      
      read to 0xffff888102e40b58 of 8 bytes by task 13035 on cpu 1:
       __skb_wait_for_more_packets+0xfa/0x320 net/core/datagram.c:100
       __skb_recv_udp+0x374/0x500 net/ipv4/udp.c:1683
       udp_recvmsg+0xe1/0xb10 net/ipv4/udp.c:1712
       inet_recvmsg+0xbb/0x250 net/ipv4/af_inet.c:838
       sock_recvmsg_nosec+0x5c/0x70 net/socket.c:871
       ___sys_recvmsg+0x1a0/0x3e0 net/socket.c:2480
       do_recvmmsg+0x19a/0x5c0 net/socket.c:2601
       __sys_recvmmsg+0x1ef/0x200 net/socket.c:2680
       __do_sys_recvmmsg net/socket.c:2703 [inline]
       __se_sys_recvmmsg net/socket.c:2696 [inline]
       __x64_sys_recvmmsg+0x89/0xb0 net/socket.c:2696
       do_syscall_64+0xcc/0x370 arch/x86/entry/common.c:290
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      Reported by Kernel Concurrency Sanitizer on:
      CPU: 1 PID: 13035 Comm: syz-executor.3 Not tainted 5.4.0-rc3+ #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      cd3bcb44
    • Eric Dumazet's avatar
      net: use skb_queue_empty_lockless() in busy poll contexts · 4f3df7f1
      Eric Dumazet authored
      [ Upstream commit 3f926af3 ]
      
      Busy polling usually runs without locks.
      Let's use skb_queue_empty_lockless() instead of skb_queue_empty()
      
      Also uses READ_ONCE() in __skb_try_recv_datagram() to address
      a similar potential problem.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4f3df7f1
    • Eric Dumazet's avatar
      net: use skb_queue_empty_lockless() in poll() handlers · eaf548fe
      Eric Dumazet authored
      [ Upstream commit 3ef7cf57 ]
      
      Many poll() handlers are lockless. Using skb_queue_empty_lockless()
      instead of skb_queue_empty() is more appropriate.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      eaf548fe
    • Eric Dumazet's avatar
      udp: use skb_queue_empty_lockless() · afa1f5e9
      Eric Dumazet authored
      [ Upstream commit 137a0dbe ]
      
      syzbot reported a data-race [1].
      
      We should use skb_queue_empty_lockless() to document that we are
      not ensuring a mutual exclusion and silence KCSAN.
      
      [1]
      BUG: KCSAN: data-race in __skb_recv_udp / __udp_enqueue_schedule_skb
      
      write to 0xffff888122474b50 of 8 bytes by interrupt on cpu 0:
       __skb_insert include/linux/skbuff.h:1852 [inline]
       __skb_queue_before include/linux/skbuff.h:1958 [inline]
       __skb_queue_tail include/linux/skbuff.h:1991 [inline]
       __udp_enqueue_schedule_skb+0x2c1/0x410 net/ipv4/udp.c:1470
       __udp_queue_rcv_skb net/ipv4/udp.c:1940 [inline]
       udp_queue_rcv_one_skb+0x7bd/0xc70 net/ipv4/udp.c:2057
       udp_queue_rcv_skb+0xb5/0x400 net/ipv4/udp.c:2074
       udp_unicast_rcv_skb.isra.0+0x7e/0x1c0 net/ipv4/udp.c:2233
       __udp4_lib_rcv+0xa44/0x17c0 net/ipv4/udp.c:2300
       udp_rcv+0x2b/0x40 net/ipv4/udp.c:2470
       ip_protocol_deliver_rcu+0x4d/0x420 net/ipv4/ip_input.c:204
       ip_local_deliver_finish+0x110/0x140 net/ipv4/ip_input.c:231
       NF_HOOK include/linux/netfilter.h:305 [inline]
       NF_HOOK include/linux/netfilter.h:299 [inline]
       ip_local_deliver+0x133/0x210 net/ipv4/ip_input.c:252
       dst_input include/net/dst.h:442 [inline]
       ip_rcv_finish+0x121/0x160 net/ipv4/ip_input.c:413
       NF_HOOK include/linux/netfilter.h:305 [inline]
       NF_HOOK include/linux/netfilter.h:299 [inline]
       ip_rcv+0x18f/0x1a0 net/ipv4/ip_input.c:523
       __netif_receive_skb_one_core+0xa7/0xe0 net/core/dev.c:5010
       __netif_receive_skb+0x37/0xf0 net/core/dev.c:5124
       process_backlog+0x1d3/0x420 net/core/dev.c:5955
      
      read to 0xffff888122474b50 of 8 bytes by task 8921 on cpu 1:
       skb_queue_empty include/linux/skbuff.h:1494 [inline]
       __skb_recv_udp+0x18d/0x500 net/ipv4/udp.c:1653
       udp_recvmsg+0xe1/0xb10 net/ipv4/udp.c:1712
       inet_recvmsg+0xbb/0x250 net/ipv4/af_inet.c:838
       sock_recvmsg_nosec+0x5c/0x70 net/socket.c:871
       ___sys_recvmsg+0x1a0/0x3e0 net/socket.c:2480
       do_recvmmsg+0x19a/0x5c0 net/socket.c:2601
       __sys_recvmmsg+0x1ef/0x200 net/socket.c:2680
       __do_sys_recvmmsg net/socket.c:2703 [inline]
       __se_sys_recvmmsg net/socket.c:2696 [inline]
       __x64_sys_recvmmsg+0x89/0xb0 net/socket.c:2696
       do_syscall_64+0xcc/0x370 arch/x86/entry/common.c:290
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      Reported by Kernel Concurrency Sanitizer on:
      CPU: 1 PID: 8921 Comm: syz-executor.4 Not tainted 5.4.0-rc3+ #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      afa1f5e9
    • Eric Dumazet's avatar
      net: add skb_queue_empty_lockless() · d5ac4232
      Eric Dumazet authored
      [ Upstream commit d7d16a89 ]
      
      Some paths call skb_queue_empty() without holding
      the queue lock. We must use a barrier in order
      to not let the compiler do strange things, and avoid
      KCSAN splats.
      
      Adding a barrier in skb_queue_empty() might be overkill,
      I prefer adding a new helper to clearly identify
      points where the callers might be lockless. This might
      help us finding real bugs.
      
      The corresponding WRITE_ONCE() should add zero cost
      for current compilers.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d5ac4232
    • Xin Long's avatar
      vxlan: check tun_info options_len properly · 83532eb4
      Xin Long authored
      [ Upstream commit eadf52cf ]
      
      This patch is to improve the tun_info options_len by dropping
      the skb when TUNNEL_VXLAN_OPT is set but options_len is less
      than vxlan_metadata. This can void a potential out-of-bounds
      access on ip_tun_info.
      
      Fixes: ee122c79 ("vxlan: Flow based tunneling")
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      83532eb4
    • Eric Dumazet's avatar
      udp: fix data-race in udp_set_dev_scratch() · a8a5adbb
      Eric Dumazet authored
      [ Upstream commit a793183c ]
      
      KCSAN reported a data-race in udp_set_dev_scratch() [1]
      
      The issue here is that we must not write over skb fields
      if skb is shared. A similar issue has been fixed in commit
      89c22d8c ("net: Fix skb csum races when peeking")
      
      While we are at it, use a helper only dealing with
      udp_skb_scratch(skb)->csum_unnecessary, as this allows
      udp_set_dev_scratch() to be called once and thus inlined.
      
      [1]
      BUG: KCSAN: data-race in udp_set_dev_scratch / udpv6_recvmsg
      
      write to 0xffff888120278317 of 1 bytes by task 10411 on cpu 1:
       udp_set_dev_scratch+0xea/0x200 net/ipv4/udp.c:1308
       __first_packet_length+0x147/0x420 net/ipv4/udp.c:1556
       first_packet_length+0x68/0x2a0 net/ipv4/udp.c:1579
       udp_poll+0xea/0x110 net/ipv4/udp.c:2720
       sock_poll+0xed/0x250 net/socket.c:1256
       vfs_poll include/linux/poll.h:90 [inline]
       do_select+0x7d0/0x1020 fs/select.c:534
       core_sys_select+0x381/0x550 fs/select.c:677
       do_pselect.constprop.0+0x11d/0x160 fs/select.c:759
       __do_sys_pselect6 fs/select.c:784 [inline]
       __se_sys_pselect6 fs/select.c:769 [inline]
       __x64_sys_pselect6+0x12e/0x170 fs/select.c:769
       do_syscall_64+0xcc/0x370 arch/x86/entry/common.c:290
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      read to 0xffff888120278317 of 1 bytes by task 10413 on cpu 0:
       udp_skb_csum_unnecessary include/net/udp.h:358 [inline]
       udpv6_recvmsg+0x43e/0xe90 net/ipv6/udp.c:310
       inet6_recvmsg+0xbb/0x240 net/ipv6/af_inet6.c:592
       sock_recvmsg_nosec+0x5c/0x70 net/socket.c:871
       ___sys_recvmsg+0x1a0/0x3e0 net/socket.c:2480
       do_recvmmsg+0x19a/0x5c0 net/socket.c:2601
       __sys_recvmmsg+0x1ef/0x200 net/socket.c:2680
       __do_sys_recvmmsg net/socket.c:2703 [inline]
       __se_sys_recvmmsg net/socket.c:2696 [inline]
       __x64_sys_recvmmsg+0x89/0xb0 net/socket.c:2696
       do_syscall_64+0xcc/0x370 arch/x86/entry/common.c:290
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      Reported by Kernel Concurrency Sanitizer on:
      CPU: 0 PID: 10413 Comm: syz-executor.0 Not tainted 5.4.0-rc3+ #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      
      Fixes: 2276f58a ("udp: use a separate rx queue for packet reception")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Cc: Paolo Abeni <pabeni@redhat.com>
      Reviewed-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a8a5adbb
    • Wei Wang's avatar
      selftests: net: reuseport_dualstack: fix uninitalized parameter · 12fab163
      Wei Wang authored
      [ Upstream commit d64479a3 ]
      
      This test reports EINVAL for getsockopt(SOL_SOCKET, SO_DOMAIN)
      occasionally due to the uninitialized length parameter.
      Initialize it to fix this, and also use int for "test_family" to comply
      with the API standard.
      
      Fixes: d6a61f80 ("soreuseport: test mixed v4/v6 sockets")
      Reported-by: default avatarMaciej Żenczykowski <maze@google.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarWei Wang <weiwan@google.com>
      Cc: Craig Gallek <cgallek@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      12fab163
    • zhanglin's avatar
      net: Zeroing the structure ethtool_wolinfo in ethtool_get_wol() · 321c9915
      zhanglin authored
      [ Upstream commit 5ff223e8 ]
      
      memset() the structure ethtool_wolinfo that has padded bytes
      but the padded bytes have not been zeroed out.
      Signed-off-by: default avatarzhanglin <zhang.lin16@zte.com.cn>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      321c9915
    • Daniel Wagner's avatar
      net: usb: lan78xx: Disable interrupts before calling generic_handle_irq() · 9da271c1
      Daniel Wagner authored
      [ Upstream commit 0a29ac5b ]
      
      lan78xx_status() will run with interrupts enabled due to the change in
      ed194d13 ("usb: core: remove local_irq_save() around ->complete()
      handler"). generic_handle_irq() expects to be run with IRQs disabled.
      
      [    4.886203] 000: irq 79 handler irq_default_primary_handler+0x0/0x8 enabled interrupts
      [    4.886243] 000: WARNING: CPU: 0 PID: 0 at kernel/irq/handle.c:152 __handle_irq_event_percpu+0x154/0x168
      [    4.896294] 000: Modules linked in:
      [    4.896301] 000: CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.3.6 #39
      [    4.896310] 000: Hardware name: Raspberry Pi 3 Model B+ (DT)
      [    4.896315] 000: pstate: 60000005 (nZCv daif -PAN -UAO)
      [    4.896321] 000: pc : __handle_irq_event_percpu+0x154/0x168
      [    4.896331] 000: lr : __handle_irq_event_percpu+0x154/0x168
      [    4.896339] 000: sp : ffff000010003cc0
      [    4.896346] 000: x29: ffff000010003cc0 x28: 0000000000000060
      [    4.896355] 000: x27: ffff000011021980 x26: ffff00001189c72b
      [    4.896364] 000: x25: ffff000011702bc0 x24: ffff800036d6e400
      [    4.896373] 000: x23: 000000000000004f x22: ffff000010003d64
      [    4.896381] 000: x21: 0000000000000000 x20: 0000000000000002
      [    4.896390] 000: x19: ffff8000371c8480 x18: 0000000000000060
      [    4.896398] 000: x17: 0000000000000000 x16: 00000000000000eb
      [    4.896406] 000: x15: ffff000011712d18 x14: 7265746e69206465
      [    4.896414] 000: x13: ffff000010003ba0 x12: ffff000011712df0
      [    4.896422] 000: x11: 0000000000000001 x10: ffff000011712e08
      [    4.896430] 000: x9 : 0000000000000001 x8 : 000000000003c920
      [    4.896437] 000: x7 : ffff0000118cc410 x6 : ffff0000118c7f00
      [    4.896445] 000: x5 : 000000000003c920 x4 : 0000000000004510
      [    4.896453] 000: x3 : ffff000011712dc8 x2 : 0000000000000000
      [    4.896461] 000: x1 : 73a3f67df94c1500 x0 : 0000000000000000
      [    4.896466] 000: Call trace:
      [    4.896471] 000:  __handle_irq_event_percpu+0x154/0x168
      [    4.896481] 000:  handle_irq_event_percpu+0x50/0xb0
      [    4.896489] 000:  handle_irq_event+0x40/0x98
      [    4.896497] 000:  handle_simple_irq+0xa4/0xf0
      [    4.896505] 000:  generic_handle_irq+0x24/0x38
      [    4.896513] 000:  intr_complete+0xb0/0xe0
      [    4.896525] 000:  __usb_hcd_giveback_urb+0x58/0xd8
      [    4.896533] 000:  usb_giveback_urb_bh+0xd0/0x170
      [    4.896539] 000:  tasklet_action_common.isra.0+0x9c/0x128
      [    4.896549] 000:  tasklet_hi_action+0x24/0x30
      [    4.896556] 000:  __do_softirq+0x120/0x23c
      [    4.896564] 000:  irq_exit+0xb8/0xd8
      [    4.896571] 000:  __handle_domain_irq+0x64/0xb8
      [    4.896579] 000:  bcm2836_arm_irqchip_handle_irq+0x60/0xc0
      [    4.896586] 000:  el1_irq+0xb8/0x140
      [    4.896592] 000:  arch_cpu_idle+0x10/0x18
      [    4.896601] 000:  do_idle+0x200/0x280
      [    4.896608] 000:  cpu_startup_entry+0x20/0x28
      [    4.896615] 000:  rest_init+0xb4/0xc0
      [    4.896623] 000:  arch_call_rest_init+0xc/0x14
      [    4.896632] 000:  start_kernel+0x454/0x480
      
      Fixes: ed194d13 ("usb: core: remove local_irq_save() around ->complete() handler")
      Cc: Woojung Huh <woojung.huh@microchip.com>
      Cc: Marc Zyngier <maz@kernel.org>
      Cc: Andrew Lunn <andrew@lunn.ch>
      Cc: Stefan Wahren <wahrenst@gmx.net>
      Cc: Jisheng Zhang <Jisheng.Zhang@synaptics.com>
      Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: David Miller <davem@davemloft.net>
      Signed-off-by: default avatarDaniel Wagner <dwagner@suse.de>
      Tested-by: default avatarStefan Wahren <wahrenst@gmx.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9da271c1
    • Guillaume Nault's avatar
      netns: fix GFP flags in rtnl_net_notifyid() · 40400fdd
      Guillaume Nault authored
      [ Upstream commit d4e4fdf9 ]
      
      In rtnl_net_notifyid(), we certainly can't pass a null GFP flag to
      rtnl_notify(). A GFP_KERNEL flag would be fine in most circumstances,
      but there are a few paths calling rtnl_net_notifyid() from atomic
      context or from RCU critical sections. The later also precludes the use
      of gfp_any() as it wouldn't detect the RCU case. Also, the nlmsg_new()
      call is wrong too, as it uses GFP_KERNEL unconditionally.
      
      Therefore, we need to pass the GFP flags as parameter and propagate it
      through function calls until the proper flags can be determined.
      
      In most cases, GFP_KERNEL is fine. The exceptions are:
        * openvswitch: ovs_vport_cmd_get() and ovs_vport_cmd_dump()
          indirectly call rtnl_net_notifyid() from RCU critical section,
      
        * rtnetlink: rtmsg_ifinfo_build_skb() already receives GFP flags as
          parameter.
      
      Also, in ovs_vport_cmd_build_info(), let's change the GFP flags used
      by nlmsg_new(). The function is allowed to sleep, so better make the
      flags consistent with the ones used in the following
      ovs_vport_cmd_fill_info() call.
      
      Found by code inspection.
      
      Fixes: 9a963454 ("netns: notify netns id events")
      Signed-off-by: default avatarGuillaume Nault <gnault@redhat.com>
      Acked-by: default avatarNicolas Dichtel <nicolas.dichtel@6wind.com>
      Acked-by: default avatarPravin B Shelar <pshelar@ovn.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      40400fdd
    • Eran Ben Elisha's avatar
      net/mlx4_core: Dynamically set guaranteed amount of counters per VF · 1d72dbb4
      Eran Ben Elisha authored
      [ Upstream commit e19868ef ]
      
      Prior to this patch, the amount of counters guaranteed per VF in the
      resource tracker was MLX4_VF_COUNTERS_PER_PORT * MLX4_MAX_PORTS. It was
      set regardless if the VF was single or dual port.
      This caused several VFs to have no guaranteed counters although the
      system could satisfy their request.
      
      The fix is to dynamically guarantee counters, based on each VF
      specification.
      
      Fixes: 9de92c60 ("net/mlx4_core: Adjust counter grant policy in the resource tracker")
      Signed-off-by: default avatarEran Ben Elisha <eranbe@mellanox.com>
      Signed-off-by: default avatarJack Morgenstein <jackm@dev.mellanox.co.il>
      Signed-off-by: default avatarTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1d72dbb4
    • Jiangfeng Xiao's avatar
      net: hisilicon: Fix ping latency when deal with high throughput · f05975d9
      Jiangfeng Xiao authored
      [ Upstream commit e56bd641 ]
      
      This is due to error in over budget processing.
      When dealing with high throughput, the used buffers
      that exceeds the budget is not cleaned up. In addition,
      it takes a lot of cycles to clean up the used buffer,
      and then the buffer where the valid data is located can take effect.
      Signed-off-by: default avatarJiangfeng Xiao <xiaojiangfeng@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f05975d9
    • Tejun Heo's avatar
      net: fix sk_page_frag() recursion from memory reclaim · 1d5cb12a
      Tejun Heo authored
      [ Upstream commit 20eb4f29 ]
      
      sk_page_frag() optimizes skb_frag allocations by using per-task
      skb_frag cache when it knows it's the only user.  The condition is
      determined by seeing whether the socket allocation mask allows
      blocking - if the allocation may block, it obviously owns the task's
      context and ergo exclusively owns current->task_frag.
      
      Unfortunately, this misses recursion through memory reclaim path.
      Please take a look at the following backtrace.
      
       [2] RIP: 0010:tcp_sendmsg_locked+0xccf/0xe10
           ...
           tcp_sendmsg+0x27/0x40
           sock_sendmsg+0x30/0x40
           sock_xmit.isra.24+0xa1/0x170 [nbd]
           nbd_send_cmd+0x1d2/0x690 [nbd]
           nbd_queue_rq+0x1b5/0x3b0 [nbd]
           __blk_mq_try_issue_directly+0x108/0x1b0
           blk_mq_request_issue_directly+0xbd/0xe0
           blk_mq_try_issue_list_directly+0x41/0xb0
           blk_mq_sched_insert_requests+0xa2/0xe0
           blk_mq_flush_plug_list+0x205/0x2a0
           blk_flush_plug_list+0xc3/0xf0
       [1] blk_finish_plug+0x21/0x2e
           _xfs_buf_ioapply+0x313/0x460
           __xfs_buf_submit+0x67/0x220
           xfs_buf_read_map+0x113/0x1a0
           xfs_trans_read_buf_map+0xbf/0x330
           xfs_btree_read_buf_block.constprop.42+0x95/0xd0
           xfs_btree_lookup_get_block+0x95/0x170
           xfs_btree_lookup+0xcc/0x470
           xfs_bmap_del_extent_real+0x254/0x9a0
           __xfs_bunmapi+0x45c/0xab0
           xfs_bunmapi+0x15/0x30
           xfs_itruncate_extents_flags+0xca/0x250
           xfs_free_eofblocks+0x181/0x1e0
           xfs_fs_destroy_inode+0xa8/0x1b0
           destroy_inode+0x38/0x70
           dispose_list+0x35/0x50
           prune_icache_sb+0x52/0x70
           super_cache_scan+0x120/0x1a0
           do_shrink_slab+0x120/0x290
           shrink_slab+0x216/0x2b0
           shrink_node+0x1b6/0x4a0
           do_try_to_free_pages+0xc6/0x370
           try_to_free_mem_cgroup_pages+0xe3/0x1e0
           try_charge+0x29e/0x790
           mem_cgroup_charge_skmem+0x6a/0x100
           __sk_mem_raise_allocated+0x18e/0x390
           __sk_mem_schedule+0x2a/0x40
       [0] tcp_sendmsg_locked+0x8eb/0xe10
           tcp_sendmsg+0x27/0x40
           sock_sendmsg+0x30/0x40
           ___sys_sendmsg+0x26d/0x2b0
           __sys_sendmsg+0x57/0xa0
           do_syscall_64+0x42/0x100
           entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      In [0], tcp_send_msg_locked() was using current->page_frag when it
      called sk_wmem_schedule().  It already calculated how many bytes can
      be fit into current->page_frag.  Due to memory pressure,
      sk_wmem_schedule() called into memory reclaim path which called into
      xfs and then IO issue path.  Because the filesystem in question is
      backed by nbd, the control goes back into the tcp layer - back into
      tcp_sendmsg_locked().
      
      nbd sets sk_allocation to (GFP_NOIO | __GFP_MEMALLOC) which makes
      sense - it's in the process of freeing memory and wants to be able to,
      e.g., drop clean pages to make forward progress.  However, this
      confused sk_page_frag() called from [2].  Because it only tests
      whether the allocation allows blocking which it does, it now thinks
      current->page_frag can be used again although it already was being
      used in [0].
      
      After [2] used current->page_frag, the offset would be increased by
      the used amount.  When the control returns to [0],
      current->page_frag's offset is increased and the previously calculated
      number of bytes now may overrun the end of allocated memory leading to
      silent memory corruptions.
      
      Fix it by adding gfpflags_normal_context() which tests sleepable &&
      !reclaim and use it to determine whether to use current->task_frag.
      
      v2: Eric didn't like gfp flags being tested twice.  Introduce a new
          helper gfpflags_normal_context() and combine the two tests.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Josef Bacik <josef@toxicpanda.com>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1d5cb12a
    • Benjamin Herrenschmidt's avatar
      net: ethernet: ftgmac100: Fix DMA coherency issue with SW checksum · 189982d1
      Benjamin Herrenschmidt authored
      [ Upstream commit 88824e3b ]
      
      We are calling the checksum helper after the dma_map_single()
      call to map the packet. This is incorrect as the checksumming
      code will touch the packet from the CPU. This means the cache
      won't be properly flushes (or the bounce buffering will leave
      us with the unmodified packet to DMA).
      
      This moves the calculation of the checksum & vlan tags to
      before the DMA mapping.
      
      This also has the side effect of fixing another bug: If the
      checksum helper fails, we goto "drop" to drop the packet, which
      will not unmap the DMA mapping.
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Fixes: 05690d63 ("ftgmac100: Upgrade to NETIF_F_HW_CSUM")
      Reviewed-by: default avatarVijay Khemka <vijaykhemka@fb.com>
      Tested-by: default avatarVijay Khemka <vijaykhemka@fb.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      189982d1
    • Florian Fainelli's avatar
      net: dsa: bcm_sf2: Fix IMP setup for port different than 8 · 5536fc89
      Florian Fainelli authored
      [ Upstream commit 5fc0f212 ]
      
      Since it became possible for the DSA core to use a CPU port different
      than 8, our bcm_sf2_imp_setup() function was broken because it assumes
      that registers are applicable to port 8. In particular, the port's MAC
      is going to stay disabled, so make sure we clear the RX_DIS and TX_DIS
      bits if we are not configured for port 8.
      
      Fixes: 9f91484f ("net: dsa: make "label" property optional for dsa2")
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5536fc89
    • Eric Dumazet's avatar
      net: annotate lockless accesses to sk->sk_napi_id · 2c50a36d
      Eric Dumazet authored
      [ Upstream commit ee8d153d ]
      
      We already annotated most accesses to sk->sk_napi_id
      
      We missed sk_mark_napi_id() and sk_mark_napi_id_once()
      which might be called without socket lock held in UDP stack.
      
      KCSAN reported :
      BUG: KCSAN: data-race in udpv6_queue_rcv_one_skb / udpv6_queue_rcv_one_skb
      
      write to 0xffff888121c6d108 of 4 bytes by interrupt on cpu 0:
       sk_mark_napi_id include/net/busy_poll.h:125 [inline]
       __udpv6_queue_rcv_skb net/ipv6/udp.c:571 [inline]
       udpv6_queue_rcv_one_skb+0x70c/0xb40 net/ipv6/udp.c:672
       udpv6_queue_rcv_skb+0xb5/0x400 net/ipv6/udp.c:689
       udp6_unicast_rcv_skb.isra.0+0xd7/0x180 net/ipv6/udp.c:832
       __udp6_lib_rcv+0x69c/0x1770 net/ipv6/udp.c:913
       udpv6_rcv+0x2b/0x40 net/ipv6/udp.c:1015
       ip6_protocol_deliver_rcu+0x22a/0xbe0 net/ipv6/ip6_input.c:409
       ip6_input_finish+0x30/0x50 net/ipv6/ip6_input.c:450
       NF_HOOK include/linux/netfilter.h:305 [inline]
       NF_HOOK include/linux/netfilter.h:299 [inline]
       ip6_input+0x177/0x190 net/ipv6/ip6_input.c:459
       dst_input include/net/dst.h:442 [inline]
       ip6_rcv_finish+0x110/0x140 net/ipv6/ip6_input.c:76
       NF_HOOK include/linux/netfilter.h:305 [inline]
       NF_HOOK include/linux/netfilter.h:299 [inline]
       ipv6_rcv+0x1a1/0x1b0 net/ipv6/ip6_input.c:284
       __netif_receive_skb_one_core+0xa7/0xe0 net/core/dev.c:5010
       __netif_receive_skb+0x37/0xf0 net/core/dev.c:5124
       process_backlog+0x1d3/0x420 net/core/dev.c:5955
       napi_poll net/core/dev.c:6392 [inline]
       net_rx_action+0x3ae/0xa90 net/core/dev.c:6460
      
      write to 0xffff888121c6d108 of 4 bytes by interrupt on cpu 1:
       sk_mark_napi_id include/net/busy_poll.h:125 [inline]
       __udpv6_queue_rcv_skb net/ipv6/udp.c:571 [inline]
       udpv6_queue_rcv_one_skb+0x70c/0xb40 net/ipv6/udp.c:672
       udpv6_queue_rcv_skb+0xb5/0x400 net/ipv6/udp.c:689
       udp6_unicast_rcv_skb.isra.0+0xd7/0x180 net/ipv6/udp.c:832
       __udp6_lib_rcv+0x69c/0x1770 net/ipv6/udp.c:913
       udpv6_rcv+0x2b/0x40 net/ipv6/udp.c:1015
       ip6_protocol_deliver_rcu+0x22a/0xbe0 net/ipv6/ip6_input.c:409
       ip6_input_finish+0x30/0x50 net/ipv6/ip6_input.c:450
       NF_HOOK include/linux/netfilter.h:305 [inline]
       NF_HOOK include/linux/netfilter.h:299 [inline]
       ip6_input+0x177/0x190 net/ipv6/ip6_input.c:459
       dst_input include/net/dst.h:442 [inline]
       ip6_rcv_finish+0x110/0x140 net/ipv6/ip6_input.c:76
       NF_HOOK include/linux/netfilter.h:305 [inline]
       NF_HOOK include/linux/netfilter.h:299 [inline]
       ipv6_rcv+0x1a1/0x1b0 net/ipv6/ip6_input.c:284
       __netif_receive_skb_one_core+0xa7/0xe0 net/core/dev.c:5010
       __netif_receive_skb+0x37/0xf0 net/core/dev.c:5124
       process_backlog+0x1d3/0x420 net/core/dev.c:5955
      
      Reported by Kernel Concurrency Sanitizer on:
      CPU: 1 PID: 10890 Comm: syz-executor.0 Not tainted 5.4.0-rc3+ #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      
      Fixes: e68b6e50 ("udp: enable busy polling for all sockets")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2c50a36d
    • Eric Dumazet's avatar
      net: annotate accesses to sk->sk_incoming_cpu · 0cfaf03c
      Eric Dumazet authored
      [ Upstream commit 7170a977 ]
      
      This socket field can be read and written by concurrent cpus.
      
      Use READ_ONCE() and WRITE_ONCE() annotations to document this,
      and avoid some compiler 'optimizations'.
      
      KCSAN reported :
      
      BUG: KCSAN: data-race in tcp_v4_rcv / tcp_v4_rcv
      
      write to 0xffff88812220763c of 4 bytes by interrupt on cpu 0:
       sk_incoming_cpu_update include/net/sock.h:953 [inline]
       tcp_v4_rcv+0x1b3c/0x1bb0 net/ipv4/tcp_ipv4.c:1934
       ip_protocol_deliver_rcu+0x4d/0x420 net/ipv4/ip_input.c:204
       ip_local_deliver_finish+0x110/0x140 net/ipv4/ip_input.c:231
       NF_HOOK include/linux/netfilter.h:305 [inline]
       NF_HOOK include/linux/netfilter.h:299 [inline]
       ip_local_deliver+0x133/0x210 net/ipv4/ip_input.c:252
       dst_input include/net/dst.h:442 [inline]
       ip_rcv_finish+0x121/0x160 net/ipv4/ip_input.c:413
       NF_HOOK include/linux/netfilter.h:305 [inline]
       NF_HOOK include/linux/netfilter.h:299 [inline]
       ip_rcv+0x18f/0x1a0 net/ipv4/ip_input.c:523
       __netif_receive_skb_one_core+0xa7/0xe0 net/core/dev.c:5010
       __netif_receive_skb+0x37/0xf0 net/core/dev.c:5124
       process_backlog+0x1d3/0x420 net/core/dev.c:5955
       napi_poll net/core/dev.c:6392 [inline]
       net_rx_action+0x3ae/0xa90 net/core/dev.c:6460
       __do_softirq+0x115/0x33f kernel/softirq.c:292
       do_softirq_own_stack+0x2a/0x40 arch/x86/entry/entry_64.S:1082
       do_softirq.part.0+0x6b/0x80 kernel/softirq.c:337
       do_softirq kernel/softirq.c:329 [inline]
       __local_bh_enable_ip+0x76/0x80 kernel/softirq.c:189
      
      read to 0xffff88812220763c of 4 bytes by interrupt on cpu 1:
       sk_incoming_cpu_update include/net/sock.h:952 [inline]
       tcp_v4_rcv+0x181a/0x1bb0 net/ipv4/tcp_ipv4.c:1934
       ip_protocol_deliver_rcu+0x4d/0x420 net/ipv4/ip_input.c:204
       ip_local_deliver_finish+0x110/0x140 net/ipv4/ip_input.c:231
       NF_HOOK include/linux/netfilter.h:305 [inline]
       NF_HOOK include/linux/netfilter.h:299 [inline]
       ip_local_deliver+0x133/0x210 net/ipv4/ip_input.c:252
       dst_input include/net/dst.h:442 [inline]
       ip_rcv_finish+0x121/0x160 net/ipv4/ip_input.c:413
       NF_HOOK include/linux/netfilter.h:305 [inline]
       NF_HOOK include/linux/netfilter.h:299 [inline]
       ip_rcv+0x18f/0x1a0 net/ipv4/ip_input.c:523
       __netif_receive_skb_one_core+0xa7/0xe0 net/core/dev.c:5010
       __netif_receive_skb+0x37/0xf0 net/core/dev.c:5124
       process_backlog+0x1d3/0x420 net/core/dev.c:5955
       napi_poll net/core/dev.c:6392 [inline]
       net_rx_action+0x3ae/0xa90 net/core/dev.c:6460
       __do_softirq+0x115/0x33f kernel/softirq.c:292
       run_ksoftirqd+0x46/0x60 kernel/softirq.c:603
       smpboot_thread_fn+0x37d/0x4a0 kernel/smpboot.c:165
      
      Reported by Kernel Concurrency Sanitizer on:
      CPU: 1 PID: 16 Comm: ksoftirqd/1 Not tainted 5.4.0-rc3+ #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0cfaf03c
    • Eric Dumazet's avatar
      inet: stop leaking jiffies on the wire · 07de7389
      Eric Dumazet authored
      [ Upstream commit a904a069 ]
      
      Historically linux tried to stick to RFC 791, 1122, 2003
      for IPv4 ID field generation.
      
      RFC 6864 made clear that no matter how hard we try,
      we can not ensure unicity of IP ID within maximum
      lifetime for all datagrams with a given source
      address/destination address/protocol tuple.
      
      Linux uses a per socket inet generator (inet_id), initialized
      at connection startup with a XOR of 'jiffies' and other
      fields that appear clear on the wire.
      
      Thiemo Nagel pointed that this strategy is a privacy
      concern as this provides 16 bits of entropy to fingerprint
      devices.
      
      Let's switch to a random starting point, this is just as
      good as far as RFC 6864 is concerned and does not leak
      anything critical.
      
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarThiemo Nagel <tnagel@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      07de7389
    • Xin Long's avatar
      erspan: fix the tun_info options_len check for erspan · 163901dc
      Xin Long authored
      [ Upstream commit 2eb8d6d2 ]
      
      The check for !md doens't really work for ip_tunnel_info_opts(info) which
      only does info + 1. Also to avoid out-of-bounds access on info, it should
      ensure options_len is not less than erspan_metadata in both erspan_xmit()
      and ip6erspan_tunnel_xmit().
      
      Fixes: 1a66a836 ("gre: add collect_md mode to ERSPAN tunnel")
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      163901dc
    • Eric Dumazet's avatar
      dccp: do not leak jiffies on the wire · 96df1ec2
      Eric Dumazet authored
      [ Upstream commit 3d1e5039 ]
      
      For some reason I missed the case of DCCP passive
      flows in my previous patch.
      
      Fixes: a904a069 ("inet: stop leaking jiffies on the wire")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarThiemo Nagel <tnagel@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      96df1ec2
    • Vishal Kulkarni's avatar
      cxgb4: fix panic when attaching to ULD fail · f291613f
      Vishal Kulkarni authored
      [ Upstream commit fc89cc35 ]
      
      Release resources when attaching to ULD fail. Otherwise, data
      mismatch is seen between LLD and ULD later on, which lead to
      kernel panic when accessing resources that should not even
      exist in the first place.
      
      Fixes: 94cdb8bb ("cxgb4: Add support for dynamic allocation of resources for ULD")
      Signed-off-by: default avatarShahjada Abul Husain <shahjada@chelsio.com>
      Signed-off-by: default avatarVishal Kulkarni <vishal@chelsio.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f291613f
    • Josef Bacik's avatar
      nbd: handle racing with error'ed out commands · 1f032ca2
      Josef Bacik authored
      [ Upstream commit 7ce23e8e ]
      
      We hit the following warning in production
      
      print_req_error: I/O error, dev nbd0, sector 7213934408 flags 80700
      ------------[ cut here ]------------
      refcount_t: underflow; use-after-free.
      WARNING: CPU: 25 PID: 32407 at lib/refcount.c:190 refcount_sub_and_test_checked+0x53/0x60
      Workqueue: knbd-recv recv_work [nbd]
      RIP: 0010:refcount_sub_and_test_checked+0x53/0x60
      Call Trace:
       blk_mq_free_request+0xb7/0xf0
       blk_mq_complete_request+0x62/0xf0
       recv_work+0x29/0xa1 [nbd]
       process_one_work+0x1f5/0x3f0
       worker_thread+0x2d/0x3d0
       ? rescuer_thread+0x340/0x340
       kthread+0x111/0x130
       ? kthread_create_on_node+0x60/0x60
       ret_from_fork+0x1f/0x30
      ---[ end trace b079c3c67f98bb7c ]---
      
      This was preceded by us timing out everything and shutting down the
      sockets for the device.  The problem is we had a request in the queue at
      the same time, so we completed the request twice.  This can actually
      happen in a lot of cases, we fail to get a ref on our config, we only
      have one connection and just error out the command, etc.
      
      Fix this by checking cmd->status in nbd_read_stat.  We only change this
      under the cmd->lock, so we are safe to check this here and see if we've
      already error'ed this command out, which would indicate that we've
      completed it as well.
      Reviewed-by: default avatarMike Christie <mchristi@redhat.com>
      Signed-off-by: default avatarJosef Bacik <josef@toxicpanda.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      1f032ca2
    • Josef Bacik's avatar
      nbd: protect cmd->status with cmd->lock · 82b7c99e
      Josef Bacik authored
      [ Upstream commit de6346ec ]
      
      We already do this for the most part, except in timeout and clear_req.
      For the timeout case we take the lock after we grab a ref on the config,
      but that isn't really necessary because we're safe to touch the cmd at
      this point, so just move the order around.
      
      For the clear_req cause this is initiated by the user, so again is safe.
      Reviewed-by: default avatarMike Christie <mchristi@redhat.com>
      Signed-off-by: default avatarJosef Bacik <josef@toxicpanda.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      82b7c99e
    • Dave Wysochanski's avatar
      cifs: Fix cifsInodeInfo lock_sem deadlock when reconnect occurs · 80b42f43
      Dave Wysochanski authored
      [ Upstream commit d46b0da7 ]
      
      There's a deadlock that is possible and can easily be seen with
      a test where multiple readers open/read/close of the same file
      and a disruption occurs causing reconnect.  The deadlock is due
      a reader thread inside cifs_strict_readv calling down_read and
      obtaining lock_sem, and then after reconnect inside
      cifs_reopen_file calling down_read a second time.  If in
      between the two down_read calls, a down_write comes from
      another process, deadlock occurs.
      
              CPU0                    CPU1
              ----                    ----
      cifs_strict_readv()
       down_read(&cifsi->lock_sem);
                                     _cifsFileInfo_put
                                        OR
                                     cifs_new_fileinfo
                                      down_write(&cifsi->lock_sem);
      cifs_reopen_file()
       down_read(&cifsi->lock_sem);
      
      Fix the above by changing all down_write(lock_sem) calls to
      down_write_trylock(lock_sem)/msleep() loop, which in turn
      makes the second down_read call benign since it will never
      block behind the writer while holding lock_sem.
      Signed-off-by: default avatarDave Wysochanski <dwysocha@redhat.com>
      Suggested-by: default avatarRonnie Sahlberg <lsahlber@redhat.com>
      Reviewed--by: default avatarRonnie Sahlberg <lsahlber@redhat.com>
      Reviewed-by: default avatarPavel Shilovsky <pshilov@microsoft.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      80b42f43
    • Alain Volmat's avatar
      i2c: stm32f7: remove warning when compiling with W=1 · a7448991
      Alain Volmat authored
      [ Upstream commit 348e46fb ]
      
      Remove the following warning:
      
      drivers/i2c/busses/i2c-stm32f7.c:315:
      warning: cannot understand function prototype:
      'struct stm32f7_i2c_spec i2c_specs[] =
      
      Replace a comment starting with /** by simply /* to avoid having
      it interpreted as a kernel-doc comment.
      
      Fixes: aeb068c5 ("i2c: i2c-stm32f7: add driver")
      Signed-off-by: default avatarAlain Volmat <alain.volmat@st.com>
      Reviewed-by: default avatarPierre-Yves MORDRET <pierre-yves.mordret@st.com>
      Signed-off-by: default avatarWolfram Sang <wsa@the-dreams.de>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      a7448991
    • Fabrice Gasnier's avatar
      i2c: stm32f7: fix a race in slave mode with arbitration loss irq · 86fd9e33
      Fabrice Gasnier authored
      [ Upstream commit 6d6b0d0d ]
      
      When in slave mode, an arbitration loss (ARLO) may be detected before the
      slave had a chance to detect the stop condition (STOPF in ISR).
      This is seen when two master + slave adapters switch their roles. It
      provokes the i2c bus to be stuck, busy as SCL line is stretched.
      - the I2C_SLAVE_STOP event is never generated due to STOPF flag is set but
        don't generate an irq (race with ARLO irq, STOPIE is masked). STOPF flag
        remains set until next master xfer (e.g. when STOPIE irq get unmasked).
        In this case, completion is generated too early: immediately upon new
        transfer request (then it doesn't send all data).
      - Some data get stuck in TXDR register. As a consequence, the controller
        stretches the SCL line: the bus gets busy until a future master transfer
        triggers the bus busy / recovery mechanism (this can take time... and
        may never happen at all)
      
      So choice is to let the STOPF being detected by the slave isr handler,
      to properly handle this stop condition. E.g. don't mask IRQs in error
      handler, when the slave is running.
      
      Fixes: 60d609f3 ("i2c: i2c-stm32f7: Add slave support")
      Signed-off-by: default avatarFabrice Gasnier <fabrice.gasnier@st.com>
      Reviewed-by: default avatarPierre-Yves MORDRET <pierre-yves.mordret@st.com>
      Signed-off-by: default avatarWolfram Sang <wsa@the-dreams.de>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      86fd9e33
    • Fabrice Gasnier's avatar
      i2c: stm32f7: fix first byte to send in slave mode · d746ce64
      Fabrice Gasnier authored
      [ Upstream commit 02e64276 ]
      
      The slave-interface documentation [1] states "the bus driver should
      transmit the first byte" upon I2C_SLAVE_READ_REQUESTED slave event:
      - 'val': backend returns first byte to be sent
      The driver currently ignores the 1st byte to send on this event.
      
      [1] https://www.kernel.org/doc/Documentation/i2c/slave-interface
      
      Fixes: 60d609f3 ("i2c: i2c-stm32f7: Add slave support")
      Signed-off-by: default avatarFabrice Gasnier <fabrice.gasnier@st.com>
      Reviewed-by: default avatarPierre-Yves MORDRET <pierre-yves.mordret@st.com>
      Signed-off-by: default avatarWolfram Sang <wsa@the-dreams.de>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      d746ce64
    • Zenghui Yu's avatar
      irqchip/gic-v3-its: Use the exact ITSList for VMOVP · 18e7fae3
      Zenghui Yu authored
      [ Upstream commit 84243125 ]
      
      On a system without Single VMOVP support (say GITS_TYPER.VMOVP == 0),
      we will map vPEs only on ITSs that will actually control interrupts
      for the given VM.  And when moving a vPE, the VMOVP command will be
      issued only for those ITSs.
      
      But when issuing VMOVPs we seemed fail to present the exact ITSList
      to ITSs who are actually included in the synchronization operation.
      The its_list_map we're currently using includes all ITSs in the system,
      even though some of them don't have the corresponding vPE mapping at all.
      
      Introduce get_its_list() to get the per-VM its_list_map, to indicate
      which ITSs have vPE mappings for the given VM, and use this map as
      the expected ITSList when building VMOVP. This is hopefully a performance
      gain not to do some synchronization with those unsuspecting ITSs.
      And initialize the whole command descriptor to zero at beginning, since
      the seq_num and its_list should be RES0 when GITS_TYPER.VMOVP == 1.
      Signed-off-by: default avatarZenghui Yu <yuzenghui@huawei.com>
      Signed-off-by: default avatarMarc Zyngier <maz@kernel.org>
      Link: https://lore.kernel.org/r/1571802386-2680-1-git-send-email-yuzenghui@huawei.comSigned-off-by: default avatarSasha Levin <sashal@kernel.org>
      18e7fae3
    • Jonas Gorski's avatar
      MIPS: bmips: mark exception vectors as char arrays · 39637aaf
      Jonas Gorski authored
      [ Upstream commit e4f5cb1a ]
      
      The vectors span more than one byte, so mark them as arrays.
      
      Fixes the following build error when building when using GCC 8.3:
      
      In file included from ./include/linux/string.h:19,
                       from ./include/linux/bitmap.h:9,
                       from ./include/linux/cpumask.h:12,
                       from ./arch/mips/include/asm/processor.h:15,
                       from ./arch/mips/include/asm/thread_info.h:16,
                       from ./include/linux/thread_info.h:38,
                       from ./include/asm-generic/preempt.h:5,
                       from ./arch/mips/include/generated/asm/preempt.h:1,
                       from ./include/linux/preempt.h:81,
                       from ./include/linux/spinlock.h:51,
                       from ./include/linux/mmzone.h:8,
                       from ./include/linux/bootmem.h:8,
                       from arch/mips/bcm63xx/prom.c:10:
      arch/mips/bcm63xx/prom.c: In function 'prom_init':
      ./arch/mips/include/asm/string.h:162:11: error: '__builtin_memcpy' forming offset [2, 32] is out of the bounds [0, 1] of object 'bmips_smp_movevec' with type 'char' [-Werror=array-bounds]
         __ret = __builtin_memcpy((dst), (src), __len); \
                 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      arch/mips/bcm63xx/prom.c:97:3: note: in expansion of macro 'memcpy'
         memcpy((void *)0xa0000200, &bmips_smp_movevec, 0x20);
         ^~~~~~
      In file included from arch/mips/bcm63xx/prom.c:14:
      ./arch/mips/include/asm/bmips.h:80:13: note: 'bmips_smp_movevec' declared here
       extern char bmips_smp_movevec;
      
      Fixes: 18a1eef9 ("MIPS: BMIPS: Introduce bmips.h")
      Signed-off-by: default avatarJonas Gorski <jonas.gorski@gmail.com>
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarPaul Burton <paulburton@kernel.org>
      Cc: linux-mips@vger.kernel.org
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: James Hogan <jhogan@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      39637aaf