1. 20 Mar, 2020 11 commits
    • Jakub Kicinski's avatar
      fib: add missing attribute validation for tun_id · 0b21c9cb
      Jakub Kicinski authored
      [ Upstream commit 4c16d64e ]
      
      Add missing netlink policy entry for FRA_TUN_ID.
      
      Fixes: e7030878 ("fib: Add fib rule match on tunnel id")
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Reviewed-by: default avatarDavid Ahern <dsahern@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0b21c9cb
    • Vasundhara Volam's avatar
      bnxt_en: reinitialize IRQs when MTU is modified · 285112d5
      Vasundhara Volam authored
      [ Upstream commit a9b952d2 ]
      
      MTU changes may affect the number of IRQs so we must call
      bnxt_close_nic()/bnxt_open_nic() with the irq_re_init parameter
      set to true.  The reason is that a larger MTU may require
      aggregation rings not needed with smaller MTU.  We may not be
      able to allocate the required number of aggregation rings and
      so we reduce the number of channels which will change the number
      of IRQs.  Without this patch, it may crash eventually in
      pci_disable_msix() when the IRQs are not properly unwound.
      
      Fixes: c0c050c5 ("bnxt_en: New Broadcom ethernet driver.")
      Signed-off-by: default avatarVasundhara Volam <vasundhara-v.volam@broadcom.com>
      Signed-off-by: default avatarMichael Chan <michael.chan@broadcom.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      285112d5
    • You-Sheng Yang's avatar
      r8152: check disconnect status after long sleep · f120a400
      You-Sheng Yang authored
      [ Upstream commit d64c7a08 ]
      
      Dell USB Type C docking WD19/WD19DC attaches additional peripherals as:
      
        /: Bus 02.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/6p, 5000M
            |__ Port 1: Dev 11, If 0, Class=Hub, Driver=hub/4p, 5000M
                |__ Port 3: Dev 12, If 0, Class=Hub, Driver=hub/4p, 5000M
                |__ Port 4: Dev 13, If 0, Class=Vendor Specific Class,
                    Driver=r8152, 5000M
      
      where usb 2-1-3 is a hub connecting all USB Type-A/C ports on the dock.
      
      When hotplugging such dock with additional usb devices already attached on
      it, the probing process may reset usb 2.1 port, therefore r8152 ethernet
      device is also reset. However, during r8152 device init there are several
      for-loops that, when it's unable to retrieve hardware registers due to
      being disconnected from USB, may take up to 14 seconds each in practice,
      and that has to be completed before USB may re-enumerate devices on the
      bus. As a result, devices attached to the dock will only be available
      after nearly 1 minute after the dock was plugged in:
      
        [ 216.388290] [250] r8152 2-1.4:1.0: usb_probe_interface
        [ 216.388292] [250] r8152 2-1.4:1.0: usb_probe_interface - got id
        [ 258.830410] r8152 2-1.4:1.0 (unnamed net_device) (uninitialized): PHY not ready
        [ 258.830460] r8152 2-1.4:1.0 (unnamed net_device) (uninitialized): Invalid header when reading pass-thru MAC addr
        [ 258.830464] r8152 2-1.4:1.0 (unnamed net_device) (uninitialized): Get ether addr fail
      
      This happens in, for example, r8153_init:
      
        static int generic_ocp_read(struct r8152 *tp, u16 index, u16 size,
      			    void *data, u16 type)
        {
          if (test_bit(RTL8152_UNPLUG, &tp->flags))
            return -ENODEV;
          ...
        }
      
        static u16 ocp_read_word(struct r8152 *tp, u16 type, u16 index)
        {
          u32 data;
          ...
          generic_ocp_read(tp, index, sizeof(tmp), &tmp, type | byen);
      
          data = __le32_to_cpu(tmp);
          ...
          return (u16)data;
        }
      
        static void r8153_init(struct r8152 *tp)
        {
          ...
          if (test_bit(RTL8152_UNPLUG, &tp->flags))
            return;
      
          for (i = 0; i < 500; i++) {
            if (ocp_read_word(tp, MCU_TYPE_PLA, PLA_BOOT_CTRL) &
                AUTOLOAD_DONE)
              break;
            msleep(20);
          }
          ...
        }
      
      Since ocp_read_word() doesn't check the return status of
      generic_ocp_read(), and the only exit condition for the loop is to have
      a match in the returned value, such loops will only ends after exceeding
      its maximum runs when the device has been marked as disconnected, which
      takes 500 * 20ms = 10 seconds in theory, 14 in practice.
      
      To solve this long latency another test to RTL8152_UNPLUG flag should be
      added after those 20ms sleep to skip unnecessary loops, so that the device
      probe can complete early and proceed to parent port reset/reprobe process.
      
      This can be reproduced on all kernel versions up to latest v5.6-rc2, but
      after v5.5-rc7 the reproduce rate is dramatically lowered to 1/30 or less
      while it was around 1/2.
      Signed-off-by: default avatarYou-Sheng Yang <vicamo.yang@canonical.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f120a400
    • Dan Carpenter's avatar
      net: nfc: fix bounds checking bugs on "pipe" · e5660ee1
      Dan Carpenter authored
      [ Upstream commit a3aefbfe ]
      
      This is similar to commit 674d9de0 ("NFC: Fix possible memory
      corruption when handling SHDLC I-Frame commands") and commit d7ee81ad
      ("NFC: nci: Add some bounds checking in nci_hci_cmd_received()") which
      added range checks on "pipe".
      
      The "pipe" variable comes skb->data[0] in nfc_hci_msg_rx_work().
      It's in the 0-255 range.  We're using it as the array index into the
      hdev->pipes[] array which has NFC_HCI_MAX_PIPES (128) members.
      
      Fixes: 118278f2 ("NFC: hci: Add pipes table to reference them with a tuple {gate, host}")
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e5660ee1
    • Dmitry Bogdanov's avatar
      net: macsec: update SCI upon MAC address change. · fc094dab
      Dmitry Bogdanov authored
      [ Upstream commit 6fc498bc ]
      
      SCI should be updated, because it contains MAC in its first 6 octets.
      
      Fixes: c09440f7 ("macsec: introduce IEEE 802.1AE driver")
      Signed-off-by: default avatarDmitry Bogdanov <dbogdanov@marvell.com>
      Signed-off-by: default avatarMark Starovoytov <mstarovoitov@marvell.com>
      Signed-off-by: default avatarIgor Russkikh <irusskikh@marvell.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      fc094dab
    • Hangbin Liu's avatar
      ipv6/addrconf: call ipv6_mc_up() for non-Ethernet interface · 6d80c781
      Hangbin Liu authored
      [ Upstream commit 60380488 ]
      
      Rafał found an issue that for non-Ethernet interface, if we down and up
      frequently, the memory will be consumed slowly.
      
      The reason is we add allnodes/allrouters addressed in multicast list in
      ipv6_add_dev(). When link down, we call ipv6_mc_down(), store all multicast
      addresses via mld_add_delrec(). But when link up, we don't call ipv6_mc_up()
      for non-Ethernet interface to remove the addresses. This makes idev->mc_tomb
      getting bigger and bigger. The call stack looks like:
      
      addrconf_notify(NETDEV_REGISTER)
      	ipv6_add_dev
      		ipv6_dev_mc_inc(ff01::1)
      		ipv6_dev_mc_inc(ff02::1)
      		ipv6_dev_mc_inc(ff02::2)
      
      addrconf_notify(NETDEV_UP)
      	addrconf_dev_config
      		/* Alas, we support only Ethernet autoconfiguration. */
      		return;
      
      addrconf_notify(NETDEV_DOWN)
      	addrconf_ifdown
      		ipv6_mc_down
      			igmp6_group_dropped(ff02::2)
      				mld_add_delrec(ff02::2)
      			igmp6_group_dropped(ff02::1)
      			igmp6_group_dropped(ff01::1)
      
      After investigating, I can't found a rule to disable multicast on
      non-Ethernet interface. In RFC2460, the link could be Ethernet, PPP, ATM,
      tunnels, etc. In IPv4, it doesn't check the dev type when calls ip_mc_up()
      in inetdev_event(). Even for IPv6, we don't check the dev type and call
      ipv6_add_dev(), ipv6_dev_mc_inc() after register device.
      
      So I think it's OK to fix this memory consumer by calling ipv6_mc_up() for
      non-Ethernet interface.
      
      v2: Also check IFF_MULTICAST flag to make sure the interface supports
          multicast
      Reported-by: default avatarRafał Miłecki <zajec5@gmail.com>
      Tested-by: default avatarRafał Miłecki <zajec5@gmail.com>
      Fixes: 74235a25 ("[IPV6] addrconf: Fix IPv6 on tuntap tunnels")
      Fixes: 1666d49e ("mld: do not remove mld souce list info when set link down")
      Signed-off-by: default avatarHangbin Liu <liuhangbin@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6d80c781
    • Eric Dumazet's avatar
      gre: fix uninit-value in __iptunnel_pull_header · 6f1aea70
      Eric Dumazet authored
      [ Upstream commit 17c25caf ]
      
      syzbot found an interesting case of the kernel reading
      an uninit-value [1]
      
      Problem is in the handling of ETH_P_WCCP in gre_parse_header()
      
      We look at the byte following GRE options to eventually decide
      if the options are four bytes longer.
      
      Use skb_header_pointer() to not pull bytes if we found
      that no more bytes were needed.
      
      All callers of gre_parse_header() are properly using pskb_may_pull()
      anyway before proceeding to next header.
      
      [1]
      BUG: KMSAN: uninit-value in pskb_may_pull include/linux/skbuff.h:2303 [inline]
      BUG: KMSAN: uninit-value in __iptunnel_pull_header+0x30c/0xbd0 net/ipv4/ip_tunnel_core.c:94
      CPU: 1 PID: 11784 Comm: syz-executor940 Not tainted 5.6.0-rc2-syzkaller #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       __dump_stack lib/dump_stack.c:77 [inline]
       dump_stack+0x1c9/0x220 lib/dump_stack.c:118
       kmsan_report+0xf7/0x1e0 mm/kmsan/kmsan_report.c:118
       __msan_warning+0x58/0xa0 mm/kmsan/kmsan_instr.c:215
       pskb_may_pull include/linux/skbuff.h:2303 [inline]
       __iptunnel_pull_header+0x30c/0xbd0 net/ipv4/ip_tunnel_core.c:94
       iptunnel_pull_header include/net/ip_tunnels.h:411 [inline]
       gre_rcv+0x15e/0x19c0 net/ipv6/ip6_gre.c:606
       ip6_protocol_deliver_rcu+0x181b/0x22c0 net/ipv6/ip6_input.c:432
       ip6_input_finish net/ipv6/ip6_input.c:473 [inline]
       NF_HOOK include/linux/netfilter.h:307 [inline]
       ip6_input net/ipv6/ip6_input.c:482 [inline]
       ip6_mc_input+0xdf2/0x1460 net/ipv6/ip6_input.c:576
       dst_input include/net/dst.h:442 [inline]
       ip6_rcv_finish net/ipv6/ip6_input.c:76 [inline]
       NF_HOOK include/linux/netfilter.h:307 [inline]
       ipv6_rcv+0x683/0x710 net/ipv6/ip6_input.c:306
       __netif_receive_skb_one_core net/core/dev.c:5198 [inline]
       __netif_receive_skb net/core/dev.c:5312 [inline]
       netif_receive_skb_internal net/core/dev.c:5402 [inline]
       netif_receive_skb+0x66b/0xf20 net/core/dev.c:5461
       tun_rx_batched include/linux/skbuff.h:4321 [inline]
       tun_get_user+0x6aef/0x6f60 drivers/net/tun.c:1997
       tun_chr_write_iter+0x1f2/0x360 drivers/net/tun.c:2026
       call_write_iter include/linux/fs.h:1901 [inline]
       new_sync_write fs/read_write.c:483 [inline]
       __vfs_write+0xa5a/0xca0 fs/read_write.c:496
       vfs_write+0x44a/0x8f0 fs/read_write.c:558
       ksys_write+0x267/0x450 fs/read_write.c:611
       __do_sys_write fs/read_write.c:623 [inline]
       __se_sys_write fs/read_write.c:620 [inline]
       __ia32_sys_write+0xdb/0x120 fs/read_write.c:620
       do_syscall_32_irqs_on arch/x86/entry/common.c:339 [inline]
       do_fast_syscall_32+0x3c7/0x6e0 arch/x86/entry/common.c:410
       entry_SYSENTER_compat+0x68/0x77 arch/x86/entry/entry_64_compat.S:139
      RIP: 0023:0xf7f62d99
      Code: 90 e8 0b 00 00 00 f3 90 0f ae e8 eb f9 8d 74 26 00 89 3c 24 c3 90 90 90 90 90 90 90 90 90 90 90 90 51 52 55 89 e5 0f 34 cd 80 <5d> 5a 59 c3 90 90 90 90 eb 0d 90 90 90 90 90 90 90 90 90 90 90 90
      RSP: 002b:00000000fffedb2c EFLAGS: 00000217 ORIG_RAX: 0000000000000004
      RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000020002580
      RDX: 0000000000000fca RSI: 0000000000000036 RDI: 0000000000000004
      RBP: 0000000000008914 R08: 0000000000000000 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
      R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
      
      Uninit was created at:
       kmsan_save_stack_with_flags mm/kmsan/kmsan.c:144 [inline]
       kmsan_internal_poison_shadow+0x66/0xd0 mm/kmsan/kmsan.c:127
       kmsan_slab_alloc+0x8a/0xe0 mm/kmsan/kmsan_hooks.c:82
       slab_alloc_node mm/slub.c:2793 [inline]
       __kmalloc_node_track_caller+0xb40/0x1200 mm/slub.c:4401
       __kmalloc_reserve net/core/skbuff.c:142 [inline]
       __alloc_skb+0x2fd/0xac0 net/core/skbuff.c:210
       alloc_skb include/linux/skbuff.h:1051 [inline]
       alloc_skb_with_frags+0x18c/0xa70 net/core/skbuff.c:5766
       sock_alloc_send_pskb+0xada/0xc60 net/core/sock.c:2242
       tun_alloc_skb drivers/net/tun.c:1529 [inline]
       tun_get_user+0x10ae/0x6f60 drivers/net/tun.c:1843
       tun_chr_write_iter+0x1f2/0x360 drivers/net/tun.c:2026
       call_write_iter include/linux/fs.h:1901 [inline]
       new_sync_write fs/read_write.c:483 [inline]
       __vfs_write+0xa5a/0xca0 fs/read_write.c:496
       vfs_write+0x44a/0x8f0 fs/read_write.c:558
       ksys_write+0x267/0x450 fs/read_write.c:611
       __do_sys_write fs/read_write.c:623 [inline]
       __se_sys_write fs/read_write.c:620 [inline]
       __ia32_sys_write+0xdb/0x120 fs/read_write.c:620
       do_syscall_32_irqs_on arch/x86/entry/common.c:339 [inline]
       do_fast_syscall_32+0x3c7/0x6e0 arch/x86/entry/common.c:410
       entry_SYSENTER_compat+0x68/0x77 arch/x86/entry/entry_64_compat.S:139
      
      Fixes: 95f5c64c ("gre: Move utility functions to common headers")
      Fixes: c5441932 ("GRE: Refactor GRE tunneling code.")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6f1aea70
    • Dmitry Yakunin's avatar
      cgroup, netclassid: periodically release file_lock on classid updating · 78604444
      Dmitry Yakunin authored
      [ Upstream commit 018d26fc ]
      
      In our production environment we have faced with problem that updating
      classid in cgroup with heavy tasks cause long freeze of the file tables
      in this tasks. By heavy tasks we understand tasks with many threads and
      opened sockets (e.g. balancers). This freeze leads to an increase number
      of client timeouts.
      
      This patch implements following logic to fix this issue:
      аfter iterating 1000 file descriptors file table lock will be released
      thus providing a time gap for socket creation/deletion.
      
      Now update is non atomic and socket may be skipped using calls:
      
      dup2(oldfd, newfd);
      close(oldfd);
      
      But this case is not typical. Moreover before this patch skip is possible
      too by hiding socket fd in unix socket buffer.
      
      New sockets will be allocated with updated classid because cgroup state
      is updated before start of the file descriptors iteration.
      
      So in common cases this patch has no side effects.
      Signed-off-by: default avatarDmitry Yakunin <zeil@yandex-team.ru>
      Reviewed-by: default avatarKonstantin Khlebnikov <khlebnikov@yandex-team.ru>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      78604444
    • Florian Fainelli's avatar
      net: phy: Avoid multiple suspends · da933d98
      Florian Fainelli authored
      commit 503ba7c6 upstream.
      
      It is currently possible for a PHY device to be suspended as part of a
      network device driver's suspend call while it is still being attached to
      that net_device, either via phy_suspend() or implicitly via phy_stop().
      
      Later on, when the MDIO bus controller get suspended, we would attempt
      to suspend again the PHY because it is still attached to a network
      device.
      
      This is both a waste of time and creates an opportunity for improper
      clock/power management bugs to creep in.
      
      Fixes: 803dd9c7 ("net: phy: avoid suspending twice a PHY")
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      da933d98
    • David S. Miller's avatar
      phy: Revert toggling reset changes. · 4ffa65aa
      David S. Miller authored
      commit 7b566f70 upstream.
      
      This reverts:
      
      ef1b5bf5 ("net: phy: Fix not to call phy_resume() if PHY is not attached")
      8c85f4b8 ("net: phy: micrel: add toggling phy reset if PHY is not  attached")
      
      Andrew Lunn informs me that there are alternative efforts
      underway to fix this more properly.
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      [just take the ef1b5bf5 revert - gregkh]
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4ffa65aa
    • Petr Malat's avatar
      NFS: Remove superfluous kmap in nfs_readdir_xdr_to_array · dbfc9e98
      Petr Malat authored
      Array is mapped by nfs_readdir_get_array(), the further kmap is a result
      of a bad merge and should be removed.
      
      This resource leakage can be exploited for DoS by receptively reading
      a content of a directory on NFS (e.g. by running ls).
      
      Fixes: 67a56e97 ("NFS: Fix memory leaks and corruption in readdir")
      Signed-off-by: default avatarPetr Malat <oss@malat.biz>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      dbfc9e98
  2. 11 Mar, 2020 29 commits