1. 26 Sep, 2020 23 commits
    • David Ahern's avatar
      ipv4: Update exception handling for multipath routes via same device · 98776a36
      David Ahern authored
      [ Upstream commit 2fbc6e89 ]
      
      Kfir reported that pmtu exceptions are not created properly for
      deployments where multipath routes use the same device.
      
      After some digging I see 2 compounding problems:
      1. ip_route_output_key_hash_rcu is updating the flowi4_oif *after*
         the route lookup. This is the second use case where this has
         been a problem (the first is related to use of vti devices with
         VRF). I can not find any reason for the oif to be changed after the
         lookup; the code goes back to the start of git. It does not seem
         logical so remove it.
      
      2. fib_lookups for exceptions do not call fib_select_path to handle
         multipath route selection based on the hash.
      
      The end result is that the fib_lookup used to add the exception
      always creates it based using the first leg of the route.
      
      An example topology showing the problem:
      
                       |  host1
                   +------+
                   | eth0 |  .209
                   +------+
                       |
                   +------+
           switch  | br0  |
                   +------+
                       |
             +---------+---------+
             | host2             |  host3
         +------+             +------+
         | eth0 | .250        | eth0 | 192.168.252.252
         +------+             +------+
      
         +-----+             +-----+
         | vti | .2          | vti | 192.168.247.3
         +-----+             +-----+
             \                  /
       =================================
       tunnels
               192.168.247.1/24
      
      for h in host1 host2 host3; do
              ip netns add ${h}
              ip -netns ${h} link set lo up
              ip netns exec ${h} sysctl -wq net.ipv4.ip_forward=1
      done
      
      ip netns add switch
      ip -netns switch li set lo up
      ip -netns switch link add br0 type bridge stp 0
      ip -netns switch link set br0 up
      
      for n in 1 2 3; do
              ip -netns switch link add eth-sw type veth peer name eth-h${n}
              ip -netns switch li set eth-h${n} master br0 up
              ip -netns switch li set eth-sw netns host${n} name eth0
      done
      
      ip -netns host1 addr add 192.168.252.209/24 dev eth0
      ip -netns host1 link set dev eth0 up
      ip -netns host1 route add 192.168.247.0/24 \
              nexthop via 192.168.252.250 dev eth0 nexthop via 192.168.252.252 dev eth0
      
      ip -netns host2 addr add 192.168.252.250/24 dev eth0
      ip -netns host2 link set dev eth0 up
      
      ip -netns host2 addr add 192.168.252.252/24 dev eth0
      ip -netns host3 link set dev eth0 up
      
      ip netns add tunnel
      ip -netns tunnel li set lo up
      ip -netns tunnel li add br0 type bridge
      ip -netns tunnel li set br0 up
      for n in $(seq 11 20); do
              ip -netns tunnel addr add dev br0 192.168.247.${n}/24
      done
      
      for n in 2 3
      do
              ip -netns tunnel link add vti${n} type veth peer name eth${n}
              ip -netns tunnel link set eth${n} mtu 1360 master br0 up
              ip -netns tunnel link set vti${n} netns host${n} mtu 1360 up
              ip -netns host${n} addr add dev vti${n} 192.168.247.${n}/24
      done
      ip -netns tunnel ro add default nexthop via 192.168.247.2 nexthop via 192.168.247.3
      
      ip netns exec host1 ping -M do -s 1400 -c3 -I 192.168.252.209 192.168.247.11
      ip netns exec host1 ping -M do -s 1400 -c3 -I 192.168.252.209 192.168.247.15
      ip -netns host1 ro ls cache
      
      Before this patch the cache always shows exceptions against the first
      leg in the multipath route; 192.168.252.250 per this example. Since the
      hash has an initial random seed, you may need to vary the final octet
      more than what is listed. In my tests, using addresses between 11 and 19
      usually found 1 that used both legs.
      
      With this patch, the cache will have exceptions for both legs.
      
      Fixes: 4895c771 ("ipv4: Add FIB nexthop exceptions")
      Reported-by: default avatarKfir Itzhak <mastertheknife@gmail.com>
      Signed-off-by: default avatarDavid Ahern <dsahern@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      98776a36
    • Eric Dumazet's avatar
      net: add __must_check to skb_put_padto() · f424617e
      Eric Dumazet authored
      [ Upstream commit 4a009cb0 ]
      
      skb_put_padto() and __skb_put_padto() callers
      must check return values or risk use-after-free.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f424617e
    • Eric Dumazet's avatar
      net: qrtr: check skb_put_padto() return value · 771443a2
      Eric Dumazet authored
      [ Upstream commit 3ca1a42a ]
      
      If skb_put_padto() returns an error, skb has been freed.
      Better not touch it anymore, as reported by syzbot [1]
      
      Note to qrtr maintainers : this suggests qrtr_sendmsg()
      should adjust sock_alloc_send_skb() second parameter
      to account for the potential added alignment to avoid
      reallocation.
      
      [1]
      
      BUG: KASAN: use-after-free in __skb_insert include/linux/skbuff.h:1907 [inline]
      BUG: KASAN: use-after-free in __skb_queue_before include/linux/skbuff.h:2016 [inline]
      BUG: KASAN: use-after-free in __skb_queue_tail include/linux/skbuff.h:2049 [inline]
      BUG: KASAN: use-after-free in skb_queue_tail+0x6b/0x120 net/core/skbuff.c:3146
      Write of size 8 at addr ffff88804d8ab3c0 by task syz-executor.4/4316
      
      CPU: 1 PID: 4316 Comm: syz-executor.4 Not tainted 5.9.0-rc4-syzkaller #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       __dump_stack lib/dump_stack.c:77 [inline]
       dump_stack+0x1d6/0x29e lib/dump_stack.c:118
       print_address_description+0x66/0x620 mm/kasan/report.c:383
       __kasan_report mm/kasan/report.c:513 [inline]
       kasan_report+0x132/0x1d0 mm/kasan/report.c:530
       __skb_insert include/linux/skbuff.h:1907 [inline]
       __skb_queue_before include/linux/skbuff.h:2016 [inline]
       __skb_queue_tail include/linux/skbuff.h:2049 [inline]
       skb_queue_tail+0x6b/0x120 net/core/skbuff.c:3146
       qrtr_tun_send+0x1a/0x40 net/qrtr/tun.c:23
       qrtr_node_enqueue+0x44f/0xc00 net/qrtr/qrtr.c:364
       qrtr_bcast_enqueue+0xbe/0x140 net/qrtr/qrtr.c:861
       qrtr_sendmsg+0x680/0x9c0 net/qrtr/qrtr.c:960
       sock_sendmsg_nosec net/socket.c:651 [inline]
       sock_sendmsg net/socket.c:671 [inline]
       sock_write_iter+0x317/0x470 net/socket.c:998
       call_write_iter include/linux/fs.h:1882 [inline]
       new_sync_write fs/read_write.c:503 [inline]
       vfs_write+0xa96/0xd10 fs/read_write.c:578
       ksys_write+0x11b/0x220 fs/read_write.c:631
       do_syscall_64+0x31/0x70 arch/x86/entry/common.c:46
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
      RIP: 0033:0x45d5b9
      Code: 5d b4 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 2b b4 fb ff c3 66 2e 0f 1f 84 00 00 00 00
      RSP: 002b:00007f84b5b81c78 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
      RAX: ffffffffffffffda RBX: 0000000000038b40 RCX: 000000000045d5b9
      RDX: 0000000000000055 RSI: 0000000020001240 RDI: 0000000000000003
      RBP: 00007f84b5b81ca0 R08: 0000000000000000 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000246 R12: 000000000000000f
      R13: 00007ffcbbf86daf R14: 00007f84b5b829c0 R15: 000000000118cf4c
      
      Allocated by task 4316:
       kasan_save_stack mm/kasan/common.c:48 [inline]
       kasan_set_track mm/kasan/common.c:56 [inline]
       __kasan_kmalloc+0x100/0x130 mm/kasan/common.c:461
       slab_post_alloc_hook+0x3e/0x290 mm/slab.h:518
       slab_alloc mm/slab.c:3312 [inline]
       kmem_cache_alloc+0x1c1/0x2d0 mm/slab.c:3482
       skb_clone+0x1b2/0x370 net/core/skbuff.c:1449
       qrtr_bcast_enqueue+0x6d/0x140 net/qrtr/qrtr.c:857
       qrtr_sendmsg+0x680/0x9c0 net/qrtr/qrtr.c:960
       sock_sendmsg_nosec net/socket.c:651 [inline]
       sock_sendmsg net/socket.c:671 [inline]
       sock_write_iter+0x317/0x470 net/socket.c:998
       call_write_iter include/linux/fs.h:1882 [inline]
       new_sync_write fs/read_write.c:503 [inline]
       vfs_write+0xa96/0xd10 fs/read_write.c:578
       ksys_write+0x11b/0x220 fs/read_write.c:631
       do_syscall_64+0x31/0x70 arch/x86/entry/common.c:46
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      Freed by task 4316:
       kasan_save_stack mm/kasan/common.c:48 [inline]
       kasan_set_track+0x3d/0x70 mm/kasan/common.c:56
       kasan_set_free_info+0x17/0x30 mm/kasan/generic.c:355
       __kasan_slab_free+0xdd/0x110 mm/kasan/common.c:422
       __cache_free mm/slab.c:3418 [inline]
       kmem_cache_free+0x82/0xf0 mm/slab.c:3693
       __skb_pad+0x3f5/0x5a0 net/core/skbuff.c:1823
       __skb_put_padto include/linux/skbuff.h:3233 [inline]
       skb_put_padto include/linux/skbuff.h:3252 [inline]
       qrtr_node_enqueue+0x62f/0xc00 net/qrtr/qrtr.c:360
       qrtr_bcast_enqueue+0xbe/0x140 net/qrtr/qrtr.c:861
       qrtr_sendmsg+0x680/0x9c0 net/qrtr/qrtr.c:960
       sock_sendmsg_nosec net/socket.c:651 [inline]
       sock_sendmsg net/socket.c:671 [inline]
       sock_write_iter+0x317/0x470 net/socket.c:998
       call_write_iter include/linux/fs.h:1882 [inline]
       new_sync_write fs/read_write.c:503 [inline]
       vfs_write+0xa96/0xd10 fs/read_write.c:578
       ksys_write+0x11b/0x220 fs/read_write.c:631
       do_syscall_64+0x31/0x70 arch/x86/entry/common.c:46
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      The buggy address belongs to the object at ffff88804d8ab3c0
       which belongs to the cache skbuff_head_cache of size 224
      The buggy address is located 0 bytes inside of
       224-byte region [ffff88804d8ab3c0, ffff88804d8ab4a0)
      The buggy address belongs to the page:
      page:00000000ea8cccfb refcount:1 mapcount:0 mapping:0000000000000000 index:0xffff88804d8abb40 pfn:0x4d8ab
      flags: 0xfffe0000000200(slab)
      raw: 00fffe0000000200 ffffea0002237ec8 ffffea00029b3388 ffff88821bb66800
      raw: ffff88804d8abb40 ffff88804d8ab000 000000010000000b 0000000000000000
      page dumped because: kasan: bad access detected
      
      Fixes: ce57785b ("net: qrtr: fix len of skb_put_padto in qrtr_node_enqueue")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Cc: Carl Huang <cjhuang@codeaurora.org>
      Cc: Wen Gong <wgong@codeaurora.org>
      Cc: Bjorn Andersson <bjorn.andersson@linaro.org>
      Cc: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
      Acked-by: default avatarManivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
      Reviewed-by: default avatarBjorn Andersson <bjorn.andersson@linaro.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      771443a2
    • Florian Fainelli's avatar
      net: phy: Avoid NPD upon phy_detach() when driver is unbound · e9ee8b69
      Florian Fainelli authored
      [ Upstream commit c2b727df ]
      
      If we have unbound the PHY driver prior to calling phy_detach() (often
      via phy_disconnect()) then we can cause a NULL pointer de-reference
      accessing the driver owner member. The steps to reproduce are:
      
      echo unimac-mdio-0:01 > /sys/class/net/eth0/phydev/driver/unbind
      ip link set eth0 down
      
      Fixes: cafe8df8 ("net: phy: Fix lack of reference count on PHY driver")
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e9ee8b69
    • Michael Chan's avatar
      bnxt_en: Protect bnxt_set_eee() and bnxt_set_pauseparam() with mutex. · ee0491c2
      Michael Chan authored
      [ Upstream commit a5390690 ]
      
      All changes related to bp->link_info require the protection of the
      link_lock mutex.  It's not sufficient to rely just on RTNL.
      
      Fixes: 163e9ef6 ("bnxt_en: Fix race when modifying pause settings.")
      Reviewed-by: default avatarEdwin Peer <edwin.peer@broadcom.com>
      Signed-off-by: default avatarMichael Chan <michael.chan@broadcom.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ee0491c2
    • Edwin Peer's avatar
      bnxt_en: return proper error codes in bnxt_show_temp · 1627f932
      Edwin Peer authored
      [ Upstream commit d69753fa ]
      
      Returning "unknown" as a temperature value violates the hwmon interface
      rules. Appropriate error codes should be returned via device_attribute
      show instead. These will ultimately be propagated to the user via the
      file system interface.
      
      In addition to the corrected error handling, it is an even better idea to
      not present the sensor in sysfs at all if it is known that the read will
      definitely fail. Given that temp1_input is currently the only sensor
      reported, ensure no hwmon registration if TEMP_MONITOR_QUERY is not
      supported or if it will fail due to access permissions. Something smarter
      may be needed if and when other sensors are added.
      
      Fixes: 12cce90b ("bnxt_en: fix HWRM error when querying VF temperature")
      Signed-off-by: default avatarEdwin Peer <edwin.peer@broadcom.com>
      Signed-off-by: default avatarMichael Chan <michael.chan@broadcom.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1627f932
    • Xin Long's avatar
      tipc: use skb_unshare() instead in tipc_buf_append() · b15fcca8
      Xin Long authored
      [ Upstream commit ff48b622 ]
      
      In tipc_buf_append() it may change skb's frag_list, and it causes
      problems when this skb is cloned. skb_unclone() doesn't really
      make this skb's flag_list available to change.
      
      Shuang Li has reported an use-after-free issue because of this
      when creating quite a few macvlan dev over the same dev, where
      the broadcast packets will be cloned and go up to the stack:
      
       [ ] BUG: KASAN: use-after-free in pskb_expand_head+0x86d/0xea0
       [ ] Call Trace:
       [ ]  dump_stack+0x7c/0xb0
       [ ]  print_address_description.constprop.7+0x1a/0x220
       [ ]  kasan_report.cold.10+0x37/0x7c
       [ ]  check_memory_region+0x183/0x1e0
       [ ]  pskb_expand_head+0x86d/0xea0
       [ ]  process_backlog+0x1df/0x660
       [ ]  net_rx_action+0x3b4/0xc90
       [ ]
       [ ] Allocated by task 1786:
       [ ]  kmem_cache_alloc+0xbf/0x220
       [ ]  skb_clone+0x10a/0x300
       [ ]  macvlan_broadcast+0x2f6/0x590 [macvlan]
       [ ]  macvlan_process_broadcast+0x37c/0x516 [macvlan]
       [ ]  process_one_work+0x66a/0x1060
       [ ]  worker_thread+0x87/0xb10
       [ ]
       [ ] Freed by task 3253:
       [ ]  kmem_cache_free+0x82/0x2a0
       [ ]  skb_release_data+0x2c3/0x6e0
       [ ]  kfree_skb+0x78/0x1d0
       [ ]  tipc_recvmsg+0x3be/0xa40 [tipc]
      
      So fix it by using skb_unshare() instead, which would create a new
      skb for the cloned frag and it'll be safe to change its frag_list.
      The similar things were also done in sctp_make_reassembled_event(),
      which is using skb_copy().
      Reported-by: default avatarShuang Li <shuali@redhat.com>
      Fixes: 37e22164 ("tipc: rename and move message reassembly function")
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b15fcca8
    • Tetsuo Handa's avatar
      tipc: fix shutdown() of connection oriented socket · 0183a74c
      Tetsuo Handa authored
      [ Upstream commit a4b5cc9e ]
      
      I confirmed that the problem fixed by commit 2a63866c ("tipc: fix
      shutdown() of connectionless socket") also applies to stream socket.
      
      ----------
      #include <sys/socket.h>
      #include <unistd.h>
      #include <sys/wait.h>
      
      int main(int argc, char *argv[])
      {
              int fds[2] = { -1, -1 };
              socketpair(PF_TIPC, SOCK_STREAM /* or SOCK_DGRAM */, 0, fds);
              if (fork() == 0)
                      _exit(read(fds[0], NULL, 1));
              shutdown(fds[0], SHUT_RDWR); /* This must make read() return. */
              wait(NULL); /* To be woken up by _exit(). */
              return 0;
      }
      ----------
      
      Since shutdown(SHUT_RDWR) should affect all processes sharing that socket,
      unconditionally setting sk->sk_shutdown to SHUTDOWN_MASK will be the right
      behavior.
      Signed-off-by: default avatarTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Acked-by: default avatarYing Xue <ying.xue@windriver.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0183a74c
    • Peilin Ye's avatar
      tipc: Fix memory leak in tipc_group_create_member() · d82e08de
      Peilin Ye authored
      [ Upstream commit bb3a420d ]
      
      tipc_group_add_to_tree() returns silently if `key` matches `nkey` of an
      existing node, causing tipc_group_create_member() to leak memory. Let
      tipc_group_add_to_tree() return an error in such a case, so that
      tipc_group_create_member() can handle it properly.
      
      Fixes: 75da2163 ("tipc: introduce communication groups")
      Reported-and-tested-by: syzbot+f95d90c454864b3b5bc9@syzkaller.appspotmail.com
      Cc: Hillf Danton <hdanton@sina.com>
      Link: https://syzkaller.appspot.com/bug?id=048390604fe1b60df34150265479202f10e13affSigned-off-by: default avatarPeilin Ye <yepeilin.cs@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d82e08de
    • Jakub Kicinski's avatar
      nfp: use correct define to return NONE fec · d4c5a31a
      Jakub Kicinski authored
      [ Upstream commit 5f6857e8 ]
      
      struct ethtool_fecparam carries bitmasks not bit numbers.
      We want to return 1 (NONE), not 0.
      
      Fixes: 0d087093 ("nfp: implement ethtool FEC mode settings")
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Reviewed-by: default avatarSimon Horman <simon.horman@netronome.com>
      Reviewed-by: default avatarJesse Brandeburg <jesse.brandeburg@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d4c5a31a
    • Yunsheng Lin's avatar
      net: sch_generic: aviod concurrent reset and enqueue op for lockless qdisc · 749cc0b0
      Yunsheng Lin authored
      [ Upstream commit 2fb541c8 ]
      
      Currently there is concurrent reset and enqueue operation for the
      same lockless qdisc when there is no lock to synchronize the
      q->enqueue() in __dev_xmit_skb() with the qdisc reset operation in
      qdisc_deactivate() called by dev_deactivate_queue(), which may cause
      out-of-bounds access for priv->ring[] in hns3 driver if user has
      requested a smaller queue num when __dev_xmit_skb() still enqueue a
      skb with a larger queue_mapping after the corresponding qdisc is
      reset, and call hns3_nic_net_xmit() with that skb later.
      
      Reused the existing synchronize_net() in dev_deactivate_many() to
      make sure skb with larger queue_mapping enqueued to old qdisc(which
      is saved in dev_queue->qdisc_sleeping) will always be reset when
      dev_reset_queue() is called.
      
      Fixes: 6b3ba914 ("net: sched: allow qdiscs to handle locking")
      Signed-off-by: default avatarYunsheng Lin <linyunsheng@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      749cc0b0
    • Necip Fazil Yildiran's avatar
      net: ipv6: fix kconfig dependency warning for IPV6_SEG6_HMAC · fe916542
      Necip Fazil Yildiran authored
      [ Upstream commit db7cd91a ]
      
      When IPV6_SEG6_HMAC is enabled and CRYPTO is disabled, it results in the
      following Kbuild warning:
      
      WARNING: unmet direct dependencies detected for CRYPTO_HMAC
        Depends on [n]: CRYPTO [=n]
        Selected by [y]:
        - IPV6_SEG6_HMAC [=y] && NET [=y] && INET [=y] && IPV6 [=y]
      
      WARNING: unmet direct dependencies detected for CRYPTO_SHA1
        Depends on [n]: CRYPTO [=n]
        Selected by [y]:
        - IPV6_SEG6_HMAC [=y] && NET [=y] && INET [=y] && IPV6 [=y]
      
      WARNING: unmet direct dependencies detected for CRYPTO_SHA256
        Depends on [n]: CRYPTO [=n]
        Selected by [y]:
        - IPV6_SEG6_HMAC [=y] && NET [=y] && INET [=y] && IPV6 [=y]
      
      The reason is that IPV6_SEG6_HMAC selects CRYPTO_HMAC, CRYPTO_SHA1, and
      CRYPTO_SHA256 without depending on or selecting CRYPTO while those configs
      are subordinate to CRYPTO.
      
      Honor the kconfig menu hierarchy to remove kconfig dependency warnings.
      
      Fixes: bf355b8d ("ipv6: sr: add core files for SR HMAC support")
      Signed-off-by: default avatarNecip Fazil Yildiran <fazilyildiran@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      fe916542
    • Linus Walleij's avatar
      net: dsa: rtl8366: Properly clear member config · 76fde30c
      Linus Walleij authored
      [ Upstream commit 4ddcaf1e ]
      
      When removing a port from a VLAN we are just erasing the
      member config for the VLAN, which is wrong: other ports
      can be using it.
      
      Just mask off the port and only zero out the rest of the
      member config once ports using of the VLAN are removed
      from it.
      Reported-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Fixes: d8652956 ("net: dsa: realtek-smi: Add Realtek SMI driver")
      Signed-off-by: default avatarLinus Walleij <linus.walleij@linaro.org>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      76fde30c
    • Petr Machata's avatar
      net: DCB: Validate DCB_ATTR_DCB_BUFFER argument · d0c2f725
      Petr Machata authored
      [ Upstream commit 297e77e5 ]
      
      The parameter passed via DCB_ATTR_DCB_BUFFER is a struct dcbnl_buffer. The
      field prio2buffer is an array of IEEE_8021Q_MAX_PRIORITIES bytes, where
      each value is a number of a buffer to direct that priority's traffic to.
      That value is however never validated to lie within the bounds set by
      DCBX_MAX_BUFFERS. The only driver that currently implements the callback is
      mlx5 (maintainers CCd), and that does not do any validation either, in
      particual allowing incorrect configuration if the prio2buffer value does
      not fit into 4 bits.
      
      Instead of offloading the need to validate the buffer index to drivers, do
      it right there in core, and bounce the request if the value is too large.
      
      CC: Parav Pandit <parav@nvidia.com>
      CC: Saeed Mahameed <saeedm@nvidia.com>
      Fixes: e549f6f9 ("net/dcb: Add dcbnl buffer attribute")
      Signed-off-by: default avatarPetr Machata <petrm@nvidia.com>
      Reviewed-by: default avatarIdo Schimmel <idosch@nvidia.com>
      Reviewed-by: default avatarJiri Pirko <jiri@nvidia.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d0c2f725
    • Eric Dumazet's avatar
      ipv6: avoid lockdep issue in fib6_del() · f2e5359d
      Eric Dumazet authored
      [ Upstream commit 843d926b ]
      
      syzbot reported twice a lockdep issue in fib6_del() [1]
      which I think is caused by net->ipv6.fib6_null_entry
      having a NULL fib6_table pointer.
      
      fib6_del() already checks for fib6_null_entry special
      case, we only need to return earlier.
      
      Bug seems to occur very rarely, I have thus chosen
      a 'bug origin' that makes backports not too complex.
      
      [1]
      WARNING: suspicious RCU usage
      5.9.0-rc4-syzkaller #0 Not tainted
      -----------------------------
      net/ipv6/ip6_fib.c:1996 suspicious rcu_dereference_protected() usage!
      
      other info that might help us debug this:
      
      rcu_scheduler_active = 2, debug_locks = 1
      4 locks held by syz-executor.5/8095:
       #0: ffffffff8a7ea708 (rtnl_mutex){+.+.}-{3:3}, at: ppp_release+0x178/0x240 drivers/net/ppp/ppp_generic.c:401
       #1: ffff88804c422dd8 (&net->ipv6.fib6_gc_lock){+.-.}-{2:2}, at: spin_trylock_bh include/linux/spinlock.h:414 [inline]
       #1: ffff88804c422dd8 (&net->ipv6.fib6_gc_lock){+.-.}-{2:2}, at: fib6_run_gc+0x21b/0x2d0 net/ipv6/ip6_fib.c:2312
       #2: ffffffff89bd6a40 (rcu_read_lock){....}-{1:2}, at: __fib6_clean_all+0x0/0x290 net/ipv6/ip6_fib.c:2613
       #3: ffff8880a82e6430 (&tb->tb6_lock){+.-.}-{2:2}, at: spin_lock_bh include/linux/spinlock.h:359 [inline]
       #3: ffff8880a82e6430 (&tb->tb6_lock){+.-.}-{2:2}, at: __fib6_clean_all+0x107/0x290 net/ipv6/ip6_fib.c:2245
      
      stack backtrace:
      CPU: 1 PID: 8095 Comm: syz-executor.5 Not tainted 5.9.0-rc4-syzkaller #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       __dump_stack lib/dump_stack.c:77 [inline]
       dump_stack+0x198/0x1fd lib/dump_stack.c:118
       fib6_del+0x12b4/0x1630 net/ipv6/ip6_fib.c:1996
       fib6_clean_node+0x39b/0x570 net/ipv6/ip6_fib.c:2180
       fib6_walk_continue+0x4aa/0x8e0 net/ipv6/ip6_fib.c:2102
       fib6_walk+0x182/0x370 net/ipv6/ip6_fib.c:2150
       fib6_clean_tree+0xdb/0x120 net/ipv6/ip6_fib.c:2230
       __fib6_clean_all+0x120/0x290 net/ipv6/ip6_fib.c:2246
       fib6_clean_all net/ipv6/ip6_fib.c:2257 [inline]
       fib6_run_gc+0x113/0x2d0 net/ipv6/ip6_fib.c:2320
       ndisc_netdev_event+0x217/0x350 net/ipv6/ndisc.c:1805
       notifier_call_chain+0xb5/0x200 kernel/notifier.c:83
       call_netdevice_notifiers_info+0xb5/0x130 net/core/dev.c:2033
       call_netdevice_notifiers_extack net/core/dev.c:2045 [inline]
       call_netdevice_notifiers net/core/dev.c:2059 [inline]
       dev_close_many+0x30b/0x650 net/core/dev.c:1634
       rollback_registered_many+0x3a8/0x1210 net/core/dev.c:9261
       rollback_registered net/core/dev.c:9329 [inline]
       unregister_netdevice_queue+0x2dd/0x570 net/core/dev.c:10410
       unregister_netdevice include/linux/netdevice.h:2774 [inline]
       ppp_release+0x216/0x240 drivers/net/ppp/ppp_generic.c:403
       __fput+0x285/0x920 fs/file_table.c:281
       task_work_run+0xdd/0x190 kernel/task_work.c:141
       tracehook_notify_resume include/linux/tracehook.h:188 [inline]
       exit_to_user_mode_loop kernel/entry/common.c:163 [inline]
       exit_to_user_mode_prepare+0x1e1/0x200 kernel/entry/common.c:190
       syscall_exit_to_user_mode+0x7e/0x2e0 kernel/entry/common.c:265
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      Fixes: 421842ed ("net/ipv6: Add fib6_null_entry")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: David Ahern <dsahern@gmail.com>
      Reviewed-by: default avatarDavid Ahern <dsahern@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f2e5359d
    • Wei Wang's avatar
      ip: fix tos reflection in ack and reset packets · 2fc322bf
      Wei Wang authored
      [ Upstream commit ba9e04a7 ]
      
      Currently, in tcp_v4_reqsk_send_ack() and tcp_v4_send_reset(), we
      echo the TOS value of the received packets in the response.
      However, we do not want to echo the lower 2 ECN bits in accordance
      with RFC 3168 6.1.5 robustness principles.
      
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Signed-off-by: default avatarWei Wang <weiwan@google.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2fc322bf
    • Dan Carpenter's avatar
      hdlc_ppp: add range checks in ppp_cp_parse_cr() · 45676c0b
      Dan Carpenter authored
      [ Upstream commit 66d42ed8 ]
      
      There are a couple bugs here:
      1) If opt[1] is zero then this results in a forever loop.  If the value
         is less than 2 then it is invalid.
      2) It assumes that "len" is more than sizeof(valid_accm) or 6 which can
         result in memory corruption.
      
      In the case of LCP_OPTION_ACCM, then  we should check "opt[1]" instead
      of "len" because, if "opt[1]" is less than sizeof(valid_accm) then
      "nak_len" gets out of sync and it can lead to memory corruption in the
      next iterations through the loop.  In case of LCP_OPTION_MAGIC, the
      only valid value for opt[1] is 6, but the code is trying to log invalid
      data so we should only discard the data when "len" is less than 6
      because that leads to a read overflow.
      Reported-by: default avatarChenNan Of Chaitin Security Research Lab  <whutchennan@gmail.com>
      Fixes: e022c2f0 ("WAN: new synchronous PPP implementation for generic HDLC.")
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Reviewed-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      45676c0b
    • Mark Gray's avatar
      geneve: add transport ports in route lookup for geneve · c797110d
      Mark Gray authored
      [ Upstream commit 34beb215 ]
      
      This patch adds transport ports information for route lookup so that
      IPsec can select Geneve tunnel traffic to do encryption. This is
      needed for OVS/OVN IPsec with encrypted Geneve tunnels.
      
      This can be tested by configuring a host-host VPN using an IKE
      daemon and specifying port numbers. For example, for an
      Openswan-type configuration, the following parameters should be
      configured on both hosts and IPsec set up as-per normal:
      
      $ cat /etc/ipsec.conf
      
      conn in
      ...
      left=$IP1
      right=$IP2
      ...
      leftprotoport=udp/6081
      rightprotoport=udp
      ...
      conn out
      ...
      left=$IP1
      right=$IP2
      ...
      leftprotoport=udp
      rightprotoport=udp/6081
      ...
      
      The tunnel can then be setup using "ip" on both hosts (but
      changing the relevant IP addresses):
      
      $ ip link add tun type geneve id 1000 remote $IP2
      $ ip addr add 192.168.0.1/24 dev tun
      $ ip link set tun up
      
      This can then be tested by pinging from $IP1:
      
      $ ping 192.168.0.2
      
      Without this patch the traffic is unencrypted on the wire.
      
      Fixes: 2d07dc79 ("geneve: add initial netdev driver for GENEVE tunnels")
      Signed-off-by: default avatarQiuyu Xiao <qiuyu.xiao.qyx@gmail.com>
      Signed-off-by: default avatarMark Gray <mark.d.gray@redhat.com>
      Reviewed-by: default avatarGreg Rose <gvrose8192@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c797110d
    • Ganji Aravind's avatar
      cxgb4: Fix offset when clearing filter byte counters · 35145dab
      Ganji Aravind authored
      [ Upstream commit 94cc242a ]
      
      Pass the correct offset to clear the stale filter hit
      bytes counter. Otherwise, the counter starts incrementing
      from the stale information, instead of 0.
      
      Fixes: 12b276fb ("cxgb4: add support to create hash filters")
      Signed-off-by: default avatarGanji Aravind <ganji.aravind@chelsio.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      35145dab
    • Ralph Campbell's avatar
      mm/thp: fix __split_huge_pmd_locked() for migration PMD · ec56646e
      Ralph Campbell authored
      [ Upstream commit ec0abae6 ]
      
      A migrating transparent huge page has to already be unmapped.  Otherwise,
      the page could be modified while it is being copied to a new page and data
      could be lost.  The function __split_huge_pmd() checks for a PMD migration
      entry before calling __split_huge_pmd_locked() leading one to think that
      __split_huge_pmd_locked() can handle splitting a migrating PMD.
      
      However, the code always increments the page->_mapcount and adjusts the
      memory control group accounting assuming the page is mapped.
      
      Also, if the PMD entry is a migration PMD entry, the call to
      is_huge_zero_pmd(*pmd) is incorrect because it calls pmd_pfn(pmd) instead
      of migration_entry_to_pfn(pmd_to_swp_entry(pmd)).  Fix these problems by
      checking for a PMD migration entry.
      
      Fixes: 84c3fc4e ("mm: thp: check pmd migration entry in common path")
      Signed-off-by: default avatarRalph Campbell <rcampbell@nvidia.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Reviewed-by: default avatarYang Shi <shy828301@gmail.com>
      Reviewed-by: default avatarZi Yan <ziy@nvidia.com>
      Cc: Jerome Glisse <jglisse@redhat.com>
      Cc: John Hubbard <jhubbard@nvidia.com>
      Cc: Alistair Popple <apopple@nvidia.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Jason Gunthorpe <jgg@nvidia.com>
      Cc: Bharata B Rao <bharata@linux.ibm.com>
      Cc: Ben Skeggs <bskeggs@redhat.com>
      Cc: Shuah Khan <shuah@kernel.org>
      Cc: <stable@vger.kernel.org>	[4.14+]
      Link: https://lkml.kernel.org/r/20200903183140.19055-1-rcampbell@nvidia.comSigned-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      ec56646e
    • Muchun Song's avatar
      kprobes: fix kill kprobe which has been marked as gone · d44a4378
      Muchun Song authored
      [ Upstream commit b0399092 ]
      
      If a kprobe is marked as gone, we should not kill it again.  Otherwise, we
      can disarm the kprobe more than once.  In that case, the statistics of
      kprobe_ftrace_enabled can unbalance which can lead to that kprobe do not
      work.
      
      Fixes: e8386a0c ("kprobes: support probing module __exit function")
      Co-developed-by: default avatarChengming Zhou <zhouchengming@bytedance.com>
      Signed-off-by: default avatarMuchun Song <songmuchun@bytedance.com>
      Signed-off-by: default avatarChengming Zhou <zhouchengming@bytedance.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Acked-by: default avatarMasami Hiramatsu <mhiramat@kernel.org>
      Cc: "Naveen N . Rao" <naveen.n.rao@linux.ibm.com>
      Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Song Liu <songliubraving@fb.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: <stable@vger.kernel.org>
      Link: https://lkml.kernel.org/r/20200822030055.32383-1-songmuchun@bytedance.comSigned-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      d44a4378
    • Rustam Kovhaev's avatar
      KVM: fix memory leak in kvm_io_bus_unregister_dev() · 19184bd0
      Rustam Kovhaev authored
      [ Upstream commit f6588660 ]
      
      when kmalloc() fails in kvm_io_bus_unregister_dev(), before removing
      the bus, we should iterate over all other devices linked to it and call
      kvm_iodevice_destructor() for them
      
      Fixes: 90db1043 ("KVM: kvm_io_bus_unregister_dev() should never fail")
      Cc: stable@vger.kernel.org
      Reported-and-tested-by: syzbot+f196caa45793d6374707@syzkaller.appspotmail.com
      Link: https://syzkaller.appspot.com/bug?extid=f196caa45793d6374707Signed-off-by: default avatarRustam Kovhaev <rkovhaev@gmail.com>
      Reviewed-by: default avatarVitaly Kuznetsov <vkuznets@redhat.com>
      Message-Id: <20200907185535.233114-1-rkovhaev@gmail.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      19184bd0
    • Mark Salyzyn's avatar
      af_key: pfkey_dump needs parameter validation · b59a23d5
      Mark Salyzyn authored
      commit 37bd2242 upstream.
      
      In pfkey_dump() dplen and splen can both be specified to access the
      xfrm_address_t structure out of bounds in__xfrm_state_filter_match()
      when it calls addr_match() with the indexes.  Return EINVAL if either
      are out of range.
      Signed-off-by: default avatarMark Salyzyn <salyzyn@android.com>
      Cc: netdev@vger.kernel.org
      Cc: linux-kernel@vger.kernel.org
      Cc: kernel-team@android.com
      Cc: Steffen Klassert <steffen.klassert@secunet.com>
      Cc: Herbert Xu <herbert@gondor.apana.org.au>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Jakub Kicinski <kuba@kernel.org>
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Signed-off-by: default avatarSteffen Klassert <steffen.klassert@secunet.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b59a23d5
  2. 23 Sep, 2020 17 commits