1. 19 May, 2018 27 commits
    • Jiri Slaby's avatar
      futex: Remove duplicated code and fix undefined behaviour · 81da9f87
      Jiri Slaby authored
      commit 30d6e0a4 upstream.
      
      There is code duplicated over all architecture's headers for
      futex_atomic_op_inuser. Namely op decoding, access_ok check for uaddr,
      and comparison of the result.
      
      Remove this duplication and leave up to the arches only the needed
      assembly which is now in arch_futex_atomic_op_inuser.
      
      This effectively distributes the Will Deacon's arm64 fix for undefined
      behaviour reported by UBSAN to all architectures. The fix was done in
      commit 5f16a046 (arm64: futex: Fix undefined behaviour with
      FUTEX_OP_OPARG_SHIFT usage). Look there for an example dump.
      
      And as suggested by Thomas, check for negative oparg too, because it was
      also reported to cause undefined behaviour report.
      
      Note that s390 removed access_ok check in d12a2970 ("s390/uaccess:
      remove pointless access_ok() checks") as access_ok there returns true.
      We introduce it back to the helper for the sake of simplicity (it gets
      optimized away anyway).
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Acked-by: default avatarRussell King <rmk+kernel@armlinux.org.uk>
      Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
      Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> [s390]
      Acked-by: Chris Metcalf <cmetcalf@mellanox.com> [for tile]
      Reviewed-by: default avatarDarren Hart (VMware) <dvhart@infradead.org>
      Reviewed-by: Will Deacon <will.deacon@arm.com> [core/arm64]
      Cc: linux-mips@linux-mips.org
      Cc: Rich Felker <dalias@libc.org>
      Cc: linux-ia64@vger.kernel.org
      Cc: linux-sh@vger.kernel.org
      Cc: peterz@infradead.org
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: sparclinux@vger.kernel.org
      Cc: Jonas Bonn <jonas@southpole.se>
      Cc: linux-s390@vger.kernel.org
      Cc: linux-arch@vger.kernel.org
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Cc: linux-hexagon@vger.kernel.org
      Cc: Helge Deller <deller@gmx.de>
      Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: linux-snps-arc@lists.infradead.org
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: linux-xtensa@linux-xtensa.org
      Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi>
      Cc: openrisc@lists.librecores.org
      Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
      Cc: Stafford Horne <shorne@gmail.com>
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: Richard Henderson <rth@twiddle.net>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: linux-parisc@vger.kernel.org
      Cc: Vineet Gupta <vgupta@synopsys.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Richard Kuo <rkuo@codeaurora.org>
      Cc: linux-alpha@vger.kernel.org
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: linuxppc-dev@lists.ozlabs.org
      Cc: "David S. Miller" <davem@davemloft.net>
      Link: http://lkml.kernel.org/r/20170824073105.3901-1-jslaby@suse.cz
      Cc: Ben Hutchings <ben.hutchings@codethink.co.uk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      81da9f87
    • Alexey Khoroshilov's avatar
      serial: sccnxp: Fix error handling in sccnxp_probe() · 8c5e7b07
      Alexey Khoroshilov authored
      commit c9126143 upstream.
      
      sccnxp_probe() returns result of regulator_disable() that may lead
      to returning zero, while device is not properly initialized.
      Also the driver enables clocks, but it does not disable it.
      
      Found by Linux Driver Verification project (linuxtesting.org).
      Signed-off-by: default avatarAlexey Khoroshilov <khoroshilov@ispras.ru>
      Cc: Ben Hutchings <ben.hutchings@codethink.co.uk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8c5e7b07
    • Xin Long's avatar
      sctp: delay the authentication for the duplicated cookie-echo chunk · 0e67ad52
      Xin Long authored
      [ Upstream commit 59d8d443 ]
      
      Now sctp only delays the authentication for the normal cookie-echo
      chunk by setting chunk->auth_chunk in sctp_endpoint_bh_rcv(). But
      for the duplicated one with auth, in sctp_assoc_bh_rcv(), it does
      authentication first based on the old asoc, which will definitely
      fail due to the different auth info in the old asoc.
      
      The duplicated cookie-echo chunk will create a new asoc with the
      auth info from this chunk, and the authentication should also be
      done with the new asoc's auth info for all of the collision 'A',
      'B' and 'D'. Otherwise, the duplicated cookie-echo chunk with auth
      will never pass the authentication and create the new connection.
      
      This issue exists since very beginning, and this fix is to make
      sctp_assoc_bh_rcv() follow the way sctp_endpoint_bh_rcv() does
      for the normal cookie-echo chunk to delay the authentication.
      
      While at it, remove the unused params from sctp_sf_authenticate()
      and define sctp_auth_chunk_verify() used for all the places that
      do the delayed authentication.
      
      v1->v2:
        fix the typo in changelog as Marcelo noticed.
      Acked-by: default avatarMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Acked-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0e67ad52
    • Xin Long's avatar
      sctp: fix the issue that the cookie-ack with auth can't get processed · db869e7d
      Xin Long authored
      [ Upstream commit ce402f04 ]
      
      When auth is enabled for cookie-ack chunk, in sctp_inq_pop, sctp
      processes auth chunk first, then continues to the next chunk in
      this packet if chunk_end + chunk_hdr size < skb_tail_pointer().
      Otherwise, it will go to the next packet or discard this chunk.
      
      However, it missed the fact that cookie-ack chunk's size is equal
      to chunk_hdr size, which couldn't match that check, and thus this
      chunk would not get processed.
      
      This patch fixes it by changing the check to chunk_end + chunk_hdr
      size <= skb_tail_pointer().
      
      Fixes: 26b87c78 ("net: sctp: fix remote memory pressure from excessive queueing")
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Acked-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      Acked-by: default avatarMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      db869e7d
    • Yuchung Cheng's avatar
      tcp: ignore Fast Open on repair mode · 832978fc
      Yuchung Cheng authored
      [ Upstream commit 16ae6aa1 ]
      
      The TCP repair sequence of operation is to first set the socket in
      repair mode, then inject the TCP stats into the socket with repair
      socket options, then call connect() to re-activate the socket. The
      connect syscall simply returns and set state to ESTABLISHED
      mode. As a result Fast Open is meaningless for TCP repair.
      
      However allowing sendto() system call with MSG_FASTOPEN flag half-way
      during the repair operation could unexpectedly cause data to be
      sent, before the operation finishes changing the internal TCP stats
      (e.g. MSS).  This in turn triggers TCP warnings on inconsistent
      packet accounting.
      
      The fix is to simply disallow Fast Open operation once the socket
      is in the repair mode.
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarYuchung Cheng <ycheng@google.com>
      Reviewed-by: default avatarNeal Cardwell <ncardwell@google.com>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      832978fc
    • Debabrata Banerjee's avatar
      bonding: send learning packets for vlans on slave · d7bfa99f
      Debabrata Banerjee authored
      [ Upstream commit 21706ee8 ]
      
      There was a regression at some point from the intended functionality of
      commit f60c3704 ("bonding: Fix alb mode to only use first level
      vlans.")
      
      Given the return value vlan_get_encap_level() we need to store the nest
      level of the bond device, and then compare the vlan's encap level to
      this. Without this, this check always fails and learning packets are
      never sent.
      
      In addition, this same commit caused a regression in the behavior of
      balance_alb, which requires learning packets be sent for all interfaces
      using the slave's mac in order to load balance properly. For vlan's
      that have not set a user mac, we can send after checking one bit.
      Otherwise we need send the set mac, albeit defeating rx load balancing
      for that vlan.
      Signed-off-by: default avatarDebabrata Banerjee <dbanerje@akamai.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d7bfa99f
    • Talat Batheesh's avatar
      net/mlx5: Avoid cleaning flow steering table twice during error flow · 8274cb81
      Talat Batheesh authored
      [ Upstream commit 9c26f5f8 ]
      
      When we fail to initialize the RX root namespace, we need
      to clean only that and not the entire flow steering.
      
      Currently the code may try to clean the flow steering twice
      on error witch leads to null pointer deference.
      Make sure we clean correctly.
      
      Fixes: fba53f7b ("net/mlx5: Introduce mlx5_flow_steering structure")
      Signed-off-by: default avatarTalat Batheesh <talatb@mellanox.com>
      Reviewed-by: default avatarMark Bloch <markb@mellanox.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8274cb81
    • Debabrata Banerjee's avatar
      bonding: do not allow rlb updates to invalid mac · 89f502a4
      Debabrata Banerjee authored
      [ Upstream commit 4fa8667c ]
      
      Make sure multicast, broadcast, and zero mac's cannot be the output of rlb
      updates, which should all be directed arps. Receive load balancing will be
      collapsed if any of these happen, as the switch will broadcast.
      Signed-off-by: default avatarDebabrata Banerjee <dbanerje@akamai.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      89f502a4
    • Michael Chan's avatar
      tg3: Fix vunmap() BUG_ON() triggered from tg3_free_consistent(). · d9793200
      Michael Chan authored
      [ Upstream commit d89a2adb ]
      
      tg3_free_consistent() calls dma_free_coherent() to free tp->hw_stats
      under spinlock and can trigger BUG_ON() in vunmap() because vunmap()
      may sleep.  Fix it by removing the spinlock and relying on the
      TG3_FLAG_INIT_COMPLETE flag to prevent race conditions between
      tg3_get_stats64() and tg3_free_consistent().  TG3_FLAG_INIT_COMPLETE
      is always cleared under tp->lock before tg3_free_consistent()
      and therefore tg3_get_stats64() can safely access tp->hw_stats
      under tp->lock if TG3_FLAG_INIT_COMPLETE is set.
      
      Fixes: f5992b72 ("tg3: Fix race condition in tg3_get_stats64().")
      Reported-by: default avatarZumeng Chen <zumeng.chen@gmail.com>
      Signed-off-by: default avatarMichael Chan <michael.chan@broadcom.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d9793200
    • Neal Cardwell's avatar
      tcp_bbr: fix to zero idle_restart only upon S/ACKed data · a9d361cf
      Neal Cardwell authored
      [ Upstream commit e6e6a278 ]
      
      Previously the bbr->idle_restart tracking was zeroing out the
      bbr->idle_restart bit upon ACKs that did not SACK or ACK anything,
      e.g. receiving incoming data or receiver window updates. In such
      situations BBR would forget that this was a restart-from-idle
      situation, and if the min_rtt had expired it would unnecessarily enter
      PROBE_RTT (even though we were actually restarting from idle but had
      merely forgotten that fact).
      
      The fix is simple: we need to remember we are restarting from idle
      until we receive a S/ACK for some data (a S/ACK for the first flight
      of data we send as we are restarting).
      
      This commit is a stable candidate for kernels back as far as 4.9.
      
      Fixes: 0f8782ea ("tcp_bbr: add BBR congestion control")
      Signed-off-by: default avatarNeal Cardwell <ncardwell@google.com>
      Signed-off-by: default avatarYuchung Cheng <ycheng@google.com>
      Signed-off-by: default avatarSoheil Hassas Yeganeh <soheil@google.com>
      Signed-off-by: default avatarPriyaranjan Jha <priyarjha@google.com>
      Signed-off-by: default avatarYousuk Seung <ysseung@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a9d361cf
    • Xin Long's avatar
      sctp: use the old asoc when making the cookie-ack chunk in dupcook_d · c832ac45
      Xin Long authored
      [ Upstream commit 46e16d4b ]
      
      When processing a duplicate cookie-echo chunk, for case 'D', sctp will
      not process the param from this chunk. It means old asoc has nothing
      to be updated, and the new temp asoc doesn't have the complete info.
      
      So there's no reason to use the new asoc when creating the cookie-ack
      chunk. Otherwise, like when auth is enabled for cookie-ack, the chunk
      can not be set with auth, and it will definitely be dropped by peer.
      
      This issue is there since very beginning, and we fix it by using the
      old asoc instead.
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Acked-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      Acked-by: default avatarMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c832ac45
    • Xin Long's avatar
      sctp: remove sctp_chunk_put from fail_mark err path in sctp_ulpevent_make_rcvmsg · 1f2b77e8
      Xin Long authored
      [ Upstream commit 6910e25d ]
      
      In Commit 1f45f78f ("sctp: allow GSO frags to access the chunk too"),
      it held the chunk in sctp_ulpevent_make_rcvmsg to access it safely later
      in recvmsg. However, it also added sctp_chunk_put in fail_mark err path,
      which is only triggered before holding the chunk.
      
      syzbot reported a use-after-free crash happened on this err path, where
      it shouldn't call sctp_chunk_put.
      
      This patch simply removes this call.
      
      Fixes: 1f45f78f ("sctp: allow GSO frags to access the chunk too")
      Reported-by: syzbot+141d898c5f24489db4aa@syzkaller.appspotmail.com
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Acked-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      Acked-by: default avatarMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1f2b77e8
    • Xin Long's avatar
      sctp: handle two v4 addrs comparison in sctp_inet6_cmp_addr · f9a670e1
      Xin Long authored
      [ Upstream commit d625329b ]
      
      Since sctp ipv6 socket also supports v4 addrs, it's possible to
      compare two v4 addrs in pf v6 .cmp_addr, sctp_inet6_cmp_addr.
      
      However after Commit 1071ec9d ("sctp: do not check port in
      sctp_inet6_cmp_addr"), it no longer calls af1->cmp_addr, which
      in this case is sctp_v4_cmp_addr, but calls __sctp_v6_cmp_addr
      where it handles them as two v6 addrs. It would cause a out of
      bounds crash.
      
      syzbot found this crash when trying to bind two v4 addrs to a
      v6 socket.
      
      This patch fixes it by adding the process for two v4 addrs in
      sctp_inet6_cmp_addr.
      
      Fixes: 1071ec9d ("sctp: do not check port in sctp_inet6_cmp_addr")
      Reported-by: syzbot+cd494c1dd681d4d93ebb@syzkaller.appspotmail.com
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Acked-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      Acked-by: default avatarMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f9a670e1
    • Heiner Kallweit's avatar
      r8169: fix powering up RTL8168h · 4a5de2f9
      Heiner Kallweit authored
      [ Upstream commit 3148dedf ]
      
      Since commit a92a0849 "r8169: improve runtime pm in general and
      suspend unused ports" interfaces w/o link are runtime-suspended after
      10s. On systems where drivers take longer to load this can lead to the
      situation that the interface is runtime-suspended already when it's
      initially brought up.
      This shouldn't be a problem because rtl_open() resumes MAC/PHY.
      However with at least one chip version the interface doesn't properly
      come up, as reported here:
      https://bugzilla.kernel.org/show_bug.cgi?id=199549
      
      The vendor driver uses a delay to give certain chip versions some
      time to resume before starting the PHY configuration. So let's do
      the same. I don't know which chip versions may be affected,
      therefore apply this delay always.
      
      This patch was reported to fix the issue for RTL8168h.
      I was able to reproduce the issue on an Asus H310I-Plus which also
      uses a RTL8168h. Also in my case the patch fixed the issue.
      Reported-by: default avatarSlava Kardakov <ojab@ojab.ru>
      Tested-by: default avatarSlava Kardakov <ojab@ojab.ru>
      Signed-off-by: default avatarHeiner Kallweit <hkallweit1@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4a5de2f9
    • Bjørn Mork's avatar
      qmi_wwan: do not steal interfaces from class drivers · 7b863f6b
      Bjørn Mork authored
      [ Upstream commit 5697db4a ]
      
      The USB_DEVICE_INTERFACE_NUMBER matching macro assumes that
      the { vendorid, productid, interfacenumber } set uniquely
      identifies one specific function.  This has proven to fail
      for some configurable devices. One example is the Quectel
      EM06/EP06 where the same interface number can be either
      QMI or MBIM, without the device ID changing either.
      
      Fix by requiring the vendor-specific class for interface number
      based matching.  Functions of other classes can and should use
      class based matching instead.
      
      Fixes: 03304bcb ("net: qmi_wwan: use fixed interface number matching")
      Signed-off-by: default avatarBjørn Mork <bjorn@mork.no>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7b863f6b
    • Stefano Brivio's avatar
      openvswitch: Don't swap table in nlattr_set() after OVS_ATTR_NESTED is found · 32a42d5f
      Stefano Brivio authored
      [ Upstream commit 72f17baf ]
      
      If an OVS_ATTR_NESTED attribute type is found while walking
      through netlink attributes, we call nlattr_set() recursively
      passing the length table for the following nested attributes, if
      different from the current one.
      
      However, once we're done with those sub-nested attributes, we
      should continue walking through attributes using the current
      table, instead of using the one related to the sub-nested
      attributes.
      
      For example, given this sequence:
      
      1  OVS_KEY_ATTR_PRIORITY
      2  OVS_KEY_ATTR_TUNNEL
      3	OVS_TUNNEL_KEY_ATTR_ID
      4	OVS_TUNNEL_KEY_ATTR_IPV4_SRC
      5	OVS_TUNNEL_KEY_ATTR_IPV4_DST
      6	OVS_TUNNEL_KEY_ATTR_TTL
      7	OVS_TUNNEL_KEY_ATTR_TP_SRC
      8	OVS_TUNNEL_KEY_ATTR_TP_DST
      9  OVS_KEY_ATTR_IN_PORT
      10 OVS_KEY_ATTR_SKB_MARK
      11 OVS_KEY_ATTR_MPLS
      
      we switch to the 'ovs_tunnel_key_lens' table on attribute #3,
      and we don't switch back to 'ovs_key_lens' while setting
      attributes #9 to #11 in the sequence. As OVS_KEY_ATTR_MPLS
      evaluates to 21, and the array size of 'ovs_tunnel_key_lens' is
      15, we also get this kind of KASan splat while accessing the
      wrong table:
      
      [ 7654.586496] ==================================================================
      [ 7654.594573] BUG: KASAN: global-out-of-bounds in nlattr_set+0x164/0xde9 [openvswitch]
      [ 7654.603214] Read of size 4 at addr ffffffffc169ecf0 by task handler29/87430
      [ 7654.610983]
      [ 7654.612644] CPU: 21 PID: 87430 Comm: handler29 Kdump: loaded Not tainted 3.10.0-866.el7.test.x86_64 #1
      [ 7654.623030] Hardware name: Dell Inc. PowerEdge R730/072T6D, BIOS 2.1.7 06/16/2016
      [ 7654.631379] Call Trace:
      [ 7654.634108]  [<ffffffffb65a7c50>] dump_stack+0x19/0x1b
      [ 7654.639843]  [<ffffffffb53ff373>] print_address_description+0x33/0x290
      [ 7654.647129]  [<ffffffffc169b37b>] ? nlattr_set+0x164/0xde9 [openvswitch]
      [ 7654.654607]  [<ffffffffb53ff812>] kasan_report.part.3+0x242/0x330
      [ 7654.661406]  [<ffffffffb53ff9b4>] __asan_report_load4_noabort+0x34/0x40
      [ 7654.668789]  [<ffffffffc169b37b>] nlattr_set+0x164/0xde9 [openvswitch]
      [ 7654.676076]  [<ffffffffc167ef68>] ovs_nla_get_match+0x10c8/0x1900 [openvswitch]
      [ 7654.684234]  [<ffffffffb61e9cc8>] ? genl_rcv+0x28/0x40
      [ 7654.689968]  [<ffffffffb61e7733>] ? netlink_unicast+0x3f3/0x590
      [ 7654.696574]  [<ffffffffc167dea0>] ? ovs_nla_put_tunnel_info+0xb0/0xb0 [openvswitch]
      [ 7654.705122]  [<ffffffffb4f41b50>] ? unwind_get_return_address+0xb0/0xb0
      [ 7654.712503]  [<ffffffffb65d9355>] ? system_call_fastpath+0x1c/0x21
      [ 7654.719401]  [<ffffffffb4f41d79>] ? update_stack_state+0x229/0x370
      [ 7654.726298]  [<ffffffffb4f41d79>] ? update_stack_state+0x229/0x370
      [ 7654.733195]  [<ffffffffb53fe4b5>] ? kasan_unpoison_shadow+0x35/0x50
      [ 7654.740187]  [<ffffffffb53fe62a>] ? kasan_kmalloc+0xaa/0xe0
      [ 7654.746406]  [<ffffffffb53fec32>] ? kasan_slab_alloc+0x12/0x20
      [ 7654.752914]  [<ffffffffb53fe711>] ? memset+0x31/0x40
      [ 7654.758456]  [<ffffffffc165bf92>] ovs_flow_cmd_new+0x2b2/0xf00 [openvswitch]
      
      [snip]
      
      [ 7655.132484] The buggy address belongs to the variable:
      [ 7655.138226]  ovs_tunnel_key_lens+0xf0/0xffffffffffffd400 [openvswitch]
      [ 7655.145507]
      [ 7655.147166] Memory state around the buggy address:
      [ 7655.152514]  ffffffffc169eb80: 00 00 00 00 00 00 00 00 00 00 fa fa fa fa fa fa
      [ 7655.160585]  ffffffffc169ec00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
      [ 7655.168644] >ffffffffc169ec80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 fa fa
      [ 7655.176701]                                                              ^
      [ 7655.184372]  ffffffffc169ed00: fa fa fa fa 00 00 00 00 fa fa fa fa 00 00 00 05
      [ 7655.192431]  ffffffffc169ed80: fa fa fa fa 00 00 00 00 00 00 00 00 00 00 00 00
      [ 7655.200490] ==================================================================
      Reported-by: default avatarHangbin Liu <liuhangbin@gmail.com>
      Fixes: 982b5270 ("openvswitch: Fix mask generation for nested attributes.")
      Signed-off-by: default avatarStefano Brivio <sbrivio@redhat.com>
      Reviewed-by: default avatarSabrina Dubroca <sd@queasysnail.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      32a42d5f
    • Lance Richardson's avatar
      net: support compat 64-bit time in {s,g}etsockopt · 51d2a5e7
      Lance Richardson authored
      [ Upstream commit 988bf724 ]
      
      For the x32 ABI, struct timeval has two 64-bit fields. However
      the kernel currently interprets the user-space values used for
      the SO_RCVTIMEO and SO_SNDTIMEO socket options as having a pair
      of 32-bit fields.
      
      When the seconds portion of the requested timeout is less than 2**32,
      the seconds portion of the effective timeout is correct but the
      microseconds portion is zero.  When the seconds portion of the
      requested timeout is zero and the microseconds portion is non-zero,
      the kernel interprets the timeout as zero (never timeout).
      
      Fix by using 64-bit time for SO_RCVTIMEO/SO_SNDTIMEO as required
      for the ABI.
      
      The code included below demonstrates the problem.
      
      Results before patch:
          $ gcc -m64 -Wall -O2 -o socktmo socktmo.c && ./socktmo
          recv time: 2.008181 seconds
          send time: 2.015985 seconds
      
          $ gcc -m32 -Wall -O2 -o socktmo socktmo.c && ./socktmo
          recv time: 2.016763 seconds
          send time: 2.016062 seconds
      
          $ gcc -mx32 -Wall -O2 -o socktmo socktmo.c && ./socktmo
          recv time: 1.007239 seconds
          send time: 1.023890 seconds
      
      Results after patch:
          $ gcc -m64 -O2 -Wall -o socktmo socktmo.c && ./socktmo
          recv time: 2.010062 seconds
          send time: 2.015836 seconds
      
          $ gcc -m32 -O2 -Wall -o socktmo socktmo.c && ./socktmo
          recv time: 2.013974 seconds
          send time: 2.015981 seconds
      
          $ gcc -mx32 -O2 -Wall -o socktmo socktmo.c && ./socktmo
          recv time: 2.030257 seconds
          send time: 2.013383 seconds
      
       #include <stdio.h>
       #include <stdlib.h>
       #include <sys/socket.h>
       #include <sys/types.h>
       #include <sys/time.h>
      
       void checkrc(char *str, int rc)
       {
               if (rc >= 0)
                       return;
      
               perror(str);
               exit(1);
       }
      
       static char buf[1024];
       int main(int argc, char **argv)
       {
               int rc;
               int socks[2];
               struct timeval tv;
               struct timeval start, end, delta;
      
               rc = socketpair(AF_UNIX, SOCK_STREAM, 0, socks);
               checkrc("socketpair", rc);
      
               /* set timeout to 1.999999 seconds */
               tv.tv_sec = 1;
               tv.tv_usec = 999999;
               rc = setsockopt(socks[0], SOL_SOCKET, SO_RCVTIMEO, &tv, sizeof tv);
               rc = setsockopt(socks[0], SOL_SOCKET, SO_SNDTIMEO, &tv, sizeof tv);
               checkrc("setsockopt", rc);
      
               /* measure actual receive timeout */
               gettimeofday(&start, NULL);
               rc = recv(socks[0], buf, sizeof buf, 0);
               gettimeofday(&end, NULL);
               timersub(&end, &start, &delta);
      
               printf("recv time: %ld.%06ld seconds\n",
                      (long)delta.tv_sec, (long)delta.tv_usec);
      
               /* fill send buffer */
               do {
                       rc = send(socks[0], buf, sizeof buf, 0);
               } while (rc > 0);
      
               /* measure actual send timeout */
               gettimeofday(&start, NULL);
               rc = send(socks[0], buf, sizeof buf, 0);
               gettimeofday(&end, NULL);
               timersub(&end, &start, &delta);
      
               printf("send time: %ld.%06ld seconds\n",
                      (long)delta.tv_sec, (long)delta.tv_usec);
               exit(0);
       }
      
      Fixes: 515c7af8 ("x32: Use compat shims for {g,s}etsockopt")
      Reported-by: default avatarGopal RajagopalSai <gopalsr83@gmail.com>
      Signed-off-by: default avatarLance Richardson <lance.richardson.net@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      51d2a5e7
    • Eric Dumazet's avatar
      net_sched: fq: take care of throttled flows before reuse · c8b54621
      Eric Dumazet authored
      [ Upstream commit 7df40c26 ]
      
      Normally, a socket can not be freed/reused unless all its TX packets
      left qdisc and were TX-completed. However connect(AF_UNSPEC) allows
      this to happen.
      
      With commit fc59d5bd ("pkt_sched: fq: clear time_next_packet for
      reused flows") we cleared f->time_next_packet but took no special
      action if the flow was still in the throttled rb-tree.
      
      Since f->time_next_packet is the key used in the rb-tree searches,
      blindly clearing it might break rb-tree integrity. We need to make
      sure the flow is no longer in the rb-tree to avoid this problem.
      
      Fixes: fc59d5bd ("pkt_sched: fq: clear time_next_packet for reused flows")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c8b54621
    • Adi Nissim's avatar
      net/mlx5: E-Switch, Include VF RDMA stats in vport statistics · a541ccf5
      Adi Nissim authored
      [ Upstream commit 88d725bb ]
      
      The host side reporting of VF vport statistics didn't include the VF
      RDMA traffic.
      
      Fixes: 3b751a2a ("net/mlx5: E-Switch, Introduce get vf statistics")
      Signed-off-by: default avatarAdi Nissim <adin@mellanox.com>
      Reported-by: default avatarAriel Almog <ariela@mellanox.com>
      Reviewed-by: default avatarOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a541ccf5
    • Moshe Shemesh's avatar
      net/mlx4_en: Verify coalescing parameters are in range · a73d97e2
      Moshe Shemesh authored
      [ Upstream commit 6ad4e91c ]
      
      Add check of coalescing parameters received through ethtool are within
      range of values supported by the HW.
      Driver gets the coalescing rx/tx-usecs and rx/tx-frames as set by the
      users through ethtool. The ethtool support up to 32 bit value for each.
      However, mlx4 modify cq limits the coalescing time parameter and
      coalescing frames parameters to 16 bits.
      Return out of range error if user tries to set these parameters to
      higher values.
      Change type of sample-interval and adaptive_rx_coal parameters in mlx4
      driver to u32 as the ethtool holds them as u32 and these parameters are
      not limited due to mlx4 HW.
      
      Fixes: c27a02cd ('mlx4_en: Add driver for Mellanox ConnectX 10GbE NIC')
      Signed-off-by: default avatarMoshe Shemesh <moshe@mellanox.com>
      Signed-off-by: default avatarTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a73d97e2
    • Grygorii Strashko's avatar
      net: ethernet: ti: cpsw: fix packet leaking in dual_mac mode · b26c7bec
      Grygorii Strashko authored
      [ Upstream commit 5e5add17 ]
      
      In dual_mac mode packets arrived on one port should not be forwarded by
      switch hw to another port. Only Linux Host can forward packets between
      ports. The below test case (reported in [1]) shows that packet arrived on
      one port can be leaked to anoter (reproducible with dual port evms):
       - connect port 1 (eth0) to linux Host 0 and run tcpdump or Wireshark
       - connect port 2 (eth1) to linux Host 1 with vlan 1 configured
       - ping <IPx> from Host 1 through vlan 1 interface.
      ARP packets will be seen on Host 0.
      
      Issue happens because dual_mac mode is implemnted using two vlans: 1 (Port
      1+Port 0) and 2 (Port 2+Port 0), so there are vlan records created for for
      each vlan. By default, the ALE will find valid vlan record in its table
      when vlan 1 tagged packet arrived on Port 2 and so forwards packet to all
      ports which are vlan 1 members (like Port.
      
      To avoid such behaviorr the ALE VLAN ID Ingress Check need to be enabled
      for each external CPSW port (ALE_PORTCTLn.VID_INGRESS_CHECK) so ALE will
      drop ingress packets if Rx port is not VLAN member.
      Signed-off-by: default avatarGrygorii Strashko <grygorii.strashko@ti.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b26c7bec
    • Rob Taglang's avatar
      net: ethernet: sun: niu set correct packet size in skb · 1ed74a5b
      Rob Taglang authored
      [ Upstream commit 14224923 ]
      
      Currently, skb->len and skb->data_len are set to the page size, not
      the packet size. This causes the frame check sequence to not be
      located at the "end" of the packet resulting in ethernet frame check
      errors. The driver does work currently, but stricter kernel facing
      networking solutions like OpenVSwitch will drop these packets as
      invalid.
      
      These changes set the packet size correctly so that these errors no
      longer occur. The length does not include the frame check sequence, so
      that subtraction was removed.
      
      Tested on Oracle/SUN Multithreaded 10-Gigabit Ethernet Network
      Controller [108e:abcd] and validated in wireshark.
      Signed-off-by: default avatarRob Taglang <rob@taglang.io>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1ed74a5b
    • Eric Dumazet's avatar
      llc: better deal with too small mtu · cf7ef0af
      Eric Dumazet authored
      [ Upstream commit 2c5d5b13 ]
      
      syzbot loves to set very small mtu on devices, since it brings joy.
      We must make llc_ui_sendmsg() fool proof.
      
      usercopy: Kernel memory overwrite attempt detected to wrapped address (offset 0, size 18446612139802320068)!
      
      kernel BUG at mm/usercopy.c:100!
      invalid opcode: 0000 [#1] SMP KASAN
      Dumping ftrace buffer:
         (ftrace buffer empty)
      Modules linked in:
      CPU: 0 PID: 17464 Comm: syz-executor1 Not tainted 4.17.0-rc3+ #36
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      RIP: 0010:usercopy_abort+0xbb/0xbd mm/usercopy.c:88
      RSP: 0018:ffff8801868bf800 EFLAGS: 00010282
      RAX: 000000000000006c RBX: ffffffff87d2fb00 RCX: 0000000000000000
      RDX: 000000000000006c RSI: ffffffff81610731 RDI: ffffed0030d17ef6
      RBP: ffff8801868bf858 R08: ffff88018daa4200 R09: ffffed003b5c4fb0
      R10: ffffed003b5c4fb0 R11: ffff8801dae27d87 R12: ffffffff87d2f8e0
      R13: ffffffff87d2f7a0 R14: ffffffff87d2f7a0 R15: ffffffff87d2f7a0
      FS:  00007f56a14ac700(0000) GS:ffff8801dae00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 0000001b2bc21000 CR3: 00000001abeb1000 CR4: 00000000001426f0
      DR0: 0000000020000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000030602
      Call Trace:
       check_bogus_address mm/usercopy.c:153 [inline]
       __check_object_size+0x5d9/0x5d9 mm/usercopy.c:256
       check_object_size include/linux/thread_info.h:108 [inline]
       check_copy_size include/linux/thread_info.h:139 [inline]
       copy_from_iter_full include/linux/uio.h:121 [inline]
       memcpy_from_msg include/linux/skbuff.h:3305 [inline]
       llc_ui_sendmsg+0x4b1/0x1530 net/llc/af_llc.c:941
       sock_sendmsg_nosec net/socket.c:629 [inline]
       sock_sendmsg+0xd5/0x120 net/socket.c:639
       __sys_sendto+0x3d7/0x670 net/socket.c:1789
       __do_sys_sendto net/socket.c:1801 [inline]
       __se_sys_sendto net/socket.c:1797 [inline]
       __x64_sys_sendto+0xe1/0x1a0 net/socket.c:1797
       do_syscall_64+0x1b1/0x800 arch/x86/entry/common.c:287
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      RIP: 0033:0x455979
      RSP: 002b:00007f56a14abc68 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
      RAX: ffffffffffffffda RBX: 00007f56a14ac6d4 RCX: 0000000000455979
      RDX: 0000000000000000 RSI: 0000000020000000 RDI: 0000000000000018
      RBP: 000000000072bea0 R08: 00000000200012c0 R09: 0000000000000010
      R10: 0000000000000000 R11: 0000000000000246 R12: 00000000ffffffff
      R13: 0000000000000548 R14: 00000000006fbf60 R15: 0000000000000000
      Code: 55 c0 e8 c0 55 bb ff ff 75 c8 48 8b 55 c0 4d 89 f9 ff 75 d0 4d 89 e8 48 89 d9 4c 89 e6 41 56 48 c7 c7 80 fa d2 87 e8 a0 0b a3 ff <0f> 0b e8 95 55 bb ff e8 c0 a8 f7 ff 8b 95 14 ff ff ff 4d 89 e8
      RIP: usercopy_abort+0xbb/0xbd mm/usercopy.c:88 RSP: ffff8801868bf800
      
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      cf7ef0af
    • Andrey Ignatov's avatar
      ipv4: fix memory leaks in udp_sendmsg, ping_v4_sendmsg · d664986f
      Andrey Ignatov authored
      [ Upstream commit 1b97013b ]
      
      Fix more memory leaks in ip_cmsg_send() callers. Part of them were fixed
      earlier in 91948309.
      
      * udp_sendmsg one was there since the beginning when linux sources were
        first added to git;
      * ping_v4_sendmsg one was copy/pasted in c319b4d7.
      
      Whenever return happens in udp_sendmsg() or ping_v4_sendmsg() IP options
      have to be freed if they were allocated previously.
      
      Add label so that future callers (if any) can use it instead of kfree()
      before return that is easy to forget.
      
      Fixes: c319b4d7 (net: ipv4: add IPPROTO_ICMP socket kind)
      Signed-off-by: default avatarAndrey Ignatov <rdna@fb.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d664986f
    • Eric Dumazet's avatar
      dccp: fix tasklet usage · aef419ef
      Eric Dumazet authored
      [ Upstream commit a8d7aa17 ]
      
      syzbot reported a crash in tasklet_action_common() caused by dccp.
      
      dccp needs to make sure socket wont disappear before tasklet handler
      has completed.
      
      This patch takes a reference on the socket when arming the tasklet,
      and moves the sock_put() from dccp_write_xmit_timer() to dccp_write_xmitlet()
      
      kernel BUG at kernel/softirq.c:514!
      invalid opcode: 0000 [#1] SMP KASAN
      Dumping ftrace buffer:
         (ftrace buffer empty)
      Modules linked in:
      CPU: 1 PID: 17 Comm: ksoftirqd/1 Not tainted 4.17.0-rc3+ #30
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      RIP: 0010:tasklet_action_common.isra.19+0x6db/0x700 kernel/softirq.c:515
      RSP: 0018:ffff8801d9b3faf8 EFLAGS: 00010246
      dccp_close: ABORT with 65423 bytes unread
      RAX: 1ffff1003b367f6b RBX: ffff8801daf1f3f0 RCX: 0000000000000000
      RDX: ffff8801cf895498 RSI: 0000000000000004 RDI: 0000000000000000
      RBP: ffff8801d9b3fc40 R08: ffffed0039f12a95 R09: ffffed0039f12a94
      dccp_close: ABORT with 65423 bytes unread
      R10: ffffed0039f12a94 R11: ffff8801cf8954a3 R12: 0000000000000000
      R13: ffff8801d9b3fc18 R14: dffffc0000000000 R15: ffff8801cf895490
      FS:  0000000000000000(0000) GS:ffff8801daf00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 0000001b2bc28000 CR3: 00000001a08a9000 CR4: 00000000001406e0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Call Trace:
       tasklet_action+0x1d/0x20 kernel/softirq.c:533
       __do_softirq+0x2e0/0xaf5 kernel/softirq.c:285
      dccp_close: ABORT with 65423 bytes unread
       run_ksoftirqd+0x86/0x100 kernel/softirq.c:646
       smpboot_thread_fn+0x417/0x870 kernel/smpboot.c:164
       kthread+0x345/0x410 kernel/kthread.c:238
       ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:412
      Code: 48 8b 85 e8 fe ff ff 48 8b 95 f0 fe ff ff e9 94 fb ff ff 48 89 95 f0 fe ff ff e8 81 53 6e 00 48 8b 95 f0 fe ff ff e9 62 fb ff ff <0f> 0b 48 89 cf 48 89 8d e8 fe ff ff e8 64 53 6e 00 48 8b 8d e8
      RIP: tasklet_action_common.isra.19+0x6db/0x700 kernel/softirq.c:515 RSP: ffff8801d9b3faf8
      
      Fixes: dc841e30 ("dccp: Extend CCID packet dequeueing interface")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Cc: Gerrit Renker <gerrit@erg.abdn.ac.uk>
      Cc: dccp@vger.kernel.org
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      aef419ef
    • Hangbin Liu's avatar
      bridge: check iface upper dev when setting master via ioctl · 0c2133c8
      Hangbin Liu authored
      [ Upstream commit e8238fc2 ]
      
      When we set a bond slave's master to bridge via ioctl, we only check
      the IFF_BRIDGE_PORT flag. Although we will find the slave's real master
      at netdev_master_upper_dev_link() later, it already does some settings
      and allocates some resources. It would be better to return as early
      as possible.
      
      v1 -> v2:
      use netdev_master_upper_dev_get() instead of netdev_has_any_upper_dev()
      to check if we have a master, because not all upper devs are masters,
      e.g. vlan device.
      
      Reported-by: syzbot+de73361ee4971b6e6f75@syzkaller.appspotmail.com
      Signed-off-by: default avatarHangbin Liu <liuhangbin@gmail.com>
      Acked-by: default avatarNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0c2133c8
    • Ingo Molnar's avatar
      8139too: Use disable_irq_nosync() in rtl8139_poll_controller() · 205cd52b
      Ingo Molnar authored
      [ Upstream commit af3e0fcf ]
      
      Use disable_irq_nosync() instead of disable_irq() as this might be
      called in atomic context with netpoll.
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarSebastian Andrzej Siewior <bigeasy@linutronix.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      205cd52b
  2. 16 May, 2018 13 commits