1. 28 Jul, 2018 12 commits
    • Lubomir Rintel's avatar
      usb: cdc_acm: Add quirk for Castles VEGA3000 · 92197cdb
      Lubomir Rintel authored
      commit 1445cbe4 upstream.
      
      The device (a POS terminal) implements CDC ACM, but has not union
      descriptor.
      Signed-off-by: default avatarLubomir Rintel <lkundrak@v3.sk>
      Acked-by: default avatarOliver Neukum <oneukum@suse.com>
      Cc: stable <stable@vger.kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      92197cdb
    • Willem de Bruijn's avatar
      ip: in cmsg IP(V6)_ORIGDSTADDR call pskb_may_pull · a77bf88d
      Willem de Bruijn authored
      [ Upstream commit 2efd4fca ]
      
      Syzbot reported a read beyond the end of the skb head when returning
      IPV6_ORIGDSTADDR:
      
        BUG: KMSAN: kernel-infoleak in put_cmsg+0x5ef/0x860 net/core/scm.c:242
        CPU: 0 PID: 4501 Comm: syz-executor128 Not tainted 4.17.0+ #9
        Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
        Google 01/01/2011
        Call Trace:
          __dump_stack lib/dump_stack.c:77 [inline]
          dump_stack+0x185/0x1d0 lib/dump_stack.c:113
          kmsan_report+0x188/0x2a0 mm/kmsan/kmsan.c:1125
          kmsan_internal_check_memory+0x138/0x1f0 mm/kmsan/kmsan.c:1219
          kmsan_copy_to_user+0x7a/0x160 mm/kmsan/kmsan.c:1261
          copy_to_user include/linux/uaccess.h:184 [inline]
          put_cmsg+0x5ef/0x860 net/core/scm.c:242
          ip6_datagram_recv_specific_ctl+0x1cf3/0x1eb0 net/ipv6/datagram.c:719
          ip6_datagram_recv_ctl+0x41c/0x450 net/ipv6/datagram.c:733
          rawv6_recvmsg+0x10fb/0x1460 net/ipv6/raw.c:521
          [..]
      
      This logic and its ipv4 counterpart read the destination port from
      the packet at skb_transport_offset(skb) + 4.
      
      With MSG_MORE and a local SOCK_RAW sender, syzbot was able to cook a
      packet that stores headers exactly up to skb_transport_offset(skb) in
      the head and the remainder in a frag.
      
      Call pskb_may_pull before accessing the pointer to ensure that it lies
      in skb head.
      
      Link: http://lkml.kernel.org/r/CAF=yD-LEJwZj5a1-bAAj2Oy_hKmGygV6rsJ_WOrAYnv-fnayiQ@mail.gmail.com
      Reported-by: syzbot+9adb4b567003cac781f0@syzkaller.appspotmail.com
      Signed-off-by: default avatarWillem de Bruijn <willemb@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a77bf88d
    • Eric Dumazet's avatar
      tcp: detect malicious patterns in tcp_collapse_ofo_queue() · dc6ae4df
      Eric Dumazet authored
      [ Upstream commit 3d4bf93a ]
      
      In case an attacker feeds tiny packets completely out of order,
      tcp_collapse_ofo_queue() might scan the whole rb-tree, performing
      expensive copies, but not changing socket memory usage at all.
      
      1) Do not attempt to collapse tiny skbs.
      2) Add logic to exit early when too many tiny skbs are detected.
      
      We prefer not doing aggressive collapsing (which copies packets)
      for pathological flows, and revert to tcp_prune_ofo_queue() which
      will be less expensive.
      
      In the future, we might add the possibility of terminating flows
      that are proven to be malicious.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Acked-by: default avatarSoheil Hassas Yeganeh <soheil@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      dc6ae4df
    • Eric Dumazet's avatar
      tcp: avoid collapses in tcp_prune_queue() if possible · 5fbec480
      Eric Dumazet authored
      [ Upstream commit f4a3313d ]
      
      Right after a TCP flow is created, receiving tiny out of order
      packets allways hit the condition :
      
      if (atomic_read(&sk->sk_rmem_alloc) >= sk->sk_rcvbuf)
      	tcp_clamp_window(sk);
      
      tcp_clamp_window() increases sk_rcvbuf to match sk_rmem_alloc
      (guarded by tcp_rmem[2])
      
      Calling tcp_collapse_ofo_queue() in this case is not useful,
      and offers a O(N^2) surface attack to malicious peers.
      
      Better not attempt anything before full queue capacity is reached,
      forcing attacker to spend lots of resource and allow us to more
      easily detect the abuse.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Acked-by: default avatarSoheil Hassas Yeganeh <soheil@google.com>
      Acked-by: default avatarYuchung Cheng <ycheng@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5fbec480
    • Yuchung Cheng's avatar
      tcp: do not delay ACK in DCTCP upon CE status change · 255924ea
      Yuchung Cheng authored
      [ Upstream commit a0496ef2 ]
      
      Per DCTCP RFC8257 (Section 3.2) the ACK reflecting the CE status change
      has to be sent immediately so the sender can respond quickly:
      
      """ When receiving packets, the CE codepoint MUST be processed as follows:
      
         1.  If the CE codepoint is set and DCTCP.CE is false, set DCTCP.CE to
             true and send an immediate ACK.
      
         2.  If the CE codepoint is not set and DCTCP.CE is true, set DCTCP.CE
             to false and send an immediate ACK.
      """
      
      Previously DCTCP implementation may continue to delay the ACK. This
      patch fixes that to implement the RFC by forcing an immediate ACK.
      
      Tested with this packetdrill script provided by Larry Brakmo
      
      0.000 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
      0.000 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
      0.000 setsockopt(3, SOL_TCP, TCP_CONGESTION, "dctcp", 5) = 0
      0.000 bind(3, ..., ...) = 0
      0.000 listen(3, 1) = 0
      
      0.100 < [ect0] SEW 0:0(0) win 32792 <mss 1000,sackOK,nop,nop,nop,wscale 7>
      0.100 > SE. 0:0(0) ack 1 <mss 1460,nop,nop,sackOK,nop,wscale 8>
      0.110 < [ect0] . 1:1(0) ack 1 win 257
      0.200 accept(3, ..., ...) = 4
         +0 setsockopt(4, SOL_SOCKET, SO_DEBUG, [1], 4) = 0
      
      0.200 < [ect0] . 1:1001(1000) ack 1 win 257
      0.200 > [ect01] . 1:1(0) ack 1001
      
      0.200 write(4, ..., 1) = 1
      0.200 > [ect01] P. 1:2(1) ack 1001
      
      0.200 < [ect0] . 1001:2001(1000) ack 2 win 257
      +0.005 < [ce] . 2001:3001(1000) ack 2 win 257
      
      +0.000 > [ect01] . 2:2(0) ack 2001
      // Previously the ACK below would be delayed by 40ms
      +0.000 > [ect01] E. 2:2(0) ack 3001
      
      +0.500 < F. 9501:9501(0) ack 4 win 257
      Signed-off-by: default avatarYuchung Cheng <ycheng@google.com>
      Acked-by: default avatarNeal Cardwell <ncardwell@google.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      255924ea
    • Yuchung Cheng's avatar
      tcp: do not cancel delay-AcK on DCTCP special ACK · 0b1d40e9
      Yuchung Cheng authored
      [ Upstream commit 27cde44a ]
      
      Currently when a DCTCP receiver delays an ACK and receive a
      data packet with a different CE mark from the previous one's, it
      sends two immediate ACKs acking previous and latest sequences
      respectly (for ECN accounting).
      
      Previously sending the first ACK may mark off the delayed ACK timer
      (tcp_event_ack_sent). This may subsequently prevent sending the
      second ACK to acknowledge the latest sequence (tcp_ack_snd_check).
      The culprit is that tcp_send_ack() assumes it always acknowleges
      the latest sequence, which is not true for the first special ACK.
      
      The fix is to not make the assumption in tcp_send_ack and check the
      actual ack sequence before cancelling the delayed ACK. Further it's
      safer to pass the ack sequence number as a local variable into
      tcp_send_ack routine, instead of intercepting tp->rcv_nxt to avoid
      future bugs like this.
      Reported-by: default avatarNeal Cardwell <ncardwell@google.com>
      Signed-off-by: default avatarYuchung Cheng <ycheng@google.com>
      Acked-by: default avatarNeal Cardwell <ncardwell@google.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0b1d40e9
    • Yuchung Cheng's avatar
      tcp: helpers to send special DCTCP ack · 17fea38e
      Yuchung Cheng authored
      [ Upstream commit 2987babb ]
      
      Refactor and create helpers to send the special ACK in DCTCP.
      Signed-off-by: default avatarYuchung Cheng <ycheng@google.com>
      Acked-by: default avatarNeal Cardwell <ncardwell@google.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      17fea38e
    • Yuchung Cheng's avatar
      tcp: fix dctcp delayed ACK schedule · 500e03f4
      Yuchung Cheng authored
      [ Upstream commit b0c05d0e ]
      
      Previously, when a data segment was sent an ACK was piggybacked
      on the data segment without generating a CA_EVENT_NON_DELAYED_ACK
      event to notify congestion control modules. So the DCTCP
      ca->delayed_ack_reserved flag could incorrectly stay set when
      in fact there were no delayed ACKs being reserved. This could result
      in sending a special ECN notification ACK that carries an older
      ACK sequence, when in fact there was no need for such an ACK.
      DCTCP keeps track of the delayed ACK status with its own separate
      state ca->delayed_ack_reserved. Previously it may accidentally cancel
      the delayed ACK without updating this field upon sending a special
      ACK that carries a older ACK sequence. This inconsistency would
      lead to DCTCP receiver never acknowledging the latest data until the
      sender times out and retry in some cases.
      
      Packetdrill script (provided by Larry Brakmo)
      
      0.000 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
      0.000 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
      0.000 setsockopt(3, SOL_TCP, TCP_CONGESTION, "dctcp", 5) = 0
      0.000 bind(3, ..., ...) = 0
      0.000 listen(3, 1) = 0
      
      0.100 < [ect0] SEW 0:0(0) win 32792 <mss 1000,sackOK,nop,nop,nop,wscale 7>
      0.100 > SE. 0:0(0) ack 1 <mss 1460,nop,nop,sackOK,nop,wscale 8>
      0.110 < [ect0] . 1:1(0) ack 1 win 257
      0.200 accept(3, ..., ...) = 4
      
      0.200 < [ect0] . 1:1001(1000) ack 1 win 257
      0.200 > [ect01] . 1:1(0) ack 1001
      
      0.200 write(4, ..., 1) = 1
      0.200 > [ect01] P. 1:2(1) ack 1001
      
      0.200 < [ect0] . 1001:2001(1000) ack 2 win 257
      0.200 write(4, ..., 1) = 1
      0.200 > [ect01] P. 2:3(1) ack 2001
      
      0.200 < [ect0] . 2001:3001(1000) ack 3 win 257
      0.200 < [ect0] . 3001:4001(1000) ack 3 win 257
      0.200 > [ect01] . 3:3(0) ack 4001
      
      0.210 < [ce] P. 4001:4501(500) ack 3 win 257
      
      +0.001 read(4, ..., 4500) = 4500
      +0 write(4, ..., 1) = 1
      +0 > [ect01] PE. 3:4(1) ack 4501
      
      +0.010 < [ect0] W. 4501:5501(1000) ack 4 win 257
      // Previously the ACK sequence below would be 4501, causing a long RTO
      +0.040~+0.045 > [ect01] . 4:4(0) ack 5501   // delayed ack
      
      +0.311 < [ect0] . 5501:6501(1000) ack 4 win 257  // More data
      +0 > [ect01] . 4:4(0) ack 6501     // now acks everything
      
      +0.500 < F. 9501:9501(0) ack 4 win 257
      Reported-by: default avatarLarry Brakmo <brakmo@fb.com>
      Signed-off-by: default avatarYuchung Cheng <ycheng@google.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Acked-by: default avatarNeal Cardwell <ncardwell@google.com>
      Acked-by: default avatarLawrence Brakmo <brakmo@fb.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      500e03f4
    • Roopa Prabhu's avatar
      rtnetlink: add rtnl_link_state check in rtnl_configure_link · b04c9a08
      Roopa Prabhu authored
      [ Upstream commit 5025f7f7 ]
      
      rtnl_configure_link sets dev->rtnl_link_state to
      RTNL_LINK_INITIALIZED and unconditionally calls
      __dev_notify_flags to notify user-space of dev flags.
      
      current call sequence for rtnl_configure_link
      rtnetlink_newlink
          rtnl_link_ops->newlink
          rtnl_configure_link (unconditionally notifies userspace of
                               default and new dev flags)
      
      If a newlink handler wants to call rtnl_configure_link
      early, we will end up with duplicate notifications to
      user-space.
      
      This patch fixes rtnl_configure_link to check rtnl_link_state
      and call __dev_notify_flags with gchanges = 0 if already
      RTNL_LINK_INITIALIZED.
      
      Later in the series, this patch will help the following sequence
      where a driver implementing newlink can call rtnl_configure_link
      to initialize the link early.
      
      makes the following call sequence work:
      rtnetlink_newlink
          rtnl_link_ops->newlink (vxlan) -> rtnl_configure_link (initializes
                                                      link and notifies
                                                      user-space of default
                                                      dev flags)
          rtnl_configure_link (updates dev flags if requested by user ifm
                               and notifies user-space of new dev flags)
      Signed-off-by: default avatarRoopa Prabhu <roopa@cumulusnetworks.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b04c9a08
    • Jack Morgenstein's avatar
      net/mlx4_core: Save the qpn from the input modifier in RST2INIT wrapper · 73dad087
      Jack Morgenstein authored
      [ Upstream commit 958c696f ]
      
      Function mlx4_RST2INIT_QP_wrapper saved the qp number passed in the qp
      context, rather than the one passed in the input modifier.
      
      However, the qp number in the qp context is not defined as a
      required parameter by the FW. Therefore, drivers may choose to not
      specify the qp number in the qp context for the reset-to-init transition.
      
      Thus, we must save the qp number passed in the command input modifier --
      which is always present. (This saved qp number is used as the input
      modifier for command 2RST_QP when a slave's qp's are destroyed).
      
      Fixes: c82e9aa0 ("mlx4_core: resource tracking for HCA resources used by guests")
      Signed-off-by: default avatarJack Morgenstein <jackm@dev.mellanox.co.il>
      Signed-off-by: default avatarTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      73dad087
    • Paolo Abeni's avatar
      ip: hash fragments consistently · 48f41c0c
      Paolo Abeni authored
      [ Upstream commit 3dd1c9a1 ]
      
      The skb hash for locally generated ip[v6] fragments belonging
      to the same datagram can vary in several circumstances:
      * for connected UDP[v6] sockets, the first fragment get its hash
        via set_owner_w()/skb_set_hash_from_sk()
      * for unconnected IPv6 UDPv6 sockets, the first fragment can get
        its hash via ip6_make_flowlabel()/skb_get_hash_flowi6(), if
        auto_flowlabel is enabled
      
      For the following frags the hash is usually computed via
      skb_get_hash().
      The above can cause OoO for unconnected IPv6 UDPv6 socket: in that
      scenario the egress tx queue can be selected on a per packet basis
      via the skb hash.
      It may also fool flow-oriented schedulers to place fragments belonging
      to the same datagram in different flows.
      
      Fix the issue by copying the skb hash from the head frag into
      the others at fragmentation time.
      
      Before this commit:
      perf probe -a "dev_queue_xmit skb skb->hash skb->l4_hash:b1@0/8 skb->sw_hash:b1@1/8"
      netperf -H $IPV4 -t UDP_STREAM -l 5 -- -m 2000 -n &
      perf record -e probe:dev_queue_xmit -e probe:skb_set_owner_w -a sleep 0.1
      perf script
      probe:dev_queue_xmit: (ffffffff8c6b1b20) hash=3713014309 l4_hash=1 sw_hash=0
      probe:dev_queue_xmit: (ffffffff8c6b1b20) hash=0 l4_hash=0 sw_hash=0
      
      After this commit:
      probe:dev_queue_xmit: (ffffffff8c6b1b20) hash=2171763177 l4_hash=1 sw_hash=0
      probe:dev_queue_xmit: (ffffffff8c6b1b20) hash=2171763177 l4_hash=1 sw_hash=0
      
      Fixes: b73c3d0e ("net: Save TX flow hash in sock and set in skbuf on xmit")
      Fixes: 67800f9b ("ipv6: Call skb_get_hash_flowi6 to get skb->hash in ip6_make_flowlabel")
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      48f41c0c
    • Felix Fietkau's avatar
      MIPS: ath79: fix register address in ath79_ddr_wb_flush() · 54a634c4
      Felix Fietkau authored
      commit bc88ad2e upstream.
      
      ath79_ddr_wb_flush_base has the type void __iomem *, so register offsets
      need to be a multiple of 4 in order to access the intended register.
      Signed-off-by: default avatarFelix Fietkau <nbd@nbd.name>
      Signed-off-by: default avatarJohn Crispin <john@phrozen.org>
      Signed-off-by: default avatarPaul Burton <paul.burton@mips.com>
      Fixes: 24b0e3e8 ("MIPS: ath79: Improve the DDR controller interface")
      Patchwork: https://patchwork.linux-mips.org/patch/19912/
      Cc: Alban Bedel <albeu@free.fr>
      Cc: James Hogan <jhogan@kernel.org>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: linux-mips@linux-mips.org
      Cc: stable@vger.kernel.org # 4.2+
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      54a634c4
  2. 25 Jul, 2018 28 commits