1. 08 Jun, 2016 7 commits
    • Pau Espin Pedrol's avatar
      tcp: accept RST if SEQ matches right edge of right-most SACK block · e00431bc
      Pau Espin Pedrol authored
      RFC 5961 advises to only accept RST packets containing a seq number
      matching the next expected seq number instead of the whole receive
      window in order to avoid spoofing attacks.
      
      However, this situation is not optimal in the case SACK is in use at the
      time the RST is sent. I recently run into a scenario in which packet
      losses were high while uploading data to a server, and userspace was
      willing to frequently terminate connections by sending a RST. In
      this case, the ACK sent on the receiver side (rcv_nxt) is frozen waiting
      for a lost packet retransmission and SACK blocks are used to let the
      client continue uploading data. At some point later on, the client sends
      the RST (snd_nxt), which matches the next expected seq number of the
      right-most SACK block on the receiver side which is going forward
      receiving data.
      
      In this scenario, as RFC 5961 defines, the RST SEQ doesn't match the
      frozen main ACK at receiver side and thus gets dropped and a challenge
      ACK is sent, which gets usually lost due to network conditions. The main
      consequence is that the connection stays alive for a while even if it
      made sense to accept the RST. This can get really bad if lots of
      connections like this one are created in few seconds, allocating all the
      resources of the server easily.
      
      For security reasons, not all SACK blocks are checked (there could be a
      big amount of SACK blocks => acceptable SEQ numbers). Furthermore, it
      wouldn't make sense to check for RST in blocks other than the right-most
      received one because the sender is not expected to be sending new data
      after the RST. For simplicity, only up to the 4 most recently updated
      SACK blocks (selective_acks[4] field) are compared to find the
      right-most block, as usually those are the ones with bigger probability
      to contain it.
      
      This patch was tested in a 3.18 kernel and probed to improve the
      situation in the scenario described above.
      Signed-off-by: default avatarPau Espin Pedrol <pau.espin@tessares.net>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Acked-by: default avatarNeal Cardwell <ncardwell@google.com>
      Tested-by: default avatarNeal Cardwell <ncardwell@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e00431bc
    • Dan Carpenter's avatar
      qed: potential overflow in qed_cxt_src_t2_alloc() · 01e517f1
      Dan Carpenter authored
      In the current code "ent_per_page" could be more than "conn_num" making
      "conn_num" negative after the subtraction.  In the next iteration
      through the loop then the negative is treated as a very high positive
      meaning we don't put a limit on "ent_num".  It could lead to memory
      corruption.
      
      Fixes: dbb799c3 ('qed: Initialize hardware for new protocols')
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Acked-by: default avatarYuval Mintz <Yuval.Mintz@qlogic.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      01e517f1
    • David S. Miller's avatar
      Merge branch 'vrf-local' · f02ea215
      David S. Miller authored
      David Ahern says:
      
      ====================
      net: vrf: Add support for local traffic to local addresses
      
      Add support for locally originated traffic to VRF-local addresses,
      be it addresses on enslaved devices or addresses on the VRF device:
      
      $ ip addr show dev red
      33: red: <NOARP,MASTER,UP,LOWER_UP> mtu 65536 qdisc pfifo_fast state UP group default qlen 1000
          link/ether be:00:53:b5:e4:25 brd ff:ff:ff:ff:ff:ff
          inet 1.1.1.1/32 scope global red
             valid_lft forever preferred_lft forever
          inet6 1111:1::1/128 scope global
             valid_lft forever preferred_lft forever
      
      $ ip addr show dev eth1
      3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master red state UP group default qlen 1000
          link/ether 02:e0:f9:79:34:bd brd ff:ff:ff:ff:ff:ff
          inet 10.100.1.1/24 brd 10.100.1.255 scope global eth1
             valid_lft forever preferred_lft forever
          inet6 2100:1::1/120 scope global
             valid_lft forever preferred_lft forever
          inet6 fe80::e0:f9ff:fe79:34bd/64 scope link
             valid_lft forever preferred_lft forever
      
      $ ping -c1 -I red 10.100.1.1
          ping: Warning: source address might be selected on device other than red.
          PING 10.100.1.1 (10.100.1.1) from 10.100.1.1 red: 56(84) bytes of data.
          64 bytes from 10.100.1.1: icmp_seq=1 ttl=64 time=0.057 ms
      
      $ ping -c1 -I red 1.1.1.1
      PING 1.1.1.1 (1.1.1.1) from 1.1.1.1 red: 56(84) bytes of data.
      64 bytes from 1.1.1.1: icmp_seq=1 ttl=64 time=0.136 ms
      
      --- 1.1.1.1 ping statistics ---
      1 packets transmitted, 1 received, 0% packet loss, time 0ms
      rtt min/avg/max/mdev = 0.136/0.136/0.136/0.000 ms
      
      $ ping6 -c1 -I red  2100:1::1
      ping6: Warning: source address might be selected on device other than red.
      PING 2100:1::1(2100:1::1) from 2100:1::1 red: 56 data bytes
      64 bytes from 2100:1::1: icmp_seq=1 ttl=64 time=0.167 ms
      
      --- 2100:1::1 ping statistics ---
      1 packets transmitted, 1 received, 0% packet loss, time 0ms
      rtt min/avg/max/mdev = 0.167/0.167/0.167/0.000 ms
      
      $ ping6 -c1 -I red 1111::1
      PING 1111::1(1111::1) from 1111:1::1 red: 56 data bytes
      64 bytes from 1111::1: icmp_seq=1 ttl=64 time=0.187 ms
      
      --- 1111::1 ping statistics ---
      1 packets transmitted, 1 received, 0% packet loss, time 0ms
      rtt min/avg/max/mdev = 0.187/0.187/0.187/0.000 ms
      
      This change also enables use of loopback address on the VRF device:
      $ ip addr add dev red 127.0.0.1/8
      
      $ ping -c1 -I red 127.0.0.1
      PING 127.0.0.1 (127.0.0.1) from 127.0.0.1 red: 56(84) bytes of data.
      64 bytes from 127.0.0.1: icmp_seq=1 ttl=64 time=0.058 ms
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f02ea215
    • David Ahern's avatar
      net: vrf: ipv6 support for local traffic to local addresses · b4869aa2
      David Ahern authored
      Add support for locally originated traffic to VRF-local IPv6 addresses.
      Similar to IPv4 a local dst is set on the skb and the packet is
      reinserted with a call to netif_rx. With this patch, ping, tcp and udp
      packets to a local IPv6 address are successfully routed:
      
          $ ip addr show dev eth1
          4: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master red state UP group default qlen 1000
              link/ether 02:e0:f9:1c:b9:74 brd ff:ff:ff:ff:ff:ff
              inet 10.100.1.1/24 brd 10.100.1.255 scope global eth1
                 valid_lft forever preferred_lft forever
              inet6 2100:1::1/120 scope global
                 valid_lft forever preferred_lft forever
              inet6 fe80::e0:f9ff:fe1c:b974/64 scope link
                 valid_lft forever preferred_lft forever
      
          $ ping6 -c1 -I red 2100:1::1
          ping6: Warning: source address might be selected on device other than red.
          PING 2100:1::1(2100:1::1) from 2100:1::1 red: 56 data bytes
          64 bytes from 2100:1::1: icmp_seq=1 ttl=64 time=0.098 ms
      
      ip6_input is exported so the VRF driver can use it for the dst input
      function. The dst_alloc function for IPv4 defaults to setting the input and
      output functions; IPv6's does not. VRF does not need to duplicate the Rx path
      so just export the ipv6 input function.
      Signed-off-by: default avatarDavid Ahern <dsa@cumulusnetworks.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b4869aa2
    • David Ahern's avatar
      net: vrf: ipv4 support for local traffic to local addresses · afe80a49
      David Ahern authored
      Add support for locally originated traffic to VRF-local addresses. If
      destination device for an skb is the loopback or VRF device then set
      its dst to a local version of the VRF cached dst_entry and call netif_rx
      to insert the packet onto the rx queue - similar to what is done for
      loopback. This patch handles IPv4 support; follow on patch handles IPv6.
      
      With this patch, ping, tcp and udp packets to a local IPv4 address are
      successfully routed:
      
          $ ip addr show dev eth1
          4: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master red state UP group default qlen 1000
              link/ether 02:e0:f9:1c:b9:74 brd ff:ff:ff:ff:ff:ff
              inet 10.100.1.1/24 brd 10.100.1.255 scope global eth1
                 valid_lft forever preferred_lft forever
              inet6 2100:1::1/120 scope global
                 valid_lft forever preferred_lft forever
              inet6 fe80::e0:f9ff:fe1c:b974/64 scope link
                 valid_lft forever preferred_lft forever
      
          $ ping -c1 -I red 10.100.1.1
          ping: Warning: source address might be selected on device other than red.
          PING 10.100.1.1 (10.100.1.1) from 10.100.1.1 red: 56(84) bytes of data.
          64 bytes from 10.100.1.1: icmp_seq=1 ttl=64 time=0.057 ms
      
      This patch also enables use of IPv4 loopback address on the VRF device:
          $ ip addr add dev red 127.0.0.1/8
      
          $ ping -c1 -I red 127.0.0.1
          PING 127.0.0.1 (127.0.0.1) from 127.0.0.1 red: 56(84) bytes of data.
          64 bytes from 127.0.0.1: icmp_seq=1 ttl=64 time=0.058 ms
      Signed-off-by: default avatarDavid Ahern <dsa@cumulusnetworks.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      afe80a49
    • David Ahern's avatar
      net: vrf: Minor refactoring for local address patches · 911a66fb
      David Ahern authored
      Move the stripping of the ethernet header from is_ip_tx_frame into the
      ipv4 and ipv6 outbound functions and collapse vrf_send_v4_prep into
      vrf_process_v4_outbound.
      Signed-off-by: default avatarDavid Ahern <dsa@cumulusnetworks.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      911a66fb
    • Tom Herbert's avatar
      gue: Implement direction IP encapsulation · c1e48af7
      Tom Herbert authored
      This patch implements direct encapsulation of IPv4 and IPv6 packets
      in UDP. This is done a version "1" of GUE and as explained in I-D
      draft-ietf-nvo3-gue-03.
      
      Changes here are only in the receive path, fou with IPxIPx already
      supports the transmit side. Both the normal receive path and
      GRO path are modified to check for GUE version and check for
      IP version in the case that GUE version is "1".
      
      Tested:
      
      IPIP with direct GUE encap
        1 TCP_STREAM
          4530 Mbps
        200 TCP_RR
          1297625 tps
          135/232/444 90/95/99% latencies
      
      IP4IP6 with direct GUE encap
        1 TCP_STREAM
          4903 Mbps
        200 TCP_RR
          1184481 tps
          149/253/473 90/95/99% latencies
      
      IP6IP6 direct GUE encap
        1 TCP_STREAM
         5146 Mbps
        200 TCP_RR
          1202879 tps
          146/251/472 90/95/99% latencies
      
      SIT with direct GUE encap
        1 TCP_STREAM
          6111 Mbps
        200 TCP_RR
          1250337 tps
          139/241/467 90/95/99% latencies
      Signed-off-by: default avatarTom Herbert <tom@herbertland.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c1e48af7
  2. 07 Jun, 2016 30 commits
  3. 06 Jun, 2016 3 commits
    • David S. Miller's avatar
      net: Revert vrf-local changes. · 3d9dc408
      David S. Miller authored
      This reverts commit 2fb7ea45.
      
      It results in build errors because ip6_input is not a
      symbol exported to modules.
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3d9dc408
    • David S. Miller's avatar
      Merge branch 'vrf-local' · 2fb7ea45
      David S. Miller authored
      David Ahern says:
      
      ====================
      net: vrf: Add support for local traffic to local addresses
      
      Add support for locally originated traffic to VRF-local addresses,
      be it addresses on enslaved devices or addresses on the VRF device:
      
      $ ip addr show dev red
      33: red: <NOARP,MASTER,UP,LOWER_UP> mtu 65536 qdisc pfifo_fast state UP group default qlen 1000
          link/ether be:00:53:b5:e4:25 brd ff:ff:ff:ff:ff:ff
          inet 1.1.1.1/32 scope global red
             valid_lft forever preferred_lft forever
          inet6 1111:1::1/128 scope global
             valid_lft forever preferred_lft forever
      
      $ ip addr show dev eth1
      3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master red state UP group default qlen 1000
          link/ether 02:e0:f9:79:34:bd brd ff:ff:ff:ff:ff:ff
          inet 10.100.1.1/24 brd 10.100.1.255 scope global eth1
             valid_lft forever preferred_lft forever
          inet6 2100:1::1/120 scope global
             valid_lft forever preferred_lft forever
          inet6 fe80::e0:f9ff:fe79:34bd/64 scope link
             valid_lft forever preferred_lft forever
      
      $ ping -c1 -I red 10.100.1.1
          ping: Warning: source address might be selected on device other than red.
          PING 10.100.1.1 (10.100.1.1) from 10.100.1.1 red: 56(84) bytes of data.
          64 bytes from 10.100.1.1: icmp_seq=1 ttl=64 time=0.057 ms
      
      $ ping -c1 -I red 1.1.1.1
      PING 1.1.1.1 (1.1.1.1) from 1.1.1.1 red: 56(84) bytes of data.
      64 bytes from 1.1.1.1: icmp_seq=1 ttl=64 time=0.136 ms
      
      --- 1.1.1.1 ping statistics ---
      1 packets transmitted, 1 received, 0% packet loss, time 0ms
      rtt min/avg/max/mdev = 0.136/0.136/0.136/0.000 ms
      
      $ ping6 -c1 -I red  2100:1::1
      ping6: Warning: source address might be selected on device other than red.
      PING 2100:1::1(2100:1::1) from 2100:1::1 red: 56 data bytes
      64 bytes from 2100:1::1: icmp_seq=1 ttl=64 time=0.167 ms
      
      --- 2100:1::1 ping statistics ---
      1 packets transmitted, 1 received, 0% packet loss, time 0ms
      rtt min/avg/max/mdev = 0.167/0.167/0.167/0.000 ms
      
      $ ping6 -c1 -I red 1111::1
      PING 1111::1(1111::1) from 1111:1::1 red: 56 data bytes
      64 bytes from 1111::1: icmp_seq=1 ttl=64 time=0.187 ms
      
      --- 1111::1 ping statistics ---
      1 packets transmitted, 1 received, 0% packet loss, time 0ms
      rtt min/avg/max/mdev = 0.187/0.187/0.187/0.000 ms
      
      This change also enables use of loopback address on the VRF device:
      $ ip addr add dev red 127.0.0.1/8
      
      $ ping -c1 -I red 127.0.0.1
      PING 127.0.0.1 (127.0.0.1) from 127.0.0.1 red: 56(84) bytes of data.
      64 bytes from 127.0.0.1: icmp_seq=1 ttl=64 time=0.058 ms
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2fb7ea45
    • David Ahern's avatar
      net: vrf: ipv6 support for local traffic to local addresses · 625b47b5
      David Ahern authored
      Add support for locally originated traffic to VRF-local IPv6 addresses.
      Similar to IPv4 a local dst is set on the skb and the packet is
      reinserted with a call to netif_rx. With this patch, ping, tcp and udp
      packets to a local IPv6 address are successfully routed:
      
          $ ip addr show dev eth1
          4: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master red state UP group default qlen 1000
              link/ether 02:e0:f9:1c:b9:74 brd ff:ff:ff:ff:ff:ff
              inet 10.100.1.1/24 brd 10.100.1.255 scope global eth1
                 valid_lft forever preferred_lft forever
              inet6 2100:1::1/120 scope global
                 valid_lft forever preferred_lft forever
              inet6 fe80::e0:f9ff:fe1c:b974/64 scope link
                 valid_lft forever preferred_lft forever
      
          $ ping6 -c1 -I red 2100:1::1
          ping6: Warning: source address might be selected on device other than red.
          PING 2100:1::1(2100:1::1) from 2100:1::1 red: 56 data bytes
          64 bytes from 2100:1::1: icmp_seq=1 ttl=64 time=0.098 ms
      
      ip6_input is exported so the VRF driver can use it for the dst input
      function. The dst_alloc function for IPv4 defaults to setting the input and
      output functions; IPv6's does not. VRF does not need to duplicate the Rx path
      so just export the ipv6 input function.
      Signed-off-by: default avatarDavid Ahern <dsa@cumulusnetworks.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      625b47b5