1. 02 Jul, 2020 5 commits
    • Florian Westphal's avatar
      mptcp: add receive buffer auto-tuning · a6b118fe
      Florian Westphal authored
      When mptcp is used, userspace doesn't read from the tcp (subflow)
      socket but from the parent (mptcp) socket receive queue.
      
      skbs are moved from the subflow socket to the mptcp rx queue either from
      'data_ready' callback (if mptcp socket can be locked), a work queue, or
      the socket receive function.
      
      This means tcp_rcv_space_adjust() is never called and thus no receive
      buffer size auto-tuning is done.
      
      An earlier (not merged) patch added tcp_rcv_space_adjust() calls to the
      function that moves skbs from subflow to mptcp socket.
      While this enabled autotuning, it also meant tuning was done even if
      userspace was reading the mptcp socket very slowly.
      
      This adds mptcp_rcv_space_adjust() and calls it after userspace has
      read data from the mptcp socket rx queue.
      
      Its very similar to tcp_rcv_space_adjust, with two differences:
      
      1. The rtt estimate is the largest one observed on a subflow
      2. The rcvbuf size and window clamp of all subflows is adjusted
         to the mptcp-level rcvbuf.
      
      Otherwise, we get spurious drops at tcp (subflow) socket level if
      the skbs are not moved to the mptcp socket fast enough.
      
      Before:
      time mptcp_connect.sh -t -f $((4*1024*1024)) -d 300 -l 0.01% -r 0 -e "" -m mmap
      [..]
      ns4 MPTCP -> ns3 (10.0.3.2:10108      ) MPTCP   (duration 40823ms) [ OK ]
      ns4 MPTCP -> ns3 (10.0.3.2:10109      ) TCP     (duration 23119ms) [ OK ]
      ns4 TCP   -> ns3 (10.0.3.2:10110      ) MPTCP   (duration  5421ms) [ OK ]
      ns4 MPTCP -> ns3 (dead:beef:3::2:10111) MPTCP   (duration 41446ms) [ OK ]
      ns4 MPTCP -> ns3 (dead:beef:3::2:10112) TCP     (duration 23427ms) [ OK ]
      ns4 TCP   -> ns3 (dead:beef:3::2:10113) MPTCP   (duration  5426ms) [ OK ]
      Time: 1396 seconds
      
      After:
      ns4 MPTCP -> ns3 (10.0.3.2:10108      ) MPTCP   (duration  5417ms) [ OK ]
      ns4 MPTCP -> ns3 (10.0.3.2:10109      ) TCP     (duration  5427ms) [ OK ]
      ns4 TCP   -> ns3 (10.0.3.2:10110      ) MPTCP   (duration  5422ms) [ OK ]
      ns4 MPTCP -> ns3 (dead:beef:3::2:10111) MPTCP   (duration  5415ms) [ OK ]
      ns4 MPTCP -> ns3 (dead:beef:3::2:10112) TCP     (duration  5422ms) [ OK ]
      ns4 TCP   -> ns3 (dead:beef:3::2:10113) MPTCP   (duration  5423ms) [ OK ]
      Time: 296 seconds
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Reviewed-by: default avatarMatthieu Baerts <matthieu.baerts@tessares.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a6b118fe
    • Florian Westphal's avatar
      selftests: mptcp: add option to specify size of file to transfer · 767659f6
      Florian Westphal authored
      The script generates two random files that are then sent via tcp and
      mptcp connections.
      
      In order to compare throughput over consecutive runs add an option
      to provide the file size on the command line: "-f 128000".
      
      Also add an option, -t, to enable tcp tests. This is useful to
      compare throughput of mptcp connections and tcp connections.
      
      Example: run tests with a 4mb file size, 300ms delay 0.01% loss,
      default gso/tso/gro settings and with large write/blocking io:
      
      mptcp_connect.sh -t -f $((4 * 1024 * 1024)) -d 300 -l 0.01%  -r 0 -e "" -m mmap
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Reviewed-by: default avatarMatthieu Baerts <matthieu.baerts@tessares.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      767659f6
    • Danny Lin's avatar
      net: sched: Allow changing default qdisc to FQ-PIE · b97e9d9d
      Danny Lin authored
      Similar to fq_codel and the other qdiscs that can set as default,
      fq_pie is also suitable for general use without explicit configuration,
      which makes it a valid choice for this.
      
      This is useful in situations where a painless out-of-the-box solution
      for reducing bufferbloat is desired but fq_codel is not necessarily the
      best choice. For example, fq_pie can be better for DASH streaming, but
      there could be more cases where it's the better choice of the two simple
      AQMs available in the kernel.
      Signed-off-by: default avatarDanny Lin <danny@kdrag0n.dev>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b97e9d9d
    • David S. Miller's avatar
      Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue · d8c8a96c
      David S. Miller authored
      Tony Nguyen says:
      
      ====================
      Intel Wired LAN Driver Updates 2020-07-01
      
      This series contains updates to all Intel drivers, but a majority of the
      changes are to the i40e driver.
      
      Jeff converts 'fall through' comments to the 'fallthrough;' keyword for
      all Intel drivers. Removed unnecessary delay in the ixgbe ethtool
      diagnostics test.
      
      Arkadiusz implements Total Port Shutdown for i40e. This is the revised
      patch based on Jakub's feedback from an earlier submission of this
      patch, where additional code comments and description was needed to
      describe the functionality.
      
      Wei Yongjun fixes return error code for iavf_init_get_resources().
      
      Magnus optimizes XDP code in i40e; starting with AF_XDP zero-copy
      transmit completion path. Then by only executing a division when
      necessary in the napi_poll data path. Move the check for transmit ring
      full outside the send loop to increase performance.
      
      Ciara add XDP ring statistics to i40e and the ability to dump these
      statistics and descriptors.
      
      Tony fixes reporting iavf statistics.
      
      Radoslaw adds support for 2.5 and 5 Gbps by implementing the newer ethtool
      ksettings API in ixgbe.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d8c8a96c
    • David S. Miller's avatar
      Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue · 11a20c71
      David S. Miller authored
      Tony Nguyen says:
      
      ====================
      100GbE Intel Wired LAN Driver Updates 2020-07-01
      
      This series contains updates to the ice driver only.
      
      Jacob implements a devlink region for device capabilities.
      
      Bruce removes structs containing only one-element arrays that are either
      unused or only used for indexing. Instead, use pointer arithmetic or
      other indexing to access the elements. Converts "C struct hack"
      variable-length types to the preferred C99 flexible array member.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      11a20c71
  2. 01 Jul, 2020 35 commits