1. 05 Nov, 2013 9 commits
    • Jason Wang's avatar
      virtio-net: coalesce rx frags when possible during rx · ba275241
      Jason Wang authored
      Commit 2613af0e (virtio_net: migrate mergeable
      rx buffers to page frag allocators) try to increase the payload/truesize for
      MTU-sized traffic. But this will introduce the extra overhead for GSO packets
      received because of the frag list. This commit tries to reduce this issue by
      coalesce the possible rx frags when possible during rx. Test result shows the
      about 15% improvement on full size GSO packet receiving (and even better than
      before commit 2613af0e).
      
      Before this commit:
      ./netperf -H 192.168.100.4
      MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.100.4
      () port 0 AF_INET : demo
      Recv   Send    Send
      Socket Socket  Message  Elapsed
      Size   Size    Size     Time     Throughput
      bytes  bytes   bytes    secs.    10^6bits/sec
      
       87380  16384  16384    10.00    20303.87
      
      After this commit:
      ./netperf -H 192.168.100.4
      MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.100.4
      () port 0 AF_INET : demo
      Recv   Send    Send
      Socket Socket  Message  Elapsed
      Size   Size    Size     Time     Throughput
      bytes  bytes   bytes    secs.    10^6bits/sec
      
       87380  16384  16384    10.00    23841.26
      
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Michael S. Tsirkin <mst@redhat.com>
      Cc: Michael Dalton <mwdalton@google.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Acked-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarJason Wang <jasowang@redhat.com>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ba275241
    • Jason Wang's avatar
      net: introduce skb_coalesce_rx_frag() · f8e617e1
      Jason Wang authored
      Sometimes we need to coalesce the rx frags to avoid frag list. One example is
      virtio-net driver which tries to use small frags for both MTU sized packet and
      GSO packet. So this patch introduce skb_coalesce_rx_frag() to do this.
      
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Michael S. Tsirkin <mst@redhat.com>
      Cc: Michael Dalton <mwdalton@google.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Acked-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: default avatarJason Wang <jasowang@redhat.com>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f8e617e1
    • Duan Jiong's avatar
      vxlan: Use ERR_CAST inlined function instead of ERR_PTR(PTR_ERR(...)) · e50fddc8
      Duan Jiong authored
      trivial patch converting ERR_PTR(PTR_ERR()) into ERR_CAST().
      No functional changes.
      Signed-off-by: default avatarDuan Jiong <duanj.fnst@cn.fujitsu.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e50fddc8
    • Jesper Dangaard Brouer's avatar
      net: codel: Avoid undefined behavior from signed overflow · 1ba3aab3
      Jesper Dangaard Brouer authored
      As described in commit 5a581b36 (jiffies: Avoid undefined
      behavior from signed overflow), according to the C standard
      3.4.3p3, overflow of a signed integer results in undefined
      behavior.
      
      To fix this, do as the above commit, and do an unsigned
      subtraction, and interpreting the result as a signed
      two's-complement number.  This is based on the theory from
      RFC 1982 and is nicely described in wikipedia here:
       https://en.wikipedia.org/wiki/Serial_number_arithmetic#General_Solution
      
      A side-note, I have seen practical issues with the previous logic
      when dealing with 16-bit, on a 64-bit machine (gcc version
      4.4.5). This were 32-bit, which I have not observed issues with.
      
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Signed-off-by: default avatarJesper Dangaard Brouer <netoptimizer@brouer.com>
      Acked-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1ba3aab3
    • David S. Miller's avatar
      Merge branch 'for-davem' of git://gitorious.org/linux-can/linux-can-next · 13521a57
      David S. Miller authored
      Marc Kleine-Budde says:
      
      ====================
      here's a pull request for net-next.
      
      It includes a patch by Oliver Hartkopp et al. that adds documentation
      for the broadcast manager to Documentation/networking/can.txt. Three
      patches by me that clean up the netlink handling code in the CAN core.
      And another patch that removes a not needed function from the ti_hecc
      driver.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      13521a57
    • Yuchung Cheng's avatar
      tcp: properly handle stretch acks in slow start · 9f9843a7
      Yuchung Cheng authored
      Slow start now increases cwnd by 1 if an ACK acknowledges some packets,
      regardless the number of packets. Consequently slow start performance
      is highly dependent on the degree of the stretch ACKs caused by
      receiver or network ACK compression mechanisms (e.g., delayed-ACK,
      GRO, etc).  But slow start algorithm is to send twice the amount of
      packets of packets left so it should process a stretch ACK of degree
      N as if N ACKs of degree 1, then exits when cwnd exceeds ssthresh. A
      follow up patch will use the remainder of the N (if greater than 1)
      to adjust cwnd in the congestion avoidance phase.
      
      In addition this patch retires the experimental limited slow start
      (LSS) feature. LSS has multiple drawbacks but questionable benefit. The
      fractional cwnd increase in LSS requires a loop in slow start even
      though it's rarely used. Configuring such an increase step via a global
      sysctl on different BDPS seems hard. Finally and most importantly the
      slow start overshoot concern is now better covered by the Hybrid slow
      start (hystart) enabled by default.
      Signed-off-by: default avatarYuchung Cheng <ycheng@google.com>
      Signed-off-by: default avatarNeal Cardwell <ncardwell@google.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9f9843a7
    • Yuchung Cheng's avatar
      tcp: enable sockets to use MSG_FASTOPEN by default · 0d41cca4
      Yuchung Cheng authored
      Applications have started to use Fast Open (e.g., Chrome browser has
      such an optional flag) and the feature has gone through several
      generations of kernels since 3.7 with many real network tests. It's
      time to enable this flag by default for applications to test more
      conveniently and extensively.
      Signed-off-by: default avatarYuchung Cheng <ycheng@google.com>
      Signed-off-by: default avatarNeal Cardwell <ncardwell@google.com>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0d41cca4
    • David S. Miller's avatar
      Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nftables · f8785c55
      David S. Miller authored
      Pablo Neira Ayuso says:
      
      ====================
      This batch contains fives nf_tables patches for your net-next tree,
      they are:
      
      * Fix possible use after free in the module removal path of the
        x_tables compatibility layer, from Dan Carpenter.
      
      * Add filter chain type for the bridge family, from myself.
      
      * Fix Kconfig dependencies of the nf_tables bridge family with
        the core, from myself.
      
      * Fix sparse warnings in nft_nat, from Tomasz Bursztyka.
      
      * Remove duplicated include in the IPv4 family support for nf_tables,
        from Wei Yongjun.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f8785c55
    • David S. Miller's avatar
      Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next · 72c39a0a
      David S. Miller authored
      Pablo Neira Ayuso says:
      
      ====================
      This is another batch containing Netfilter/IPVS updates for your net-next
      tree, they are:
      
      * Six patches to make the ipt_CLUSTERIP target support netnamespace,
        from Gao feng.
      
      * Two cleanups for the nf_conntrack_acct infrastructure, introducing
        a new structure to encapsulate conntrack counters, from Holger
        Eitzenberger.
      
      * Fix missing verdict in SCTP support for IPVS, from Daniel Borkmann.
      
      * Skip checksum recalculation in SCTP support for IPVS, also from
        Daniel Borkmann.
      
      * Fix behavioural change in xt_socket after IP early demux, from
        Florian Westphal.
      
      * Fix bogus large memory allocation in the bitmap port set type in ipset,
        from Jozsef Kadlecsik.
      
      * Fix possible compilation issues in the hash netnet set type in ipset,
        also from Jozsef Kadlecsik.
      
      * Define constants to identify netlink callback data in ipset dumps,
        again from Jozsef Kadlecsik.
      
      * Use sock_gen_put() in xt_socket to replace xt_socket_put_sk,
        from Eric Dumazet.
      
      * Improvements for the SH scheduler in IPVS, from Alexander Frolkin.
      
      * Remove extra delay due to unneeded rcu barrier in IPVS net namespace
        cleanup path, from Julian Anastasov.
      
      * Save some cycles in ip6t_REJECT by skipping checksum validation in
        packets leaving from our stack, from Stanislav Fomichev.
      
      * Fix IPVS_CMD_ATTR_MAX definition in IPVS, larger that required, from
        Julian Anastasov.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      72c39a0a
  2. 04 Nov, 2013 31 commits