1. 30 Nov, 2018 11 commits
    • David S. Miller's avatar
      Merge branch 'tcp-take-a-bit-more-care-of-backlog-stress' · 2f695553
      David S. Miller authored
      Eric Dumazet says:
      
      ====================
      tcp: take a bit more care of backlog stress
      
      While working on the SACK compression issue Jean-Louis Dupond
      reported, we found that his linux box was suffering very hard
      from tail drops on the socket backlog queue.
      
      First patch hints the compiler about sack flows being the norm.
      
      Second patch changes non-sack code in preparation of the ack
      compression.
      
      Third patch fixes tcp_space() to take backlog into account.
      
      Fourth patch is attempting coalescing when a new packet must
      be added to the backlog queue. Cooking bigger skbs helps
      to keep backlog list smaller and speeds its handling when
      user thread finally releases the socket lock.
      
      v3: Neal/Yuchung feedback addressed :
           Do not aggregate if any skb has URG bit set.
           Do not aggregate if the skbs have different ECE/CWR bits
      
      v2: added feedback from Neal : tcp: take care of compressed acks in tcp_add_reno_sack()
          added : tcp: hint compiler about sack flows
      	added : tcp: make tcp_space() aware of socket backlog
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2f695553
    • Eric Dumazet's avatar
      tcp: implement coalescing on backlog queue · 4f693b55
      Eric Dumazet authored
      In case GRO is not as efficient as it should be or disabled,
      we might have a user thread trapped in __release_sock() while
      softirq handler flood packets up to the point we have to drop.
      
      This patch balances work done from user thread and softirq,
      to give more chances to __release_sock() to complete its work
      before new packets are added the the backlog.
      
      This also helps if we receive many ACK packets, since GRO
      does not aggregate them.
      
      This patch brings ~60% throughput increase on a receiver
      without GRO, but the spectacular gain is really on
      1000x release_sock() latency reduction I have measured.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Neal Cardwell <ncardwell@google.com>
      Cc: Yuchung Cheng <ycheng@google.com>
      Acked-by: default avatarNeal Cardwell <ncardwell@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4f693b55
    • Eric Dumazet's avatar
      tcp: make tcp_space() aware of socket backlog · 85bdf7db
      Eric Dumazet authored
      Jean-Louis Dupond reported poor iscsi TCP receive performance
      that we tracked to backlog drops.
      
      Apparently we fail to send window updates reflecting the
      fact that we are under stress.
      
      Note that we might lack a proper window increase when
      backlog is fully processed, since __release_sock() clears
      sk->sk_backlog.len _after_ all skbs have been processed.
      
      This should not matter in practice. If we had a significant
      load through socket backlog, we are in a dangerous
      situation.
      Reported-by: default avatarJean-Louis Dupond <jean-louis@dupond.be>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Acked-by: default avatarNeal Cardwell <ncardwell@google.com>
      Acked-by: default avatarYuchung Cheng <ycheng@google.com>
      Tested-by: Jean-Louis Dupond<jean-louis@dupond.be>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      85bdf7db
    • Eric Dumazet's avatar
      tcp: take care of compressed acks in tcp_add_reno_sack() · 19119f29
      Eric Dumazet authored
      Neal pointed out that non sack flows might suffer from ACK compression
      added in the following patch ("tcp: implement coalescing on backlog queue")
      
      Instead of tweaking tcp_add_backlog() we can take into
      account how many ACK were coalesced, this information
      will be available in skb_shinfo(skb)->gso_segs
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Acked-by: default avatarNeal Cardwell <ncardwell@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      19119f29
    • Eric Dumazet's avatar
      tcp: hint compiler about sack flows · ebeef4bc
      Eric Dumazet authored
      Tell the compiler that most TCP flows are using SACK these days.
      
      There is no need to add the unlikely() clause in tcp_is_reno(),
      the compiler is able to infer it.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Acked-by: default avatarNeal Cardwell <ncardwell@google.com>
      Acked-by: default avatarYuchung Cheng <ycheng@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ebeef4bc
    • Geneviève Bastien's avatar
      net: Add trace events for all receive exit points · b0e3f1bd
      Geneviève Bastien authored
      Trace events are already present for the receive entry points, to indicate
      how the reception entered the stack.
      
      This patch adds the corresponding exit trace events that will bound the
      reception such that all events occurring between the entry and the exit
      can be considered as part of the reception context. This greatly helps
      for dependency and root cause analyses.
      
      Without this, it is not possible with tracepoint instrumentation to
      determine whether a sched_wakeup event following a netif_receive_skb
      event is the result of the packet reception or a simple coincidence after
      further processing by the thread. It is possible using other mechanisms
      like kretprobes, but considering the "entry" points are already present,
      it would be good to add the matching exit events.
      
      In addition to linking packets with wakeups, the entry/exit event pair
      can also be used to perform network stack latency analyses.
      Signed-off-by: default avatarGeneviève Bastien <gbastien@versatic.net>
      CC: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
      CC: Steven Rostedt <rostedt@goodmis.org>
      CC: Ingo Molnar <mingo@redhat.com>
      CC: David S. Miller <davem@davemloft.net>
      Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org> (tracing side)
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b0e3f1bd
    • Edward Cree's avatar
      net/flow_dissector: correct comments on enum flow_dissector_key_id · 91c45956
      Edward Cree authored
      There are no such structs flow_dissector_key_flow_vlan or
       flow_dissector_key_flow_tags, the actual structs used are struct
       flow_dissector_key_vlan and struct flow_dissector_key_tags.  So correct the
       comments against FLOW_DISSECTOR_KEY_VLAN, FLOW_DISSECTOR_KEY_FLOW_LABEL and
       FLOW_DISSECTOR_KEY_CVLAN to refer to those.
      Signed-off-by: default avatarEdward Cree <ecree@solarflare.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      91c45956
    • Ganesh Goudar's avatar
      cxgb4: number of VFs supported is not always 16 · 1b974aa4
      Ganesh Goudar authored
      Total number of VFs supported by PF is used to determine the last
      byte of VF's mac address. Number of VFs supported is not always
      16, use the variable nvfs to get the number of VFs supported
      rather than hard coding it to 16.
      Signed-off-by: default avatarCasey Leedom <leedom@chelsio.com>
      Signed-off-by: default avatarGanesh Goudar <ganeshgr@chelsio.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1b974aa4
    • David S. Miller's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next · 93029d7d
      David S. Miller authored
      Daniel Borkmann says:
      
      ====================
      bpf-next 2018-11-30
      
      The following pull-request contains BPF updates for your *net-next* tree.
      
      (Getting out bit earlier this time to pull in a dependency from bpf.)
      
      The main changes are:
      
      1) Add libbpf ABI versioning and document API naming conventions
         as well as ABI versioning process, from Andrey.
      
      2) Add a new sk_msg_pop_data() helper for sk_msg based BPF
         programs that is used in conjunction with sk_msg_push_data()
         for adding / removing meta data to the msg data, from John.
      
      3) Optimize convert_bpf_ld_abs() for 0 offset and fix various
         lib and testsuite build failures on 32 bit, from David.
      
      4) Make BPF prog dump for !JIT identical to how we dump subprogs
         when JIT is in use, from Yonghong.
      
      5) Rename btf_get_from_id() to make it more conform with libbpf
         API naming conventions, from Martin.
      
      6) Add a missing BPF kselftest config item, from Naresh.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      93029d7d
    • Yonghong Song's avatar
      tools/bpf: make libbpf _GNU_SOURCE friendly · b4269954
      Yonghong Song authored
      During porting libbpf to bcc, I got some warnings like below:
        ...
        [  2%] Building C object src/cc/CMakeFiles/bpf-shared.dir/libbpf/src/libbpf.c.o
        /home/yhs/work/bcc2/src/cc/libbpf/src/libbpf.c:12:0:
        warning: "_GNU_SOURCE" redefined [enabled by default]
         #define _GNU_SOURCE
        ...
        [  3%] Building C object src/cc/CMakeFiles/bpf-shared.dir/libbpf/src/libbpf_errno.c.o
        /home/yhs/work/bcc2/src/cc/libbpf/src/libbpf_errno.c: In function ‘libbpf_strerror’:
        /home/yhs/work/bcc2/src/cc/libbpf/src/libbpf_errno.c:45:7:
        warning: assignment makes integer from pointer without a cast [enabled by default]
           ret = strerror_r(err, buf, size);
        ...
      
      bcc is built with _GNU_SOURCE defined and this caused the above warning.
      This patch intends to make libpf _GNU_SOURCE friendly by
        . define _GNU_SOURCE in libbpf.c unless it is not defined
        . undefine _GNU_SOURCE as non-gnu version of strerror_r is expected.
      Signed-off-by: default avatarYonghong Song <yhs@fb.com>
      Acked-by: default avatarJakub Kicinski <jakub.kicinski@netronome.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      b4269954
    • David S. Miller's avatar
      3d58c9c9
  2. 29 Nov, 2018 11 commits
  3. 28 Nov, 2018 18 commits