• Marcelo Ricardo Leitner's avatar
    sctp: linearize early if it's not GSO · 4c2f2454
    Marcelo Ricardo Leitner authored
    Because otherwise when crc computation is still needed it's way more
    expensive than on a linear buffer to the point that it affects
    performance.
    
    It's so expensive that netperf test gives a perf output as below:
    
    Overhead  Command         Shared Object       Symbol
      18,62%  netserver       [kernel.vmlinux]    [k] crc32_generic_shift
       2,57%  netserver       [kernel.vmlinux]    [k] __pskb_pull_tail
       1,94%  netserver       [kernel.vmlinux]    [k] fib_table_lookup
       1,90%  netserver       [kernel.vmlinux]    [k] copy_user_enhanced_fast_string
       1,66%  swapper         [kernel.vmlinux]    [k] intel_idle
       1,63%  netserver       [kernel.vmlinux]    [k] _raw_spin_lock
       1,59%  netserver       [sctp]              [k] sctp_packet_transmit
       1,55%  netserver       [kernel.vmlinux]    [k] memcpy_erms
       1,42%  netserver       [sctp]              [k] sctp_rcv
    
    # netperf -H 192.168.10.1 -l 10 -t SCTP_STREAM -cC -- -m 12000
    SCTP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.10.1 () port 0 AF_INET
    Recv   Send    Send                          Utilization       Service Demand
    Socket Socket  Message  Elapsed              Send     Recv     Send    Recv
    Size   Size    Size     Time     Throughput  local    remote   local   remote
    bytes  bytes   bytes    secs.    10^6bits/s  % S      % S      us/KB   us/KB
    
    212992 212992  12000    10.00      3016.42   2.88     3.78     1.874   2.462
    
    After patch:
    Overhead  Command         Shared Object      Symbol
       2,75%  netserver       [kernel.vmlinux]   [k] memcpy_erms
       2,63%  netserver       [kernel.vmlinux]   [k] copy_user_enhanced_fast_string
       2,39%  netserver       [kernel.vmlinux]   [k] fib_table_lookup
       2,04%  netserver       [kernel.vmlinux]   [k] __pskb_pull_tail
       1,91%  netserver       [kernel.vmlinux]   [k] _raw_spin_lock
       1,91%  netserver       [sctp]             [k] sctp_packet_transmit
       1,72%  netserver       [mlx4_en]          [k] mlx4_en_process_rx_cq
       1,68%  netserver       [sctp]             [k] sctp_rcv
    
    # netperf -H 192.168.10.1 -l 10 -t SCTP_STREAM -cC -- -m 12000
    SCTP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.10.1 () port 0 AF_INET
    Recv   Send    Send                          Utilization       Service Demand
    Socket Socket  Message  Elapsed              Send     Recv     Send    Recv
    Size   Size    Size     Time     Throughput  local    remote   local   remote
    bytes  bytes   bytes    secs.    10^6bits/s  % S      % S      us/KB   us/KB
    
    212992 212992  12000    10.00      3681.77   3.83     3.46     2.045   1.849
    
    Fixes: 3acb50c1 ("sctp: delay as much as possible skb_linearize")
    Signed-off-by: default avatarMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
    Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    4c2f2454
inqueue.c 7.04 KB