• Asbjørn Sloth Tønnesen's avatar
    net: sched: make skip_sw actually skip software · 047f340b
    Asbjørn Sloth Tønnesen authored
    TC filters come in 3 variants:
    - no flag (try to process in hardware, but fallback to software))
    - skip_hw (do not process filter by hardware)
    - skip_sw (do not process filter by software)
    
    However skip_sw is implemented so that the skip_sw
    flag can first be checked, after it has been matched.
    
    IMHO it's common when using skip_sw, to use it on all rules.
    
    So if all filters in a block is skip_sw filters, then
    we can bail early, we can thus avoid having to match
    the filters, just to check for the skip_sw flag.
    
    This patch adds a bypass, for when only TC skip_sw rules
    are used. The bypass is guarded by a static key, to avoid
    harming other workloads.
    
    There are 3 ways that a packet from a skip_sw ruleset, can
    end up in the kernel path. Although the send packets to a
    non-existent chain way is only improved a few percents, then
    I believe it's worth optimizing the trap and fall-though
    use-cases.
    
     +----------------------------+--------+--------+--------+
     | Test description           | Pre-   | Post-  | Rel.   |
     |                            | kpps   | kpps   | chg.   |
     +----------------------------+--------+--------+--------+
     | basic forwarding + notrack | 3589.3 | 3587.9 |  1.00x |
     | switch to eswitch mode     | 3081.8 | 3094.7 |  1.00x |
     | add ingress qdisc          | 3042.9 | 3063.6 |  1.01x |
     | tc forward in hw / skip_sw |37024.7 |37028.4 |  1.00x |
     | tc forward in sw / skip_hw | 3245.0 | 3245.3 |  1.00x |
     +----------------------------+--------+--------+--------+
     | tests with only skip_sw rules below:                  |
     +----------------------------+--------+--------+--------+
     | 1 non-matching rule        | 2694.7 | 3058.7 |  1.14x |
     | 1 n-m rule, match trap     | 2611.2 | 3323.1 |  1.27x |
     | 1 n-m rule, goto non-chain | 2886.8 | 2945.9 |  1.02x |
     | 5 non-matching rules       | 1958.2 | 3061.3 |  1.56x |
     | 5 n-m rules, match trap    | 1911.9 | 3327.0 |  1.74x |
     | 5 n-m rules, goto non-chain| 2883.1 | 2947.5 |  1.02x |
     | 10 non-matching rules      | 1466.3 | 3062.8 |  2.09x |
     | 10 n-m rules, match trap   | 1444.3 | 3317.9 |  2.30x |
     | 10 n-m rules,goto non-chain| 2883.1 | 2939.5 |  1.02x |
     | 25 non-matching rules      |  838.5 | 3058.9 |  3.65x |
     | 25 n-m rules, match trap   |  824.5 | 3323.0 |  4.03x |
     | 25 n-m rules,goto non-chain| 2875.8 | 2944.7 |  1.02x |
     | 50 non-matching rules      |  488.1 | 3054.7 |  6.26x |
     | 50 n-m rules, match trap   |  484.9 | 3318.5 |  6.84x |
     | 50 n-m rules,goto non-chain| 2884.1 | 2939.7 |  1.02x |
     +----------------------------+--------+--------+--------+
    
    perf top (25 n-m skip_sw rules - pre patch):
      20.39%  [kernel]  [k] __skb_flow_dissect
      16.43%  [kernel]  [k] rhashtable_jhash2
      10.58%  [kernel]  [k] fl_classify
      10.23%  [kernel]  [k] fl_mask_lookup
       4.79%  [kernel]  [k] memset_orig
       2.58%  [kernel]  [k] tcf_classify
       1.47%  [kernel]  [k] __x86_indirect_thunk_rax
       1.42%  [kernel]  [k] __dev_queue_xmit
       1.36%  [kernel]  [k] nft_do_chain
       1.21%  [kernel]  [k] __rcu_read_lock
    
    perf top (25 n-m skip_sw rules - post patch):
       5.12%  [kernel]  [k] __dev_queue_xmit
       4.77%  [kernel]  [k] nft_do_chain
       3.65%  [kernel]  [k] dev_gro_receive
       3.41%  [kernel]  [k] check_preemption_disabled
       3.14%  [kernel]  [k] mlx5e_skb_from_cqe_mpwrq_nonlinear
       2.88%  [kernel]  [k] __netif_receive_skb_core.constprop.0
       2.49%  [kernel]  [k] mlx5e_xmit
       2.15%  [kernel]  [k] ip_forward
       1.95%  [kernel]  [k] mlx5e_tc_restore_tunnel
       1.92%  [kernel]  [k] vlan_gro_receive
    
    Test setup:
     DUT: Intel Xeon D-1518 (2.20GHz) w/ Nvidia/Mellanox ConnectX-6 Dx 2x100G
     Data rate measured on switch (Extreme X690), and DUT connected as
     a router on a stick, with pktgen and pktsink as VLANs.
     Pktgen-dpdk was in range 36.6-37.7 Mpps 64B packets across all tests.
     Full test data at https://files.fiberby.net/ast/2024/tc_skip_sw/v2_tests/Signed-off-by: default avatarAsbjørn Sloth Tønnesen <ast@fiberby.net>
    Reviewed-by: default avatarSimon Horman <horms@kernel.org>
    Reviewed-by: default avatarMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
    Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    047f340b
cls_api.c 99.4 KB