1. 16 Jun, 2021 26 commits
  2. 15 Jun, 2021 14 commits
    • Lorenzo Bianconi's avatar
      net: ti: add pp skb recycling support · a078d981
      Lorenzo Bianconi authored
      As already done for mvneta and mvpp2, enable skb recycling for ti
      ethernet drivers
      
      ti driver on net-next:
      ----------------------
      [perf top]
       47.15%  [kernel]     [k] _raw_spin_unlock_irqrestore
       11.77%  [kernel]     [k] __cpdma_chan_free
        3.16%  [kernel]     [k] ___bpf_prog_run
        2.52%  [kernel]     [k] cpsw_rx_vlan_encap
        2.34%  [kernel]     [k] __netif_receive_skb_core
        2.27%  [kernel]     [k] free_unref_page
        2.26%  [kernel]     [k] kmem_cache_free
        2.24%  [kernel]     [k] kmem_cache_alloc
        1.69%  [kernel]     [k] __softirqentry_text_start
        1.61%  [kernel]     [k] cpsw_rx_handler
        1.19%  [kernel]     [k] page_pool_release_page
        1.19%  [kernel]     [k] clear_bits_ll
        1.15%  [kernel]     [k] page_frag_free
        1.06%  [kernel]     [k] __dma_page_dev_to_cpu
        0.99%  [kernel]     [k] memset
        0.94%  [kernel]     [k] __alloc_pages_bulk
        0.92%  [kernel]     [k] kfree_skb
        0.85%  [kernel]     [k] packet_rcv
        0.78%  [kernel]     [k] page_address
        0.75%  [kernel]     [k] v7_dma_inv_range
        0.71%  [kernel]     [k] __lock_text_start
      
      [iperf3 tcp]
      [  5]   0.00-10.00  sec   873 MBytes   732 Mbits/sec    0   sender
      [  5]   0.00-10.01  sec   866 MBytes   726 Mbits/sec        receiver
      
      ti + skb recycling:
      -------------------
      [perf top]
       40.58%  [kernel]    [k] _raw_spin_unlock_irqrestore
       16.18%  [kernel]    [k] __softirqentry_text_start
       10.33%  [kernel]    [k] __cpdma_chan_free
        2.62%  [kernel]    [k] ___bpf_prog_run
        2.05%  [kernel]    [k] cpsw_rx_vlan_encap
        2.00%  [kernel]    [k] kmem_cache_alloc
        1.86%  [kernel]    [k] __netif_receive_skb_core
        1.80%  [kernel]    [k] kmem_cache_free
        1.63%  [kernel]    [k] cpsw_rx_handler
        1.12%  [kernel]    [k] cpsw_rx_mq_poll
        1.11%  [kernel]    [k] page_pool_put_page
        1.04%  [kernel]    [k] _raw_spin_unlock
        0.97%  [kernel]    [k] clear_bits_ll
        0.90%  [kernel]    [k] packet_rcv
        0.88%  [kernel]    [k] __dma_page_dev_to_cpu
        0.85%  [kernel]    [k] kfree_skb
        0.80%  [kernel]    [k] memset
        0.71%  [kernel]    [k] __lock_text_start
        0.66%  [kernel]    [k] v7_dma_inv_range
        0.64%  [kernel]    [k] gen_pool_free_owner
      
      [iperf3 tcp]
      [  5]   0.00-10.00  sec   884 MBytes   742 Mbits/sec    0   sender
      [  5]   0.00-10.01  sec   878 MBytes   735 Mbits/sec        receiver
      Tested-by: default avatarGrygorii Strashko <grygorii.strashko@ti.com>
      Reviewed-by: default avatarGrygorii Strashko <grygorii.strashko@ti.com>
      Signed-off-by: default avatarLorenzo Bianconi <lorenzo@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a078d981
    • M Chetan Kumar's avatar
      net: wwan: iosm: Fix htmldocs warnings · 925a56b2
      M Chetan Kumar authored
      Fixes .rst file warnings seen on linux-next build.
      
      Fixes: f7af616c ("net: iosm: infrastructure")
      Reported-by: default avatarStephen Rothwell <sfr@canb.auug.org.au>
      Signed-off-by: default avatarM Chetan Kumar <m.chetan.kumar@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      925a56b2
    • Colin Ian King's avatar
      octeontx2-pf: Fix spelling mistake "morethan" -> "more than" · f25dcde9
      Colin Ian King authored
      There is a spelling mistake in a dev_err message. Fix it.
      Signed-off-by: default avatarColin Ian King <colin.king@canonical.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f25dcde9
    • Colin Ian King's avatar
      net: dsa: b53: remove redundant null check on dev · 11b57faf
      Colin Ian King authored
      The pointer dev can never be null, the null check is redundant
      and can be removed. Cleans up a static analysis warning that
      pointer priv is dereferencing dev before dev is being null
      checked.
      
      Addresses-Coverity: ("Dereference before null check")
      Signed-off-by: default avatarColin Ian King <colin.king@canonical.com>
      Acked-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      11b57faf
    • Jussi Maki's avatar
      net: bonding: Use per-cpu rr_tx_counter · 848ca918
      Jussi Maki authored
      The round-robin rr_tx_counter was shared across CPUs leading to
      significant cache thrashing at high packet rates. This patch switches
      the round-robin packet counter to use a per-cpu variable to decide
      the destination slave.
      
      On a test with 2x100Gbit ICE nic with pktgen_sample_04_many_flows.sh
      (-s 64 -t 32) the tx rate was 19.6Mpps before and 22.3Mpps after
      this patch.
      
      "perf top -e cache_misses" before:
          12.31%  [bonding]       [k] bond_xmit_roundrobin_slave_get
          10.59%  [sch_fq_codel]  [k] fq_codel_dequeue
           9.34%  [kernel]        [k] skb_release_data
      after:
          15.42%  [sch_fq_codel]  [k] fq_codel_dequeue
          10.06%  [kernel]        [k] __memset
           9.12%  [kernel]        [k] skb_release_data
      Signed-off-by: default avatarJussi Maki <joamaki@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      848ca918
    • Liu Shixin's avatar
      netlabel: Fix memory leak in netlbl_mgmt_add_common · b8f6b052
      Liu Shixin authored
      Hulk Robot reported memory leak in netlbl_mgmt_add_common.
      The problem is non-freed map in case of netlbl_domhsh_add() failed.
      
      BUG: memory leak
      unreferenced object 0xffff888100ab7080 (size 96):
        comm "syz-executor537", pid 360, jiffies 4294862456 (age 22.678s)
        hex dump (first 32 bytes):
          05 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
          fe 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01  ................
        backtrace:
          [<0000000008b40026>] netlbl_mgmt_add_common.isra.0+0xb2a/0x1b40
          [<000000003be10950>] netlbl_mgmt_add+0x271/0x3c0
          [<00000000c70487ed>] genl_family_rcv_msg_doit.isra.0+0x20e/0x320
          [<000000001f2ff614>] genl_rcv_msg+0x2bf/0x4f0
          [<0000000089045792>] netlink_rcv_skb+0x134/0x3d0
          [<0000000020e96fdd>] genl_rcv+0x24/0x40
          [<0000000042810c66>] netlink_unicast+0x4a0/0x6a0
          [<000000002e1659f0>] netlink_sendmsg+0x789/0xc70
          [<000000006e43415f>] sock_sendmsg+0x139/0x170
          [<00000000680a73d7>] ____sys_sendmsg+0x658/0x7d0
          [<0000000065cbb8af>] ___sys_sendmsg+0xf8/0x170
          [<0000000019932b6c>] __sys_sendmsg+0xd3/0x190
          [<00000000643ac172>] do_syscall_64+0x37/0x90
          [<000000009b79d6dc>] entry_SYSCALL_64_after_hwframe+0x44/0xae
      
      Fixes: 63c41688 ("netlabel: Add network address selectors to the NetLabel/LSM domain mapping")
      Reported-by: default avatarHulk Robot <hulkci@huawei.com>
      Signed-off-by: default avatarLiu Shixin <liushixin2@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b8f6b052
    • David S. Miller's avatar
      Merge tag 'mlx5-updates-2021-06-14' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux · f0c227c7
      David S. Miller authored
      Saeed Mahameed says:
      
      ====================
      mlx5-updates-2021-06-14
      
      1) Trivial Lag refactroing in preparation for upcomming Single FDB lag feature
       - First 3 patches
      
      2) Scalable IRQ distriburion for Sub-functions
      
      A subfunction (SF) is a lightweight function that has a parent PCI
      function (PF) on which it is deployed.
      
      Currently, mlx5 subfunction is sharing the IRQs (MSI-X) with their
      parent PCI function.
      
      Before this series the PF allocates enough IRQs to cover
      all the cores in a system, Newly created SFs will re-use all the IRQs
      that the PF has allocated for itself.
      Hence, the more SFs are created, there are more EQs per IRQs. Therefore,
      whenever we handle an interrupt, we need to pull all SFs EQs and PF EQs
      instead of PF EQs without SFs on the system. This leads to a hard impact
      on the performance of SFs and PF.
      
      For example, on machine with:
      Intel(R) Xeon(R) CPU E5-2697 v3 @ 2.60GHz with 56 cores.
      PCI Express 3 with BW of 126 Gb/s.
      ConnectX-5 Ex; EDR IB (100Gb/s) and 100GbE; dual-port QSFP28; PCIe4.0 x16.
      
      test case: iperf TX BW single CPU, affinity of app and IRQ are the same.
      PF only: no SFs on the system, 56 IRQs.
      SF (before), 250 SFs Sharing the same 56 IRQs .
      SF (now),    250 SFs + 255 avaiable IRQs for the NIC. (please see IRQ spread scheme below).
      
      	    application SF-IRQ  channel   BW(Gb/sec)         interrupts/sec
                  iperf TX            affinity
      PF only     cpu={0}     cpu={0} cpu={0}   79                 8200
      SF (before) cpu={0}     cpu={0} cpu={0}   51.3 (-35%)        9500
      SF (now)    cpu={0}     cpu={0} cpu={0}   78 (-2%)           8200
      
      command:
      $ taskset -c 0 iperf -c 11.1.1.1 -P 3 -i 6 -t 30 | grep SUM
      
      The different between the SF examples is that before this series we
      allocate num_cpus (56) IRQs, and all of them were shared among the PF
      and the SFs. And after this series, we allocate 255 IRQs, and we spread
      the SFs among the above IRQs. This have significantly decreased the load
      on each IRQ and the number of EQs per IRQ is down by 95% (251->11).
      
      In this patchset the solution proposed is to have a dedicated IRQ pool
      for SFs to use. the pool will allocate a large number of IRQs
      for SFs to grab from in order to minimize irq sharing between the
      different SFs.
      IRQs will not be requested from the OS until they are 1st requested by
      an SF consumer, and will be eventually released when the last SF consumer
      releases them.
      
      For the detailed IRQ spread and allocation scheme  please see last patch:
      ("net/mlx5: Round-Robin EQs over IRQs")
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f0c227c7
    • David S. Miller's avatar
      Merge branch 'occteontx2-rate-limit-offload' · 08ab4d74
      David S. Miller authored
      Subbaraya Sundeep says:
      
      ====================
      octeontx2: Add ingress ratelimit offload
      
      This patchset adds ingress rate limiting hardware
      offload support for CN10K silicons. Police actions
      are added for TC matchall and flower filters.
      CN10K has ingress rate limiting feature where
      a receive queue is mapped to bandwidth profile
      and the profile is configured with rate and burst
      parameters by software. CN10K hardware supports
      three levels of ingress policing or ratelimiting.
      Multiple leaf profiles can  point to a single mid
      level profile and multiple mid level profile can
      point to a single top level one. Only leaf level
      profiles are used for configuring rate limiting.
      
      Patch 1 adds the new bandwidth profile contexts
      in AF driver similar to other hardware contexts
      Patch 2 adds the debugfs changes to dump bandwidth
      profile contexts
      Patch 3 adds support for police action with TC matchall filter
      Patch 4 uses NL_SET_ERR_MSG_MOD for tc code
      Patch 5 adds support for police action with TC flower filter
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      08ab4d74
    • Subbaraya Sundeep's avatar
      octeontx2-pf: Add police action for TC flower · 68fbff68
      Subbaraya Sundeep authored
      Added police action for ingress TC flower
      hardware offload. With this rate limiting can be
      done per flow. Since rate limiting is tied to
      RQs in hardware the number of TC flower filters
      with action as police is limited to number
      of receive queues of the interface. Both bps
      and pps modes are supported.
      
      Examples to rate limit a flow:
      $ ethtool -K eth0 hw-tc-offload on
      $ tc qdisc add dev eth0 ingress
      $ tc filter add dev eth0 parent ffff: protocol ip \
        flower ip_proto udp dst_port 80 action \
        police rate 100Mbit burst 32Kbit
      
      $ tc filter add dev eth0 parent ffff: \
        protocol ip flower dst_mac 5e:b2:34:ee:29:49 \
        action police pkts_rate 5000 pkts_burst 2048
      Signed-off-by: default avatarSubbaraya Sundeep <sbhatta@marvell.com>
      Signed-off-by: default avatarSunil Kovvuri Goutham <sgoutham@marvell.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      68fbff68
    • Subbaraya Sundeep's avatar
      octeontx2-pf: Use NL_SET_ERR_MSG_MOD for TC · 5d2fdd86
      Subbaraya Sundeep authored
      This patch modifies all netdev_err messages in
      tc code to NL_SET_ERR_MSG_MOD. NL_SET_ERR_MSG_MOD
      does not support format specifiers yet hence
      netdev_err messages with only strings are modified.
      Signed-off-by: default avatarSubbaraya Sundeep <sbhatta@marvell.com>
      Signed-off-by: default avatarSunil Kovvuri Goutham <sgoutham@marvell.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5d2fdd86
    • Sunil Goutham's avatar
      octeontx2-pf: TC_MATCHALL ingress ratelimiting offload · 2ca89a2c
      Sunil Goutham authored
      Add TC_MATCHALL ingress ratelimiting offload support with POLICE
      action for entire traffic coming into the interface.
      
      Eg: To ratelimit ingress traffic to 100Mbps
      
      $ ethtool -K eth0 hw-tc-offload on
      $ tc qdisc add dev eth0 clsact
      $ tc filter add dev eth0 ingress matchall skip_sw \
                      action police rate 100Mbit burst 32Kbit
      
      To support this, a leaf level bandwidth profile is allocated and all
      RQs' contexts used by this interface are updated to point to it.
      And the leaf level bandwidth profile is configured with user specified
      rate and burst sizes.
      Co-developed-by: default avatarSubbaraya Sundeep <sbhatta@marvell.com>
      Signed-off-by: default avatarSubbaraya Sundeep <sbhatta@marvell.com>
      Signed-off-by: default avatarSunil Goutham <sgoutham@marvell.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2ca89a2c
    • Sunil Goutham's avatar
      octeontx2-af: cn10k: Debugfs support for bandwidth profiles · e7d89717
      Sunil Goutham authored
      Added support for dumping current resource status of bandwidth
      profiles and contexts of allocated profiles via debugfs.
      Signed-off-by: default avatarSunil Goutham <sgoutham@marvell.com>
      Signed-off-by: default avatarSubbaraya Sundeep <sbhatta@marvell.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e7d89717
    • Sunil Goutham's avatar
      octeontx2-af: cn10k: Bandwidth profiles config support · e8e095b3
      Sunil Goutham authored
      CN10K silicons supports hierarchial ingress packet ratelimiting.
      There are 3 levels of profilers supported leaf, mid and top.
      Ratelimiting is done after packet forwarding decision is taken
      and a NIXLF's RQ is identified to DMA the packet. RQ's context
      points to a leaf bandwidth profile which can be configured
      to achieve desired ratelimit.
      
      This patch adds logic for management of these bandwidth profiles
      ie profile alloc, free, context update etc.
      Signed-off-by: default avatarSunil Goutham <sgoutham@marvell.com>
      Signed-off-by: default avatarSubbaraya Sundeep <sbhatta@marvell.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e8e095b3
    • David S. Miller's avatar
      Merge branch 'pci200syn-cleanups' · ad5645d7
      David S. Miller authored
      Peng Li says:
      
      ====================
      net: pci200syn: clean up some code style issues
      
      This patchset clean up some code style issues.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ad5645d7