Commits · 2fba753cc9b57d82800db0997b9271898323c57e · Kirill Smelkov / linux

An error occurred fetching the project authors.

09 Mar, 2023 5 commits

bnx2x: Drop redundant pci_enable_pcie_error_reporting() · 2fba753c

Bjorn Helgaas authored 2 years ago

pci_enable_pcie_error_reporting() enables the device to send ERR_*
Messages.  Since f26e58bf ("PCI/AER: Enable error reporting when AER is
native"), the PCI core does this for all devices during enumeration, so the
driver doesn't need to do it itself.

Remove the redundant pci_enable_pcie_error_reporting() call from the
driver.  Also remove the corresponding pci_disable_pcie_error_reporting()
from the driver .remove() path.

Note that this only controls ERR_* Messages from the device.  An ERR_*
Message may cause the Root Port to generate an interrupt, depending on the
AER Root Error Command register managed by the AER service driver.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Ariel Elior <aelior@marvell.com>
Cc: Sudarsana Kalluru <skalluru@marvell.com>
Cc: Manish Chopra <manishc@marvell.com>
Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

2fba753c

bnx2: Drop redundant pci_enable_pcie_error_reporting() · 5f00358b

Bjorn Helgaas authored 2 years ago

pci_enable_pcie_error_reporting() enables the device to send ERR_*
Messages.  Since f26e58bf ("PCI/AER: Enable error reporting when AER is
native"), the PCI core does this for all devices during enumeration, so the
driver doesn't need to do it itself.

Remove the redundant pci_enable_pcie_error_reporting() call from the
driver.  Also remove the corresponding pci_disable_pcie_error_reporting()
from the driver .remove() path.

Note that this only controls ERR_* Messages from the device.  An ERR_*
Message may cause the Root Port to generate an interrupt, depending on the
AER Root Error Command register managed by the AER service driver.

cd709aa9 ("bnx2: Add PCI Advanced Error Reporting support.") added
pci_enable_pcie_error_reporting() for all devices, and c239f279 ("bnx2:
Enable AER on PCIE devices only") restricted it to BNX2_CHIP_5709 devices
to avoid an error message when it failed on non-PCIe devices.  The PCI core
only enables PCIe error reporting on PCIe devices, which I assume means
BNX2_CHIP_5709.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Rasesh Mody <rmody@marvell.com>
Cc: GR-Linux-NIC-Dev@marvell.com
Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

5f00358b

be2net: Drop redundant pci_enable_pcie_error_reporting() · b4e24578

Bjorn Helgaas authored 2 years ago

pci_enable_pcie_error_reporting() enables the device to send ERR_*
Messages.  Since f26e58bf ("PCI/AER: Enable error reporting when AER is
native"), the PCI core does this for all devices during enumeration,  so the
driver doesn't need to do it itself.

Remove the redundant pci_enable_pcie_error_reporting() call from the
driver.  Also remove the corresponding pci_disable_pcie_error_reporting()
from the driver .remove() path.

Note that this only controls ERR_* Messages from the device.  An ERR_*
Message may cause the Root Port to generate an interrupt, depending on the
AER Root Error Command register managed by the AER service driver.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Ajit Khaparde <ajit.khaparde@broadcom.com>
Cc: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com>
Cc: Somnath Kotur <somnath.kotur@broadcom.com>
Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

b4e24578

alx: Drop redundant pci_enable_pcie_error_reporting() · 1de2a84d

Bjorn Helgaas authored 2 years ago

pci_enable_pcie_error_reporting() enables the device to send ERR_*
Messages.  Since f26e58bf ("PCI/AER: Enable error reporting when AER is
native"), the PCI core does this for all devices during enumeration, so the
driver doesn't need to do it itself.

Remove the redundant pci_enable_pcie_error_reporting() call from the
driver.  Also remove the corresponding pci_disable_pcie_error_reporting()
from the driver .remove() path.

Note that this only controls ERR_* Messages from the device.  An ERR_*
Message may cause the Root Port to generate an interrupt, depending on the
AER Root Error Command register managed by the AER service driver.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Chris Snook <chris.snook@gmail.com>
Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

1de2a84d

ravb: remove R-Car H3 ES1.* handling · 6bf0ad7f

Wolfram Sang authored 2 years ago

R-Car H3 ES1.* was only available to an internal development group and
needed a lot of quirks and workarounds. These become a maintenance
burden now, so our development group decided to remove upstream support
and disable booting for this SoC. Public users only have ES2 onwards.
Reviewed-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Link: https://lore.kernel.org/all/20230307163041.3815-8-wsa+renesas@sang-engineering.com/Signed-off-by: Jakub Kicinski <kuba@kernel.org>

6bf0ad7f

08 Mar, 2023 24 commits

Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next · ed69e066

Jakub Kicinski authored 2 years ago

Andrii Nakryiko says:

====================
pull-request: bpf-next 2023-03-08

We've added 23 non-merge commits during the last 2 day(s) which contain
a total of 28 files changed, 414 insertions(+), 104 deletions(-).

The main changes are:

1) Add more precise memory usage reporting for all BPF map types,
   from Yafang Shao.

2) Add ARM32 USDT support to libbpf, from Puranjay Mohan.

3) Fix BTF_ID_LIST size causing problems in !CONFIG_DEBUG_INFO_BTF,
   from Nathan Chancellor.

4) IMA selftests fix, from Roberto Sassu.

5) libbpf fix in APK support code, from Daniel Müller.

* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (23 commits)
  selftests/bpf: Fix IMA test
  libbpf: USDT arm arg parsing support
  libbpf: Refactor parse_usdt_arg() to re-use code
  libbpf: Fix theoretical u32 underflow in find_cd() function
  bpf: enforce all maps having memory usage callback
  bpf: offload map memory usage
  bpf, net: xskmap memory usage
  bpf, net: sock_map memory usage
  bpf, net: bpf_local_storage memory usage
  bpf: local_storage memory usage
  bpf: bpf_struct_ops memory usage
  bpf: queue_stack_maps memory usage
  bpf: devmap memory usage
  bpf: cpumap memory usage
  bpf: bloom_filter memory usage
  bpf: ringbuf memory usage
  bpf: reuseport_array memory usage
  bpf: stackmap memory usage
  bpf: arraymap memory usage
  bpf: hashtab memory usage
  ...
====================

Link: https://lore.kernel.org/r/20230308193533.1671597-1-andrii@kernel.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>

ed69e066

selftests/bpf: Fix IMA test · 12fabae0

Roberto Sassu authored 2 years ago

Commit 62622dab ("ima: return IMA digest value only when IMA_COLLECTED
flag is set") caused bpf_ima_inode_hash() to refuse to give non-fresh
digests. IMA test #3 assumed the old behavior, that bpf_ima_inode_hash()
still returned also non-fresh digests.

Correct the test by accepting both cases. If the samples returned are 1,
assume that the commit above is applied and that the returned digest is
fresh. If the samples returned are 2, assume that the commit above is not
applied, and check both the non-fresh and fresh digest.

Fixes: 62622dab ("ima: return IMA digest value only when IMA_COLLECTED flag is set")
Reported-by: David Vernet <void@manifault.com>
Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Reviewed-by: Matt Bobrowski <mattbobrowski@google.com>
Link: https://lore.kernel.org/bpf/20230308103713.1681200-1-roberto.sassu@huaweicloud.com

12fabae0

net: reclaim skb->scm_io_uring bit · 10369080

Eric Dumazet authored 2 years ago

Commit 0091bfc8 ("io_uring/af_unix: defer registered
files gc to io_uring release") added one bit to struct sk_buff.

This structure is critical for networking, and we try very hard
to not add bloat on it, unless absolutely required.

For instance, we can use a specific destructor as a wrapper
around unix_destruct_scm(), to identify skbs that unix_gc()
has to special case.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Pavel Begunkov <asml.silence@gmail.com>
Cc: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Cc: Jens Axboe <axboe@kernel.dk>
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Reviewed-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

10369080

Merge branch 'sparx5-tc-flower-templates' · b3f4cd07

David S. Miller authored 2 years ago

Steen Hegelund says:

====================
Add support for TC flower templates in Sparx5

This adds support for the TC template mechanism in the Sparx5 flower filter
implementation.

Templates are as such handled by the TC framework, but when a template is
created (using a chain id) there are by definition no filters on this
chain (an error will be returned if there are any).

If the templates chain id is one that is represented by a VCAP lookup, then
when the template is created, we know that it is safe to use the keys
provided in the template to change the keyset configuration for the (port,
lookup) combination, if this is needed to improve the match on the
template.

The original port keyset configuration is captured in the template state
information which is kept per port, so that when the template is deleted
the port keyset configuration can be restored to its previous setting.

The template also provides the protocol parameter which is the basic
information that is used to find out which port keyset configuration needs
to be changed.

The VCAPs and lookups are slightly different when it comes to which keys,
keysets and protocol are supported and used for selection, so in some
cases a bit of tweaking is needed to find a useful match.  This is done by
e.g. removing a key that prevents the best matching keyset from being
selected.

The debugfs output that is provided for a port allows inspection of the
currently used keyset in each of the VCAPs lookups.  So when a template has
been created the debugfs output allows you to verify if the keyset
configuration has been changed successfully.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

b3f4cd07

net: microchip: sparx5: Add TC template support · e1d597ec

Steen Hegelund authored 2 years ago

This adds support for using the "template add" and "template destroy"
functionality to change the port keyset configuration.

If the VCAP lookup already contains rules, the port keyset is left
unchanged, as a change would make these rules unusable.

When the template is destroyed the port keyset configuration is restored.
The filters using the template chain will automatically be deleted by the
TC framework.
Signed-off-by: Steen Hegelund <steen.hegelund@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

e1d597ec

net: microchip: sparx5: Add port keyset changing functionality · d9f175b0

Steen Hegelund authored 2 years ago

With this its is now possible for clients (like TC) to change the port
keyset configuration in the Sparx5 VCAPs.

This is typically done per traffic class which is guided with the L3
protocol information.
Before the change the current keyset configuration is collected in a list
that is handed back to the client.
Signed-off-by: Steen Hegelund <steen.hegelund@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

d9f175b0

net: microchip: sparx5: Add TC template list to a port · 1c14432d

Steen Hegelund authored 2 years ago

This adds a list that is used to collect the templates that are active on a
port.

This allows the template creation to change the port configuration
and the template destruction to change it back.
Signed-off-by: Steen Hegelund <steen.hegelund@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

1c14432d

net: microchip: sparx5: Provide rule count, key removal and keyset select · bfcb94aa

Steen Hegelund authored 2 years ago

This provides these 3 functions in the VCAP API:

- Count the number of rules in a VCAP lookup (chain)
- Remove a key from a VCAP rule
- Find the keyset that gives the smallest rule list from a list of keysets
Signed-off-by: Steen Hegelund <steen.hegelund@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bfcb94aa

net: microchip: sparx5: Correct the spelling of the keysets in debugfs · fbd3dce9

Steen Hegelund authored 2 years ago

Correct the name used in the debugfs output.
Signed-off-by: Steen Hegelund <steen.hegelund@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

fbd3dce9

dt-bindings: net: dsa: mediatek,mt7530: change some descriptions to literal · 7d8c4891

Arınç ÜNAL authored 2 years ago

The line endings must be preserved on gpio-controller, io-supply, and
reset-gpios properties to look proper when the YAML file is parsed.

Currently it's interpreted as a single line when parsed. Change the style
of the description of these properties to literal style to preserve the
line endings.
Signed-off-by: Arınç ÜNAL <arinc.unal@arinc9.com>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

7d8c4891

emulex/benet: clean up some inconsistent indenting · ecf729f9

Jiapeng Chong authored 2 years ago

No functional modification involved.

drivers/net/ethernet/emulex/benet/be_cmds.c:1120 be_cmd_pmac_add() warn: inconsistent indenting.
Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Link: https://bugzilla.openanolis.cn/show_bug.cgi?id=4396Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ecf729f9

net/mlx4_en: Replace fake flex-array with flexible-array member · 966b6b80

Gustavo A. R. Silva authored 2 years ago

Zero-length arrays as fake flexible arrays are deprecated and we are
moving towards adopting C99 flexible-array members instead.

Transform zero-length array into flexible-array member in struct
mlx4_en_rx_desc.

Address the following warnings found with GCC-13 and
-fstrict-flex-arrays=3 enabled:
drivers/net/ethernet/mellanox/mlx4/en_rx.c:88:30: warning: array subscript i is outside array bounds of ‘struct mlx4_wqe_data_seg[0]’ [-Warray-bounds=]
drivers/net/ethernet/mellanox/mlx4/en_rx.c:149:30: warning: array subscript 0 is outside array bounds of ‘struct mlx4_wqe_data_seg[0]’ [-Warray-bounds=]
drivers/net/ethernet/mellanox/mlx4/en_rx.c:127:30: warning: array subscript i is outside array bounds of ‘struct mlx4_wqe_data_seg[0]’ [-Warray-bounds=]
drivers/net/ethernet/mellanox/mlx4/en_rx.c:128:30: warning: array subscript i is outside array bounds of ‘struct mlx4_wqe_data_seg[0]’ [-Warray-bounds=]
drivers/net/ethernet/mellanox/mlx4/en_rx.c:129:30: warning: array subscript i is outside array bounds of ‘struct mlx4_wqe_data_seg[0]’ [-Warray-bounds=]
drivers/net/ethernet/mellanox/mlx4/en_rx.c:117:30: warning: array subscript i is outside array bounds of ‘struct mlx4_wqe_data_seg[0]’ [-Warray-bounds=]
drivers/net/ethernet/mellanox/mlx4/en_rx.c:119:30: warning: array subscript i is outside array bounds of ‘struct mlx4_wqe_data_seg[0]’ [-Warray-bounds=]

This helps with the ongoing efforts to tighten the FORTIFY_SOURCE
routines on memcpy() and help us make progress towards globally
enabling -fstrict-flex-arrays=3 [1].

Link: https://github.com/KSPP/linux/issues/21
Link: https://github.com/KSPP/linux/issues/264
Link: https://gcc.gnu.org/pipermail/gcc-patches/2022-October/602902.html [1]
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

966b6b80

Merge branch 'r8169-disable-ASPM-during-NAPI-poll' · db067ef3

David S. Miller authored 2 years ago

Heiner Kallweit says:

====================
r8169: disable ASPM during NAPI poll

This is a rework of ideas from Kai-Heng on how to avoid the known
ASPM issues whilst still allowing for a maximum of ASPM-related power
savings. As a prerequisite some locking is added first.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

db067ef3

r8169: remove ASPM restrictions now that ASPM is disabled during NAPI poll · 2ab19de6

Heiner Kallweit authored 2 years ago

Now that  ASPM is disabled during NAPI poll, we can remove all ASPM
restrictions. This allows for higher power savings if the network
isn't fully loaded.
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Tested-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Tested-by: Holger Hoffstätte <holger@applied-asynchrony.com>
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

2ab19de6

r8169: disable ASPM during NAPI poll · e1ed3e4d

Heiner Kallweit authored 2 years ago

Several chip versions have problems with ASPM, what may result in
rx_missed errors or tx timeouts. The root cause isn't known but
experience shows that disabling ASPM during NAPI poll can avoid
these problems.
Suggested-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Tested-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Tested-by: Holger Hoffstätte <holger@applied-asynchrony.com>
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

e1ed3e4d

r8169: prepare rtl_hw_aspm_clkreq_enable for usage in atomic context · 49ef7d84

Heiner Kallweit authored 2 years ago

Bail out if the function is used with chip versions that don't support
ASPM configuration. In addition remove the delay, it tuned out that
it's not needed, also vendor driver r8125 doesn't have it.
Suggested-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Tested-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Tested-by: Holger Hoffstätte <holger@applied-asynchrony.com>
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

49ef7d84

r8169: enable cfg9346 config register access in atomic context · 59ee97c0

Heiner Kallweit authored 2 years ago

For disabling ASPM during NAPI poll we'll have to unlock access
to the config registers in atomic context. Other code parts
running with config register access unlocked are partially
longer and can sleep. Add a usage counter to enable parallel
execution of code parts requiring unlocked config registers.
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Tested-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Tested-by: Holger Hoffstätte <holger@applied-asynchrony.com>
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

59ee97c0

r8169: use spinlock to protect access to registers Config2 and Config5 · 6bc6c4e6

Heiner Kallweit authored 2 years ago

For disabling ASPM during NAPI poll we'll have to access both registers
in atomic context. Use a spinlock to protect access.
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Tested-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Tested-by: Holger Hoffstätte <holger@applied-asynchrony.com>
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

6bc6c4e6

r8169: use spinlock to protect mac ocp register access · 91c86435

Heiner Kallweit authored 2 years ago

For disabling ASPM during NAPI poll we'll have to access mac ocp
registers in atomic context. This could result in races because
a mac ocp read consists of a write to register OCPDR, followed
by a read from the same register. Therefore add a spinlock to
protect access to mac ocp registers.
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Tested-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Tested-by: Holger Hoffstätte <holger@applied-asynchrony.com>
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

91c86435

net-timestamp: extend SOF_TIMESTAMPING_OPT_ID to HW timestamps · 8ca5a579

Vadim Fedorenko authored 2 years ago

When the feature was added it was enabled for SW timestamps only but
with current hardware the same out-of-order timestamps can be seen.
Let's expand the area for the feature to all types of timestamps.
Signed-off-by: Vadim Fedorenko <vadfed@meta.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

8ca5a579

netxen_nic: Replace fake flex-array with flexible-array member · 25493479

Gustavo A. R. Silva authored 2 years ago

Zero-length arrays as fake flexible arrays are deprecated and we are
moving towards adopting C99 flexible-array members instead.

Transform zero-length array into flexible-array member in struct
nx_cardrsp_rx_ctx_t.

Address the following warnings found with GCC-13 and
-fstrict-flex-arrays=3 enabled:
drivers/net/ethernet/qlogic/netxen/netxen_nic_ctx.c:361:26: warning: array subscript <unknown> is outside array bounds of ‘char[0]’ [-Warray-bounds=]
drivers/net/ethernet/qlogic/netxen/netxen_nic_ctx.c:372:25: warning: array subscript <unknown> is outside array bounds of ‘char[0]’ [-Warray-bounds=]

This helps with the ongoing efforts to tighten the FORTIFY_SOURCE
routines on memcpy() and help us make progress towards globally
enabling -fstrict-flex-arrays=3 [1].

Link: https://github.com/KSPP/linux/issues/21
Link: https://github.com/KSPP/linux/issues/265
Link: https://gcc.gnu.org/pipermail/gcc-patches/2022-October/602902.html [1]
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Link: https://lore.kernel.org/r/ZAZ57I6WdQEwWh7v@workSigned-off-by: Jakub Kicinski <kuba@kernel.org>

25493479

net: phy: smsc: simplify lan95xx_config_aneg_ext · 4310e2f4

Heiner Kallweit authored 2 years ago

lan95xx_config_aneg_ext() can be simplified by using phy_set_bits().
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Link: https://lore.kernel.org/r/3da785c7-3ef8-b5d3-89a0-340f550be3c2@gmail.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>

4310e2f4

net: remove enum skb_free_reason · 40bbae58

Eric Dumazet authored 2 years ago

enum skb_drop_reason is more generic, we can adopt it instead.

Provide dev_kfree_skb_irq_reason() and dev_kfree_skb_any_reason().

This means drivers can use more precise drop reasons if they want to.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Reviewed-by: Yunsheng Lin <linyunsheng@huawei.com>
Link: https://lore.kernel.org/r/20230306204313.10492-1-edumazet@google.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>

40bbae58

net: phy: improve phy_read_poll_timeout · 0194b645

Heiner Kallweit authored 2 years ago

cond sometimes is (val & MASK) what may result in a false positive
if val is a negative errno. We shouldn't evaluate cond if val < 0.
This has no functional impact here, but it's not nice.
Therefore switch order of the checks.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/6d8274ac-4344-23b4-d9a3-cad4c39517d4@gmail.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>

0194b645

07 Mar, 2023 11 commits

Merge branch 'libbpf: usdt arm arg parsing support' · d1d51a62

Andrii Nakryiko authored 2 years ago

Puranjay Mohan says:

====================

This series add the support of the ARM architecture to libbpf USDT. This
involves implementing the parse_usdt_arg() function for ARM.

It was seen that the last part of parse_usdt_arg() is repeated for all architectures,
so, the first patch in this series refactors these functions and moved the post
processing to parse_usdt_spec()

Changes in V2[1] to V3:

- Use a tabular approach to find register offsets.
- Add the patch for refactoring parse_usdt_arg()
====================
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>

d1d51a62

libbpf: USDT arm arg parsing support · 720d93b6

Puranjay Mohan authored 2 years ago

Parsing of USDT arguments is architecture-specific; on arm it is
relatively easy since registers used are r[0-10], fp, ip, sp, lr,
pc. Format is slightly different compared to aarch64; forms are

- "size @ [ reg, #offset ]" for dereferences, for example
  "-8 @ [ sp, #76 ]" ; " -4 @ [ sp ]"
- "size @ reg" for register values; for example
  "-4@r0"
- "size @ #value" for raw values; for example
  "-8@#1"

Add support for parsing USDT arguments for ARM architecture.

To test the above changes QEMU's virt[1] board with cortex-a15
CPU was used. libbpf-bootstrap's usdt example[2] was modified to attach
to a test program with DTRACE_PROBE1/2/3/4... probes to test different
combinations.

[1] https://www.qemu.org/docs/master/system/arm/virt.html
[2] https://github.com/libbpf/libbpf-bootstrap/blob/master/examples/c/usdt.bpf.cSigned-off-by: Puranjay Mohan <puranjay12@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20230307120440.25941-3-puranjay12@gmail.com

720d93b6

libbpf: Refactor parse_usdt_arg() to re-use code · 98e678e9

Puranjay Mohan authored 2 years ago

The parse_usdt_arg() function is defined differently for each
architecture but the last part of the function is repeated
verbatim for each architecture.

Refactor parse_usdt_arg() to fill the arg_sz and then do the repeated
post-processing in parse_usdt_spec().
Signed-off-by: Puranjay Mohan <puranjay12@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20230307120440.25941-2-puranjay12@gmail.com

98e678e9

libbpf: Fix theoretical u32 underflow in find_cd() function · 3ecde218

Daniel Müller authored 2 years ago

Coverity reported a potential underflow of the offset variable used in
the find_cd() function. Switch to using a signed 64 bit integer for the
representation of offset to make sure we can never underflow.

Fixes: 1eebcb60 ("libbpf: Implement basic zip archive parsing support")
Signed-off-by: Daniel Müller <deso@posteo.net>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20230307215504.837321-1-deso@posteo.net

3ecde218

Merge branch 'bpf: bpf memory usage' · a73dc912

Alexei Starovoitov authored 2 years ago

Yafang Shao says:

====================

Currently we can't get bpf memory usage reliably either from memcg or
from bpftool.

In memcg, there's not a 'bpf' item in memory.stat, but only 'kernel',
'sock', 'vmalloc' and 'percpu' which may related to bpf memory. With
these items we still can't get the bpf memory usage, because bpf memory
usage may far less than the kmem in a memcg, for example, the dentry may
consume lots of kmem.

bpftool now shows the bpf memory footprint, which is difference with bpf
memory usage. The difference can be quite great in some cases, for example,

- non-preallocated bpf map
  The non-preallocated bpf map memory usage is dynamically changed. The
  allocated elements count can be from 0 to the max entries. But the
  memory footprint in bpftool only shows a fixed number.

- bpf metadata consumes more memory than bpf element
  In some corner cases, the bpf metadata can consumes a lot more memory
  than bpf element consumes. For example, it can happen when the element
  size is quite small.

- some maps don't have key, value or max_entries
  For example the key_size and value_size of ringbuf is 0, so its
  memlock is always 0.

We need a way to show the bpf memory usage especially there will be more
and more bpf programs running on the production environment and thus the
bpf memory usage is not trivial.

This patchset introduces a new map ops ->map_mem_usage to calculate the
memory usage. Note that we don't intend to make the memory usage 100%
accurate, while our goal is to make sure there is only a small difference
between what bpftool reports and the real memory. That small difference
can be ignored compared to the total usage.  That is enough to monitor
the bpf memory usage. For example, the user can rely on this value to
monitor the trend of bpf memory usage, compare the difference in bpf
memory usage between different bpf program versions, figure out which
maps consume large memory, and etc.

This patchset implements the bpf memory usage for all maps, and yet there's
still work to do. We don't want to introduce runtime overhead in the
element update and delete path, but we have to do it for some
non-preallocated maps,
- devmap, xskmap
  When we update or delete an element, it will allocate or free memory.
  In order to track this dynamic memory, we have to track the count in
  element update and delete path.

- cpumap
  The element size of each cpumap element is not determinated. If we
  want to track the usage, we have to count the size of all elements in
  the element update and delete path. So I just put it aside currently.

- local_storage, bpf_local_storage
  When we attach or detach a cgroup, it will allocate or free memory. If
  we want to track the dynamic memory, we also need to do something in
  the update and delete path. So I just put it aside currently.

- offload map
  The element update and delete of offload map is via the netdev dev_ops,
  in which it may dynamically allocate or free memory, but this dynamic
  memory isn't counted in offload map memory usage currently.

The result of each map can be found in the individual patch.

We may also need to track per-container bpf memory usage, that will be
addressed by a different patchset.

Changes:
v3->v4: code improvement on ringbuf (Andrii)
        use READ_ONCE() to read lpm_trie (Tao)
        explain why we can't get bpf memory usage from memcg.
v2->v3: check callback at map creation time and avoid warning (Alexei)
        fix build error under CONFIG_BPF=n (lkp@intel.com)
v1->v2: calculate the memory usage within bpf (Alexei)
- [v1] bpf, mm: bpf memory usage
  https://lwn.net/Articles/921991/
- [RFC PATCH v2] mm, bpf: Add BPF into /proc/meminfo
  https://lwn.net/Articles/919848/
- [RFC PATCH v1] mm, bpf: Add BPF into /proc/meminfo
  https://lwn.net/Articles/917647/
- [RFC PATCH] bpf, mm: Add a new item bpf into memory.stat
  https://lore.kernel.org/bpf/20220921170002.29557-1-laoar.shao@gmail].com/
====================
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

a73dc912

bpf: enforce all maps having memory usage callback · 6b4a6ea2

Yafang Shao authored 2 years ago

We have implemented memory usage callback for all maps, and we enforce
any newly added map having a callback as well. We check this callback at
map creation time. If it doesn't have the callback, we will return
EINVAL.
Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Link: https://lore.kernel.org/r/20230305124615.12358-19-laoar.shao@gmail.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>

6b4a6ea2

bpf: offload map memory usage · 9629363c

Yafang Shao authored 2 years ago

A new helper is introduced to calculate offload map memory usage. But
currently the memory dynamically allocated in netdev dev_ops, like
nsim_map_update_elem, is not counted. Let's just put it aside now.
Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Link: https://lore.kernel.org/r/20230305124615.12358-18-laoar.shao@gmail.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>

9629363c

bpf, net: xskmap memory usage · b4fd0d67

Yafang Shao authored 2 years ago

A new helper is introduced to calculate xskmap memory usage.

The xfsmap memory usage can be dynamically changed when we add or remove
a xsk_map_node. Hence we need to track the count of xsk_map_node to get
its memory usage.

The result as follows,
- before
10: xskmap  name count_map  flags 0x0
        key 4B  value 4B  max_entries 65536  memlock 524288B

- after
10: xskmap  name count_map  flags 0x0 <<< no elements case
        key 4B  value 4B  max_entries 65536  memlock 524608B
Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Link: https://lore.kernel.org/r/20230305124615.12358-17-laoar.shao@gmail.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>

b4fd0d67

bpf, net: sock_map memory usage · 73d2c619

Yafang Shao authored 2 years ago

sockmap and sockhash don't have something in common in allocation, so let's
introduce different helpers to calculate their memory usage.

The reuslt as follows,

- before
28: sockmap  name count_map  flags 0x0
        key 4B  value 4B  max_entries 65536  memlock 524288B
29: sockhash  name count_map  flags 0x0
        key 4B  value 4B  max_entries 65536  memlock 524288B

- after
28: sockmap  name count_map  flags 0x0
        key 4B  value 4B  max_entries 65536  memlock 524608B
29: sockhash  name count_map  flags 0x0  <<<< no updated elements
        key 4B  value 4B  max_entries 65536  memlock 1048896B
Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Link: https://lore.kernel.org/r/20230305124615.12358-16-laoar.shao@gmail.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>

73d2c619

bpf, net: bpf_local_storage memory usage · 7490b7f1

Yafang Shao authored 2 years ago

A new helper is introduced into bpf_local_storage map to calculate the
memory usage. This helper is also used by other maps like
bpf_cgrp_storage, bpf_inode_storage, bpf_task_storage and etc.

Note that currently the dynamically allocated storage elements are not
counted in the usage, since it will take extra runtime overhead in the
elements update or delete path. So let's put it aside now, and implement
it in the future when someone really need it.
Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Link: https://lore.kernel.org/r/20230305124615.12358-15-laoar.shao@gmail.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>

7490b7f1

bpf: local_storage memory usage · 2f536977

Yafang Shao authored 2 years ago

A new helper is introduced to calculate local_storage map memory usage.
Currently the dynamically allocated elements are not counted, since it
will take runtime overhead in the element update or delete path. So
let's put it aside currently, and implement it in the future if the user
really needs it.
Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Link: https://lore.kernel.org/r/20230305124615.12358-14-laoar.shao@gmail.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>

2f536977