Commits · 6c3c267e5fbc637c8fc4b1d075e3c43e328550a6 · Kirill Smelkov / linux

12 Jul, 2022 1 commit

Documentation/process: Add embargoed HW contact for LLVM · 6c3c267e

Nick Desaulniers authored Jul 11, 2022

Should the need for toolchain mitigations ever be necessary, add a group
for toolchain ambassadors.

Add Nick Desaulniers as LLVM's ambassador for the embargoed hardware
issues process.
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Link: https://lore.kernel.org/r/20220711181101.1559558-1-ndesaulniers@google.comSigned-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

6c3c267e

08 Jul, 2022 1 commit

Merge tag 'arch-cache-topo-5.20' of... · 2c8f7ef4

Greg Kroah-Hartman authored Jul 08, 2022

Merge tag 'arch-cache-topo-5.20' of git://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux into driver-core-next

Sudeep writes:

cacheinfo and arch_topology updates for v5.20

These are updates to fix some discrepancies we have in the CPU topology
parsing from the device tree /cpu-map node and the divergence from the
behaviour on a ACPI enabled platform. The expectation is that both DT
and ACPI enabled systems must present consistent view of the CPU topology.

The current assignment of generated cluster count as the physical package
identifier for each CPU is wrong. The device tree bindings for CPU
topology supports sockets to infer the socket or physical package
identifier for a given CPU. It is now being made use of you address the
issue. These updates also assigns the cluster identifier as parsed from
the device tree cluster nodes within /cpu-map without support for
nesting of the clusters as there are no such reported/known platforms.

In order to be on par with ACPI PPTT physical package/socket support,
these updates also include support for socket nodes in /cpu-map.

The only exception is that the last level cache id information can be
inferred from the same ACPI PPTT while we need to parse CPU cache nodes
in the device tree. The cacheinfo changes here is to enable the re-use
of the cacheinfo to detect the cache attributes for all the CPU quite
early even before the scondardaries are booted so that the information
can be used to build the schedular domains especially the last level
cache(LLC).

* tag 'arch-cache-topo-5.20' of git://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux: (21 commits)
ACPI: Remove the unused find_acpi_cpu_cache_topology()
arch_topology: Warn that topology for nested clusters is not supported
arch_topology: Add support for parsing sockets in /cpu-map
arch_topology: Set cluster identifier in each core/thread from /cpu-map
arch_topology: Limit span of cpu_clustergroup_mask()
arch_topology: Don't set cluster identifier as physical package identifier
arch_topology: Avoid parsing through all the CPUs once a outlier CPU is found
arch_topology: Check for non-negative value rather than -1 for IDs validity
arch_topology: Set thread sibling cpumask only within the cluster
arch_topology: Drop LLC identifier stash from the CPU topology
arm64: topology: Remove redundant setting of llc_id in CPU topology
arch_topology: Use the last level cache information from the cacheinfo
arch_topology: Add support to parse and detect cache attributes
cacheinfo: Align checks in cache_shared_cpu_map_{setup,remove} for readability
cacheinfo: Use cache identifiers to check if the caches are shared if available
cacheinfo: Allow early detection and population of cache attributes
cacheinfo: Add support to check if last level cache(LLC) is valid or shared
cacheinfo: Move cache_leaves_are_shared out of CONFIG_OF
cacheinfo: Add helper to access any cache index for a given CPU
cacheinfo: Use of_cpu_device_node_get instead cpu_dev->of_node
...

2c8f7ef4

06 Jul, 2022 1 commit

Revert "kernfs: Change kernfs_notify_list to llist." · 2fd26970

Imran Khan authored Jul 06, 2022

This reverts commit b8f35fa1.

This is causing regression due to same kernfs_node getting
added multiple times in kernfs_notify_list so revert it until
safe way of using llist in this context is found.
Reported-by: Nathan Chancellor <nathan@kernel.org>
Reported-by: Michael Walle <michael@walle.cc>
Reported-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Imran Khan <imran.f.khan@oracle.com>
Cc: Tejun Heo <tj@kernel.org>
Link: https://lore.kernel.org/r/20220705201026.2487665-1-imran.f.khan@oracle.comSigned-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

2fd26970

04 Jul, 2022 21 commits

ACPI: Remove the unused find_acpi_cpu_cache_topology() · 7128af87

Sudeep Holla authored Jul 04, 2022

The sole user of this find_acpi_cpu_cache_topology() was arm64 topology
which is now consolidated into the generic arch_topology without the need
of this function.

Drop the unused function find_acpi_cpu_cache_topology().

Link: https://lore.kernel.org/r/20220704101605.1318280-22-sudeep.holla@arm.com
Cc: Rafael J. Wysocki <rafael@kernel.org>
Reported-by: Ionela Voinescu <ionela.voinescu@arm.com>
Tested-by: Conor Dooley <conor.dooley@microchip.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>

7128af87

arch_topology: Warn that topology for nested clusters is not supported · 00e66e37

Sudeep Holla authored Jul 04, 2022

We don't support the topology for clusters of CPU clusters while the
DT and ACPI bindings theoritcally support the same. Just warn about the
same so that it is clear to the users of arch_topology that the nested
clusters are not yet supported.

Link: https://lore.kernel.org/r/20220704101605.1318280-21-sudeep.holla@arm.comTested-by: Ionela Voinescu <ionela.voinescu@arm.com>
Tested-by: Conor Dooley <conor.dooley@microchip.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>

00e66e37

arch_topology: Add support for parsing sockets in /cpu-map · dea8c0b4

Sudeep Holla authored Jul 04, 2022

Finally let us add support for socket nodes in /cpu-map in the device
tree. Since this may not be present in all the old platforms and even
most of the existing platforms, we need to assume absence of the socket
node indicates that it is a single socket system and handle appropriately.

Also it is likely that most single socket systems skip to as the node
since it is optional.

Link: https://lore.kernel.org/r/20220704101605.1318280-20-sudeep.holla@arm.comTested-by: Ionela Voinescu <ionela.voinescu@arm.com>
Tested-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Ionela Voinescu <ionela.voinescu@arm.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>

dea8c0b4

arch_topology: Set cluster identifier in each core/thread from /cpu-map · 556c9678

Sudeep Holla authored Jul 04, 2022

Let us set the cluster identifier as parsed from the device tree
cluster nodes within /cpu-map.

We don't support nesting of clusters yet as there are no real hardware
to support clusters of clusters.

Link: https://lore.kernel.org/r/20220704101605.1318280-19-sudeep.holla@arm.comTested-by: Ionela Voinescu <ionela.voinescu@arm.com>
Tested-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Ionela Voinescu <ionela.voinescu@arm.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>

556c9678

arch_topology: Limit span of cpu_clustergroup_mask() · bfcc4397

Ionela Voinescu authored Jul 04, 2022

Currently the cluster identifier is not set on DT based platforms.
The reset or default value is -1 for all the CPUs. Once we assign the
cluster identifier values correctly, the cluster_sibling mask will be
populated and returned by cpu_clustergroup_mask() to contribute in the
creation of the CLS scheduling domain level, if SCHED_CLUSTER is
enabled.

To avoid topologies that will result in questionable or incorrect
scheduling domains, impose restrictions regarding the span of clusters,
as presented to scheduling domains building code: cluster_sibling should
not span more or the same CPUs as cpu_coregroup_mask().

This is needed in order to obtain a strict separation between the MC and
CLS levels, and maintain the same domains for existing platforms in
the presence of CONFIG_SCHED_CLUSTER, where the new cluster information
is redundant and irrelevant for the scheduler.

While previously the scheduling domain builder code would have removed MC
as redundant and kept CLS if SCHED_CLUSTER was enabled and the
cpu_coregroup_mask() and cpu_clustergroup_mask() spanned the same CPUs,
now CLS will be removed and MC kept.

Link: https://lore.kernel.org/r/20220704101605.1318280-18-sudeep.holla@arm.com
Cc: Darren Hart <darren@os.amperecomputing.com>
Tested-by: Conor Dooley <conor.dooley@microchip.com>
Acked-by: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Ionela Voinescu <ionela.voinescu@arm.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>

bfcc4397

arch_topology: Don't set cluster identifier as physical package identifier · 26a2b73a

Sudeep Holla authored Jul 04, 2022

Currently as we parse the CPU topology from /cpu-map node from the
device tree, we assign generated cluster count as the physical package
identifier for each CPU which is wrong.

The device tree bindings for CPU topology supports sockets to infer
the socket or physical package identifier for a given CPU. Since it is
fairly new and not supported on most of the old and existing systems, we
can assume all such systems have single socket/physical package.

Fix the physical package identifier to 0 by removing the assignment of
cluster identifier to the same.

Link: https://lore.kernel.org/r/20220704101605.1318280-17-sudeep.holla@arm.comTested-by: Ionela Voinescu <ionela.voinescu@arm.com>
Tested-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Ionela Voinescu <ionela.voinescu@arm.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>

26a2b73a

arch_topology: Avoid parsing through all the CPUs once a outlier CPU is found · 5a01bb8e

Sudeep Holla authored Jul 04, 2022

There is no point in looping through all the CPU's physical package
identifier to check if it is valid or not once a CPU which is outside
the topology(i.e. outlier CPU) is found.

Let us just break out of the loop early in such case.

Link: https://lore.kernel.org/r/20220704101605.1318280-16-sudeep.holla@arm.comTested-by: Ionela Voinescu <ionela.voinescu@arm.com>
Tested-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>

5a01bb8e

arch_topology: Check for non-negative value rather than -1 for IDs validity · 9eb5e54f

Sudeep Holla authored Jul 04, 2022

Instead of just comparing the cpu topology IDs with -1 to check their
validity, improve that by checking for a valid non-negative value.

Link: https://lore.kernel.org/r/20220704101605.1318280-15-sudeep.holla@arm.comSuggested-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Tested-by: Ionela Voinescu <ionela.voinescu@arm.com>
Tested-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>

9eb5e54f

arch_topology: Set thread sibling cpumask only within the cluster · 3f828329

Sudeep Holla authored Jul 04, 2022

Currently the cluster identifier is not set on the DT based platforms.
The reset or default value is -1 for all the CPUs. Once we assign the
cluster identifier values correctly that may result in getting the thread
siblings wrong as the core identifiers can be same for 2 different CPUs
belonging to 2 different cluster.

So, in order to get the thread sibling cpumasks correct, we need to
update them only if the cores they belong are in the same cluster within
the socket. Let us skip updation of the thread sibling cpumaks if the
cluster identifier doesn't match.

This change won't affect even if the cluster identifiers are not set
currently but will avoid any breakage once we set the same correctly.

Link: https://lore.kernel.org/r/20220704101605.1318280-14-sudeep.holla@arm.comTested-by: Gavin Shan <gshan@redhat.com>
Tested-by: Ionela Voinescu <ionela.voinescu@arm.com>
Tested-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>

3f828329

arch_topology: Drop LLC identifier stash from the CPU topology · 5b8dc787

Sudeep Holla authored Jul 04, 2022

Since the cacheinfo LLC information is used directly in arch_topology,
there is no need to parse and store the LLC ID information only for
ACPI systems in the CPU topology.

Remove the redundant LLC ID from the generic CPU arch_topology
information.

Link: https://lore.kernel.org/r/20220704101605.1318280-13-sudeep.holla@arm.comTested-by: Ionela Voinescu <ionela.voinescu@arm.com>
Tested-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>

5b8dc787

arm64: topology: Remove redundant setting of llc_id in CPU topology · 798eb5b4

Sudeep Holla authored Jul 04, 2022

Since the cacheinfo LLC information is used directly in arch_topology,
there is no need to parse and fetch the LLC ID information only for
ACPI systems.

Just drop the redundant parsing and setting of llc_id in CPU topology
from ACPI PPTT.

Link: https://lore.kernel.org/r/20220704101605.1318280-12-sudeep.holla@arm.com
Cc: Will Deacon <will@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Tested-by: Ionela Voinescu <ionela.voinescu@arm.com>
Tested-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>

798eb5b4

arch_topology: Use the last level cache information from the cacheinfo · f027db2f

Sudeep Holla authored Jul 04, 2022

The cacheinfo is now initialised early along with the CPU topology
initialisation. Instead of relying on the LLC ID information parsed
separately only with ACPI PPTT elsewhere, migrate to use the similar
information from the cacheinfo.

This is generic for both DT and ACPI systems. The ACPI LLC ID information
parsed separately can now be removed from arch specific code.

Link: https://lore.kernel.org/r/20220704101605.1318280-11-sudeep.holla@arm.comTested-by: Ionela Voinescu <ionela.voinescu@arm.com>
Tested-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>

f027db2f

arch_topology: Add support to parse and detect cache attributes · 38db9b95

Sudeep Holla authored Jul 04, 2022

Currently ACPI populates just the minimum information about the last
level cache from PPTT in order to feed the same to build sched_domains.
Similar support for DT platforms is not present.

In order to enable the same, the entire cache hierarchy information can
be built as part of CPU topoplogy parsing both on ACPI and DT platforms.

Note that this change builds the cacheinfo early even on ACPI systems,
but the current mechanism of building llc_sibling mask remains unchanged.

Link: https://lore.kernel.org/r/20220704101605.1318280-10-sudeep.holla@arm.comTested-by: Ionela Voinescu <ionela.voinescu@arm.com>
Tested-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>

38db9b95

cacheinfo: Align checks in cache_shared_cpu_map_{setup,remove} for readability · 52110313

Sudeep Holla authored Jul 04, 2022

The checks to skip the CPU itself or no cacheinfo case are implemented
bit differently though the effect is exactly same. Just align the
implementation in both cache_shared_cpu_map_{setup,remove} just for
improved readability. No functional change.

Link: https://lore.kernel.org/r/20220704101605.1318280-9-sudeep.holla@arm.comTested-by: Conor Dooley <conor.dooley@microchip.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>

52110313

cacheinfo: Use cache identifiers to check if the caches are shared if available · f16d1bec

Sudeep Holla authored Jul 04, 2022

The cache identifiers is an optional property on most of the platforms.
The presence of one must be indicated by the CACHE_ID valid bit in the
attributes.

We can use the cache identifiers provided by the firmware to check if
any two cpus share the same cache instead of relying on the fw_token
generated and set in the OS.

Link: https://lore.kernel.org/r/20220704101605.1318280-8-sudeep.holla@arm.comTested-by: Ionela Voinescu <ionela.voinescu@arm.com>
Tested-by: Conor Dooley <conor.dooley@microchip.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>

f16d1bec

cacheinfo: Allow early detection and population of cache attributes · 36bbc5b4

Sudeep Holla authored Jul 04, 2022

Some architecture/platforms may need to setup cache properties very
early in the boot along with other cpu topologies so that all these
information can be used to build sched_domains which is used by the
scheduler.

Allow detect_cache_attributes to be called quite early during the boot.

Link: https://lore.kernel.org/r/20220704101605.1318280-7-sudeep.holla@arm.comTested-by: Ionela Voinescu <ionela.voinescu@arm.com>
Tested-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>

36bbc5b4

cacheinfo: Add support to check if last level cache(LLC) is valid or shared · cc1cfc47

Sudeep Holla authored Jul 04, 2022

It is useful to have helper to check if the given two CPUs share last
level cache. We can do that check by comparing fw_token or by comparing
the cache ID. Currently we check just for fw_token as the cache ID is
optional.

This helper can be used to build the llc_sibling during arch specific
topology parsing and feeding information to the sched_domains. This also
helps to get rid of llc_id in the CPU topology as it is sort of duplicate
information.

Also add helper to check if the llc information in cacheinfo is valid
or not.

Link: https://lore.kernel.org/r/20220704101605.1318280-6-sudeep.holla@arm.comTested-by: Ionela Voinescu <ionela.voinescu@arm.com>
Tested-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>

cc1cfc47

cacheinfo: Move cache_leaves_are_shared out of CONFIG_OF · 9447eb0f

Sudeep Holla authored Jul 04, 2022

cache_leaves_are_shared is already used even with ACPI and PPTT. It
checks if the cache leaves are the shared based on fw_token pointer.
However it is defined conditionally only if CONFIG_OF is enabled which
is wrong.

Move the function cache_leaves_are_shared out of CONFIG_OF and keep it
generic. It also handles the case where both OF and ACPI is not defined.

Link: https://lore.kernel.org/r/20220704101605.1318280-5-sudeep.holla@arm.comTested-by: Ionela Voinescu <ionela.voinescu@arm.com>
Tested-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>

9447eb0f

cacheinfo: Add helper to access any cache index for a given CPU · b14e8d21

Sudeep Holla authored Jul 04, 2022

The cacheinfo for a given CPU at a given index is used at quite a few
places by fetching the base point for index 0 using the helper
per_cpu_cacheinfo(cpu) and offsetting it by the required index.

Instead, add another helper to fetch the required pointer directly and
use it to simplify and improve readability.

Link: https://lore.kernel.org/r/20220704101605.1318280-4-sudeep.holla@arm.comTested-by: Ionela Voinescu <ionela.voinescu@arm.com>
Tested-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>

b14e8d21

cacheinfo: Use of_cpu_device_node_get instead cpu_dev->of_node · d4ec840b

Sudeep Holla authored Jul 04, 2022

The of_cpu_device_node_get takes care of fetching the CPU'd device node
either from cached cpu_dev->of_node if cpu_dev is initialised or uses
of_get_cpu_node to parse and fetch node if cpu_dev isn't available yet.

Just use of_cpu_device_node_get instead of getting the cpu device first
and then using cpu_dev->of_node for two reasons:
1. There is no other use of cpu_dev and can be simplified
2. It enabled the use detect_cache_attributes and hence cache_setup_of_node
much earlier before the CPUs are registered as devices.

Link: https://lore.kernel.org/r/20220704101605.1318280-3-sudeep.holla@arm.comTested-by: Ionela Voinescu <ionela.voinescu@arm.com>
Tested-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>

d4ec840b

ACPI: PPTT: Use table offset as fw_token instead of virtual address · 0d4c331a

Sudeep Holla authored Jul 04, 2022

There is need to use the cache sharing information quite early during
the boot before the secondary cores are up and running. The permanent
memory map for all the ACPI tables(via acpi_permanent_mmap) is turned
on in acpi_early_init() which is quite late for the above requirement.

As a result there is possibility that the ACPI PPTT gets mapped to
different virtual addresses. In such scenarios, using virtual address as
fw_token before the acpi_permanent_mmap is enabled results in different
fw_token for the same cache entity and hence wrong cache sharing
information will be deduced based on the same.

Instead of using virtual address, just use the table offset as the
unique firmware token for the caches. The same offset is used as
ACPI identifiers if the firmware has not set a valid one for other
entries in the ACPI PPTT.

Link: https://lore.kernel.org/r/20220704101605.1318280-2-sudeep.holla@arm.com
Cc: linux-acpi@vger.kernel.org
Tested-by: Ionela Voinescu <ionela.voinescu@arm.com>
Tested-by: Conor Dooley <conor.dooley@microchip.com>
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>

0d4c331a

01 Jul, 2022 2 commits

kernfs: fix potential NULL dereference in __kernfs_remove · 72b5d5ae

Yushan Zhou authored Jun 30, 2022

When lockdep is enabled, lockdep_assert_held_write would
cause potential NULL pointer dereference.

Fix the following smatch warnings:

fs/kernfs/dir.c:1353 __kernfs_remove() warn: variable dereferenced before check 'kn' (see line 1346)

Fixes: 393c3714 ("kernfs: switch global kernfs_rwsem lock to per-fs lock")
Signed-off-by: Yushan Zhou <katrinzhou@tencent.com>
Link: https://lore.kernel.org/r/20220630082512.3482581-1-zys.zljxml@gmail.comSigned-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

72b5d5ae

firmware: Hold a reference for of_find_compatible_node() · c882716b

Liang He authored Jun 28, 2022

In of_register_trusted_foundations(), we need to hold the reference
returned by of_find_compatible_node() and then use it to call
of_node_put() for refcount balance.
Signed-off-by: Liang He <windhl@126.com>
Link: https://lore.kernel.org/r/20220628021640.4015-1-windhl@126.comSigned-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

c882716b

27 Jun, 2022 12 commits

of: base: Avoid console probe delay when fw_devlink.strict=1 · a244ec36

Saravana Kannan authored Jun 23, 2022

Commit 71066545 ("driver core: Set fw_devlink.strict=1 by default")
enabled iommus and dmas dependency enforcement by default. On some
systems, this caused the console device's probe to get delayed until the
deferred_probe_timeout expires.

We need consoles to work as soon as possible, so mark the console device
node with FWNODE_FLAG_BEST_EFFORT so that fw_delink knows not to delay
the probe of the console device for suppliers without drivers. The
driver can then make the decision on where it can probe without those
suppliers or defer its probe.

Fixes: 71066545 ("driver core: Set fw_devlink.strict=1 by default")
Reported-by: Sascha Hauer <sha@pengutronix.de>
Reported-by: Peng Fan <peng.fan@nxp.com>
Tested-by: Peng Fan <peng.fan@nxp.com>
Signed-off-by: Saravana Kannan <saravanak@google.com>
Link: https://lore.kernel.org/r/20220623080344.783549-3-saravanak@google.comSigned-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

a244ec36

driver core: fw_devlink: Allow firmware to mark devices as best effort · 8f486cab

Saravana Kannan authored Jun 23, 2022

When firmware sets the FWNODE_FLAG_BEST_EFFORT flag for a fwnode,
fw_devlink will do a best effort ordering for that device where it'll
only enforce the probe/suspend/resume ordering of that device with
suppliers that have drivers. The driver of that device can then decide
if it wants to defer probe or probe without the suppliers.

This will be useful for avoid probe delays of the console device that
were caused by commit 71066545 ("driver core: Set
fw_devlink.strict=1 by default").

Fixes: 71066545 ("driver core: Set fw_devlink.strict=1 by default")
Reported-by: Sascha Hauer <sha@pengutronix.de>
Reported-by: Peng Fan <peng.fan@nxp.com>
Tested-by: Peng Fan <peng.fan@nxp.com>
Signed-off-by: Saravana Kannan <saravanak@google.com>
Link: https://lore.kernel.org/r/20220623080344.783549-2-saravanak@google.comSigned-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

8f486cab

kernfs: Replace global kernfs_open_file_mutex with hashed mutexes. · 1d25b84e

Imran Khan authored Jun 15, 2022

In current kernfs design a single mutex, kernfs_open_file_mutex, protects
the list of kernfs_open_file instances corresponding to a sysfs attribute.
So even if different tasks are opening or closing different sysfs files
they can contend on osq_lock of this mutex. The contention is more apparent
in large scale systems with few hundred CPUs where most of the CPUs have
running tasks that are opening, accessing or closing sysfs files at any
point of time.

Using hashed mutexes in place of a single global mutex, can significantly
reduce contention around global mutex and hence can provide better
scalability. Moreover as these hashed mutexes are not part of kernfs_node
objects we will not see any singnificant change in memory utilization of
kernfs based file systems like sysfs, cgroupfs etc.

Modify interface introduced in previous patch to make use of hashed
mutexes. Use kernfs_node address as hashing key.
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Imran Khan <imran.f.khan@oracle.com>
Link: https://lore.kernel.org/r/20220615021059.862643-5-imran.f.khan@oracle.comSigned-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

1d25b84e

kernfs: Introduce interface to access global kernfs_open_file_mutex. · 41448c61

Imran Khan authored Jun 15, 2022

This allows to change underlying mutex locking, without needing to change
the users of the lock. For example next patch modifies this interface to
use hashed mutexes in place of a single global kernfs_open_file_mutex.
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Imran Khan <imran.f.khan@oracle.com>
Link: https://lore.kernel.org/r/20220615021059.862643-4-imran.f.khan@oracle.comSigned-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

41448c61

kernfs: Change kernfs_notify_list to llist. · b8f35fa1

Imran Khan authored Jun 15, 2022

At present kernfs_notify_list is implemented as a singly linked
list of kernfs_node(s), where last element points to itself and
value of ->attr.next tells if node is present on the list or not.
Both addition and deletion to list happen under kernfs_notify_lock.

Change kernfs_notify_list to llist so that addition to list can heppen
locklessly.

Suggested by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Imran Khan <imran.f.khan@oracle.com>
Link: https://lore.kernel.org/r/20220615021059.862643-3-imran.f.khan@oracle.comSigned-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

b8f35fa1

kernfs: make ->attr.open RCU protected. · 086c00c7

Imran Khan authored Jun 15, 2022

After removal of kernfs_open_node->refcnt in the previous patch,
kernfs_open_node_lock can be removed as well by making ->attr.open
RCU protected. kernfs_put_open_node can delegate freeing to ->attr.open
to RCU and other readers of ->attr.open can do so under rcu_read_(un)lock.

Suggested by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Imran Khan <imran.f.khan@oracle.com>
Link: https://lore.kernel.org/r/20220615021059.862643-2-imran.f.khan@oracle.comSigned-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

086c00c7

kernfs/file.c: remove redundant error return counter assignment · dcab8da1

Lin Feng authored Jun 17, 2022

Since previous 'rc = -EINVAL;', rc value doesn't change, so not
necessary to re-assign it again.
Signed-off-by: Lin Feng <linf@wangsu.com>
Link: https://lore.kernel.org/r/20220617091746.206515-1-linf@wangsu.comSigned-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

dcab8da1

driver core: fix potential deadlock in __driver_attach · 70fe7583

Zhang Wensheng authored Jun 22, 2022

In __driver_attach function, There are also AA deadlock problem,
like the commit b232b02b ("driver core: fix deadlock in
__device_attach").

stack like commit b232b02b ("driver core: fix deadlock in
__device_attach").
list below:
    In __driver_attach function, The lock holding logic is as follows:
    ...
    __driver_attach
    if (driver_allows_async_probing(drv))
      device_lock(dev)      // get lock dev
        async_schedule_dev(__driver_attach_async_helper, dev); // func
          async_schedule_node
            async_schedule_node_domain(func)
              entry = kzalloc(sizeof(struct async_entry), GFP_ATOMIC);
              /* when fail or work limit, sync to execute func, but
                 __driver_attach_async_helper will get lock dev as
                 will, which will lead to A-A deadlock.  */
              if (!entry || atomic_read(&entry_count) > MAX_WORK) {
                func;
              else
                queue_work_node(node, system_unbound_wq, &entry->work)
      device_unlock(dev)

    As above show, when it is allowed to do async probes, because of
    out of memory or work limit, async work is not be allowed, to do
    sync execute instead. it will lead to A-A deadlock because of
    __driver_attach_async_helper getting lock dev.

Reproduce:
and it can be reproduce by make the condition
(if (!entry || atomic_read(&entry_count) > MAX_WORK)) untenable, like
below:

[  370.785650] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
this message.
[  370.787154] task:swapper/0       state:D stack:    0 pid:    1 ppid:
0 flags:0x00004000
[  370.788865] Call Trace:
[  370.789374]  <TASK>
[  370.789841]  __schedule+0x482/0x1050
[  370.790613]  schedule+0x92/0x1a0
[  370.791290]  schedule_preempt_disabled+0x2c/0x50
[  370.792256]  __mutex_lock.isra.0+0x757/0xec0
[  370.793158]  __mutex_lock_slowpath+0x1f/0x30
[  370.794079]  mutex_lock+0x50/0x60
[  370.794795]  __device_driver_lock+0x2f/0x70
[  370.795677]  ? driver_probe_device+0xd0/0xd0
[  370.796576]  __driver_attach_async_helper+0x1d/0xd0
[  370.797318]  ? driver_probe_device+0xd0/0xd0
[  370.797957]  async_schedule_node_domain+0xa5/0xc0
[  370.798652]  async_schedule_node+0x19/0x30
[  370.799243]  __driver_attach+0x246/0x290
[  370.799828]  ? driver_allows_async_probing+0xa0/0xa0
[  370.800548]  bus_for_each_dev+0x9d/0x130
[  370.801132]  driver_attach+0x22/0x30
[  370.801666]  bus_add_driver+0x290/0x340
[  370.802246]  driver_register+0x88/0x140
[  370.802817]  ? virtio_scsi_init+0x116/0x116
[  370.803425]  scsi_register_driver+0x1a/0x30
[  370.804057]  init_sd+0x184/0x226
[  370.804533]  do_one_initcall+0x71/0x3a0
[  370.805107]  kernel_init_freeable+0x39a/0x43a
[  370.805759]  ? rest_init+0x150/0x150
[  370.806283]  kernel_init+0x26/0x230
[  370.806799]  ret_from_fork+0x1f/0x30

To fix the deadlock, move the async_schedule_dev outside device_lock,
as we can see, in async_schedule_node_domain, the parameter of
queue_work_node is system_unbound_wq, so it can accept concurrent
operations. which will also not change the code logic, and will
not lead to deadlock.

Fixes: ef0ff683 ("driver core: Probe devices asynchronously instead of the driver")
Signed-off-by: Zhang Wensheng <zhangwensheng5@huawei.com>
Link: https://lore.kernel.org/r/20220622074327.497102-1-zhangwensheng5@huawei.comSigned-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

70fe7583

ABI: testing/sysfs-devices-system-cpu: remove duplicated core_id · 1d248d23

Mauro Carvalho Chehab authored Jun 26, 2022

This was already defined at stable/sysfs-devices-system-cpu with
the same description, as pointed by get_abi.pl:

Warning: /sys/devices/system/cpu/cpuX/topology/core_id is defined 2 times: Documentation/ABI/stable/sysfs-devices-system-cpu:38 Documentation/ABI/testing/sysfs-devices-system-cpu:69

Remove the duplicated one.
Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org>
Link: https://lore.kernel.org/r/1e92337c1ef74f5eb9e1c1871e20b858b490d269.1656235926.git.mchehab@kernel.orgSigned-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

1d248d23

devtmpfs: fix the dangling pointer of global devtmpfsd thread · 31c779f2

Yangxi Xiang authored Jun 27, 2022

When the devtmpfs fails to mount, a dangling pointer still remains in
global. Specifically, the err variable is passed by a pointer to the
devtmpfsd. When the devtmpfsd exits, it sets the error and completes the
setup_done. In this situation, the thread pointer is not set to null.
After the devtmpfsd exited, the devtmpfs can wakes up the destroyed
devtmpfsd thread by wake_up_process if a device change event comes.
Signed-off-by: Yangxi Xiang <xyangxi5@gmail.com>
Link: https://lore.kernel.org/r/20220627120409.11174-1-xyangxi5@gmail.comSigned-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

31c779f2

Revert "devcoredump: remove the useless gfp_t parameter in dev_coredumpv and dev_coredumpm" · 38a523a2

Greg Kroah-Hartman authored Jun 27, 2022

This reverts commit 77515eba as it
causes build problems in linux-next.  It needs to be reintroduced in a
way that can allow the api to evolve and not require a "flag day" to
catch all users.

Link: https://lore.kernel.org/r/20220623160723.7a44b573@canb.auug.org.au
Cc: Duoming Zhou <duoming@zju.edu.cn>
Cc: Brian Norris <briannorris@chromium.org>
Cc: Johannes Berg <johannes@sipsolutions.net>
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

38a523a2

Revert "mwifiex: fix sleep in atomic context bugs caused by dev_coredumpv" · 5f8954e0

Greg Kroah-Hartman authored Jun 27, 2022

This reverts commit a52ed486 as it
causes build problems in linux-next.  It needs to be reintroduced in a
way that can allow the api to evolve and not require a "flag day" to
catch all users.

Link: https://lore.kernel.org/r/20220623160723.7a44b573@canb.auug.org.au
Cc: Duoming Zhou <duoming@zju.edu.cn>
Cc: Brian Norris <briannorris@chromium.org>
Cc: Johannes Berg <johannes@sipsolutions.net>
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

5f8954e0

21 Jun, 2022 2 commits

mwifiex: fix sleep in atomic context bugs caused by dev_coredumpv · a52ed486

Duoming Zhou authored Jun 07, 2022

There are sleep in atomic context bugs when uploading device dump
data in mwifiex. The root cause is that dev_coredumpv could not
be used in atomic contexts, because it calls dev_set_name which
include operations that may sleep. The call tree shows execution
paths that could lead to bugs:

   (Interrupt context)
fw_dump_timer_fn
  mwifiex_upload_device_dump
    dev_coredumpv(..., GFP_KERNEL)
      dev_coredumpm()
        kzalloc(sizeof(*devcd), gfp); //may sleep
        dev_set_name
          kobject_set_name_vargs
            kvasprintf_const(GFP_KERNEL, ...); //may sleep
            kstrdup(s, GFP_KERNEL); //may sleep

The corresponding fail log is shown below:

[  135.275938] usb 1-1: == mwifiex dump information to /sys/class/devcoredump start
[  135.281029] BUG: sleeping function called from invalid context at include/linux/sched/mm.h:265
...
[  135.293613] Call Trace:
[  135.293613]  <IRQ>
[  135.293613]  dump_stack_lvl+0x57/0x7d
[  135.293613]  __might_resched.cold+0x138/0x173
[  135.293613]  ? dev_coredumpm+0xca/0x2e0
[  135.293613]  kmem_cache_alloc_trace+0x189/0x1f0
[  135.293613]  ? devcd_match_failing+0x30/0x30
[  135.293613]  dev_coredumpm+0xca/0x2e0
[  135.293613]  ? devcd_freev+0x10/0x10
[  135.293613]  dev_coredumpv+0x1c/0x20
[  135.293613]  ? devcd_match_failing+0x30/0x30
[  135.293613]  mwifiex_upload_device_dump+0x65/0xb0
[  135.293613]  ? mwifiex_dnld_fw+0x1b0/0x1b0
[  135.293613]  call_timer_fn+0x122/0x3d0
[  135.293613]  ? msleep_interruptible+0xb0/0xb0
[  135.293613]  ? lock_downgrade+0x3c0/0x3c0
[  135.293613]  ? __next_timer_interrupt+0x13c/0x160
[  135.293613]  ? lockdep_hardirqs_on_prepare+0xe/0x220
[  135.293613]  ? mwifiex_dnld_fw+0x1b0/0x1b0
[  135.293613]  __run_timers.part.0+0x3f8/0x540
[  135.293613]  ? call_timer_fn+0x3d0/0x3d0
[  135.293613]  ? arch_restore_msi_irqs+0x10/0x10
[  135.293613]  ? lapic_next_event+0x31/0x40
[  135.293613]  run_timer_softirq+0x4f/0xb0
[  135.293613]  __do_softirq+0x1c2/0x651
...
[  135.293613] RIP: 0010:default_idle+0xb/0x10
[  135.293613] RSP: 0018:ffff888006317e68 EFLAGS: 00000246
[  135.293613] RAX: ffffffff82ad8d10 RBX: ffff888006301cc0 RCX: ffffffff82ac90e1
[  135.293613] RDX: ffffed100d9ff1b4 RSI: ffffffff831ad140 RDI: ffffffff82ad8f20
[  135.293613] RBP: 0000000000000003 R08: 0000000000000000 R09: ffff88806cff8d9b
[  135.293613] R10: ffffed100d9ff1b3 R11: 0000000000000001 R12: ffffffff84593410
[  135.293613] R13: 0000000000000000 R14: 0000000000000000 R15: 1ffff11000c62fd2
...
[  135.389205] usb 1-1: == mwifiex dump information to /sys/class/devcoredump end

This patch uses delayed work to replace timer and moves the operations
that may sleep into a delayed work in order to mitigate bugs, it was
tested on Marvell 88W8801 chip whose port is usb and the firmware is
usb8801_uapsta.bin. The following is the result after using delayed
work to replace timer.

[  134.936453] usb 1-1: == mwifiex dump information to /sys/class/devcoredump start
[  135.043344] usb 1-1: == mwifiex dump information to /sys/class/devcoredump end

As we can see, there is no bug now.

Fixes: f5ecd02a ("mwifiex: device dump support for usb interface")
Reviewed-by: Brian Norris <briannorris@chromium.org>
Signed-off-by: Duoming Zhou <duoming@zju.edu.cn>
Link: https://lore.kernel.org/r/b63b77fc84ed3e8a6bef02378e17c7c71a0bc3be.1654569290.git.duoming@zju.edu.cnSigned-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

a52ed486

devcoredump: remove the useless gfp_t parameter in dev_coredumpv and dev_coredumpm · 77515eba

Duoming Zhou authored Jun 07, 2022

The dev_coredumpv() and dev_coredumpm() could not be used in atomic
context, because they call kvasprintf_const() and kstrdup() with
GFP_KERNEL parameter. The process is shown below:

dev_coredumpv(.., gfp_t gfp)
  dev_coredumpm(.., gfp_t gfp)
    dev_set_name
      kobject_set_name_vargs
        kvasprintf_const(GFP_KERNEL, ...); //may sleep
          kstrdup(s, GFP_KERNEL); //may sleep

This patch removes gfp_t parameter of dev_coredumpv() and dev_coredumpm()
and changes the gfp_t parameter of kzalloc() in dev_coredumpm() to
GFP_KERNEL in order to show they could not be used in atomic context.

Fixes: 833c9545 ("device coredump: add new device coredump class")
Reviewed-by: Brian Norris <briannorris@chromium.org>
Reviewed-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Duoming Zhou <duoming@zju.edu.cn>
Link: https://lore.kernel.org/r/df72af3b1862bac7d8e793d1f3931857d3779dfd.1654569290.git.duoming@zju.edu.cnSigned-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

77515eba