- 22 Jul, 2023 4 commits
-
-
Yuanjun Gong authored
in gsbi_probe(), the return value of function clk_prepare_enable() should be checked, since it may fail. using devm_clk_get_enabled() instead of devm_clk_get() and clk_prepare_enable() can avoid this problem. Signed-off-by: Yuanjun Gong <ruc_gongyuanjun@163.com> Link: https://lore.kernel.org/r/20230720140834.33557-1-ruc_gongyuanjun@163.com [bjorn: Dropped unnecessary "ret" variable] Signed-off-by: Bjorn Andersson <andersson@kernel.org>
-
Rohit Agarwal authored
Update the SoC SM8[2345]50 entries to use the new generic RPMHPD bindings. Signed-off-by: Rohit Agarwal <quic_rohiagar@quicinc.com> Reviewed-by: Bjorn Andersson <andersson@kernel.org> Link: https://lore.kernel.org/r/1689744162-9421-3-git-send-email-quic_rohiagar@quicinc.comSigned-off-by: Bjorn Andersson <andersson@kernel.org>
-
Bjorn Andersson authored
Merge the new generic RPMHPD defines from a topic branch, to alow them being used in DeviceTree source, and the driver.
-
Rohit Agarwal authored
Add Generic RPMh Power Domain indexes that can be used for all the Qualcomm SoC henceforth. The power domain indexes of these bindings are based on compatibility with current targets like SM8[2345]50 targets. Signed-off-by: Rohit Agarwal <quic_rohiagar@quicinc.com> Suggested-by: Konrad Dybcio <konrad.dybcio@linaro.org> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://lore.kernel.org/r/1689744162-9421-2-git-send-email-quic_rohiagar@quicinc.comSigned-off-by: Bjorn Andersson <andersson@kernel.org>
-
- 14 Jul, 2023 11 commits
-
-
Rob Herring authored
The DT of_device.h and of_platform.h date back to the separate of_platform_bus_type before it as merged into the regular platform bus. As part of that merge prepping Arm DT support 13 years ago, they "temporarily" include each other. They also include platform_device.h and of.h. As a result, there's a pretty much random mix of those include files used throughout the tree. In order to detangle these headers and replace the implicit includes with struct declarations, users need to explicitly include the correct includes. Signed-off-by: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/20230714175142.4067795-1-robh@kernel.orgSigned-off-by: Bjorn Andersson <andersson@kernel.org>
-
Stephan Gerhold authored
Add a simple driver for the qcom,rpm-proc compatible that registers the "smd-edge" and populates other children defined in the device tree. Note that the DT schema belongs to the remoteproc subsystem while this driver is added inside soc/qcom. I argue that the RPM *is* a remoteproc, but as an implementation detail in Linux it can currently not benefit from anything provided by the remoteproc subsystem. The RPM firmware is usually already loaded and started by earlier components in the boot chain and is not meant to be ever restarted. To avoid breaking existing kernel configurations the driver is always built when smd-rpm.c is also built. They belong closely together anyway. To avoid build errors CONFIG_RPMSG_QCOM_SMD must be also built-in if rpm-proc is. Reviewed-by: Konrad Dybcio <konrad.dybcio@linaro.org> Signed-off-by: Stephan Gerhold <stephan@gerhold.net> Link: https://lore.kernel.org/r/20230531-rpm-rproc-v3-9-a07dcdefd918@gerhold.netSigned-off-by: Bjorn Andersson <andersson@kernel.org>
-
Stephan Gerhold authored
Rather than looking up a dummy item from SMEM, use the new qcom_smem_is_available() function to make the code more clear (and reduce the overhead slightly). Add the same check to qcom_smd_register_edge() as well to ensure that it only succeeds if SMEM is already available - if a driver calls the function and SMEM is not available yet then the initial state will be read incorrectly and the RPMSG devices might never become available. Reviewed-by: Konrad Dybcio <konrad.dybcio@linaro.org> Signed-off-by: Stephan Gerhold <stephan@gerhold.net> Link: https://lore.kernel.org/r/20230531-rpm-rproc-v3-8-a07dcdefd918@gerhold.netSigned-off-by: Bjorn Andersson <andersson@kernel.org>
-
Stephan Gerhold authored
Avoid having to look up a dummy item from SMEM to detect if it is already available or if we need to defer probing. Reviewed-by: Konrad Dybcio <konrad.dybcio@linaro.org> Signed-off-by: Stephan Gerhold <stephan@gerhold.net> Link: https://lore.kernel.org/r/20230531-rpm-rproc-v3-7-a07dcdefd918@gerhold.netSigned-off-by: Bjorn Andersson <andersson@kernel.org>
-
Stephan Gerhold authored
On Qualcomm platforms, most subsystems (e.g. audio/modem DSP) are described as remote processors in the device tree, with a dedicated node where properties and services related to them can be described. The Resource Power Manager (RPM) is also such a subsystem, with a remote processor that is running a special firmware. Unfortunately, the RPM never got a dedicated node representing it properly in the device tree. Most of the RPM services are described below a top-level /smd or /rpm-glink node. However, SMD/GLINK is just one of the communication channels to the RPM firmware. For example, the MPM interrupt functionality provided by the RPM does not use SMD/GLINK but writes directly to a special memory region allocated by the RPM firmware in combination with a mailbox. Currently there is no good place in the device tree to describe this functionality. It doesn't belong below SMD/GLINK but it's not an independent top-level device either. Introduce a new "qcom,rpm-proc" compatible that allows describing the RPM as a remote processor/subsystem like all others. The SMD/GLINK node is moved to a "smd-edge"/"glink-edge" subnode consistent with other existing bindings. Additional subnodes (e.g. interrupt-controller for MPM, rpm-master-stats) can be also added there. Deprecate using the old top-level /smd node since all SMD edges are now specified as subnodes of the remote processor. Signed-off-by: Stephan Gerhold <stephan@gerhold.net> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://lore.kernel.org/r/20230531-rpm-rproc-v3-6-a07dcdefd918@gerhold.netSigned-off-by: Bjorn Andersson <andersson@kernel.org>
-
Stephan Gerhold authored
Semantically glink-edge and glink-rpm-edge are similar: Both describe the communication channels to a remote processor. The RPM glink-edge is a special case that needs slightly different properties but otherwise it is used exactly the same. To improve consistency use the same "glink-edge" node name also for glink-rpm-edge. Drop the $nodename from qcom,glink-edge.yaml to avoid matching the wrong schema. qcom,glink-edge.yaml is always referenced explicitly from other schemas. This will already ensure that the nodes are being checked, so it's not necessary to bind to all nodes named "glink-edge". Signed-off-by: Stephan Gerhold <stephan@gerhold.net> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://lore.kernel.org/r/20230531-rpm-rproc-v3-5-a07dcdefd918@gerhold.netSigned-off-by: Bjorn Andersson <andersson@kernel.org>
-
Stephan Gerhold authored
There is an ever growing list of compatibles in the smd-rpm.c driver. A fallback compatible would help here but would still require keeping the current list around for backwards compatibility. As an alternative, let's switch the driver to match the rpmsg_device_id instead, which is always "rpm_requests" on all platforms. Add a check to ensure that there is a device tree node defined for the device since otherwise the of_platform_populate() call will operate on the root node (/). Similar approaches with matching rpmsg_device_id are already used in qcom_sysmon, qcom_glink_ssr, qrtr, and rpmsg_wwan_ctrl. Tested-by: Konrad Dybcio <konrad.dybcio@linaro.org> # SM6375 (G-Link) Reviewed-by: Konrad Dybcio <konrad.dybcio@linaro.org> Signed-off-by: Stephan Gerhold <stephan@gerhold.net> Link: https://lore.kernel.org/r/20230531-rpm-rproc-v3-4-a07dcdefd918@gerhold.netSigned-off-by: Bjorn Andersson <andersson@kernel.org>
-
Stephan Gerhold authored
To avoid several more small patches adding new RPM compatibles in the future, add MDM9607, MSM8610, MSM8917, MSM8937 and MSM8952 at once. All of these have been worked on over the time by some people and are definitely compatible as-is with the smd-rpm driver. Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Stephan Gerhold <stephan@gerhold.net> Link: https://lore.kernel.org/r/20230531-rpm-rproc-v3-3-a07dcdefd918@gerhold.netSigned-off-by: Bjorn Andersson <andersson@kernel.org>
-
Stephan Gerhold authored
MSM8909 is using qcom,smd-channels but is missing in the list, add it there as well. Fixes: 709d473d ("dt-bindings: soc: qcom: smd-rpm: Add MSM8909") Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Stephan Gerhold <stephan@gerhold.net> Link: https://lore.kernel.org/r/20230531-rpm-rproc-v3-2-a07dcdefd918@gerhold.netSigned-off-by: Bjorn Andersson <andersson@kernel.org>
-
Stephan Gerhold authored
Some of the enum entries are not properly ordered, fix that. Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Stephan Gerhold <stephan@gerhold.net> Link: https://lore.kernel.org/r/20230531-rpm-rproc-v3-1-a07dcdefd918@gerhold.netSigned-off-by: Bjorn Andersson <andersson@kernel.org>
-
Christophe JAILLET authored
Use struct_size() instead of hand-writing it, when allocating a structure with a flex array. This is less verbose. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/f74328551cfab0262ba353f37d047ac74bf616e1.1689194490.git.christophe.jaillet@wanadoo.frSigned-off-by: Bjorn Andersson <andersson@kernel.org>
-
- 10 Jul, 2023 11 commits
-
-
Yangtao Li authored
Use devm_platform_ioremap_resource() to simplify code. Signed-off-by: Yangtao Li <frank.li@vivo.com> Link: https://lore.kernel.org/r/20230705122644.32236-3-frank.li@vivo.comSigned-off-by: Bjorn Andersson <andersson@kernel.org>
-
Konrad Dybcio authored
Just like all other Qualcomm SoCs, SC8280XP SCM should be fed an interconnect path. Do so. Signed-off-by: Konrad Dybcio <konrad.dybcio@linaro.org> Acked-by: Conor Dooley <conor.dooley@microchip.com> Reviewed-by: Johan Hovold <johan+linaro@kernel.org> Tested-by: Johan Hovold <johan+linaro@kernel.org> Link: https://lore.kernel.org/r/20230622-topic-8280scmicc-v1-1-6ef318919ea5@linaro.orgSigned-off-by: Bjorn Andersson <andersson@kernel.org>
-
Bjorn Andersson authored
When tracing messages written to the RSC it's very useful to know the type of TCS being targeted, in particular if/when the code borrows a WAKE TCS for ACTIVE votes. Add the "state" of the message to the traced information. While at it, drop the "send-msg:" substring, as this is already captured by the trace event itself. Signed-off-by: Bjorn Andersson <quic_bjorande@quicinc.com> Acked-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Acked-by: Konrad Dybcio <konrad.dybcio@linaro.org> Link: https://lore.kernel.org/r/20230620230058.428833-1-quic_bjorande@quicinc.comSigned-off-by: Bjorn Andersson <andersson@kernel.org>
-
Bjorn Andersson authored
The debugfs dump of Command DB relies uses %*pEp to print the resource identifiers, with escaping of non-printable characters. But p (ESCAPE_NP) does not escape NUL characters, so for identifiers less than 8 bytes in length the output will retain these. This does not cause an issue while looking at the dump in the terminal (no known complaints at least), but when programmatically consuming the debugfs output the extra characters are unwanted. Change the fixed 8-byte sizeof() to a dynamic strnlen() to avoid printing these NUL characters. Signed-off-by: Bjorn Andersson <quic_bjorande@quicinc.com> Reviewed-by: Konrad Dybcio <konrad.dybcio@linaro.org> Link: https://lore.kernel.org/r/20230620213703.283583-1-quic_bjorande@quicinc.comSigned-off-by: Bjorn Andersson <andersson@kernel.org>
-
Konrad Dybcio authored
Add a sync_state implementation, very similar to the one already present in the RPMhPD driver. Signed-off-by: Konrad Dybcio <konrad.dybcio@linaro.org> Link: https://lore.kernel.org/r/20230619-topic-rpmpd_syncstate-v1-1-54f986cf9444@linaro.orgSigned-off-by: Bjorn Andersson <andersson@kernel.org>
-
Luca Weiss authored
The msm8226 SoC also contains OCMEM but with one region only. Reviewed-by: Konrad Dybcio <konrad.dybcio@linaro.org> Signed-off-by: Luca Weiss <luca@z3ntu.xyz> Link: https://lore.kernel.org/r/20230506-msm8226-ocmem-v3-5-79da95a2581f@z3ntu.xyzSigned-off-by: Bjorn Andersson <andersson@kernel.org>
-
Luca Weiss authored
Add the compatible for the OCMEM found on msm8226 which compared to msm8974 only has a core clock and no iface clock. Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Signed-off-by: Luca Weiss <luca@z3ntu.xyz> Link: https://lore.kernel.org/r/20230506-msm8226-ocmem-v3-4-79da95a2581f@z3ntu.xyzSigned-off-by: Bjorn Andersson <andersson@kernel.org>
-
Luca Weiss authored
Some platforms such as msm8226 do not have an iface clk. Since clk_bulk APIs don't offer to a way to treat some clocks as optional simply add core_clk and iface_clk members to our drvdata. Reviewed-by: Konrad Dybcio <konrad.dybcio@linaro.org> Signed-off-by: Luca Weiss <luca@z3ntu.xyz> Link: https://lore.kernel.org/r/20230506-msm8226-ocmem-v3-3-79da95a2581f@z3ntu.xyzSigned-off-by: Bjorn Andersson <andersson@kernel.org>
-
Luca Weiss authored
Use dev_err_probe in the driver probe function where useful, to simplify getting PTR_ERR and to ensure the underlying errors are included in the error message. Reviewed-by: Caleb Connolly <caleb.connolly@linaro.org> Reviewed-by: Konrad Dybcio <konrad.dybcio@linaro.org> Signed-off-by: Luca Weiss <luca@z3ntu.xyz> Link: https://lore.kernel.org/r/20230506-msm8226-ocmem-v3-2-79da95a2581f@z3ntu.xyzSigned-off-by: Bjorn Andersson <andersson@kernel.org>
-
Luca Weiss authored
Since we're using these two macros to read a value from a register, we need to use the FIELD_GET instead of the FIELD_PREP macro, otherwise we're getting wrong values. So instead of: [ 3.111779] ocmem fdd00000.sram: 2 ports, 1 regions, 512 macros, not interleaved we now get the correct value of: [ 3.129672] ocmem fdd00000.sram: 2 ports, 1 regions, 2 macros, not interleaved Fixes: 88c1e940 ("soc: qcom: add OCMEM driver") Reviewed-by: Caleb Connolly <caleb.connolly@linaro.org> Reviewed-by: Konrad Dybcio <konrad.dybcio@linaro.org> Signed-off-by: Luca Weiss <luca@z3ntu.xyz> Link: https://lore.kernel.org/r/20230506-msm8226-ocmem-v3-1-79da95a2581f@z3ntu.xyzSigned-off-by: Bjorn Andersson <andersson@kernel.org>
-
Konrad Dybcio authored
Currently we use predefined initial threshold values. This works, but does not really scale well with more and more SoCs gaining bwmon support, as the necessary kickoff values may differ between platforms due to memory type and/or controller setup. All of the data we need for that is already provided in the device tree, anyway. Change the thresholds to: * low = 0 (as we've been doing up until now) * med = high = BW_MIN Throughput going below the med threshold nudges bwmon into signaling that we should slow down (e.g. if we inherited too high bandwidth from the bootloader). Throughput going above the high threshold nudges bwmon into signaling that we should speed up so as not to choke the bus traffic due to insufficient transfer rates. F_MIN is a perfect initial value for both of these cases - if we go above it (and there's a 99.99% chance it'll happen at boot time), we should definitely make the memory go faster, whereas if we go below it, we should slow down, no matter what performance state we were at before (it's only possible for them to be >= FMIN). This only changes the values programmed at probe time, as high and med thresholds are updated at interrupt, also based on the OPP table from DT. Signed-off-by: Konrad Dybcio <konrad.dybcio@linaro.org> Reviewed-by: Bjorn Andersson <andersson@kernel.org> Link: https://lore.kernel.org/r/20230610-topic-bwmon_opp-v2-1-0d25c1ce7dca@linaro.orgSigned-off-by: Bjorn Andersson <andersson@kernel.org>
-
- 09 Jul, 2023 10 commits
-
-
Linus Torvalds authored
-
Linus Torvalds authored
We just sorted the entries and fields last release, so just out of a perverse sense of curiosity, I decided to see if we can keep things ordered for even just one release. The answer is "No. No we cannot". I suggest that all kernel developers will need weekly training sessions, involving a lot of Big Bird and Sesame Street. And at the yearly maintainer summit, we will all sing the alphabet song together. I doubt I will keep doing this. At some point "perverse sense of curiosity" turns into just a cold dark place filled with sadness and despair. Repeats: 80e62bc8 ("MAINTAINERS: re-sort all entries and fields") Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
git://git.infradead.org/users/hch/dma-mappingLinus Torvalds authored
Pull dma-mapping fixes from Christoph Hellwig: - swiotlb area sizing fixes (Petr Tesarik) * tag 'dma-mapping-6.5-2023-07-09' of git://git.infradead.org/users/hch/dma-mapping: swiotlb: reduce the number of areas to match actual memory pool size swiotlb: always set the number of areas before allocating the pool
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull irq update from Borislav Petkov: - Optimize IRQ domain's name assignment * tag 'irq_urgent_for_v6.5_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: irqdomain: Use return value of strreplace()
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull x86 fpu fix from Borislav Petkov: - Do FPU AP initialization on Xen PV too which got missed by the recent boot reordering work * tag 'x86_urgent_for_v6.5_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/xen: Fix secondary processors' FPU initialization
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull x86 fix from Thomas Gleixner: "A single fix for the mechanism to park CPUs with an INIT IPI. On shutdown or kexec, the kernel tries to park the non-boot CPUs with an INIT IPI. But the same code path is also used by the crash utility. If the CPU which panics is not the boot CPU then it sends an INIT IPI to the boot CPU which resets the machine. Prevent this by validating that the CPU which runs the stop mechanism is the boot CPU. If not, leave the other CPUs in HLT" * tag 'x86-core-2023-07-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/smp: Don't send INIT to boot CPU
-
git://git.kernel.org/pub/scm/linux/kernel/git/mips/linuxLinus Torvalds authored
Pull MIPS fixes from Thomas Bogendoerfer: - fixes for KVM - fix for loongson build and cpu probing - DT fixes * tag 'mips_6.5_1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux: MIPS: kvm: Fix build error with KVM_MIPS_DEBUG_COP0_COUNTERS enabled MIPS: dts: add missing space before { MIPS: Loongson: Fix build error when make modules_install MIPS: KVM: Fix NULL pointer dereference MIPS: Loongson: Fix cpu_probe_loongson() again
-
git://git.kernel.org/pub/scm/fs/xfs/xfs-linuxLinus Torvalds authored
Pull xfs fix from Darrick Wong: "Nothing exciting here, just getting rid of a gcc warning that I got tired of seeing when I turn on gcov" * tag 'xfs-6.5-merge-6' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: xfs: fix uninit warning in xfs_growfs_data
-
git://git.samba.org/sfrench/cifs-2.6Linus Torvalds authored
Pull more smb client updates from Steve French: - fix potential use after free in unmount - minor cleanup - add worker to cleanup stale directory leases * tag '6.5-rc-smb3-client-fixes-part2' of git://git.samba.org/sfrench/cifs-2.6: cifs: Add a laundromat thread for cached directories smb: client: remove redundant pointer 'server' cifs: fix session state transition to avoid use-after-free issue
-
https://github.com/jonmason/ntbLinus Torvalds authored
Pull NTB updates from Jon Mason: "Fixes for pci_clean_master, error handling in driver inits, and various other issues/bugs" * tag 'ntb-6.5' of https://github.com/jonmason/ntb: ntb: hw: amd: Fix debugfs_create_dir error checking ntb.rst: Fix copy and paste error ntb_netdev: Fix module_init problem ntb: intel: Remove redundant pci_clear_master ntb: epf: Remove redundant pci_clear_master ntb_hw_amd: Remove redundant pci_clear_master ntb: idt: drop redundant pci_enable_pcie_error_reporting() MAINTAINERS: git://github -> https://github.com for jonmason NTB: EPF: fix possible memory leak in pci_vntb_probe() NTB: ntb_tool: Add check for devm_kcalloc NTB: ntb_transport: fix possible memory leak while device_register() fails ntb: intel: Fix error handling in intel_ntb_pci_driver_init() NTB: amd: Fix error handling in amd_ntb_pci_driver_init() ntb: idt: Fix error handling in idt_pci_driver_init()
-
- 08 Jul, 2023 4 commits
-
-
Hugh Dickins authored
Lockdep is certainly right to complain about (&vma->vm_lock->lock){++++}-{3:3}, at: vma_start_write+0x2d/0x3f but task is already holding lock: (&mapping->i_mmap_rwsem){+.+.}-{3:3}, at: mmap_region+0x4dc/0x6db Invert those to the usual ordering. Fixes: 33313a74 ("mm: lock newly mapped VMA which can be modified after it becomes visible") Cc: stable@vger.kernel.org Signed-off-by: Hugh Dickins <hughd@google.com> Tested-by: Suren Baghdasaryan <surenb@google.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Linus Torvalds authored
Merge tag 'mm-hotfixes-stable-2023-07-08-10-43' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull hotfixes from Andrew Morton: "16 hotfixes. Six are cc:stable and the remainder address post-6.4 issues" The merge undoes the disabling of the CONFIG_PER_VMA_LOCK feature, since it was all hopefully fixed in mainline. * tag 'mm-hotfixes-stable-2023-07-08-10-43' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: lib: dhry: fix sleeping allocations inside non-preemptable section kasan, slub: fix HW_TAGS zeroing with slub_debug kasan: fix type cast in memory_is_poisoned_n mailmap: add entries for Heiko Stuebner mailmap: update manpage link bootmem: remove the vmemmap pages from kmemleak in free_bootmem_page MAINTAINERS: add linux-next info mailmap: add Markus Schneider-Pargmann writeback: account the number of pages written back mm: call arch_swap_restore() from do_swap_page() squashfs: fix cache race with migration mm/hugetlb.c: fix a bug within a BUG(): inconsistent pte comparison docs: update ocfs2-devel mailing list address MAINTAINERS: update ocfs2-devel mailing list address mm: disable CONFIG_PER_VMA_LOCK until its fixed fork: lock VMAs of the parent process when forking
-
Suren Baghdasaryan authored
When forking a child process, the parent write-protects anonymous pages and COW-shares them with the child being forked using copy_present_pte(). We must not take any concurrent page faults on the source vma's as they are being processed, as we expect both the vma and the pte's behind it to be stable. For example, the anon_vma_fork() expects the parents vma->anon_vma to not change during the vma copy. A concurrent page fault on a page newly marked read-only by the page copy might trigger wp_page_copy() and a anon_vma_prepare(vma) on the source vma, defeating the anon_vma_clone() that wasn't done because the parent vma originally didn't have an anon_vma, but we now might end up copying a pte entry for a page that has one. Before the per-vma lock based changes, the mmap_lock guaranteed exclusion with concurrent page faults. But now we need to do a vma_start_write() to make sure no concurrent faults happen on this vma while it is being processed. This fix can potentially regress some fork-heavy workloads. Kernel build time did not show noticeable regression on a 56-core machine while a stress test mapping 10000 VMAs and forking 5000 times in a tight loop shows ~5% regression. If such fork time regression is unacceptable, disabling CONFIG_PER_VMA_LOCK should restore its performance. Further optimizations are possible if this regression proves to be problematic. Suggested-by: David Hildenbrand <david@redhat.com> Reported-by: Jiri Slaby <jirislaby@kernel.org> Closes: https://lore.kernel.org/all/dbdef34c-3a07-5951-e1ae-e9c6e3cdf51b@kernel.org/Reported-by: Holger Hoffstätte <holger@applied-asynchrony.com> Closes: https://lore.kernel.org/all/b198d649-f4bf-b971-31d0-e8433ec2a34c@applied-asynchrony.com/Reported-by: Jacob Young <jacobly.alt@gmail.com> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217624 Fixes: 0bff0aae ("x86/mm: try VMA lock-based page fault handling first") Cc: stable@vger.kernel.org Signed-off-by: Suren Baghdasaryan <surenb@google.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Suren Baghdasaryan authored
mmap_region adds a newly created VMA into VMA tree and might modify it afterwards before dropping the mmap_lock. This poses a problem for page faults handled under per-VMA locks because they don't take the mmap_lock and can stumble on this VMA while it's still being modified. Currently this does not pose a problem since post-addition modifications are done only for file-backed VMAs, which are not handled under per-VMA lock. However, once support for handling file-backed page faults with per-VMA locks is added, this will become a race. Fix this by write-locking the VMA before inserting it into the VMA tree. Other places where a new VMA is added into VMA tree do not modify it after the insertion, so do not need the same locking. Cc: stable@vger.kernel.org Signed-off-by: Suren Baghdasaryan <surenb@google.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-