Commits · 472e5b056f000a778abb41f1e443de58eb259783 · nexedi / linux

02 Oct, 2020 1 commit

pipe: remove pipe_wait() and fix wakeup race with splice · 472e5b05

Linus Torvalds authored Oct 01, 2020

The pipe splice code still used the old model of waiting for pipe IO by
using a non-specific "pipe_wait()" that waited for any pipe event to
happen, which depended on all pipe IO being entirely serialized by the
pipe lock. So by checking the state you were waiting for, and then
adding yourself to the wait queue before dropping the lock, you were
guaranteed to see all the wakeups.

Strictly speaking, the actual wakeups were not done under the lock, but
the pipe_wait() model still worked, because since the waiter held the
lock when checking whether it should sleep, it would always see the
current state, and the wakeup was always done after updating the state.

However, commit 0ddad21d ("pipe: use exclusive waits when reading or
writing") split the single wait-queue into two, and in the process also
made the "wait for event" code wait for _two_ wait queues, and that then
showed a race with the wakers that were not serialized by the pipe lock.

It's only splice that used that "pipe_wait()" model, so the problem
wasn't obvious, but Josef Bacik reports:

"I hit a hang with fstest btrfs/187, which does a btrfs send into
/dev/null. This works by creating a pipe, the write side is given to
the kernel to write into, and the read side is handed to a thread that
splices into a file, in this case /dev/null.

The box that was hung had the write side stuck here [pipe_write] and
the read side stuck here [splice_from_pipe_next -> pipe_wait].

[ more details about pipe_wait() scenario ]

The problem is we're doing the prepare_to_wait, which sets our state
each time, however we can be woken up either with reads or writes. In
the case above we race with the WRITER waking us up, and re-set our
state to INTERRUPTIBLE, and thus never break out of schedule"

Josef had a patch that avoided the issue in pipe_wait() by just making
it set the state only once, but the deeper problem is that pipe_wait()
depends on a level of synchonization by the pipe mutex that it really
shouldn't. And the whole "wait for any pipe state change" model really
isn't very good to begin with.

So rather than trying to work around things in pipe_wait(), remove that
legacy model of "wait for arbitrary pipe event" entirely, and actually
create functions that wait for the pipe actually being readable or
writable, and can do so without depending on the pipe lock serializing
everything.

Fixes: 0ddad21d ("pipe: use exclusive waits when reading or writing")
Link: https://lore.kernel.org/linux-fsdevel/bfa88b5ad6f069b2b679316b9e495a970130416c.1601567868.git.josef@toxicpanda.com/Reported-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-and-tested-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

472e5b05

01 Oct, 2020 7 commits

Merge tag 'iommu-fixes-v5.9-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu · 44b6e23b

Linus Torvalds authored Oct 01, 2020

Pull iommu fixes from Joerg Roedel:

 - Fix a device reference counting bug in the Exynos IOMMU driver.

 - Lockdep fix for the Intel VT-d driver.

 - Fix a bug in the AMD IOMMU driver which caused corruption of the IVRS
   ACPI table and caused IOMMU driver initialization failures in kdump
   kernels.

* tag 'iommu-fixes-v5.9-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
  iommu/vt-d: Fix lockdep splat in iommu_flush_dev_iotlb()
  iommu/amd: Fix the overwritten field in IVMD header
  iommu/exynos: add missing put_device() call in exynos_iommu_of_xlate()

44b6e23b

Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux · eed2ef44

Linus Torvalds authored Oct 01, 2020

Pull arm64 fix from Catalin Marinas:
 "A previous commit to prevent AML memory opregions from accessing the
  kernel memory turned out to be too restrictive. Relax the permission
  check to permit the ACPI core to map kernel memory used for table
  overrides"

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: permit ACPI core to map kernel memory used for table overrides

eed2ef44

Merge tag 'drm-fixes-2020-10-01-1' of git://anongit.freedesktop.org/drm/drm · fcadab74

Linus Torvalds authored Oct 01, 2020

Pull drm fixes from Dave Airlie:
 "AMD and vmwgfx fixes.

  Just dequeuing these a bit early as the AMD ones are bit larger than
  I'd prefer, but Alex missed last week so it's a double set of fixes.
  The larger ones are just register header fixes for the new chips that
  were just introduced in rc1 along with some new PCI IDs for new hw.
  Otherwise it is usual fixes.

  The vmwgfx fix was due to some testing I was doing and found we
  weren't booting properly, vmware had the fix internally so hurried it

  vmwgfx:
   - fix a regression due to TTM refactor

  amdgpu:
   - Fix potential double free in userptr handling
   - Sienna Cichlid and Navy Flounder udpates
   - Add Sienna Cichlid PCI IDs
   - Drop experimental flag for navi12
   - Raven fixes
   - Renoir fixes
   - HDCP fix
   - DCN3 fix for clang and older versions of gcc
   - Fix a runtime pm refcount issue"

* tag 'drm-fixes-2020-10-01-1' of git://anongit.freedesktop.org/drm/drm:
  drm/amdgpu: disable gfxoff temporarily for navy_flounder
  drm/amd/pm: setup APU dpm clock table in SMU HW initialization
  drm/vmwgfx: Fix error handling in get_node
  drm/amd/display: remove duplicate call to rn_vbios_smu_get_smu_version()
  drm/amdgpu/swsmu/smu12: fix force clock handling for mclk
  drm/amdgpu: restore proper ref count in amdgpu_display_crtc_set_config
  drm/amdgpu/display: fix CFLAGS setup for DCN30
  drm/amd/display: fix return value check for hdcp_work
  drm/amdgpu: remove gpu_info fw support for sienna_cichlid etc.
  drm/amd/pm: Removed fixed clock in auto mode DPM
  drm/amdgpu: remove experimental flag from navi12
  drm/amdgpu: add device ID for sienna_cichlid (v2)
  drm/amdgpu: use the AV1 defines for VCN 3.0
  drm/amdgpu: add VCN 3.0 AV1 registers
  drm/amdgpu: add the GC 10.3 VRS registers
  drm/amdgpu: prevent double kfree ttm->sg

fcadab74

Merge tag 'trace-v5.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace · aa5ff935

Linus Torvalds authored Oct 01, 2020

Pull tracing fixes from Steven Rostedt:
 "Two tracing fixes:

   - Fix temp buffer accounting that caused a WARNING for
     ftrace_dump_on_opps()

   - Move the recursion check in one of the function callback helpers to
     the beginning of the function, as if the rcu_is_watching() gets
     traced, it will cause a recursive loop that will crash the kernel"

* tag 'trace-v5.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  ftrace: Move RCU is watching check after recursion check
  tracing: Fix trace_find_next_entry() accounting of temp buffer size

aa5ff935

iommu/vt-d: Fix lockdep splat in iommu_flush_dev_iotlb() · 1a3f2fd7

Lu Baolu authored Sep 27, 2020

Lock(&iommu->lock) without disabling irq causes lockdep warnings.

[   12.703950] ========================================================
[   12.703962] WARNING: possible irq lock inversion dependency detected
[   12.703975] 5.9.0-rc6+ #659 Not tainted
[   12.703983] --------------------------------------------------------
[   12.703995] systemd-udevd/284 just changed the state of lock:
[   12.704007] ffffffffbd6ff4d8 (device_domain_lock){..-.}-{2:2}, at:
               iommu_flush_dev_iotlb.part.57+0x2e/0x90
[   12.704031] but this lock took another, SOFTIRQ-unsafe lock in the past:
[   12.704043]  (&iommu->lock){+.+.}-{2:2}
[   12.704045]

               and interrupts could create inverse lock ordering between
               them.

[   12.704073]
               other info that might help us debug this:
[   12.704085]  Possible interrupt unsafe locking scenario:

[   12.704097]        CPU0                    CPU1
[   12.704106]        ----                    ----
[   12.704115]   lock(&iommu->lock);
[   12.704123]                                local_irq_disable();
[   12.704134]                                lock(device_domain_lock);
[   12.704146]                                lock(&iommu->lock);
[   12.704158]   <Interrupt>
[   12.704164]     lock(device_domain_lock);
[   12.704174]
                *** DEADLOCK ***
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20200927062428.13713-1-baolu.lu@linux.intel.comSigned-off-by: Joerg Roedel <jroedel@suse.de>

1a3f2fd7

iommu/amd: Fix the overwritten field in IVMD header · 0bbe4ced

Adrian Huang authored Sep 26, 2020

Commit 387caf0b ("iommu/amd: Treat per-device exclusion
ranges as r/w unity-mapped regions") accidentally overwrites
the 'flags' field in IVMD (struct ivmd_header) when the I/O
virtualization memory definition is associated with the
exclusion range entry. This leads to the corrupted IVMD table
(incorrect checksum). The kdump kernel reports the invalid checksum:

ACPI BIOS Warning (bug): Incorrect checksum in table [IVRS] - 0x5C, should be 0x60 (20200717/tbprint-177)
AMD-Vi: [Firmware Bug]: IVRS invalid checksum

Fix the above-mentioned issue by modifying the 'struct unity_map_entry'
member instead of the IVMD header.

Cleanup: The *exclusion_range* functions are not used anymore, so
get rid of them.

Fixes: 387caf0b ("iommu/amd: Treat per-device exclusion ranges as r/w unity-mapped regions")
Reported-and-tested-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Adrian Huang <ahuang12@lenovo.com>
Cc: Jerry Snitselaar <jsnitsel@redhat.com>
Link: https://lore.kernel.org/r/20200926102602.19177-1-adrianhuang0701@gmail.comSigned-off-by: Joerg Roedel <jroedel@suse.de>

0bbe4ced

Merge tag 'amd-drm-fixes-5.9-2020-09-30' of... · 132d7c8a

Dave Airlie authored Oct 01, 2020

Merge tag 'amd-drm-fixes-5.9-2020-09-30' of git://people.freedesktop.org/~agd5f/linux into drm-fixes

amd-drm-fixes-5.9-2020-09-30:

amdgpu:
- Fix potential double free in userptr handling
- Sienna Cichlid and Navy Flounder udpates
- Add Sienna Cichlid PCI IDs
- Drop experimental flag for navi12
- Raven fixes
- Renoir fixes
- HDCP fix
- DCN3 fix for clang and older versions of gcc
- Fix a runtime pm refcount issue
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexdeucher@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200930161326.4243-1-alexander.deucher@amd.com

132d7c8a

30 Sep, 2020 8 commits

arm64: permit ACPI core to map kernel memory used for table overrides · a509a66a

Ard Biesheuvel authored Sep 29, 2020

Jonathan reports that the strict policy for memory mapped by the
ACPI core breaks the use case of passing ACPI table overrides via
initramfs. This is due to the fact that the memory type used for
loading the initramfs in memory is not recognized as a memory type
that is typically used by firmware to pass firmware tables.

Since the purpose of the strict policy is to ensure that no AML or
other ACPI code can manipulate any memory that is used by the kernel
to keep its internal state or the state of user tasks, we can relax
the permission check, and allow mappings of memory that is reserved
and marked as NOMAP via memblock, and therefore not covered by the
linear mapping to begin with.

Fixes: 1583052d ("arm64/acpi: disallow AML memory opregions to access kernel memory")
Fixes: 325f5585 ("arm64/acpi: disallow writeable AML opregion mapping for EFI code regions")
Reported-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Tested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Sudeep Holla <sudeep.holla@arm.com>
Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Link: https://lore.kernel.org/r/20200929132522.18067-1-ardb@kernel.orgSigned-off-by: Catalin Marinas <catalin.marinas@arm.com>

a509a66a

Merge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux · 60e72093

Linus Torvalds authored Sep 30, 2020

Pull clk fixes from Stephen Boyd:
 "Another batch of clk driver fixes:

   - Make sure DRAM and ChipID region doesn't get disabled on Exynos

   - Fix a SATA failure on Tegra

   - Fix the emac_ptp clk divider on stratix10"

* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
  clk: socfpga: stratix10: fix the divider for the emac_ptp_free_clk
  clk: samsung: exynos4: mark 'chipid' clock as CLK_IGNORE_UNUSED
  clk: tegra: Fix missing prototype for tegra210_clk_register_emc()
  clk: tegra: Always program PLL_E when enabled
  clk: tegra: Capitalization fixes
  clk: samsung: Keep top BPLL mux on Exynos542x enabled

60e72093

drm/amdgpu: disable gfxoff temporarily for navy_flounder · 95433a13

Jiansong Chen authored Sep 30, 2020

gfxoff is temporarily disabled for navy_flounder, since
at present the feature caused some tdr when performing
display operations.
Signed-off-by: Jiansong Chen <Jiansong.Chen@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

95433a13

drm/amd/pm: setup APU dpm clock table in SMU HW initialization · b1951525

Evan Quan authored Sep 30, 2020

As the dpm clock table is needed during DC HW initialization.
And that (DC HW initialization) comes before smu_late_init()
where current APU dpm clock table setup is performed. So, NULL
pointer dereference will be triggered. By moving APU dpm clock
table setup to smu_hw_init(), this can be avoided.

Fixes: 02cf91c1 ("drm/amd/powerplay: postpone operations not required for hw setup to late_init")
Acked-by: Nirmoy Das <nirmoy.das@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reported-by: Dirk Gouders <dirk@gouders.net>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

b1951525

Merge branch 'vmwgfx-fixes-5.9' of git://people.freedesktop.org/~sroland/linux into drm-fixes · 6f4fc18f

Dave Airlie authored Sep 30, 2020

One vmwgfx regression fix.
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: "Roland Scheidegger (VMware)" <rscheidegger.oss@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200930041000.2423-1-rscheidegger.oss@gmail.com

6f4fc18f

drm/vmwgfx: Fix error handling in get_node · f54c4442

Zack Rusin authored Sep 25, 2020

ttm_mem_type_manager_func.get_node was changed to return -ENOSPC
instead of setting the node pointer to NULL. Unfortunately
vmwgfx still had two places where it was explicitly converting
-ENOSPC to 0 causing regressions. This fixes those spots by
allowing -ENOSPC to be returned. That seems to fix recent
regressions with vmwgfx.
Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Martin Krastev <krastevm@vmware.com>
Sigend-off-by: Roland Scheidegger <sroland@vmware.com>

f54c4442

Merge tag 'devicetree-fixes-for-5.9-3' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux · 02de58b2

Linus Torvalds authored Sep 29, 2020

Pull devicetree fixes from Rob Herring:

 - Fix handling of HOST_EXTRACFLAGS for dtc

 - Several warning fixes for DT bindings

* tag 'devicetree-fixes-for-5.9-3' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux:
  scripts/dtc: only append to HOST_EXTRACFLAGS instead of overwriting
  dt-bindings: Fix 'reg' size issues in zynqmp examples
  ARM: dts: bcm2835: Change firmware compatible from simple-bus to simple-mfd
  dt-bindings: leds: cznic,turris-omnia-leds: fix error in binding
  dt-bindings: crypto: sa2ul: fix a DT binding check warning

02de58b2

autofs: use __kernel_write() for the autofs pipe writing · 90fb7027

Linus Torvalds authored Sep 29, 2020

autofs got broken in some configurations by commit 13c164b1
("autofs: switch to kernel_write") because there is now an extra LSM
permission check done by security_file_permission() in rw_verify_area().

autofs is one if the few places that really does want the much more
limited __kernel_write(), because the write is an internal kernel one
that shouldn't do any user permission checks (it also doesn't need the
file_start_write/file_end_write logic, since it's just a pipe).

There are a couple of other cases like that - accounting, core dumping,
and splice - but autofs stands out because it can be built as a module.

As a result, we need to export this internal __kernel_write() function
again.

We really don't want any other module to use this, but we don't have a
"EXPORT_SYMBOL_FOR_AUTOFS_ONLY()".  But we can mark it GPL-only to at
least approximate that "internal use only" for licensing.

While in this area, make autofs pass in NULL for the file position
pointer, since it's always a pipe, and we now use a NULL file pointer
for streaming file descriptors (see file_ppos() and commit 438ab720:
"vfs: pass ppos=NULL to .read()/.write() of FMODE_STREAM files")

This effectively reverts commits 9db97752 ("fs: unexport
__kernel_write") and 13c164b1 ("autofs: switch to kernel_write").

Fixes: 13c164b1 ("autofs: switch to kernel_write")
Reported-by: Ondrej Mosnacek <omosnace@redhat.com>
Acked-by: Christoph Hellwig <hch@lst.de>
Acked-by: Acked-by: Ian Kent <raven@themaw.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

90fb7027

29 Sep, 2020 13 commits

drm/amd/display: remove duplicate call to rn_vbios_smu_get_smu_version() · 548c7ba7

Dirk Gouders authored Sep 27, 2020

Commit 78fe9f63 ("drm/amd/display: Remove DISPCLK Limit Floor for Certain SMU Versions")
added a call to rn_vbios_smu_get_smu_version() to set clk_mgr->smu_ver.
That field is initialized prior to the if-statement, already.

Fixes: 78fe9f63 (drm/amd/display: Remove DISPCLK Limit Floor for Certain SMU Versions)
Signed-off-by: Dirk Gouders <dirk@gouders.net>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Sung Lee <sung.lee@amd.com>
Cc: Yongqiang Sun <yongqiang.sun@amd.com>
Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

548c7ba7

drm/amdgpu/swsmu/smu12: fix force clock handling for mclk · 3c26d031

Alex Deucher authored Sep 28, 2020

The state array is in the reverse order compared to other asics
(high to low rather than low to high).

Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1313Reviewed-by: Prike Liang <Prike.Liang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

3c26d031

drm/amdgpu: restore proper ref count in amdgpu_display_crtc_set_config · a39d0d7b

Jean Delvare authored Sep 28, 2020

A recent attempt to fix a ref count leak in
amdgpu_display_crtc_set_config() turned out to be doing too much and
"fixed" an intended decrease as if it were a leak. Undo that part to
restore the proper balance. This is the very nature of this function
to increase or decrease the power reference count depending on the
situation.

Consequences of this bug is that the power reference would
eventually get down to 0 while the display was still in use,
resulting in that display switching off unexpectedly.
Signed-off-by: Jean Delvare <jdelvare@suse.de>
Fixes: e008fa6f ("drm/amdgpu: fix ref count leak in amdgpu_display_crtc_set_config")
Cc: stable@vger.kernel.org
Cc: Navid Emamdoost <navid.emamdoost@gmail.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

a39d0d7b

drm/amdgpu/display: fix CFLAGS setup for DCN30 · c73d05ea

Alex Deucher authored Sep 22, 2020

Properly handle clang and older versions of gcc.

Fixes: e77165bf ("drm/amd/display: Add DCN3 blocks to Makefile")
Acked-by: Nirmoy Das <nirmoy.das@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

c73d05ea

drm/amd/display: fix return value check for hdcp_work · 898c7302

Flora Cui authored Sep 23, 2020

max_caps might be 0, thus hdcp_work might be ZERO_SIZE_PTR
Signed-off-by: Flora Cui <flora.cui@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

898c7302

drm/amdgpu: remove gpu_info fw support for sienna_cichlid etc. · 0c701415

Jiansong Chen authored Sep 23, 2020

Remove gpu_info fw support for sienna_cichlid etc., since the
information can be retrieved from discovery binary.
Signed-off-by: Jiansong Chen <Jiansong.Chen@amd.com>
Reviewed-by: Likun Gao <Likun.Gao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

0c701415

drm/amd/pm: Removed fixed clock in auto mode DPM · 97cf3299

Sudheesh Mavila authored Sep 15, 2020

SMU10_UMD_PSTATE_PEAK_FCLK value should not be used to set the DPM.
Suggested-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Sudheesh Mavila <sudheesh.mavila@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

97cf3299

scripts/dtc: only append to HOST_EXTRACFLAGS instead of overwriting · efe84d40

Uwe Kleine-König authored Sep 19, 2020

When building with

	$ HOST_EXTRACFLAGS=-g make

the expectation is that host tools are built with debug informations.
This however doesn't happen if the Makefile assigns a new value to the
HOST_EXTRACFLAGS instead of appending to it. So use += instead of := for
the first assignment.

Fixes: e3fd9b53 ("scripts/dtc: consolidate include path options in Makefile")
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Rob Herring <robh@kernel.org>

efe84d40

dt-bindings: Fix 'reg' size issues in zynqmp examples · 64ff609b

Rob Herring authored Sep 28, 2020

The default sizes in examples for 'reg' are 1 cell each. Fix the
incorrect sizes in zynqmp examples:

Documentation/devicetree/bindings/dma/xilinx/xlnx,zynqmp-dpdma.example.dt.yaml: example-0: dma-controller@fd4c0000:reg:0: [0, 4249616384, 0, 4096] is too long
From schema: /usr/local/lib/python3.8/dist-packages/dtschema/schemas/reg.yaml
Documentation/devicetree/bindings/display/xlnx/xlnx,zynqmp-dpsub.example.dt.yaml: example-0: display@fd4a0000:reg:0: [0, 4249485312, 0, 4096] is too long
From schema: /usr/local/lib/python3.8/dist-packages/dtschema/schemas/reg.yaml
Documentation/devicetree/bindings/display/xlnx/xlnx,zynqmp-dpsub.example.dt.yaml: example-0: display@fd4a0000:reg:1: [0, 4249526272, 0, 4096] is too long
From schema: /usr/local/lib/python3.8/dist-packages/dtschema/schemas/reg.yaml
Documentation/devicetree/bindings/display/xlnx/xlnx,zynqmp-dpsub.example.dt.yaml: example-0: display@fd4a0000:reg:2: [0, 4249530368, 0, 4096] is too long
From schema: /usr/local/lib/python3.8/dist-packages/dtschema/schemas/reg.yaml
Documentation/devicetree/bindings/display/xlnx/xlnx,zynqmp-dpsub.example.dt.yaml: example-0: display@fd4a0000:reg:3: [0, 4249534464, 0, 4096] is too long
From schema: /usr/local/lib/python3.8/dist-packages/dtschema/schemas/reg.yaml

Cc: Hyun Kwon <hyun.kwon@xilinx.com>
Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Cc: Vinod Koul <vkoul@kernel.org>
Cc: dri-devel@lists.freedesktop.org
Cc: dmaengine@vger.kernel.org
Acked-by: Michal Simek <michal.simek@xilinx.com>
Signed-off-by: Rob Herring <robh@kernel.org>

64ff609b

Merge tag 'dmaengine-fix-5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine · ccc1d052

Linus Torvalds authored Sep 29, 2020

Pull dmaengine fix from Vinod Koul:
 "Fix dmatest for misconfigured channel"

* tag 'dmaengine-fix-5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine:
  dmaengine: dmatest: Prevent to run on misconfigured channel

ccc1d052

Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost · 1ccfa66d

Linus Torvalds authored Sep 29, 2020

Pull virtio fixes from Michael Tsirkin:
 "A couple of last minute fixes"

* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
  vhost-vdpa: fix backend feature ioctls
  vhost: Fix documentation

1ccfa66d

ftrace: Move RCU is watching check after recursion check · b40341fa

Steven Rostedt (VMware) authored Sep 29, 2020

The first thing that the ftrace function callback helper functions should do
is to check for recursion. Peter Zijlstra found that when
"rcu_is_watching()" had its notrace removed, it caused perf function tracing
to crash. This is because the call of rcu_is_watching() is tested before
function recursion is checked and and if it is traced, it will cause an
infinite recursion loop.

rcu_is_watching() should still stay notrace, but to prevent this should
never had crashed in the first place. The recursion prevention must be the
first thing done in callback functions.

Link: https://lore.kernel.org/r/20200929112541.GM2628@hirez.programming.kicks-ass.net

Cc: stable@vger.kernel.org
Cc: Paul McKenney <paulmck@kernel.org>
Fixes: c68c0fa2 ("ftrace: Have ftrace_ops_get_func() handle RCU and PER_CPU flags too")
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reported-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>

b40341fa

tracing: Fix trace_find_next_entry() accounting of temp buffer size · 851e6f61

Steven Rostedt (VMware) authored Sep 29, 2020

The temp buffer size variable for trace_find_next_entry() was incorrectly
being updated when the size did not change. The temp buffer size should only
be updated when it is reallocated.

This is mostly an issue when used with ftrace_dump(). That's because
ftrace_dump() can not allocate a new buffer, and instead uses a temporary
buffer with a fix size. But the variable that keeps track of that size is
incorrectly updated with each call, and it could fall into the path that
would try to reallocate the buffer and produce a warning.

 ------------[ cut here ]------------
 WARNING: CPU: 1 PID: 1601 at kernel/trace/trace.c:3548
trace_find_next_entry+0xd0/0xe0
 Modules linked in [..]
 CPU: 1 PID: 1601 Comm: bash Not tainted 5.9.0-rc5-test+ #521
 Hardware name: Hewlett-Packard HP Compaq Pro 6300 SFF/339A, BIOS K01 v03.03
07/14/2016
 RIP: 0010:trace_find_next_entry+0xd0/0xe0
 Code: 40 21 00 00 4c 89 e1 31 d2 4c 89 ee 48 89 df e8 c6 9e ff ff 89 ab 54
21 00 00 5b 5d 41 5c 41 5d c3 48 63 d5 eb bf 31 c0 eb f0 <0f> 0b 48 63 d5 eb
b4 66 0f 1f 84 00 00 00 00 00 53 48 8d 8f 60 21
 RSP: 0018:ffff95a4f2e8bd70 EFLAGS: 00010046
 RAX: ffffffff96679fc0 RBX: ffffffff97910de0 RCX: ffffffff96679fc0
 RDX: ffff95a4f2e8bd98 RSI: ffff95a4ee321098 RDI: ffffffff97913000
 RBP: 0000000000000018 R08: 0000000000000000 R09: 0000000000000000
 R10: 0000000000000001 R11: 0000000000000046 R12: ffff95a4f2e8bd98
 R13: 0000000000000000 R14: ffff95a4ee321098 R15: 00000000009aa301
 FS:  00007f8565484740(0000) GS:ffff95a55aa40000(0000)
knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 000055876bd43d90 CR3: 00000000b76e6003 CR4: 00000000001706e0
 Call Trace:
  trace_print_lat_context+0x58/0x2d0
  ? cpumask_next+0x16/0x20
  print_trace_line+0x1a4/0x4f0
  ftrace_dump.cold+0xad/0x12c
  __handle_sysrq.cold+0x51/0x126
  write_sysrq_trigger+0x3f/0x4a
  proc_reg_write+0x53/0x80
  vfs_write+0xca/0x210
  ksys_write+0x70/0xf0
  do_syscall_64+0x33/0x40
  entry_SYSCALL_64_after_hwframe+0x44/0xa9
 RIP: 0033:0x7f8565579487
 Code: 64 89 02 48 c7 c0 ff ff ff ff eb bb 0f 1f 80 00 00 00 00 f3 0f 1e fa
64 8b 04 25 18 00 00 00 85 c0 75 10 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff
77 51 c3 48 83 ec 28 48 89 54 24 18 48 89 74 24
 RSP: 002b:00007ffd40707948 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
 RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007f8565579487
 RDX: 0000000000000002 RSI: 000055876bd74de0 RDI: 0000000000000001
 RBP: 000055876bd74de0 R08: 000000000000000a R09: 0000000000000001
 R10: 000055876bdec280 R11: 0000000000000246 R12: 0000000000000002
 R13: 00007f856564a500 R14: 0000000000000002 R15: 00007f856564a700
 irq event stamp: 109958
 ---[ end trace 7aab5b7e51484b00 ]---

Not only fix the updating of the temp buffer, but also do not free the temp
buffer before a new buffer is allocated (there's no reason to not continue
to use the current temp buffer if an allocation fails).

Cc: stable@vger.kernel.org
Fixes: 8e99cf91 ("tracing: Do not allocate buffer in trace_find_next_entry() in atomic")
Reported-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>

851e6f61

28 Sep, 2020 3 commits

Merge tag 'nfs-for-5.9-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs · fb0155a0

Linus Torvalds authored Sep 28, 2020

Pull NFS client bugfixes from Trond Myklebust:
 "Highlights include:

   - NFSv4.2: copy_file_range needs to invalidate caches on success

   - NFSv4.2: Fix security label length not being reset

   - pNFS/flexfiles: Ensure we initialise the mirror bsizes correctly
     on read

   - pNFS/flexfiles: Fix signed/unsigned type issues with mirror
     indices"

* tag 'nfs-for-5.9-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
  pNFS/flexfiles: Be consistent about mirror index types
  pNFS/flexfiles: Ensure we initialise the mirror bsizes correctly on read
  NFSv4.2: fix client's attribute cache management for copy_file_range
  nfs: Fix security label length not being reset

fb0155a0

mm: do not rely on mm == current->mm in __get_user_pages_locked · a4d63c37

Jason A. Donenfeld authored Sep 28, 2020

It seems likely this block was pasted from internal_get_user_pages_fast,
which is not passed an mm struct and therefore uses current's.  But
__get_user_pages_locked is passed an explicit mm, and current->mm is not
always valid. This was hit when being called from i915, which uses:

  pin_user_pages_remote->
    __get_user_pages_remote->
      __gup_longterm_locked->
        __get_user_pages_locked

Before, this would lead to an OOPS:

  BUG: kernel NULL pointer dereference, address: 0000000000000064
  #PF: supervisor write access in kernel mode
  #PF: error_code(0x0002) - not-present page
  CPU: 10 PID: 1431 Comm: kworker/u33:1 Tainted: P S   U     O      5.9.0-rc7+ #140
  Hardware name: LENOVO 20QTCTO1WW/20QTCTO1WW, BIOS N2OET47W (1.34 ) 08/06/2020
  Workqueue: i915-userptr-acquire __i915_gem_userptr_get_pages_worker [i915]
  RIP: 0010:__get_user_pages_remote+0xd7/0x310
  Call Trace:
   __i915_gem_userptr_get_pages_worker+0xc8/0x260 [i915]
   process_one_work+0x1ca/0x390
   worker_thread+0x48/0x3c0
   kthread+0x114/0x130
   ret_from_fork+0x1f/0x30
  CR2: 0000000000000064

This commit fixes the problem by using the mm pointer passed to the
function rather than the bogus one in current.

Fixes: 008cfe44 ("mm: Introduce mm_struct.has_pinned")
Tested-by: Chris Wilson <chris@chris-wilson.co.uk>
Reported-by: Harald Arnesen <harald@skogtun.org>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

a4d63c37

ARM: dts: bcm2835: Change firmware compatible from simple-bus to simple-mfd · 6a754830

Maxime Ripard authored Sep 24, 2020

The current binding for the RPi firmware uses the simple-bus compatible as
a fallback to benefit from its automatic probing of child nodes.

However, simple-bus also comes with some constraints, like having the ranges,
our case.

Let's switch to simple-mfd that provides the same probing logic without
those constraints.
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
Link: https://lore.kernel.org/r/20200924082642.18144-1-maxime@cerno.techSigned-off-by: Rob Herring <robh@kernel.org>

6a754830

27 Sep, 2020 8 commits

Linux 5.9-rc7 · a1b8638b
Linus Torvalds authored Sep 27, 2020

a1b8638b

Merge tag 'kbuild-fixes-v5.9-4' of... · 16bc1d54

Linus Torvalds authored Sep 27, 2020

Merge tag 'kbuild-fixes-v5.9-4' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild

Pull Kbuild fixes from Masahiro Yamada:

 - ignore compiler stubs for PPC to fix builds

 - fix the usage of --target mentioned in the LLVM document

* tag 'kbuild-fixes-v5.9-4' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
  Documentation/llvm: Fix clang target examples
  scripts/kallsyms: skip ppc compiler stub *.long_branch.* / *.plt_branch.*

16bc1d54

Merge tag 'x86-urgent-2020-09-27' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · f8818559

Linus Torvalds authored Sep 27, 2020

Pull x86 fixes from Thomas Gleixner:
 "Two fixes for the x86 interrupt code:

   - Unbreak the magic 'search the timer interrupt' logic in IO/APIC
     code which got wreckaged when the core interrupt code made the
     state tracking logic stricter.

     That caused the interrupt line to stay masked after switching from
     IO/APIC to PIC delivery mode, which obviously prevents interrupts
     from being delivered.

   - Make run_on_irqstack_code() typesafe. The function argument is a
     void pointer which is then cast to 'void (*fun)(void *).

     This breaks Control Flow Integrity checking in clang. Use proper
     helper functions for the three variants reuqired"

* tag 'x86-urgent-2020-09-27' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/ioapic: Unbreak check_timer()
  x86/irq: Make run_on_irqstack_cond() typesafe

f8818559

Merge tag 'timers-urgent-2020-09-27' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · ba25f057

Linus Torvalds authored Sep 27, 2020

Pull timer updates from Thomas Gleixner:
 "A set of clocksource/clockevents updates:

   - Reset the TI/DM timer before enabling it instead of doing it the
     other way round.

   - Initialize the reload value for the GX6605s timer correctly so the
     hardware counter starts at 0 again after overrun.

   - Make error return value negative in the h8300 timer init function"

* tag 'timers-urgent-2020-09-27' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  clocksource/drivers/timer-gx6605s: Fixup counter reload
  clocksource/drivers/timer-ti-dm: Do reset before enable
  clocksource/drivers/h8300_timer8: Fix wrong return value in h8300_8timer_init()

ba25f057

mm/thp: Split huge pmds/puds if they're pinned when fork() · d042035e

Peter Xu authored Sep 25, 2020

Pinned pages shouldn't be write-protected when fork() happens, because
follow up copy-on-write on these pages could cause the pinned pages to
be replaced by random newly allocated pages.

For huge PMDs, we split the huge pmd if pinning is detected.  So that
future handling will be done by the PTE level (with our latest changes,
each of the small pages will be copied).  We can achieve this by let
copy_huge_pmd() return -EAGAIN for pinned pages, so that we'll
fallthrough in copy_pmd_range() and finally land the next
copy_pte_range() call.

Huge PUDs will be even more special - so far it does not support
anonymous pages.  But it can actually be done the same as the huge PMDs
even if the split huge PUDs means to erase the PUD entries.  It'll
guarantee the follow up fault ins will remap the same pages in either
parent/child later.

This might not be the most efficient way, but it should be easy and
clean enough.  It should be fine, since we're tackling with a very rare
case just to make sure userspaces that pinned some thps will still work
even without MADV_DONTFORK and after they fork()ed.
Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

d042035e

mm: Do early cow for pinned pages during fork() for ptes · 70e806e4

Peter Xu authored Sep 25, 2020

This allows copy_pte_range() to do early cow if the pages were pinned on
the source mm.

Currently we don't have an accurate way to know whether a page is pinned
or not.  The only thing we have is page_maybe_dma_pinned().  However
that's good enough for now.  Especially, with the newly added
mm->has_pinned flag to make sure we won't affect processes that never
pinned any pages.

It would be easier if we can do GFP_KERNEL allocation within
copy_one_pte().  Unluckily, we can't because we're with the page table
locks held for both the parent and child processes.  So the page
allocation needs to be done outside copy_one_pte().

Some trick is there in copy_present_pte(), majorly the wrprotect trick
to block concurrent fast-gup.  Comments in the function should explain
better in place.

Oleg Nesterov reported a (probably harmless) bug during review that we
didn't reset entry.val properly in copy_pte_range() so that potentially
there's chance to call add_swap_count_continuation() multiple times on
the same swp entry.  However that should be harmless since even if it
happens, the same function (add_swap_count_continuation()) will return
directly noticing that there're enough space for the swp counter.  So
instead of a standalone stable patch, it is touched up in this patch
directly.

Link: https://lore.kernel.org/lkml/20200914143829.GA1424636@nvidia.com/Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

70e806e4

mm/fork: Pass new vma pointer into copy_page_range() · 7a4830c3

Peter Xu authored Sep 25, 2020

This prepares for the future work to trigger early cow on pinned pages
during fork().

No functional change intended.
Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

7a4830c3

mm: Introduce mm_struct.has_pinned · 008cfe44

Peter Xu authored Sep 25, 2020

(Commit message majorly collected from Jason Gunthorpe)

Reduce the chance of false positive from page_maybe_dma_pinned() by
keeping track if the mm_struct has ever been used with pin_user_pages().
This allows cases that might drive up the page ref_count to avoid any
penalty from handling dma_pinned pages.

Future work is planned, to provide a more sophisticated solution, likely
to turn it into a real counter.  For now, make it atomic_t but use it as
a boolean for simplicity.
Suggested-by: Jason Gunthorpe <jgg@ziepe.ca>
Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

008cfe44