Commits · e6d05acd57013114977ec77a131fe79d2f542774 · Kirill Smelkov / linux

03 Apr, 2020 3 commits

remoteproc/omap: Fix set_load call in omap_rproc_request_timer · e6d05acd

Nathan Chancellor authored Apr 02, 2020

When building arm allyesconfig:

drivers/remoteproc/omap_remoteproc.c:174:44: error: too many arguments
to function call, expected 2, have 3
        timer->timer_ops->set_load(timer->odt, 0, 0);
        ~~~~~~~~~~~~~~~~~~~~~~~~~~                ^
1 error generated.

This is due to commit 02e6d546 ("clocksource/drivers/timer-ti-dm:
Enable autoreload in set_pwm") in the clockevents tree interacting with
commit e28edc57 ("remoteproc/omap: Request a timer(s) for remoteproc
usage") from the rpmsg tree.

This should have been fixed during the merge of the remoteproc tree
since it happened after the clockevents tree merge; however, it does not
look like my email was noticed by either maintainer and I did not pay
attention when the pull was sent since I was on CC.

Fixes: c6570114 ("Merge tag 'rproc-v5.7' of git://git.kernel.org/pub/scm/linux/kernel/git/andersson/remoteproc")
Link: https://lore.kernel.org/lkml/20200327185055.GA22438@ubuntu-m2-xlarge-x86/Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Acked-by: Suman Anna <s-anna@ti.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

e6d05acd

Merge tag 'devicetree-for-5.7' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux · bef7b2a7

Linus Torvalds authored Apr 02, 2020

Pull devicetree updates from Rob Herring:

 - Unit test for overlays with GPIO hogs

 - Improve dma-ranges parsing to handle dma-ranges with multiple entries

 - Update dtc to upstream version v1.6.0-2-g87a656ae5ff9

 - Improve overlay error reporting

 - Device link support for power-domains and hwlocks bindings

 - Add vendor prefixes for Beacon, Topwise, ENE, Dell, SG Micro, Elida,
   PocketBook, Xiaomi, Linutronix, OzzMaker, Waveshare Electronics, and
   ITE Tech

 - Add deprecated Marvell vendor prefix 'mrvl'

 - A bunch of binding conversions to DT schema continues. Of note, the
   common serial and USB connector bindings are converted.

 - Add more Arm CPU compatibles

 - Drop Mark Rutland as DT maintainer :(

* tag 'devicetree-for-5.7' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux: (106 commits)
  MAINTAINERS: drop an old reference to stm32 pwm timers doc
  MAINTAINERS: dt: update etnaviv file reference
  dt-bindings: usb: dwc2: fix bindings for amlogic, meson-gxbb-usb
  dt-bindings: uniphier-system-bus: fix warning in the example
  dt-bindings: display: meson-vpu: fix indentation of reg-names' "items"
  dt-bindings: iio: Fix adi, ltc2983 uint64-matrix schema constraints
  dt-bindings: power: Fix example for power-domain
  dt-bindings: arm: Add some constraints for PSCI nodes
  of: some unittest overlays not untracked
  of: gpio unittest kfree() wrong object
  dt-bindings: phy: convert phy-rockchip-inno-usb2 bindings to yaml
  dt-bindings: serial: sh-sci: Convert to json-schema
  dt-bindings: serial: Document serialN aliases
  dt-bindings: thermal: tsens: Set 'additionalProperties: false'
  dt-bindings: thermal: tsens: Fix nvmem-cell-names schema
  dt-bindings: vendor-prefixes: Add Beacon vendor prefix
  dt-bindings: vendor-prefixes: Add Topwise
  of: of_private.h: Replace zero-length array with flexible-array member
  docs: dt: fix a broken reference to input.yaml
  docs: dt: fix references to ap806-system-controller.txt
  ...

bef7b2a7

Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi · 79f51b7b

Linus Torvalds authored Apr 02, 2020

Pull SCSI updates from James Bottomley:
 "This series has a huge amount of churn because it pulls in Mauro's doc
  update changing all our txt files to rst ones.

  Excluding that, we have the usual driver updates (qla2xxx, ufs, lpfc,
  zfcp, ibmvfc, pm80xx, aacraid), a treewide update for scnprintf and
  some other minor updates.

  The major core change is Hannes moving functions out of the aacraid
  driver and into the core"

* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (223 commits)
  scsi: aic7xxx: aic97xx: Remove FreeBSD-specific code
  scsi: ufs: Do not rely on prefetched data
  scsi: dc395x: remove dc395x_bios_param
  scsi: libiscsi: Fix error count for active session
  scsi: hpsa: correct race condition in offload enabled
  scsi: message: fusion: Replace zero-length array with flexible-array member
  scsi: qedi: Add PCI shutdown handler support
  scsi: qedi: Add MFW error recovery process
  scsi: ufs: Enable block layer runtime PM for well-known logical units
  scsi: ufs-qcom: Override devfreq parameters
  scsi: ufshcd: Let vendor override devfreq parameters
  scsi: ufshcd: Update the set frequency to devfreq
  scsi: ufs: Resume ufs host before accessing ufs device
  scsi: ufs-mediatek: customize the delay for enabling host
  scsi: ufs: make HCE polling more compact to improve initialization latency
  scsi: ufs: allow custom delay prior to host enabling
  scsi: ufs-mediatek: use common delay function
  scsi: ufs: introduce common and flexible delay function
  scsi: ufs: use an enum for host capabilities
  scsi: ufs: fix uninitialized tx_lanes in ufshcd_disable_tx_lcc()
  ...

79f51b7b

02 Apr, 2020 37 commits

Merge tag 'mtd/for-5.7' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux · e109f506

Linus Torvalds authored Apr 02, 2020

Pull MTD updates from Miquel Raynal:
 "MTD core changes:
   - Fix issue where write_cached_data() fails but write() still returns
     success

   - maps: sa1100-flash: Replace zero-length array with flexible-array
     member

   - phram: Fix a double free issue in error path

   - Convert fallthrough comments into statements

   - MAINTAINERS: Add the IRC channel to the MTD related subsystems

  Raw NAND core changes:
   - Add support for manufacturer specific suspend/resume operation

   - Add support for manufacturer specific lock/unlock operation

   - Replace zero-length array with flexible-array member

   - Fix a typo ("manufecturer")

   - Ensure nand_soft_waitrdy wait period is enough

  Raw NAND controller driver changes:
   - Brcmnand:
       * Add support for flash-edu for dma transfers (+ bindings)

   - Cadence:
       * Reinit completion before executing a new command
       * Change bad block marker size
       * Fix the calculation of the avaialble OOB size
       * Get meta data size from registers

   - Qualcom:
       * Use dma_request_chan() instead dma_request_slave_channel()
       * Release resources on failure within qcom_nandc_alloc()

   - Allwinner:
       * Use dma_request_chan() instead dma_request_slave_channel()

   - Marvell:
       * Use dma_request_chan() instead dma_request_slave_channel()
       * Release DMA channel on error

   - Freescale:
       * Use dma_request_chan() instead dma_request_slave_channel()

   - Macronix:
       * Add support for Macronix NAND randomizer (+ bindings)

   - Ams-delta:
       * Rename structures and functions to gpio_nand*
       * Make the driver custom I/O ready
       * Drop useless local variable
       * Support custom driver initialisation
       * Add module device tables
       * Handle more GPIO pins as optional
       * Make read pulses optional
       * Don't hardcode read/write pulse widths
       * Push inversion handling to gpiolib
       * Enable OF partition info support
       * Drop board specific partition info
       * Use struct gpio_nand_platdata
       * Write protect device during probe

   - Ingenic:
       * Use devm_platform_ioremap_resource()
       * Add dependency on MIPS || COMPILE_TEST

   - Denali:
       * Deassert write protect pin

   - ST:
       * Use dma_request_chan() instead dma_request_slave_channel()

  Raw NAND chip driver changes:
   - Toshiba:
       * Support reading the number of bitflips for BENAND (Built-in ECC NAND)

   - Macronix:
       * Add support for deep power down mode
       * Add support for block protection

  SPI-NAND core changes:
   - Do not erase the block before writing a bad block marker

   - Explicitly use MTD_OPS_RAW to write the bad block marker to OOB

   - Stop using spinand->oobbuf for buffering bad block markers

   - Rework detect procedure for different READ_ID operation

  SPI-NAND driver changes:
   - Toshiba:
       * Support for new Kioxia Serial NAND
       * Rename function name to change suffix and prefix (8Gbit)
       * Add comment about Kioxia ID

   - Micron:
       * Add new Micron SPI NAND devices with multiple dies
       * Add M70A series Micron SPI NAND devices
       * identify SPI NAND device with Continuous Read mode
       * Add new Micron SPI NAND devices
       * Describe the SPI NAND device MT29F2G01ABAGD
       * Generalize the OOB layout structure and function names

  SPI NOR core changes:
   - Move all the manufacturer specific quirks/code out of the core, to
     make the core logic more readable and thus ease maintenance.

   - Move the SFDP logic out of the core, it provides a better
     separation between the SFDP parsing and core logic.

   - Trim what is exposed in spi-nor.h. The SPI NOR controllers drivers
     must not be able to use structures that are meant just for the SPI
     NOR core.

   - Use the spi-mem direct mapping API to let advanced controllers
     optimize the read/write operations when they support direct
     mapping.

   - Add generic formula for the Status Register block protection
     handling. It fixes some long standing locking limitations and eases
     the addition of the 4bit block protection support.

   - Add block protection support for flashes with 4 block protection
     bits in the Status Register.

  SPI NOR controller drivers changes:
   - The mtk-quadspi driver is replaced by the new spi-mem spi-mtk-nor
     driver.

   - Merge tag 'mtk-mtd-spi-move' into spi-nor/next to avoid conflicts.

  HyperBus changes:
   - Print error msg when compatible is wrong or missing

   - Move mapping of direct access window from core to individual
     drivers"

* tag 'mtd/for-5.7' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux: (103 commits)
  mtd: Convert fallthrough comments into statements
  mtd: rawnand: toshiba: Support reading the number of bitflips for BENAND (Built-in ECC NAND)
  MAINTAINERS: Add the IRC channel to the MTD related subsystems
  mtd: Fix issue where write_cached_data() fails but write() still returns success
  mtd: maps: sa1100-flash: Replace zero-length array with flexible-array member
  mtd: phram: fix a double free issue in error path
  mtd: spinand: toshiba: Support for new Kioxia Serial NAND
  mtd: spinand: toshiba: Rename function name to change suffix and prefix (8Gbit)
  mtd: rawnand: macronix: Add support for deep power down mode
  mtd: rawnand: Add support for manufacturer specific suspend/resume operation
  mtd: spi-nor: Enable locking for n25q512ax3/n25q512a
  mtd: spi-nor: Add SR 4bit block protection support
  mtd: spi-nor: Add generic formula for SR block protection handling
  mtd: spi-nor: Set all BP bits to one when lock_len == mtd->size
  mtd: spi-nor: controllers: aspeed-smc: Replace zero-length array with flexible-array member
  mtd: spi-nor: Clear WEL bit when erase or program errors occur
  MAINTAINERS: update entry after SPI NOR controller move
  mtd: spi-nor: Trim what is exposed in spi-nor.h
  mtd: spi-nor: Drop the MFR definitions
  mtd: spi-nor: Get rid of the now empty spi_nor_ids[] table
  ...

e109f506

Merge tag 'dmaengine-5.7-rc1' of git://git.infradead.org/users/vkoul/slave-dma · e964f1e0

Linus Torvalds authored Apr 02, 2020

Pull dmaengine updates from Vinod Koul:
 "Core:
   - Some code cleanup and optimization in core by Andy

   - Debugfs support for displaying dmaengine channels by Peter

  Drivers:
   - New driver for uniphier-xdmac controller

   - Updates to stm32 dma, mdma and dmamux drivers and PM support

   - More updates to idxd drivers

   - Bunch of changes in tegra-apb driver and cleaning up of pm
     functions

   - Bunch of spelling fixes and Replace zero-length array patches

   - Shutdown hook for fsl-dpaa2-qdma driver

   - Support for interleaved transfers for ti-edma and virtualization
     support for k3-dma driver

   - Support for reset and updates in xilinx_dma driver

   - Improvements and locking updates in at_hdma driver"

* tag 'dmaengine-5.7-rc1' of git://git.infradead.org/users/vkoul/slave-dma: (89 commits)
  dt-bindings: dma: renesas,usb-dmac: add r8a77961 support
  dmaengine: uniphier-xdmac: Remove redandant error log for platform_get_irq
  dmaengine: tegra-apb: Improve DMA synchronization
  dmaengine: tegra-apb: Don't save/restore IRQ flags in interrupt handler
  dmaengine: tegra-apb: mark PM functions as __maybe_unused
  dmaengine: fix spelling mistake "exceds" -> "exceeds"
  dmaengine: sprd: Set request pending flag when DMA controller is active
  dmaengine: ppc4xx: Use scnprintf() for avoiding potential buffer overflow
  dmaengine: idxd: remove global token limit check
  dmaengine: idxd: reflect shadow copy of traffic class programming
  dmaengine: idxd: Merge definition of dsa_batch_desc into dsa_hw_desc
  dmaengine: Create debug directories for DMA devices
  dmaengine: ti: k3-udma: Implement custom dbg_summary_show for debugfs
  dmaengine: Add basic debugfs support
  dmaengine: fsl-dpaa2-qdma: remove set but not used variable 'dpaa2_qdma'
  dmaengine: ti: edma: fix null dereference because of a typo in pointer name
  dmaengine: fsl-dpaa2-qdma: Adding shutdown hook
  dmaengine: uniphier-xdmac: Add UniPhier external DMA controller driver
  dt-bindings: dmaengine: Add UniPhier external DMA controller bindings
  dmaengine: ti: k3-udma: Implement support for atype (for virtualization)
  ...

e964f1e0

Merge branch 'i2c/for-5.7' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux · 5c8db3eb

Linus Torvalds authored Apr 02, 2020

Pull i2c updates from Wolfram Sang:
 "I2C has:

   - using defines for bus speeds to avoid mistakes in hardcoded values;
     lots of small driver updates because of that. Thanks, Andy!

   - API change: i2c_setup_smbus_alert() was renamed to
     i2c_new_smbus_alert_device() and returns ERRPTR now. All in-tree
     users have been converted

   - in the core, a rare race condition when deleting the cdev has been
     fixed. Thanks, Kevin!

   - lots of driver updates. Thanks, everyone!

  I also want to mention: The amount of review and testing tags given
  was quite high this time. Thank you to these people, too. I hope we
  can keep it like this!"

* 'i2c/for-5.7' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: (34 commits)
  i2c: rcar: clean up after refactoring i2c_timings
  macintosh: convert to i2c_new_scanned_device
  i2c: drivers: Use generic definitions for bus frequencies
  i2c: algo: Use generic definitions for bus frequencies
  i2c: stm32f7: switch to I²C generic property parsing
  i2c: rcar: Consolidate timings calls in rcar_i2c_clock_calculate()
  i2c: core: Allow override timing properties with 0
  i2c: core: Provide generic definitions for bus frequencies
  i2c: mxs: Use dma_request_chan() instead dma_request_slave_channel()
  i2c: imx: remove duplicate print after platform_get_irq()
  i2c: designware: Fix spelling typos in the comments
  i2c: designware: Discard i2c_dw_read_comp_param() function
  i2c: designware: Detect the FIFO size in the common code
  i2c: dev: Fix the race between the release of i2c_dev and cdev
  i2c: qcom-geni: Drop of_platform.h include
  i2c: qcom-geni: Grow a dev pointer to simplify code
  i2c: qcom-geni: Let firmware specify irq trigger flags
  i2c: stm32f7: do not backup read-only PECR register
  i2c: smbus: remove outdated references to irq level triggers
  i2c: convert SMBus alert setup function to return an ERRPTR
  ...

5c8db3eb

Merge tag 'sound-5.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound · 848960e5

Linus Torvalds authored Apr 02, 2020

Pull sound updates from Takashi Iwai:
 "This became again a busy development cycle.  There are few ALSA core
  updates (merely API cleanups and sparse fixes), with the majority of
  other changes are found in ASoC scene.

  Here are some highlights:

  ALSA core:
   - More helper macros for sparse warning fixes (e.g. bitwise types)
   - Slight optimization of PCM OSS locks
   - Make common handling for PCM / compress buffers (for SOF)

  ASoC:
   - Lots of code refactoring and modernization for (still ongoing)
     componentization works
   - Conversion of SND_SOC_ALL_CODECS to use imply
   - Continued refactoring and fixing of the Intel SOF/SST support,
     including the initial (but still incomplete) SoundWire support
   - SoundWire and more advanced clocking support for Realtek RT5682
   - Support for amlogic GX, Meson 8, Meson 8B and T9015 DAC, Broadcom
     DSL/PON, Ingenic JZ4760 and JZ4770, Realtek RL6231, and TI TAS2563
     and TLV320ADCX140

  HD-audio:
   - Optimizations in HDMI jack handling
   - A few new quirks and fixups for Realtek codecs

  USB-audio:
   - Delayed registration support
   - New quirks for Motu, Kingston, Presonus"

* tag 'sound-5.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (415 commits)
  ALSA: usb-audio: Fix case when USB MIDI interface has more than one extra endpoint descriptor
  Revert "ALSA: uapi: Drop asound.h inclusion from asoc.h"
  ALSA: hda/realtek - Remove now-unnecessary XPS 13 headphone noise fixups
  ALSA: hda/realtek - Set principled PC Beep configuration for ALC256
  ALSA: doc: Document PC Beep Hidden Register on Realtek ALC256
  ALSA: hda/realtek - a fake key event is triggered by running shutup
  ALSA: hda: default enable CA0132 DSP support
  ASoC: amd: acp3x-pcm-dma: clean up two indentation issues
  ASoC: tlv320adcx140: Remove undocumented property
  ASoC: Intel: sof_sdw: Add Volteer support with RT5682 SNDW helper function
  ASoC: Intel: common: add match table for TGL RT5682 SoundWire driver
  ASoC: Intel: boards: add sof_sdw machine driver
  ASoC: Intel: soc-acpi: update topology and driver name for SoundWire platforms
  ASoC: rt5682: move DAI clock registry to I2S mode
  ASoC: pxa: magician: convert to use i2c_new_client_device()
  ASoC: SOF: Intel: hda-ctrl: add reset cycle before parsing capabilities
  Asoc: SOF: Intel: hda: check SoundWire wakeen interrupt in irq thread
  ASoC: SOF: Intel: hda: add WAKEEN interrupt support for SoundWire
  ASoC: SOF: Intel: hda: add parameter to control SoundWire clock stop quirks
  ASoC: SOF: Intel: hda: merge IPC, stream and SoundWire interrupt handlers
  ...

848960e5

Merge tag 'pinctrl-v5.7-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl · bc3b3f4b

Linus Torvalds authored Apr 02, 2020

Pull pin control updates from Linus Walleij:
 "This is the bulk of pin control changes for the v5.7 kernel cycle.

  There are no core changes this time, only driver developments:

   - New driver for the Dialog Semiconductor DA9062 Power Management
     Integrated Circuit (PMIC).

   - Renesas SH-PFC has improved consistency, with group and register
     checks in the configuration checker.

   - New subdriver for the Qualcomm IPQ6018.

   - Add the RGMII pin control functionality to Qualcomm IPQ8064.

   - Performance and code quality cleanups in the Mediatek driver.

   - Improve the Broadcom BCM2835 support to cover all the GPIOs that
     exist in it.

   - The Allwinner/Sunxi driver properly masks non-wakeup IRQs on
     suspend.

   - Add some missing groups and functions to the Ingenic driver.

   - Convert some of the Freescale device tree bindings to use the new
     and all improved JSON YAML markup.

   - Refactorings and support for the SFIO/GPIO in the Tegra194 SoC
     driver.

   - Support high impedance mode in the Spreadtrum/Unisoc driver"

* tag 'pinctrl-v5.7-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl: (64 commits)
  pinctrl: qcom: fix compilation error
  pinctrl: qcom: use scm_call to route GPIO irq to Apps
  pinctrl: sprd: Add pin high impedance mode support
  pinctrl: sprd: Use the correct pin output configuration
  pinctrl: tegra: Add SFIO/GPIO programming on Tegra194
  pinctrl: tegra: Renumber the GG.0 and GG.1 pins
  pinctrl: tegra: Do not add default pin range on Tegra194
  pinctrl: tegra: Pass struct tegra_pmx for pin range check
  pinctrl: tegra: Fix "Scmitt" -> "Schmitt" typo
  pinctrl: tegra: Fix whitespace issues for improved readability
  pinctrl: mediatek: Use scnprintf() for avoiding potential buffer overflow
  pinctrl: freescale: drop the dependency on ARM64 for i.MX8M
  Revert "pinctrl: mvebu: armada-37xx: use use platform api"
  dt-bindings: pinctrl: at91: Fix a typo ("descibe")
  pinctrl: meson: add tsin pinctrl for meson gxbb/gxl/gxm
  pinctrl: sprd: Fix the kconfig warning
  pinctrl: ingenic: add hdmi-ddc pin control group
  pinctrl: sirf/atlas7: Replace zero-length array with flexible-array member
  pinctrl: sprd: Allow the SPRD pinctrl driver building into a module
  pinctrl: Export some needed symbols at module load time
  ...

bc3b3f4b

Merge tag 'hwlock-v5.7' of git://git.kernel.org/pub/scm/linux/kernel/git/andersson/remoteproc · 11786191

Linus Torvalds authored Apr 02, 2020

Pull hwspinlock updates from Bjorn Andersson:
 "This marks all hwspinlock driver COMPILE_TESTable and replaces the
  zero-length array in hwspinlock_device with a flexible-array member"

* tag 'hwlock-v5.7' of git://git.kernel.org/pub/scm/linux/kernel/git/andersson/remoteproc:
  hwspinlock: hwspinlock_internal.h: Replace zero-length array with flexible-array member
  hwspinlock: Allow drivers to be built with COMPILE_TEST

11786191

Merge tag 'rproc-v5.7' of git://git.kernel.org/pub/scm/linux/kernel/git/andersson/remoteproc · c6570114

Linus Torvalds authored Apr 02, 2020

Pull remoteproc updates from Bjorn Andersson:

 - a range of improvements to the OMAP remoeteproc driver; among other
   things adding devicetree, suspend/resume and watchdog support, and
   adds support the remoteprocs in the DRA7xx SoC

 - support for 64-bit firmware, extends the ELF loader to support this
   and fixes for a number of race conditions in the recovery handling

 - a generic mechanism to allow remoteproc drivers to sync state with
   remote processors during a panic, and uses this to prepare Qualcomm
   remote processors for post mortem analysis

 - fixes to cleanly recover from crashes in the modem firmware on
   production Qualcomm devices

* tag 'rproc-v5.7' of git://git.kernel.org/pub/scm/linux/kernel/git/andersson/remoteproc: (37 commits)
  remoteproc/omap: Switch to SPDX license identifiers
  remoteproc/omap: Add watchdog functionality for remote processors
  remoteproc/omap: Report device exceptions and trigger recovery
  remoteproc/omap: Add support for runtime auto-suspend/resume
  remoteproc/omap: Add support for system suspend/resume
  remoteproc/omap: Request a timer(s) for remoteproc usage
  remoteproc/omap: Check for undefined mailbox messages
  remoteproc/omap: Remove the platform_data header
  remoteproc/omap: Add support for DRA7xx remote processors
  remoteproc/omap: Initialize and assign reserved memory node
  remoteproc/omap: Add the rproc ops .da_to_va() implementation
  remoteproc/omap: Add support to parse internal memories from DT
  remoteproc/omap: Add a sanity check for DSP boot address alignment
  remoteproc/omap: Add device tree support
  dt-bindings: remoteproc: Add OMAP remoteproc bindings
  remoteproc: qcom: Introduce panic handler for PAS and ADSP
  remoteproc: qcom: q6v5: Add common panic handler
  remoteproc: Introduce "panic" callback in ops
  remoteproc: Traverse rproc_list under RCU read lock
  remoteproc: Fix NULL pointer dereference in rproc_virtio_notify
  ...

c6570114

Merge branch 'for-5.7' of git://git.kernel.org/pub/scm/linux/kernel/git/dennis/percpu · ac438771

Linus Torvalds authored Apr 02, 2020

Pull percpu updates from Dennis Zhou:
 "This is just a few documentation fixes for percpu refcount and bitmap
  helpers that went in v5.6, and moving my emails to all be at korg"

* 'for-5.7' of git://git.kernel.org/pub/scm/linux/kernel/git/dennis/percpu:
  percpu: update copyright emails to dennis@kernel.org
  include/bitmap.h: add new functions to documentation
  include/bitmap.h: add missing parameter in docs
  percpu_ref: Fix comment regarding percpu_ref_init flags

ac438771

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm · 8c1b724d

Linus Torvalds authored Apr 02, 2020

Pull kvm updates from Paolo Bonzini:
 "ARM:
   - GICv4.1 support

   - 32bit host removal

  PPC:
   - secure (encrypted) using under the Protected Execution Framework
     ultravisor

  s390:
   - allow disabling GISA (hardware interrupt injection) and protected
     VMs/ultravisor support.

  x86:
   - New dirty bitmap flag that sets all bits in the bitmap when dirty
     page logging is enabled; this is faster because it doesn't require
     bulk modification of the page tables.

   - Initial work on making nested SVM event injection more similar to
     VMX, and less buggy.

   - Various cleanups to MMU code (though the big ones and related
     optimizations were delayed to 5.8). Instead of using cr3 in
     function names which occasionally means eptp, KVM too has
     standardized on "pgd".

   - A large refactoring of CPUID features, which now use an array that
     parallels the core x86_features.

   - Some removal of pointer chasing from kvm_x86_ops, which will also
     be switched to static calls as soon as they are available.

   - New Tigerlake CPUID features.

   - More bugfixes, optimizations and cleanups.

  Generic:
   - selftests: cleanups, new MMU notifier stress test, steal-time test

   - CSV output for kvm_stat"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (277 commits)
  x86/kvm: fix a missing-prototypes "vmread_error"
  KVM: x86: Fix BUILD_BUG() in __cpuid_entry_get_reg() w/ CONFIG_UBSAN=y
  KVM: VMX: Add a trampoline to fix VMREAD error handling
  KVM: SVM: Annotate svm_x86_ops as __initdata
  KVM: VMX: Annotate vmx_x86_ops as __initdata
  KVM: x86: Drop __exit from kvm_x86_ops' hardware_unsetup()
  KVM: x86: Copy kvm_x86_ops by value to eliminate layer of indirection
  KVM: x86: Set kvm_x86_ops only after ->hardware_setup() completes
  KVM: VMX: Configure runtime hooks using vmx_x86_ops
  KVM: VMX: Move hardware_setup() definition below vmx_x86_ops
  KVM: x86: Move init-only kvm_x86_ops to separate struct
  KVM: Pass kvm_init()'s opaque param to additional arch funcs
  s390/gmap: return proper error code on ksm unsharing
  KVM: selftests: Fix cosmetic copy-paste error in vm_mem_region_move()
  KVM: Fix out of range accesses to memslots
  KVM: X86: Micro-optimize IPI fastpath delay
  KVM: X86: Delay read msr data iff writes ICR MSR
  KVM: PPC: Book3S HV: Add a capability for enabling secure guests
  KVM: arm64: GICv4.1: Expose HW-based SGIs in debugfs
  KVM: arm64: GICv4.1: Allow non-trapping WFI when using HW SGIs
  ...

8c1b724d

Merge tag 'x86-urgent-2020-04-02' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · f14a9532

Linus Torvalds authored Apr 02, 2020

Pull x86 fix from Ingo Molnar:
 "A single fix addressing Sparse warnings. <asm/bitops.h> is changed
  non-trivially to avoid the warnings, but generated code is not
  supposed to be affected"

* tag 'x86-urgent-2020-04-02' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86: Fix bitops.h warning with a moved cast

f14a9532

Merge branch 'next-integrity' of... · 7f218319

Linus Torvalds authored Apr 02, 2020

Merge branch 'next-integrity' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity

Pull integrity updates from Mimi Zohar:
 "Just a couple of updates for linux-5.7:

   - A new Kconfig option to enable IMA architecture specific runtime
     policy rules needed for secure and/or trusted boot, as requested.

   - Some message cleanup (eg. pr_fmt, additional error messages)"

* 'next-integrity' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity:
  ima: add a new CONFIG for loading arch-specific policies
  integrity: Remove duplicate pr_fmt definitions
  IMA: Add log statements for failure conditions
  IMA: Update KBUILD_MODNAME for IMA files to ima

7f218319

Merge branch 'akpm' (patches from Andrew) · 6cad420c

Linus Torvalds authored Apr 02, 2020

Merge updates from Andrew Morton:
 "A large amount of MM, plenty more to come.

  Subsystems affected by this patch series:
   - tools
   - kthread
   - kbuild
   - scripts
   - ocfs2
   - vfs
   - mm: slub, kmemleak, pagecache, gup, swap, memcg, pagemap, mremap,
         sparsemem, kasan, pagealloc, vmscan, compaction, mempolicy,
         hugetlbfs, hugetlb"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (155 commits)
  include/linux/huge_mm.h: check PageTail in hpage_nr_pages even when !THP
  mm/hugetlb: fix build failure with HUGETLB_PAGE but not HUGEBTLBFS
  selftests/vm: fix map_hugetlb length used for testing read and write
  mm/hugetlb: remove unnecessary memory fetch in PageHeadHuge()
  mm/hugetlb.c: clean code by removing unnecessary initialization
  hugetlb_cgroup: add hugetlb_cgroup reservation docs
  hugetlb_cgroup: add hugetlb_cgroup reservation tests
  hugetlb: support file_region coalescing again
  hugetlb_cgroup: support noreserve mappings
  hugetlb_cgroup: add accounting for shared mappings
  hugetlb: disable region_add file_region coalescing
  hugetlb_cgroup: add reservation accounting for private mappings
  mm/hugetlb_cgroup: fix hugetlb_cgroup migration
  hugetlb_cgroup: add interface for charge/uncharge hugetlb reservations
  hugetlb_cgroup: add hugetlb_cgroup reservation counter
  hugetlbfs: Use i_mmap_rwsem to address page fault/truncate race
  hugetlbfs: use i_mmap_rwsem for more pmd sharing synchronization
  mm/memblock.c: remove redundant assignment to variable max_addr
  mm: mempolicy: require at least one nodeid for MPOL_PREFERRED
  mm: mempolicy: use VM_BUG_ON_VMA in queue_pages_test_walk()
  ...

6cad420c

Merge tag 'xfs-5.7-merge-8' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux · 7be97138

Linus Torvalds authored Apr 02, 2020

Pull xfs updates from Darrick Wong:
 "There's a lot going on this cycle with cleanups in the log code, the
  btree code, and the xattr code.

  We're tightening of metadata validation and online fsck checking, and
  introducing a common btree rebuilding library so that we can refactor
  xfs_repair and introduce online repair in a future cycle.

  We also fixed a few visible bugs -- most notably there's one in
  getdents that we introduced in 5.6; and a fix for hangs when disabling
  quotas.

  This series has been running fstests & other QA in the background for
  over a week and looks good so far.

  I anticipate sending a second pull request next week. That batch will
  change how xfs interacts with memory reclaim; how the log batches and
  throttles log items; how hard writes near ENOSPC will try to squeeze
  more space out of the filesystem; and hopefully fix the last of the
  umount hangs after a catastrophic failure. That should ease a lot of
  problems when running at the limits, but for now I'm leaving that in
  for-next for another week to make sure we got all the subtleties
  right.

  Summary:

   - Fix a hard to trigger race between iclog error checking and log
     shutdown.

   - Strengthen the AGF verifier.

   - Ratelimit some of the more spammy error messages.

   - Remove the icdinode uid/gid members and just use the ones in the
     vfs inode.

   - Hold ILOCK across insert/collapse range.

   - Clean up the extended attribute interfaces.

   - Clean up the attr flags mess.

   - Restore PF_MEMALLOC after exiting xfsaild thread to avoid
     triggering warnings in the process accounting code.

   - Remove the flexibly-sized array from struct xfs_agfl to eliminate
     compiler warnings about unaligned pointers and packed structures.

   - Various macro and typedef removals.

   - Stale metadata buffers if we decide they're corrupt outside of a
     verifier.

   - Check directory data/block/free block owners.

   - Fix a UAF when aborting inactivation of a corrupt xattr fork.

   - Teach online scrub to report failed directory and attr name lookups
     as a metadata corruption instead of a runtime error.

   - Avoid potential buffer overflows in sysfs files by using scnprintf.

   - Fix a regression in getdents lookups due to a mistake in pointer
     arithmetic.

   - Refactor btree cursor private data structures to use anonymous
     unions.

   - Cleanups in the log unmounting code.

   - Fix a potential mishandling of ENOMEM errors on multi-block
     directory buffer lookups.

   - Fix an incorrect test in the block allocation code.

   - Cleanups and name prefix shortening in the scrub code.

   - Introduce btree bulk loading code for online repair and scrub.

   - Fix a quotaoff log item leak (and hang) when the fs goes down
     midway through a quotaoff operation.

   - Remove di_version from the incore inode.

   - Refactor some of the log shutdown checking code.

   - Record the forcing of the log unmount records in the log force
     counters.

   - Fix a longstanding bug where quotacheck would purge the
     administrator's default quota grace interval and warning limits.

   - Reduce memory usage when scrubbing directory and xattr trees.

   - Don't let fsfreeze race with GETFSMAP or online scrub.

   - Handle bio_add_page failures more gracefully in xlog_write_iclog"

* tag 'xfs-5.7-merge-8' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: (108 commits)
  xfs: prohibit fs freezing when using empty transactions
  xfs: shutdown on failure to add page to log bio
  xfs: directory bestfree check should release buffers
  xfs: drop all altpath buffers at the end of the sibling check
  xfs: preserve default grace interval during quotacheck
  xfs: remove xlog_state_want_sync
  xfs: move the ioerror check out of xlog_state_clean_iclog
  xfs: refactor xlog_state_clean_iclog
  xfs: remove the aborted parameter to xlog_state_done_syncing
  xfs: simplify log shutdown checking in xfs_log_release_iclog
  xfs: simplify the xfs_log_release_iclog calling convention
  xfs: factor out a xlog_wait_on_iclog helper
  xfs: merge xlog_cil_push into xlog_cil_push_work
  xfs: remove the di_version field from struct icdinode
  xfs: simplify a check in xfs_ioctl_setattr_check_cowextsize
  xfs: simplify di_flags2 inheritance in xfs_ialloc
  xfs: only check the superblock version for dinode size calculation
  xfs: add a new xfs_sb_version_has_v3inode helper
  xfs: fix unmount hang and memory leak on shutdown during quotaoff
  xfs: factor out quotaoff intent AIL removal and memory free
  ...

7be97138

Merge tag 'vfs-5.7-merge-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux · 7db83c07

Linus Torvalds authored Apr 02, 2020

Pull hibernation fix from Darrick Wong:
 "Fix a regression where we broke the userspace hibernation driver by
  disallowing writes to the swap device"

* tag 'vfs-5.7-merge-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
  hibernate: Allow uswsusp to write to swap

7db83c07

Merge tag 'iomap-5.7-merge-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux · 35a9fafe

Linus Torvalds authored Apr 02, 2020

Pull iomap updates from Darrick Wong:
 "We're fixing tracepoints and comments in this cycle, so there
  shouldn't be any surprises here.

  I anticipate sending a second pull request next week with a single bug
  fix for readahead, but it's still undergoing QA.

  Summary:

   - Fix a broken tracepoint

   - Fix a broken comment"

* tag 'iomap-5.7-merge-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
  iomap: fix comments in iomap_dio_rw
  iomap: Remove pgoff from tracepoints

35a9fafe

Merge branch 'work.dotdot1' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · 9c577491

Linus Torvalds authored Apr 02, 2020

Pull vfs pathwalk sanitizing from Al Viro:
 "Massive pathwalk rewrite and cleanups.

  Several iterations have been posted; hopefully this thing is getting
  readable and understandable now. Pretty much all parts of pathname
  resolutions are affected...

  The branch is identical to what has sat in -next, except for commit
  message in "lift all calls of step_into() out of follow_dotdot/
  follow_dotdot_rcu", crediting Qian Cai for reporting the bug; only
  commit message changed there."

* 'work.dotdot1' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (69 commits)
  lookup_open(): don't bother with fallbacks to lookup+create
  atomic_open(): no need to pass struct open_flags anymore
  open_last_lookups(): move complete_walk() into do_open()
  open_last_lookups(): lift O_EXCL|O_CREAT handling into do_open()
  open_last_lookups(): don't abuse complete_walk() when all we want is unlazy
  open_last_lookups(): consolidate fsnotify_create() calls
  take post-lookup part of do_last() out of loop
  link_path_walk(): sample parent's i_uid and i_mode for the last component
  __nd_alloc_stack(): make it return bool
  reserve_stack(): switch to __nd_alloc_stack()
  pick_link(): take reserving space on stack into a new helper
  pick_link(): more straightforward handling of allocation failures
  fold path_to_nameidata() into its only remaining caller
  pick_link(): pass it struct path already with normal refcounting rules
  fs/namei.c: kill follow_mount()
  non-RCU analogue of the previous commit
  helper for mount rootwards traversal
  follow_dotdot(): be lazy about changing nd->path
  follow_dotdot_rcu(): be lazy about changing nd->path
  follow_dotdot{,_rcu}(): massage loops
  ...

9c577491

x86/kvm: fix a missing-prototypes "vmread_error" · 514ccc19

Qian Cai authored Apr 02, 2020

The commit 842f4be9 ("KVM: VMX: Add a trampoline to fix VMREAD error
handling") removed the declaration of vmread_error() causes a W=1 build
failure with KVM_WERROR=y. Fix it by adding it back.

arch/x86/kvm/vmx/vmx.c:359:17: error: no previous prototype for 'vmread_error' [-Werror=missing-prototypes]
 asmlinkage void vmread_error(unsigned long field, bool fault)
                 ^~~~~~~~~~~~
Signed-off-by: Qian Cai <cai@lca.pw>
Message-Id: <20200402153955.1695-1-cai@lca.pw>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

514ccc19

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace · d987ca1c

Linus Torvalds authored Apr 02, 2020

Pull exec/proc updates from Eric Biederman:
 "This contains two significant pieces of work: the work to sort out
  proc_flush_task, and the work to solve a deadlock between strace and
  exec.

  Fixing proc_flush_task so that it no longer requires a persistent
  mount makes improvements to proc possible. The removal of the
  persistent mount solves an old regression that that caused the hidepid
  mount option to only work on remount not on mount. The regression was
  found and reported by the Android folks. This further allows Alexey
  Gladkov's work making proc mount options specific to an individual
  mount of proc to move forward.

  The work on exec starts solving a long standing issue with exec that
  it takes mutexes of blocking userspace applications, which makes exec
  extremely deadlock prone. For the moment this adds a second mutex with
  a narrower scope that handles all of the easy cases. Which makes the
  tricky cases easy to spot. With a little luck the code to solve those
  deadlocks will be ready by next merge window"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (25 commits)
  signal: Extend exec_id to 64bits
  pidfd: Use new infrastructure to fix deadlocks in execve
  perf: Use new infrastructure to fix deadlocks in execve
  proc: io_accounting: Use new infrastructure to fix deadlocks in execve
  proc: Use new infrastructure to fix deadlocks in execve
  kernel/kcmp.c: Use new infrastructure to fix deadlocks in execve
  kernel: doc: remove outdated comment cred.c
  mm: docs: Fix a comment in process_vm_rw_core
  selftests/ptrace: add test cases for dead-locks
  exec: Fix a deadlock in strace
  exec: Add exec_update_mutex to replace cred_guard_mutex
  exec: Move exec_mmap right after de_thread in flush_old_exec
  exec: Move cleanup of posix timers on exec out of de_thread
  exec: Factor unshare_sighand out of de_thread and call it separately
  exec: Only compute current once in flush_old_exec
  pid: Improve the comment about waiting in zap_pid_ns_processes
  proc: Remove the now unnecessary internal mount of proc
  uml: Create a private mount of proc for mconsole
  uml: Don't consult current to find the proc_mnt in mconsole_proc
  proc: Use a list of inodes to flush from proc
  ...

d987ca1c

include/linux/huge_mm.h: check PageTail in hpage_nr_pages even when !THP · 77d6b909

Matthew Wilcox (Oracle) authored Apr 01, 2020

It's even more important to check that we don't have a tail page when
calling hpage_nr_pages() when THP are disabled.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Link: http://lkml.kernel.org/r/20200318140253.6141-4-willy@infradead.orgSigned-off-by: Linus Torvalds <torvalds@linux-foundation.org>

77d6b909

mm/hugetlb: fix build failure with HUGETLB_PAGE but not HUGEBTLBFS · bb297bb2

Christophe Leroy authored Apr 01, 2020

When CONFIG_HUGETLB_PAGE is set but not CONFIG_HUGETLBFS, the following
build failure is encoutered:

  In file included from arch/powerpc/mm/fault.c:33:0:
  include/linux/hugetlb.h: In function 'hstate_inode':
  include/linux/hugetlb.h:477:9: error: implicit declaration of function 'HUGETLBFS_SB' [-Werror=implicit-function-declaration]
    return HUGETLBFS_SB(i->i_sb)->hstate;
           ^
  include/linux/hugetlb.h:477:30: error: invalid type argument of '->' (have 'int')
    return HUGETLBFS_SB(i->i_sb)->hstate;
                                ^

Gate hstate_inode() with CONFIG_HUGETLBFS instead of CONFIG_HUGETLB_PAGE.

Fixes: a137e1cc ("hugetlbfs: per mount huge page sizes")
Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Nishanth Aravamudan <nacc@us.ibm.com>
Cc: Nick Piggin <npiggin@suse.de>
Cc: Adam Litke <agl@us.ibm.com>
Cc: Andi Kleen <ak@suse.de>
Link: http://lkml.kernel.org/r/7e8c3a3c9a587b9cd8a2f146df32a421b961f3a2.1584432148.git.christophe.leroy@c-s.fr
Link: https://patchwork.ozlabs.org/patch/1255548/#2386036Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

bb297bb2

selftests/vm: fix map_hugetlb length used for testing read and write · cabc30da

Christophe Leroy authored Apr 01, 2020

Commit fa7b9a80 ("tools/selftest/vm: allow choosing mem size and page
size in map_hugetlb") added the possibility to change the size of memory
mapped for the test, but left the read and write test using the default
value. This is unnoticed when mapping a length greater than the default
one, but segfaults otherwise.

Fix read_bytes() and write_bytes() by giving them the real length.

Also fix the call to munmap().

Fixes: fa7b9a80 ("tools/selftest/vm: allow choosing mem size and page size in map_hugetlb")
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Leonardo Bras <leonardo@linux.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Shuah Khan <shuah@kernel.org>
Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/9a404a13c871c4bd0ba9ede68f69a1225180dd7e.1580978385.git.christophe.leroy@c-s.frSigned-off-by: Linus Torvalds <torvalds@linux-foundation.org>

cabc30da

mm/hugetlb: remove unnecessary memory fetch in PageHeadHuge() · d4af73e3

Vlastimil Babka authored Apr 01, 2020

Commit f1e61557 ("mm: pack compound_dtor and compound_order into one
word in struct page") changed compound_dtor from a pointer to an array
index in order to pack it. To check if page has the hugeltbfs
compound_dtor, we can just compare the index directly without fetching the
function pointer. Said commit did that with PageHuge() and we can do the
same with PageHeadHuge() to make the code a bit smaller and faster.
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Acked-by: David Rientjes <rientjes@google.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Neha Agarwal <nehaagarwal@google.com>
Link: http://lkml.kernel.org/r/20200311172440.6988-1-vbabka@suse.czSigned-off-by: Linus Torvalds <torvalds@linux-foundation.org>

d4af73e3

mm/hugetlb.c: clean code by removing unnecessary initialization · 353b2de4

Mateusz Nosek authored Apr 01, 2020

Previously variable 'check_addr' was initialized, but was not read later
before reassigning. So the initialization can be removed.
Signed-off-by: Mateusz Nosek <mateusznosek0@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Link: http://lkml.kernel.org/r/20200303212354.25226-1-mateusznosek0@gmail.comSigned-off-by: Linus Torvalds <torvalds@linux-foundation.org>

353b2de4

hugetlb_cgroup: add hugetlb_cgroup reservation docs · 6566704d

Mina Almasry authored Apr 01, 2020

Add docs for how to use hugetlb_cgroup reservations, and their behavior.
Signed-off-by: Mina Almasry <almasrymina@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Sandipan Das <sandipan@linux.ibm.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Shuah Khan <shuah@kernel.org>
Link: http://lkml.kernel.org/r/20200211213128.73302-9-almasrymina@google.comSigned-off-by: Linus Torvalds <torvalds@linux-foundation.org>

6566704d

hugetlb_cgroup: add hugetlb_cgroup reservation tests · 29750f71

Mina Almasry authored Apr 01, 2020

The tests use both shared and private mapped hugetlb memory, and monitors
the hugetlb usage counter as well as the hugetlb reservation counter.
They test different configurations such as hugetlb memory usage via
hugetlbfs, or MAP_HUGETLB, or shmget/shmat, and with and without
MAP_POPULATE.

Also add test for hugetlb reservation reparenting, since this is a subtle
issue.
Signed-off-by: Mina Almasry <almasrymina@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Tested-by: Sandipan Das <sandipan@linux.ibm.com>	[powerpc64]
Acked-by: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Sandipan Das <sandipan@linux.ibm.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Shuah Khan <shuah@kernel.org>
Link: http://lkml.kernel.org/r/20200211213128.73302-8-almasrymina@google.comSigned-off-by: Linus Torvalds <torvalds@linux-foundation.org>

29750f71

hugetlb: support file_region coalescing again · a9b3f867

Mina Almasry authored Apr 01, 2020

An earlier patch in this series disabled file_region coalescing in order
to hang the hugetlb_cgroup uncharge info on the file_region entries.

This patch re-adds support for coalescing of file_region entries.
Essentially everytime we add an entry, we call a recursive function that
tries to coalesce the added region with the regions next to it.  The worst
case call depth for this function is 3: one to coalesce with the region
next to it, one to coalesce to the region prev, and one to reach the base
case.

This is an important performance optimization as private mappings add
their entries page by page, and we could incur big performance costs for
large mappings with lots of file_region entries in their resv_map.

[almasrymina@google.com: fix CONFIG_CGROUP_HUGETLB ifdefs]
  Link: http://lkml.kernel.org/r/20200214204544.231482-1-almasrymina@google.com
[almasrymina@google.com: remove check_coalesce_bug debug code]
  Link: http://lkml.kernel.org/r/20200219233610.13808-1-almasrymina@google.comSigned-off-by: Mina Almasry <almasrymina@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Acked-by: David Rientjes <rientjes@google.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Sandipan Das <sandipan@linux.ibm.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Link: http://lkml.kernel.org/r/20200211213128.73302-7-almasrymina@google.comSigned-off-by: Linus Torvalds <torvalds@linux-foundation.org>

a9b3f867

hugetlb_cgroup: support noreserve mappings · 08cf9faf

Mina Almasry authored Apr 01, 2020

Support MAP_NORESERVE accounting as part of the new counter.

For each hugepage allocation, at allocation time we check if there is a
reservation for this allocation or not.  If there is a reservation for
this allocation, then this allocation was charged at reservation time, and
we don't re-account it.  If there is no reserevation for this allocation,
we charge the appropriate hugetlb_cgroup.

The hugetlb_cgroup to uncharge for this allocation is stored in
page[3].private.  We use new APIs added in an earlier patch to set this
pointer.
Signed-off-by: Mina Almasry <almasrymina@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Sandipan Das <sandipan@linux.ibm.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Shuah Khan <shuah@kernel.org>
Link: http://lkml.kernel.org/r/20200211213128.73302-6-almasrymina@google.comSigned-off-by: Linus Torvalds <torvalds@linux-foundation.org>

08cf9faf

hugetlb_cgroup: add accounting for shared mappings · 075a61d0

Mina Almasry authored Apr 01, 2020

For shared mappings, the pointer to the hugetlb_cgroup to uncharge lives
in the resv_map entries, in file_region->reservation_counter.

After a call to region_chg, we charge the approprate hugetlb_cgroup, and
if successful, we pass on the hugetlb_cgroup info to a follow up
region_add call.  When a file_region entry is added to the resv_map via
region_add, we put the pointer to that cgroup in
file_region->reservation_counter.  If charging doesn't succeed, we report
the error to the caller, so that the kernel fails the reservation.

On region_del, which is when the hugetlb memory is unreserved, we also
uncharge the file_region->reservation_counter.

[akpm@linux-foundation.org: forward declare struct file_region]
Signed-off-by: Mina Almasry <almasrymina@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Sandipan Das <sandipan@linux.ibm.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Shuah Khan <shuah@kernel.org>
Link: http://lkml.kernel.org/r/20200211213128.73302-5-almasrymina@google.comSigned-off-by: Linus Torvalds <torvalds@linux-foundation.org>

075a61d0

hugetlb: disable region_add file_region coalescing · 0db9d74e

Mina Almasry authored Apr 01, 2020

A follow up patch in this series adds hugetlb cgroup uncharge info the
file_region entries in resv->regions.  The cgroup uncharge info may differ
for different regions, so they can no longer be coalesced at region_add
time.  So, disable region coalescing in region_add in this patch.

Behavior change:

Say a resv_map exists like this [0->1], [2->3], and [5->6].

Then a region_chg/add call comes in region_chg/add(f=0, t=5).

Old code would generate resv->regions: [0->5], [5->6].
New code would generate resv->regions: [0->1], [1->2], [2->3], [3->5],
[5->6].

Special care needs to be taken to handle the resv->adds_in_progress
variable correctly.  In the past, only 1 region would be added for every
region_chg and region_add call.  But now, each call may add multiple
regions, so we can no longer increment adds_in_progress by 1 in
region_chg, or decrement adds_in_progress by 1 after region_add or
region_abort.  Instead, region_chg calls add_reservation_in_range() to
count the number of regions needed and allocates those, and that info is
passed to region_add and region_abort to decrement adds_in_progress
correctly.

We've also modified the assumption that region_add after region_chg never
fails.  region_chg now pre-allocates at least 1 region for region_add.  If
region_add needs more regions than region_chg has allocated for it, then
it may fail.

[almasrymina@google.com: fix file_region entry allocations]
  Link: http://lkml.kernel.org/r/20200219012736.20363-1-almasrymina@google.comSigned-off-by: Mina Almasry <almasrymina@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Acked-by: David Rientjes <rientjes@google.com>
Cc: Sandipan Das <sandipan@linux.ibm.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Greg Thelen <gthelen@google.com>
Cc: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
Link: http://lkml.kernel.org/r/20200211213128.73302-4-almasrymina@google.comSigned-off-by: Linus Torvalds <torvalds@linux-foundation.org>

0db9d74e

hugetlb_cgroup: add reservation accounting for private mappings · e9fe92ae

Mina Almasry authored Apr 01, 2020

Normally the pointer to the cgroup to uncharge hangs off the struct page,
and gets queried when it's time to free the page.  With hugetlb_cgroup
reservations, this is not possible.  Because it's possible for a page to
be reserved by one task and actually faulted in by another task.

The best place to put the hugetlb_cgroup pointer to uncharge for
reservations is in the resv_map.  But, because the resv_map has different
semantics for private and shared mappings, the code patch to
charge/uncharge shared and private mappings is different.  This patch
implements charging and uncharging for private mappings.

For private mappings, the counter to uncharge is in
resv_map->reservation_counter.  On initializing the resv_map this is set
to NULL.  On reservation of a region in private mapping, the tasks
hugetlb_cgroup is charged and the hugetlb_cgroup is placed is
resv_map->reservation_counter.

On hugetlb_vm_op_close, we uncharge resv_map->reservation_counter.

[akpm@linux-foundation.org: forward declare struct resv_map]
Signed-off-by: Mina Almasry <almasrymina@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Acked-by: David Rientjes <rientjes@google.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Sandipan Das <sandipan@linux.ibm.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Shuah Khan <shuah@kernel.org>
Link: http://lkml.kernel.org/r/20200211213128.73302-3-almasrymina@google.comSigned-off-by: Linus Torvalds <torvalds@linux-foundation.org>

e9fe92ae

mm/hugetlb_cgroup: fix hugetlb_cgroup migration · 9808895e

Mina Almasry authored Apr 01, 2020

Commit c32300516047 ("hugetlb_cgroup: add interface for charge/uncharge
hugetlb reservations") mistakingly doesn't handle the migration of *both*
the reservation hugetlb_cgroup and the fault hugetlb_cgroup correctly.

What should happen is that both cgroups shuold be queried from the old
page, then both set to NULL on the old page, then both inserted into the
new page.

The mistake also creates the following warning:

mm/hugetlb_cgroup.c: In function 'hugetlb_cgroup_migrate':
mm/hugetlb_cgroup.c:777:25: warning: variable 'h_cg' set but not used
[-Wunused-but-set-variable]
  struct hugetlb_cgroup *h_cg;
                         ^~~~

Solution is to add the missing steps, namly setting the reservation
hugetlb_cgroup to NULL on the old page, and setting the fault
hugetlb_cgroup on the new page.

Fixes: c32300516047 ("hugetlb_cgroup: add interface for charge/uncharge hugetlb reservations")
Reported-by: Qian Cai <cai@lca.pw>
Signed-off-by: Mina Almasry <almasrymina@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Sandipan Das <sandipan@linux.ibm.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Shuah Khan <shuah@kernel.org>
Link: http://lkml.kernel.org/r/20200218194727.46995-1-almasrymina@google.comSigned-off-by: Linus Torvalds <torvalds@linux-foundation.org>

9808895e

hugetlb_cgroup: add interface for charge/uncharge hugetlb reservations · 1adc4d41

Mina Almasry authored Apr 01, 2020

Augments hugetlb_cgroup_charge_cgroup to be able to charge hugetlb usage
or hugetlb reservation counter.

Adds a new interface to uncharge a hugetlb_cgroup counter via
hugetlb_cgroup_uncharge_counter.

Integrates the counter with hugetlb_cgroup, via hugetlb_cgroup_init,
hugetlb_cgroup_have_usage, and hugetlb_cgroup_css_offline.
Signed-off-by: Mina Almasry <almasrymina@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Mike Kravetz <mike.kravetz@oracle.com>
Acked-by: David Rientjes <rientjes@google.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Sandipan Das <sandipan@linux.ibm.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Shuah Khan <shuah@kernel.org>
Link: http://lkml.kernel.org/r/20200211213128.73302-2-almasrymina@google.comSigned-off-by: Linus Torvalds <torvalds@linux-foundation.org>

1adc4d41

hugetlb_cgroup: add hugetlb_cgroup reservation counter · cdc2fcfe

Mina Almasry authored Apr 01, 2020

These counters will track hugetlb reservations rather than hugetlb memory
faulted in.  This patch only adds the counter, following patches add the
charging and uncharging of the counter.

This is patch 1 of an 9 patch series.

Problem:

Currently tasks attempting to reserve more hugetlb memory than is
available get a failure at mmap/shmget time.  This is thanks to Hugetlbfs
Reservations [1].  However, if a task attempts to reserve more hugetlb
memory than its hugetlb_cgroup limit allows, the kernel will allow the
mmap/shmget call, but will SIGBUS the task when it attempts to fault in
the excess memory.

We have users hitting their hugetlb_cgroup limits and thus we've been
looking at this failure mode.  We'd like to improve this behavior such
that users violating the hugetlb_cgroup limits get an error on mmap/shmget
time, rather than getting SIGBUS'd when they try to fault the excess
memory in.  This gives the user an opportunity to fallback more gracefully
to non-hugetlbfs memory for example.

The underlying problem is that today's hugetlb_cgroup accounting happens
at hugetlb memory *fault* time, rather than at *reservation* time.  Thus,
enforcing the hugetlb_cgroup limit only happens at fault time, and the
offending task gets SIGBUS'd.

Proposed Solution:

A new page counter named
'hugetlb.xMB.rsvd.[limit|usage|max_usage]_in_bytes'. This counter has
slightly different semantics than
'hugetlb.xMB.[limit|usage|max_usage]_in_bytes':

- While usage_in_bytes tracks all *faulted* hugetlb memory,
  rsvd.usage_in_bytes tracks all *reserved* hugetlb memory and hugetlb
  memory faulted in without a prior reservation.

- If a task attempts to reserve more memory than limit_in_bytes allows,
  the kernel will allow it to do so.  But if a task attempts to reserve
  more memory than rsvd.limit_in_bytes, the kernel will fail this
  reservation.

This proposal is implemented in this patch series, with tests to verify
functionality and show the usage.

Alternatives considered:

1. A new cgroup, instead of only a new page_counter attached to the
   existing hugetlb_cgroup.  Adding a new cgroup seemed like a lot of code
   duplication with hugetlb_cgroup.  Keeping hugetlb related page counters
   under hugetlb_cgroup seemed cleaner as well.

2. Instead of adding a new counter, we considered adding a sysctl that
   modifies the behavior of hugetlb.xMB.[limit|usage]_in_bytes, to do
   accounting at reservation time rather than fault time.  Adding a new
   page_counter seems better as userspace could, if it wants, choose to
   enforce different cgroups differently: one via limit_in_bytes, and
   another via rsvd.limit_in_bytes.  This could be very useful if you're
   transitioning how hugetlb memory is partitioned on your system one
   cgroup at a time, for example.  Also, someone may find usage for both
   limit_in_bytes and rsvd.limit_in_bytes concurrently, and this approach
   gives them the option to do so.

Testing:
- Added tests passing.
- Used libhugetlbfs for regression testing.

[1]: https://www.kernel.org/doc/html/latest/vm/hugetlbfs_reserv.htmlSigned-off-by: Mina Almasry <almasrymina@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Acked-by: David Rientjes <rientjes@google.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Sandipan Das <sandipan@linux.ibm.com>
Link: http://lkml.kernel.org/r/20200211213128.73302-1-almasrymina@google.comSigned-off-by: Linus Torvalds <torvalds@linux-foundation.org>

cdc2fcfe

hugetlbfs: Use i_mmap_rwsem to address page fault/truncate race · 87bf91d3

Mike Kravetz authored Apr 01, 2020

hugetlbfs page faults can race with truncate and hole punch operations.
Current code in the page fault path attempts to handle this by 'backing
out' operations if we encounter the race. One obvious omission in the
current code is removing a page newly added to the page cache. This is
pretty straight forward to address, but there is a more subtle and
difficult issue of backing out hugetlb reservations. To handle this
correctly, the 'reservation state' before page allocation needs to be
noted so that it can be properly backed out. There are four distinct
possibilities for reservation state: shared/reserved, shared/no-resv,
private/reserved and private/no-resv. Backing out a reservation may
require memory allocation which could fail so that needs to be taken
into account as well.

Instead of writing the required complicated code for this rare
occurrence, just eliminate the race. i_mmap_rwsem is now held in read
mode for the duration of page fault processing. Hold i_mmap_rwsem in
write mode when modifying i_size. In this way, truncation can not
proceed when page faults are being processed. In addition, i_size
will not change during fault processing so a single check can be made
to ensure faults are not beyond (proposed) end of file. Faults can
still race with hole punch, but that race is handled by existing code
and the use of hugetlb_fault_mutex.

With this modification, checks for races with truncation in the page
fault path can be simplified and removed. remove_inode_hugepages no
longer needs to take hugetlb_fault_mutex in the case of truncation.
Comments are expanded to explain reasoning behind locking.
Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: "Aneesh Kumar K . V" <aneesh.kumar@linux.vnet.ibm.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Hugh Dickins <hughd@google.com>
Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Prakash Sangappa <prakash.sangappa@oracle.com>
Link: http://lkml.kernel.org/r/20200316205756.146666-3-mike.kravetz@oracle.comSigned-off-by: Linus Torvalds <torvalds@linux-foundation.org>

87bf91d3

hugetlbfs: use i_mmap_rwsem for more pmd sharing synchronization · c0d0381a

Mike Kravetz authored Apr 01, 2020

Patch series "hugetlbfs: use i_mmap_rwsem for more synchronization", v2.

While discussing the issue with huge_pte_offset [1], I remembered that
there were more outstanding hugetlb races.  These issues are:

1) For shared pmds, huge PTE pointers returned by huge_pte_alloc can become
   invalid via a call to huge_pmd_unshare by another thread.
2) hugetlbfs page faults can race with truncation causing invalid global
   reserve counts and state.

A previous attempt was made to use i_mmap_rwsem in this manner as
described at [2].  However, those patches were reverted starting with [3]
due to locking issues.

To effectively use i_mmap_rwsem to address the above issues it needs to be
held (in read mode) during page fault processing.  However, during fault
processing we need to lock the page we will be adding.  Lock ordering
requires we take page lock before i_mmap_rwsem.  Waiting until after
taking the page lock is too late in the fault process for the
synchronization we want to do.

To address this lock ordering issue, the following patches change the lock
ordering for hugetlb pages.  This is not too invasive as hugetlbfs
processing is done separate from core mm in many places.  However, I don't
really like this idea.  Much ugliness is contained in the new routine
hugetlb_page_mapping_lock_write() of patch 1.

The only other way I can think of to address these issues is by catching
all the races.  After catching a race, cleanup, backout, retry ...  etc,
as needed.  This can get really ugly, especially for huge page
reservations.  At one time, I started writing some of the reservation
backout code for page faults and it got so ugly and complicated I went
down the path of adding synchronization to avoid the races.  Any other
suggestions would be welcome.

[1] https://lore.kernel.org/linux-mm/1582342427-230392-1-git-send-email-longpeng2@huawei.com/
[2] https://lore.kernel.org/linux-mm/20181222223013.22193-1-mike.kravetz@oracle.com/
[3] https://lore.kernel.org/linux-mm/20190103235452.29335-1-mike.kravetz@oracle.com
[4] https://lore.kernel.org/linux-mm/1584028670.7365.182.camel@lca.pw/
[5] https://lore.kernel.org/lkml/20200312183142.108df9ac@canb.auug.org.au/

This patch (of 2):

While looking at BUGs associated with invalid huge page map counts, it was
discovered and observed that a huge pte pointer could become 'invalid' and
point to another task's page table.  Consider the following:

A task takes a page fault on a shared hugetlbfs file and calls
huge_pte_alloc to get a ptep.  Suppose the returned ptep points to a
shared pmd.

Now, another task truncates the hugetlbfs file.  As part of truncation, it
unmaps everyone who has the file mapped.  If the range being truncated is
covered by a shared pmd, huge_pmd_unshare will be called.  For all but the
last user of the shared pmd, huge_pmd_unshare will clear the pud pointing
to the pmd.  If the task in the middle of the page fault is not the last
user, the ptep returned by huge_pte_alloc now points to another task's
page table or worse.  This leads to bad things such as incorrect page
map/reference counts or invalid memory references.

To fix, expand the use of i_mmap_rwsem as follows:
- i_mmap_rwsem is held in read mode whenever huge_pmd_share is called.
  huge_pmd_share is only called via huge_pte_alloc, so callers of
  huge_pte_alloc take i_mmap_rwsem before calling.  In addition, callers
  of huge_pte_alloc continue to hold the semaphore until finished with
  the ptep.
- i_mmap_rwsem is held in write mode whenever huge_pmd_unshare is called.

One problem with this scheme is that it requires taking i_mmap_rwsem
before taking the page lock during page faults.  This is not the order
specified in the rest of mm code.  Handling of hugetlbfs pages is mostly
isolated today.  Therefore, we use this alternative locking order for
PageHuge() pages.

         mapping->i_mmap_rwsem
           hugetlb_fault_mutex (hugetlbfs specific page fault mutex)
             page->flags PG_locked (lock_page)

To help with lock ordering issues, hugetlb_page_mapping_lock_write() is
introduced to write lock the i_mmap_rwsem associated with a page.

In most cases it is easy to get address_space via vma->vm_file->f_mapping.
However, in the case of migration or memory errors for anon pages we do
not have an associated vma.  A new routine _get_hugetlb_page_mapping()
will use anon_vma to get address_space in these cases.
Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: "Aneesh Kumar K . V" <aneesh.kumar@linux.vnet.ibm.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Prakash Sangappa <prakash.sangappa@oracle.com>
Link: http://lkml.kernel.org/r/20200316205756.146666-2-mike.kravetz@oracle.comSigned-off-by: Linus Torvalds <torvalds@linux-foundation.org>

c0d0381a

mm/memblock.c: remove redundant assignment to variable max_addr · 49aef717

Colin Ian King authored Apr 01, 2020

The variable max_addr is being initialized with a value that is never read
and it is being updated later with a new value.  The initialization is
redundant and can be removed.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Reviewed-by: Mike Rapoport <rppt@linux.ibm.com>
Link: http://lkml.kernel.org/r/20200228235003.112718-1-colin.king@canonical.com
Addresses-Coverity: ("Unused value")
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

49aef717

mm: mempolicy: require at least one nodeid for MPOL_PREFERRED · aa9f7d51

Randy Dunlap authored Apr 01, 2020

Using an empty (malformed) nodelist that is not caught during mount option
parsing leads to a stack-out-of-bounds access.

The option string that was used was: "mpol=prefer:,".  However,
MPOL_PREFERRED requires a single node number, which is not being provided
here.

Add a check that 'nodes' is not empty after parsing for MPOL_PREFERRED's
nodeid.

Fixes: 095f1fc4 ("mempolicy: rework shmem mpol parsing and display")
Reported-by: Entropy Moe <3ntr0py1337@gmail.com>
Reported-by: syzbot+b055b1a6b2b958707a21@syzkaller.appspotmail.com
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Tested-by: syzbot+b055b1a6b2b958707a21@syzkaller.appspotmail.com
Cc: Lee Schermerhorn <lee.schermerhorn@hp.com>
Link: http://lkml.kernel.org/r/89526377-7eb6-b662-e1d8-4430928abde9@infradead.orgSigned-off-by: Linus Torvalds <torvalds@linux-foundation.org>

aa9f7d51