Commits · f0254b51cbbfe040142cb824a1ff623de698ae8a · Kirill Smelkov / linux

02 Apr, 2020 1 commit

gpio: Unconditionally assign .request()/.free() · f0254b51

Thierry Reding authored Apr 01, 2020

The gpiochip_generic_request() and gpiochip_generic_free() functions can
now deal properly with chips that don't have any pin-ranges defined, so
they can be assigned unconditionally.
Suggested-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Link: https://lore.kernel.org/r/20200401200527.2982450-1-thierry.reding@gmail.comSigned-off-by: Linus Walleij <linus.walleij@linaro.org>

f0254b51

01 Apr, 2020 1 commit

gpio: export of_pinctrl_get to modules · 33dd8882

Stephen Rothwell authored Apr 01, 2020

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Link: https://lore.kernel.org/r/20200401151904.6948af20@canb.auug.org.auSigned-off-by: Linus Walleij <linus.walleij@linaro.org>

33dd8882

31 Mar, 2020 3 commits

pinctrl: Define of_pinctrl_get() dummy for !PINCTRL · e45ee71a

Thierry Reding authored Mar 30, 2020

Currently, the of_pinctrl_get() dummy is only defined for !OF, which can
still cause build failures on configurations with OF enabled but PINCTRL
disabled. Make sure to define the dummy if either OF or PINCTRL are not
enabled.
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Link: https://lore.kernel.org/r/20200330095801.2421589-1-thierry.reding@gmail.comSigned-off-by: Linus Walleij <linus.walleij@linaro.org>

e45ee71a

gpio: Rename variable in core APIs · a0b66a73

Linus Walleij authored Mar 29, 2020

There is struct gpio *gc, *chip and *gpiochip, and yes
I am responsible for some of the inconsistencies. I want
this to be just gc everywhere for minimizing cognitive
resistance when reading the code: more compact function
signatures and less clutter.

Purely syntactic changes intended. No semantic effects.
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Link: https://lore.kernel.org/r/20200329140405.52276-1-linus.walleij@linaro.orgSigned-off-by: Linus Walleij <linus.walleij@linaro.org>

a0b66a73

gpio: Avoid using pin ranges with !PINCTRL · 89ad556b

Thierry Reding authored Mar 30, 2020

Do not use the struct gpio_device's .pin_ranges field if the PINCTRL
Kconfig symbol is not selected to avoid build failures.

Fixes: 2ab73c6d ("gpio: Support GPIO controllers without pin-ranges")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Link: https://lore.kernel.org/r/20200330090257.2332864-1-thierry.reding@gmail.comReviewed-by: Geert Uytterhoeven <geert@linux-m68k.org>
Tested-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>

89ad556b

27 Mar, 2020 9 commits

gpiolib: Remove unused gpio_chip parameter from gpio_set_bias() · 5f4bf171

Geert Uytterhoeven authored Mar 25, 2020

gpio_set_bias() no longer uses the passed gpio_chip pointer parameter.
Remove it.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://lore.kernel.org/r/20200325100439.14000-3-geert+renesas@glider.beSigned-off-by: Linus Walleij <linus.walleij@linaro.org>

5f4bf171

gpiolib: Pass gpio_desc to gpio_set_config() · 83522358

Geert Uytterhoeven authored Mar 25, 2020

All callers of gpio_set_config() have to convert a gpio_desc to a
gpio_chip and offset. Avoid these duplicated conversion steps by
letting gpio_set_config() take a gpio_desc pointer directly.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://lore.kernel.org/r/20200325100439.14000-2-geert+renesas@glider.beSigned-off-by: Linus Walleij <linus.walleij@linaro.org>

83522358

gpiolib: Introduce gpiod_set_config() · 8ced32ff

Geert Uytterhoeven authored Mar 24, 2020

The GPIO Aggregator will need a method to forward a .set_config() call
to its parent gpiochip.  This requires obtaining the gpio_chip and
offset for a given gpio_desc.  While gpiod_to_chip() is public,
gpio_chip_hwgpio() is not, so there is currently no method to obtain the
needed GPIO offset parameter.

Hence introduce a public gpiod_set_config() helper, which invokes the
.set_config() callback through a gpio_desc pointer, like is done for
most other gpio_chip callbacks.

Rewrite the existing gpiod_set_debounce() helper as a wrapper around
gpiod_set_config(), to avoid duplication.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://lore.kernel.org/r/20200324135653.6676-5-geert+renesas@glider.beSigned-off-by: Linus Walleij <linus.walleij@linaro.org>

8ced32ff

Merge tag 'v5.6-rc7' into devel · 06dd3f31
Linus Walleij authored Mar 27, 2020
```
Linux 5.6-rc7
```
06dd3f31

tools: gpio: Fix out-of-tree build regression · 82f04bfe

Anssi Hannula authored Mar 25, 2020

Commit 0161a94e ("tools: gpio: Correctly add make dependencies for
gpio_utils") added a make rule for gpio-utils-in.o but used $(output)
instead of the correct $(OUTPUT) for the output directory, breaking
out-of-tree build (O=xx) with the following error:

No rule to make target 'out/tools/gpio/gpio-utils-in.o', needed by 'out/tools/gpio/lsgpio-in.o'. Stop.

Fix that.

Fixes: 0161a94e ("tools: gpio: Correctly add make dependencies for gpio_utils")
Cc: <stable@vger.kernel.org>
Cc: Laura Abbott <labbott@redhat.com>
Signed-off-by: Anssi Hannula <anssi.hannula@bitwise.fi>
Link: https://lore.kernel.org/r/20200325103154.32235-1-anssi.hannula@bitwise.fiReviewed-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>

82f04bfe

gpio: gpiolib: fix a doc warning · 35c6cfb4

Mauro Carvalho Chehab authored Mar 17, 2020

Use a different markup for the ERR_PTR, as %FOO doesn't work
if there are parenthesis. So, use, instead:

	``ERR_PTR(-EINVAL)``

This fixes the following warning:

	./drivers/gpio/gpiolib.c:139: WARNING: Inline literal start-string without end-string.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Link: https://lore.kernel.org/r/51197e3568f073e22c280f0584bfa20b44436708.1584456635.git.mchehab+huawei@kernel.orgReviewed-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>

35c6cfb4

gpio: tegra186: Add Tegra194 pin ranges for GG.0 and GG.1 · ffa91e7c

Thierry Reding authored Mar 19, 2020

The GG.0 and GG.1 GPIOs serve as CLKREQ and RST pins, respectively, for
PCIe controller 5 on Tegra194. When this controller is configured in
endpoint mode, these pins need to be used as GPIOs by the PCIe endpoint
driver. Typically the mode programming of these pins (GPIO vs. SFIO) is
performed by early boot firmware to ensure that the configuration is
consistent.

However, the GG.0 and GG.1 pins are part of a special power partition
that is not enabled during early boot, and hence the early boot firmware
cannot program these pins to be GPIOs (they are SFIO by default). Adding
them as pin ranges for the pin controller allows the pin controller to
be involved when these pins are requested as GPIOs and allows the proper
programming to take place.
Signed-off-by: Thierry Reding <treding@nvidia.com>
Link: https://lore.kernel.org/r/20200319122737.3063291-4-thierry.reding@gmail.comTested-by: Vidya Sagar <vidyas@nvidia.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>

ffa91e7c

gpio: tegra186: Add support for pin ranges · b64d6c9a

Thierry Reding authored Mar 19, 2020

Add support for Tegra SoC generations to specify a list of pin ranges
that map GPIOs to ranges of pins in the pin controller.
Signed-off-by: Thierry Reding <treding@nvidia.com>
Link: https://lore.kernel.org/r/20200319122737.3063291-3-thierry.reding@gmail.comTested-by: Vidya Sagar <vidyas@nvidia.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>

b64d6c9a

gpio: Support GPIO controllers without pin-ranges · 2ab73c6d

Thierry Reding authored Mar 19, 2020

Wake gpiochip_generic_request() call into the pinctrl helpers only if a
GPIO controller had any pin-ranges assigned to it. This allows a driver
to unconditionally use this helper if it supports multiple devices of
which only a subset have pin-ranges assigned to them.
Signed-off-by: Thierry Reding <treding@nvidia.com>
Link: https://lore.kernel.org/r/20200319122737.3063291-2-thierry.reding@gmail.comTested-by: Vidya Sagar <vidyas@nvidia.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>

2ab73c6d

26 Mar, 2020 1 commit

ARM: integrator: impd1: Use GPIO_LOOKUP() helper macro · da3f5947

Geert Uytterhoeven authored Mar 24, 2020

impd1_probe() fills in the GPIO lookup table by manually populating an
array of gpiod_lookup structures. Use the existing GPIO_LOOKUP() helper
macro instead, to relax a dependency on the gpiod_lookup structure's
member names.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20200324135653.6676-1-geert+renesas@glider.beSigned-off-by: Linus Walleij <linus.walleij@linaro.org>

da3f5947

25 Mar, 2020 15 commits

gpio: brcmstb: support gpio-line-names property · 5eefcaed

Doug Berger authored Mar 09, 2020

The default handling of the gpio-line-names property by the
gpiolib-of implementation does not work with the multiple
gpiochip banks per device structure used by the gpio-brcmstb
driver.

This commit adds driver level support for the device tree
property so that GPIO lines can be assigned friendly names.
Signed-off-by: Doug Berger <opendmb@gmail.com>
Link: https://lore.kernel.org/r/1583780521-45702-1-git-send-email-opendmb@gmail.comAcked-by: Gregory Fong <gregory.0xf0@gmail.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>

5eefcaed

Merge tag 'gpio-updates-for-v5.7-part4' of... · 30a464a8

Linus Walleij authored Mar 25, 2020

Merge tag 'gpio-updates-for-v5.7-part4' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux into devel

gpio updates for v5.7 part 4

- improve comments in the uapi header
- fix documentation issues
- add a warning to gpio-pl061 when the IRQ line is not configured
- allow building gpio-mxc and gpio-mxs with COMPILE_TEST enabled
- don't print an error message when an optional IRQ is missing in gpio-mvebu
- fix a potential segfault in gpio-hammer
- fix a couple typos and coding style issues in gpio tools
- provide a new flag in gpio-mmio and use it in mt7621 to fix an issue with
  the controller ignoring value setting when a GPIO is in input mode
- slightly refactor gpio_name_to_desc()

30a464a8

tools: gpio: Fix typo in gpio-utils · 97551625

Mykyta Poturai authored Mar 22, 2020

Replace COMSUMER with proper CONSUMER
Signed-off-by: Mykyta Poturai <mykyta.poturai@gmail.com>
Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>

97551625

tools: gpio-hammer: Apply scripts/Lindent and retain good changes · 1003bc16

Gabriel Ravier authored Mar 16, 2020

"retain good changes" means that I left the help string split up instead
of having this weird thing where it tries to merge together the last three
lines and it looks **really** bad
Signed-off-by: Gabriel Ravier <gabravier@gmail.com>
Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>

1003bc16

gpiolib: gpio_name_to_desc: factor out !name check · ee203bbd

Michał Mirosław authored Mar 15, 2020

Since name == NULL can't ever match, move the check out of
IRQ-disabled region.
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>

ee203bbd

tools: gpio-hammer: fix spelling mistake: "occurences" -> "occurrences" · 55f17e2a

Colin Ian King authored Mar 16, 2020

There is a spelling mistake in an error message. Fix it.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>

55f17e2a

gpio: mt7621: add BGPIOF_NO_SET_ON_INPUT flag · 427cabed

Chuanhong Guo authored Mar 15, 2020

DSET/DCLR registers only works on output pins. Add corresponding
BGPIOF_NO_SET_ON_INPUT flag to bgpio_init call to fix direction_out
behavior.
Signed-off-by: Chuanhong Guo <gch981213@gmail.com>
Tested-by: René van Dorst <opensource@vdorst.com>
Reviewed-by: Sergio Paracuellos <sergio.paracuellos@gmail.com>
Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>

427cabed

gpio: mmio: introduce BGPIOF_NO_SET_ON_INPUT · d19d2de6

Chuanhong Guo authored Mar 15, 2020

Some gpio controllers ignores pin value writing when that pin is
configured as input mode. As a result, bgpio_dir_out should set
pin to output before configuring pin values or gpio pin values
can't be set up properly.
Introduce two variants of bgpio_dir_out: bgpio_dir_out_val_first
and bgpio_dir_out_dir_first, and assign direction_output according
to a new flag: BGPIOF_NO_SET_ON_INPUT.
Signed-off-by: Chuanhong Guo <gch981213@gmail.com>
Tested-by: René van Dorst <opensource@vdorst.com>
Reviewed-by: Sergio Paracuellos <sergio.paracuellos@gmail.com>
Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>

d19d2de6

tools: gpio-hammer: Avoid potential overflow in main · d1ee7e1f

Gabriel Ravier authored Mar 12, 2020

If '-o' was used more than 64 times in a single invocation of gpio-hammer,
this could lead to an overflow of the 'lines' array. This commit fixes
this by avoiding the overflow and giving a proper diagnostic back to the
user
Signed-off-by: Gabriel Ravier <gabravier@gmail.com>
Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>

d1ee7e1f

gpio: mvebu: avoid error message for optional IRQ · 525b0858

Chris Packham authored Mar 13, 2020

platform_get_irq() will generate an error message if the requested irq
is not present

  mvebu-gpio f1010140.gpio: IRQ index 3 not found

use platform_get_irq_optional() to avoid the error message being
generated.
Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz>
Acked-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>

525b0858

gpio: mxs: add COMPILE_TEST support for GPIO_MXS · 6876ca31

Anson Huang authored Mar 09, 2020

Add COMPILE_TEST support to GPIO_MXS driver for better compile
testing coverage.
Signed-off-by: Anson Huang <Anson.Huang@nxp.com>
Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>

6876ca31

gpio: mxc: Add COMPILE_TEST support for GPIO_MXC · d4e93614

Anson Huang authored Mar 07, 2020

Add COMPILE_TEST support to GPIO_MXC driver for better compile
testing coverage.
Signed-off-by: Anson Huang <Anson.Huang@nxp.com>
Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>

d4e93614

gpio: pl061: Warn when IRQ line has not been configured · 1a555713

Alexander Sverdlin authored Mar 03, 2020

Existing (irq < 0) condition is always false because adev->irq has unsigned
type and contains 0 in case of failed irq_of_parse_and_map(). Up to now all
the mapping errors were silently ignored.

Seems that repairing this check would be backwards-incompatible and might
break the probe() for the implementations without IRQ support. Therefore
warn the user instead.
Signed-off-by: Alexander Sverdlin <alexander.sverdlin@nokia.com>
Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>

1a555713

docs: gpio: driver.rst: don't mark literal blocks twice · f8c3cea8

Mauro Carvalho Chehab authored Mar 03, 2020

Two literal blocks there are marked with both "::" and

.. code-block:: c

This causes Sphinx (2.4.1) to do the wrong thing, causing
lots of warnings:

Documentation/driver-api/gpio/driver.rst:425: WARNING: Unexpected indentation.
Documentation/driver-api/gpio/driver.rst:423: WARNING: Inline emphasis start-string without end-string.
Documentation/driver-api/gpio/driver.rst:427: WARNING: Block quote ends without a blank line; unexpected unindent.
Documentation/driver-api/gpio/driver.rst:429: WARNING: Inline emphasis start-string without end-string.
Documentation/driver-api/gpio/driver.rst:429: WARNING: Inline emphasis start-string without end-string.
Documentation/driver-api/gpio/driver.rst:429: WARNING: Inline emphasis start-string without end-string.
Documentation/driver-api/gpio/driver.rst:433: WARNING: Inline emphasis start-string without end-string.
Documentation/driver-api/gpio/driver.rst:446: WARNING: Unexpected indentation.
Documentation/driver-api/gpio/driver.rst:440: WARNING: Inline emphasis start-string without end-string.
Documentation/driver-api/gpio/driver.rst:440: WARNING: Inline emphasis start-string without end-string.
Documentation/driver-api/gpio/driver.rst:447: WARNING: Block quote ends without a blank line; unexpected unindent.
Documentation/driver-api/gpio/driver.rst:449: WARNING: Definition list ends without a blank line; unexpected unindent.
Documentation/driver-api/gpio/driver.rst:462: WARNING: Unexpected indentation.
Documentation/driver-api/gpio/driver.rst:460: WARNING: Inline emphasis start-string without end-string.
Documentation/driver-api/gpio/driver.rst:462: WARNING: Inline emphasis start-string without end-string.
Documentation/driver-api/gpio/driver.rst:465: WARNING: Block quote ends without a blank line; unexpected unindent.
Documentation/driver-api/gpio/driver.rst:467: WARNING: Inline emphasis start-string without end-string.
Documentation/driver-api/gpio/driver.rst:467: WARNING: Inline emphasis start-string without end-string.
Documentation/driver-api/gpio/driver.rst:467: WARNING: Inline emphasis start-string without end-string.
Documentation/driver-api/gpio/driver.rst:471: WARNING: Inline emphasis start-string without end-string.
Documentation/driver-api/gpio/driver.rst:478: WARNING: Inline emphasis start-string without end-string.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>

f8c3cea8

gpio: uapi: Improve phrasing around arrays representing empty strings · 32f5f62d

Jonathan Neuschäfer authored Mar 03, 2020

Character arrays can be considered empty strings (if they are
immediately terminated), but they cannot be NULL.
Signed-off-by: Jonathan Neuschäfer <j.neuschaefer@gmx.net>
Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>

32f5f62d

23 Mar, 2020 1 commit
- Linux 5.6-rc7 · 16fbf79b
  Linus Torvalds authored Mar 22, 2020
  
  16fbf79b
22 Mar, 2020 9 commits

Merge tag 'for-5.6-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux · 67d584e3

Linus Torvalds authored Mar 22, 2020

Pull btrfs fixes from David Sterba:
 "Two fixes.

  The first is a regression: when dropping some incompat bits the
  conditions were reversed. The other is a fix for rename whiteout
  potentially leaving stack memory linked to a list"

* tag 'for-5.6-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
  btrfs: fix removal of raid[56|1c34} incompat flags after removing block group
  btrfs: fix log context list corruption after rename whiteout error

67d584e3

Merge branch 'akpm' (patches from Andrew) · b3c03db6

Linus Torvalds authored Mar 22, 2020

Merge misc fixes from Andrew Morton:
 "10 fixes"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  x86/mm: split vmalloc_sync_all()
  mm, slub: prevent kmalloc_node crashes and memory leaks
  mm/mmu_notifier: silence PROVE_RCU_LIST warnings
  epoll: fix possible lost wakeup on epoll_ctl() path
  mm: do not allow MADV_PAGEOUT for CoW pages
  mm, memcg: throttle allocators based on ancestral memory.high
  mm, memcg: fix corruption on 64-bit divisor in memory.high throttling
  page-flags: fix a crash at SetPageError(THP_SWAP)
  mm/hotplug: fix hot remove failure in SPARSEMEM|!VMEMMAP case
  memcg: fix NULL pointer dereference in __mem_cgroup_usage_unregister_event

b3c03db6

x86/mm: split vmalloc_sync_all() · 763802b5

Joerg Roedel authored Mar 21, 2020

Commit 3f8fd02b ("mm/vmalloc: Sync unmappings in
__purge_vmap_area_lazy()") introduced a call to vmalloc_sync_all() in
the vunmap() code-path.  While this change was necessary to maintain
correctness on x86-32-pae kernels, it also adds additional cycles for
architectures that don't need it.

Specifically on x86-64 with CONFIG_VMAP_STACK=y some people reported
severe performance regressions in micro-benchmarks because it now also
calls the x86-64 implementation of vmalloc_sync_all() on vunmap().  But
the vmalloc_sync_all() implementation on x86-64 is only needed for newly
created mappings.

To avoid the unnecessary work on x86-64 and to gain the performance
back, split up vmalloc_sync_all() into two functions:

	* vmalloc_sync_mappings(), and
	* vmalloc_sync_unmappings()

Most call-sites to vmalloc_sync_all() only care about new mappings being
synchronized.  The only exception is the new call-site added in the
above mentioned commit.

Shile Zhang directed us to a report of an 80% regression in reaim
throughput.

Fixes: 3f8fd02b ("mm/vmalloc: Sync unmappings in __purge_vmap_area_lazy()")
Reported-by: kernel test robot <oliver.sang@intel.com>
Reported-by: Shile Zhang <shile.zhang@linux.alibaba.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Tested-by: Borislav Petkov <bp@suse.de>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	[GHES]
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/20191009124418.8286-1-joro@8bytes.org
Link: https://lists.01.org/hyperkitty/list/lkp@lists.01.org/thread/4D3JPPHBNOSPFK2KEPC6KGKS6J25AIDB/
Link: http://lkml.kernel.org/r/20191113095530.228959-1-shile.zhang@linux.alibaba.comSigned-off-by: Linus Torvalds <torvalds@linux-foundation.org>

763802b5

mm, slub: prevent kmalloc_node crashes and memory leaks · 0715e6c5

Vlastimil Babka authored Mar 21, 2020

Sachin reports [1] a crash in SLUB __slab_alloc():

  BUG: Kernel NULL pointer dereference on read at 0x000073b0
  Faulting instruction address: 0xc0000000003d55f4
  Oops: Kernel access of bad area, sig: 11 [#1]
  LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
  Modules linked in:
  CPU: 19 PID: 1 Comm: systemd Not tainted 5.6.0-rc2-next-20200218-autotest #1
  NIP:  c0000000003d55f4 LR: c0000000003d5b94 CTR: 0000000000000000
  REGS: c0000008b37836d0 TRAP: 0300   Not tainted  (5.6.0-rc2-next-20200218-autotest)
  MSR:  8000000000009033 <SF,EE,ME,IR,DR,RI,LE>  CR: 24004844  XER: 00000000
  CFAR: c00000000000dec4 DAR: 00000000000073b0 DSISR: 40000000 IRQMASK: 1
  GPR00: c0000000003d5b94 c0000008b3783960 c00000000155d400 c0000008b301f500
  GPR04: 0000000000000dc0 0000000000000002 c0000000003443d8 c0000008bb398620
  GPR08: 00000008ba2f0000 0000000000000001 0000000000000000 0000000000000000
  GPR12: 0000000024004844 c00000001ec52a00 0000000000000000 0000000000000000
  GPR16: c0000008a1b20048 c000000001595898 c000000001750c18 0000000000000002
  GPR20: c000000001750c28 c000000001624470 0000000fffffffe0 5deadbeef0000122
  GPR24: 0000000000000001 0000000000000dc0 0000000000000002 c0000000003443d8
  GPR28: c0000008b301f500 c0000008bb398620 0000000000000000 c00c000002287180
  NIP ___slab_alloc+0x1f4/0x760
  LR __slab_alloc+0x34/0x60
  Call Trace:
    ___slab_alloc+0x334/0x760 (unreliable)
    __slab_alloc+0x34/0x60
    __kmalloc_node+0x110/0x490
    kvmalloc_node+0x58/0x110
    mem_cgroup_css_online+0x108/0x270
    online_css+0x48/0xd0
    cgroup_apply_control_enable+0x2ec/0x4d0
    cgroup_mkdir+0x228/0x5f0
    kernfs_iop_mkdir+0x90/0xf0
    vfs_mkdir+0x110/0x230
    do_mkdirat+0xb0/0x1a0
    system_call+0x5c/0x68

This is a PowerPC platform with following NUMA topology:

  available: 2 nodes (0-1)
  node 0 cpus:
  node 0 size: 0 MB
  node 0 free: 0 MB
  node 1 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
  node 1 size: 35247 MB
  node 1 free: 30907 MB
  node distances:
  node   0   1
    0:  10  40
    1:  40  10

  possible numa nodes: 0-31

This only happens with a mmotm patch "mm/memcontrol.c: allocate
shrinker_map on appropriate NUMA node" [2] which effectively calls
kmalloc_node for each possible node.  SLUB however only allocates
kmem_cache_node on online N_NORMAL_MEMORY nodes, and relies on
node_to_mem_node to return such valid node for other nodes since commit
a561ce00 ("slub: fall back to node_to_mem_node() node if allocating
on memoryless node").  This is however not true in this configuration
where the _node_numa_mem_ array is not initialized for nodes 0 and 2-31,
thus it contains zeroes and get_partial() ends up accessing
non-allocated kmem_cache_node.

A related issue was reported by Bharata (originally by Ramachandran) [3]
where a similar PowerPC configuration, but with mainline kernel without
patch [2] ends up allocating large amounts of pages by kmalloc-1k
kmalloc-512.  This seems to have the same underlying issue with
node_to_mem_node() not behaving as expected, and might probably also
lead to an infinite loop with CONFIG_SLUB_CPU_PARTIAL [4].

This patch should fix both issues by not relying on node_to_mem_node()
anymore and instead simply falling back to NUMA_NO_NODE, when
kmalloc_node(node) is attempted for a node that's not online, or has no
usable memory.  The "usable memory" condition is also changed from
node_present_pages() to N_NORMAL_MEMORY node state, as that is exactly
the condition that SLUB uses to allocate kmem_cache_node structures.
The check in get_partial() is removed completely, as the checks in
___slab_alloc() are now sufficient to prevent get_partial() being
reached with an invalid node.

[1] https://lore.kernel.org/linux-next/3381CD91-AB3D-4773-BA04-E7A072A63968@linux.vnet.ibm.com/
[2] https://lore.kernel.org/linux-mm/fff0e636-4c36-ed10-281c-8cdb0687c839@virtuozzo.com/
[3] https://lore.kernel.org/linux-mm/20200317092624.GB22538@in.ibm.com/
[4] https://lore.kernel.org/linux-mm/088b5996-faae-8a56-ef9c-5b567125ae54@suse.cz/

Fixes: a561ce00 ("slub: fall back to node_to_mem_node() node if allocating on memoryless node")
Reported-by: Sachin Sant <sachinp@linux.vnet.ibm.com>
Reported-by: PUVICHAKRAVARTHY RAMACHANDRAN <puvichakravarthy@in.ibm.com>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Tested-by: Sachin Sant <sachinp@linux.vnet.ibm.com>
Tested-by: Bharata B Rao <bharata@linux.ibm.com>
Reviewed-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Christopher Lameter <cl@linux.com>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Kirill Tkhai <ktkhai@virtuozzo.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Nathan Lynch <nathanl@linux.ibm.com>
Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/20200320115533.9604-1-vbabka@suse.czDebugged-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

0715e6c5

mm/mmu_notifier: silence PROVE_RCU_LIST warnings · 63886bad

Qian Cai authored Mar 21, 2020

It is safe to traverse mm->notifier_subscriptions->list either under
SRCU read lock or mm->notifier_subscriptions->lock using
hlist_for_each_entry_rcu().  Silence the PROVE_RCU_LIST false positives,
for example,

  WARNING: suspicious RCU usage
  -----------------------------
  mm/mmu_notifier.c:484 RCU-list traversed in non-reader section!!

  other info that might help us debug this:

  rcu_scheduler_active = 2, debug_locks = 1
  3 locks held by libvirtd/802:
   #0: ffff9321e3f58148 (&mm->mmap_sem#2){++++}, at: do_mprotect_pkey+0xe1/0x3e0
   #1: ffffffff91ae6160 (mmu_notifier_invalidate_range_start){+.+.}, at: change_p4d_range+0x5fa/0x800
   #2: ffffffff91ae6e08 (srcu){....}, at: __mmu_notifier_invalidate_range_start+0x178/0x460

  stack backtrace:
  CPU: 7 PID: 802 Comm: libvirtd Tainted: G          I       5.6.0-rc6-next-20200317+ #2
  Hardware name: HP ProLiant BL460c Gen8, BIOS I31 11/02/2014
  Call Trace:
    dump_stack+0xa4/0xfe
    lockdep_rcu_suspicious+0xeb/0xf5
    __mmu_notifier_invalidate_range_start+0x3ff/0x460
    change_p4d_range+0x746/0x800
    change_protection+0x1df/0x300
    mprotect_fixup+0x245/0x3e0
    do_mprotect_pkey+0x23b/0x3e0
    __x64_sys_mprotect+0x51/0x70
    do_syscall_64+0x91/0xae8
    entry_SYSCALL_64_after_hwframe+0x49/0xb3
Signed-off-by: Qian Cai <cai@lca.pw>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Paul E. McKenney <paulmck@kernel.org>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Link: http://lkml.kernel.org/r/20200317175640.2047-1-cai@lca.pwSigned-off-by: Linus Torvalds <torvalds@linux-foundation.org>

63886bad

epoll: fix possible lost wakeup on epoll_ctl() path · 1b53734b

Roman Penyaev authored Mar 21, 2020

This fixes possible lost wakeup introduced by commit a218cc49.
Originally modifications to ep->wq were serialized by ep->wq.lock, but
in commit a218cc49 ("epoll: use rwlock in order to reduce
ep_poll_callback() contention") a new rw lock was introduced in order to
relax fd event path, i.e. callers of ep_poll_callback() function.

After the change ep_modify and ep_insert (both are called on epoll_ctl()
path) were switched to ep->lock, but ep_poll (epoll_wait) was using
ep->wq.lock on wqueue list modification.

The bug doesn't lead to any wqueue list corruptions, because wake up
path and list modifications were serialized by ep->wq.lock internally,
but actual waitqueue_active() check prior wake_up() call can be
reordered with modifications of ep ready list, thus wake up can be lost.

And yes, can be healed by explicit smp_mb():

  list_add_tail(&epi->rdlink, &ep->rdllist);
  smp_mb();
  if (waitqueue_active(&ep->wq))
	wake_up(&ep->wp);

But let's make it simple, thus current patch replaces ep->wq.lock with
the ep->lock for wqueue modifications, thus wake up path always observes
activeness of the wqueue correcty.

Fixes: a218cc49 ("epoll: use rwlock in order to reduce ep_poll_callback() contention")
Reported-by: Max Neunhoeffer <max@arangodb.com>
Signed-off-by: Roman Penyaev <rpenyaev@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Tested-by: Max Neunhoeffer <max@arangodb.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Christopher Kohlhoff <chris.kohlhoff@clearpool.io>
Cc: Davidlohr Bueso <dbueso@suse.de>
Cc: Jason Baron <jbaron@akamai.com>
Cc: Jes Sorensen <jes.sorensen@gmail.com>
Cc: <stable@vger.kernel.org>	[5.1+]
Link: http://lkml.kernel.org/r/20200214170211.561524-1-rpenyaev@suse.de
References: https://bugzilla.kernel.org/show_bug.cgi?id=205933Bisected-by: Max Neunhoeffer <max@arangodb.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

1b53734b

mm: do not allow MADV_PAGEOUT for CoW pages · 12e967fd

Michal Hocko authored Mar 21, 2020

Jann has brought up a very interesting point [1].  While shared pages
are excluded from MADV_PAGEOUT normally, CoW pages can be easily
reclaimed that way.  This can lead to all sorts of hard to debug
problems.  E.g.  performance problems outlined by Daniel [2].

There are runtime environments where there is a substantial memory
shared among security domains via CoW memory and a easy to reclaim way
of that memory, which MADV_{COLD,PAGEOUT} offers, can lead to either
performance degradation in for the parent process which might be more
privileged or even open side channel attacks.

The feasibility of the latter is not really clear to me TBH but there is
no real reason for exposure at this stage.  It seems there is no real
use case to depend on reclaiming CoW memory via madvise at this stage so
it is much easier to simply disallow it and this is what this patch
does.  Put it simply MADV_{PAGEOUT,COLD} can operate only on the
exclusively owned memory which is a straightforward semantic.

[1] http://lkml.kernel.org/r/CAG48ez0G3JkMq61gUmyQAaCq=_TwHbi1XKzWRooxZkv08PQKuw@mail.gmail.com
[2] http://lkml.kernel.org/r/CAKOZueua_v8jHCpmEtTB6f3i9e2YnmX4mqdYVWhV4E=Z-n+zRQ@mail.gmail.com

Fixes: 9c276cc6 ("mm: introduce MADV_COLD")
Reported-by: Jann Horn <jannh@google.com>
Signed-off-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Daniel Colascione <dancol@google.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: "Joel Fernandes (Google)" <joel@joelfernandes.org>
Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/20200312082248.GS23944@dhcp22.suse.czSigned-off-by: Linus Torvalds <torvalds@linux-foundation.org>

12e967fd

mm, memcg: throttle allocators based on ancestral memory.high · e26733e0

Chris Down authored Mar 21, 2020

Prior to this commit, we only directly check the affected cgroup's
memory.high against its usage.  However, it's possible that we are being
reclaimed as a result of hitting an ancestor memory.high and should be
penalised based on that, instead.

This patch changes memory.high overage throttling to use the largest
overage in its ancestors when considering how many penalty jiffies to
charge.  This makes sure that we penalise poorly behaving cgroups in the
same way regardless of at what level of the hierarchy memory.high was
breached.

Fixes: 0e4b01df ("mm, memcg: throttle allocators when failing reclaim over memory.high")
Reported-by: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Chris Down <chris@chrisdown.name>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Nathan Chancellor <natechancellor@gmail.com>
Cc: Roman Gushchin <guro@fb.com>
Cc: <stable@vger.kernel.org>	[5.4.x+]
Link: http://lkml.kernel.org/r/8cd132f84bd7e16cdb8fde3378cdbf05ba00d387.1584036142.git.chris@chrisdown.nameSigned-off-by: Linus Torvalds <torvalds@linux-foundation.org>

e26733e0

mm, memcg: fix corruption on 64-bit divisor in memory.high throttling · d397a45f

Chris Down authored Mar 21, 2020

Commit 0e4b01df had a bunch of fixups to use the right division
method.  However, it seems that after all that it still wasn't right --
div_u64 takes a 32-bit divisor.

The headroom is still large (2^32 pages), so on mundane systems you
won't hit this, but this should definitely be fixed.

Fixes: 0e4b01df ("mm, memcg: throttle allocators when failing reclaim over memory.high")
Reported-by: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Chris Down <chris@chrisdown.name>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Roman Gushchin <guro@fb.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Nathan Chancellor <natechancellor@gmail.com>
Cc: <stable@vger.kernel.org>	[5.4.x+]
Link: http://lkml.kernel.org/r/80780887060514967d414b3cd91f9a316a16ab98.1584036142.git.chris@chrisdown.nameSigned-off-by: Linus Torvalds <torvalds@linux-foundation.org>

d397a45f