Commits · e6de179d7a88b833ccadd18da5099d435acdac65 · nexedi / linux

25 Mar, 2020 2 commits

nvmem: core: add root_only member to nvmem device struct · e6de179d

Srinivas Kandagatla authored Mar 25, 2020

As we are planning to move to use sysfs is_bin_visible callback,
having root_only as part of nvmem_device will help decide correct
permissions.
Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Link: https://lore.kernel.org/r/20200325122116.15096-2-srinivas.kandagatla@linaro.orgSigned-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

e6de179d

Merge tag 'extcon-next-for-5.7' of... · b83f6877

Greg Kroah-Hartman authored Mar 25, 2020

Merge tag 'extcon-next-for-5.7' of git://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/extcon into char-misc-next

Chanwoo writes:

Update extcon for 5.7

Detailed description for this pull request:
1. Update the extcon provider driver as following:
- Add wakeup support for extcon-axp288.c
- Clean-up code of -EPROBE_DEFER error case for extcon-palmas.c
- Covert extcon-usbc-cros-ec.txt to yaml format
2. Export symbol of extcon_get_edev_name()

* tag 'extcon-next-for-5.7' of git://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/extcon:
  extcon: axp288: Add wakeup support
  extcon: Mark extcon_get_edev_name() function as exported symbol
  extcon: palmas: Hide error messages if gpio returns -EPROBE_DEFER
  dt-bindings: extcon: usbc-cros-ec: convert extcon-usbc-cros-ec.txt to yaml format

b83f6877

24 Mar, 2020 27 commits

extcon: axp288: Add wakeup support · 9c945530

Hans de Goede authored Mar 23, 2020

On devices with an AXP288, we need to wakeup from suspend when a charger
is plugged in, so that we can do charger-type detection and so that the
axp288-charger driver, which listens for our extcon events, can configure
the input-current-limit accordingly.

Cc: stable@vger.kernel.org
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>

9c945530

extcon: Mark extcon_get_edev_name() function as exported symbol · 995bb109

Mayank Rana authored Mar 16, 2020

extcon_get_edev_name() function provides client driver to request
extcon dev's name. If extcon driver and client driver are compiled
as loadable modules, extcon_get_edev_name() function symbol is not
visible to client driver. Hence mark extcon_find_edev_name() function
as exported symbol.
Signed-off-by: Mayank Rana <mrana@codeaurora.org>
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>

995bb109

extcon: palmas: Hide error messages if gpio returns -EPROBE_DEFER · 3426ad6d

H. Nikolaus Schaller authored Feb 17, 2020

If the gpios are probed after this driver (e.g. if they
come from an i2c expander) there is no need to print an
error message.
Signed-off-by: H. Nikolaus Schaller <hns@goldelico.com>
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>

3426ad6d

dt-bindings: extcon: usbc-cros-ec: convert extcon-usbc-cros-ec.txt to yaml format · 1d279047

Dafna Hirschfeld authored Feb 13, 2020

convert the binding file extcon-usbc-cros-ec.txt to
yaml format extcon-usbc-cros-ec.yaml

This was tested and verified on ARM with:
make dt_binding_check DT_SCHEMA_FILES=Documentation/devicetree/bindings/extcon/extcon-usbc-cros-ec.yaml
make dtbs_check DT_SCHEMA_FILES=Documentation/devicetree/bindings/extcon/extcon-usbc-cros-ec.yaml
Signed-off-by: Dafna Hirschfeld <dafna.hirschfeld@collabora.com>
Reviewed-by: Enric Balletbo i Serra <enric.balletbo@collabora.com>
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>

1d279047

bus: mhi: core: Pass module owner during client driver registration · 82174738

Manivannan Sadhasivam authored Mar 24, 2020

The module owner field can be used to prevent the removal of kernel
modules when there are any device files associated with it opened in
userspace. Hence, modify the API to pass module owner field. For
convenience, module_mhi_driver() macro is used which takes care of
passing the module owner through THIS_MODULE of the module of the
driver and also avoiding the use of specifying the default MHI client
driver register/unregister routines.
Suggested-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Link: https://lore.kernel.org/r/20200324061050.14845-2-manivannan.sadhasivam@linaro.orgSigned-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

82174738

intel_th: msu: Make stopping the trace optional · 8622dfef

Alexander Shishkin authored Mar 19, 2020

Some use cases prefer to keep collecting the trace data into the last
available window while the other windows are being offloaded instead of
stopping the trace. In this scenario, the window switch happens
automatically when the next window becomes available again.

Add an option to allow this and a sysfs attribute to enable it.
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20200319085152.52183-1-alexander.shishkin@linux.intel.comSigned-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

8622dfef

bus/mhi: fix printk format for size_t · 3baf89ab

Randy Dunlap authored Mar 23, 2020

Fix printk format warning by using %z for size_t modifier:

../drivers/bus/mhi/core/boot.c: In function `mhi_rddm_prepare':
../drivers/bus/mhi/core/boot.c:55:15: warning: format `%lx' expects argument of type `long unsigned int', but argument 5 has type `size_t {aka unsigned int}' [-Wformat=]
  dev_dbg(dev, "Address: %p and len: 0x%lx sequence: %u
",

Link: http://lkml.kernel.org/r/c4852a82-cdb9-6318-70a4-96ccb4ba5af2@infradead.org
Fixes: 6fdfdd27 ("bus: mhi: core: Add support for downloading RDDM image during panic")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Acked-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Cc: Hemant Kumar <hemantk@codeaurora.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Link: https://lore.kernel.org/r/20200324022505.UiPPJZVXX%akpm@linux-foundation.orgSigned-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

3baf89ab

Merge tag 'misc-habanalabs-next-2020-03-24' of... · 9d20328d

Greg Kroah-Hartman authored Mar 24, 2020

Merge tag 'misc-habanalabs-next-2020-03-24' of git://people.freedesktop.org/~gabbayo/linux into char-misc-next

Oded writes:

This tag contains the following changes for kernel 5.7:

- MMU code improvements that includes:
  - Flush MMU TLB cache only once, at the end of mapping/unmapping
    function, instead of flushing after mapping of every page.
  - Add future ASIC support by splitting properties of ASIC capabilities
    regarding mapping of host memory to regular and huge pages.

- Add debugfs interface to write and read 64-bit values from the device's
  memory/registers. Previously the driver provided interface for 32-bit
  values and this will allow the user to debug much more quickly. We saw it
  gives a boost of around 1.5 - 1.7 when reading internal memories.

- Support temperature offset via sysfs as defined in
  https://www.kernel.org/doc/Documentation/hwmon/sysfs-interface

- Display historical maximum of various sensors.

- Print to kernel log when clock throttling occurs to due breach of power
  or thermal envelope. Also prints when clock throttling is finished
  (clock is back to optimal).

- Fix bug when moving from manual to auto power-management mode.

- Print a message ("unsupported device") to kernel log in case a GAUDI device
  is recognized.

- Small bug fixes and minor improvements to code.

* tag 'misc-habanalabs-next-2020-03-24' of git://people.freedesktop.org/~gabbayo/linux:
  habanalabs: fix pm manual->auto in GOYA
  habanalabs: show unsupported message for GAUDI
  habanalabs: add print upon clock change
  habanalabs: update goya firmware register map
  habanalabs: Add missing annotation for goya_hw_queues_unlock()
  habanalabs: Add missing annotation for goya_hw_queues_lock()
  habanalabs: Remove unused parse_cnt variable
  habanalabs: provide historical maximum of various sensors
  habanalabs: modify the return values of hl_read/write routines
  habanalabs: support temperature offset via sysfs
  habanalabs: ratelimit error prints of IRQs
  habanalabs: add debugfs write64/read64
  habanalabs: fix DDR bar address setting
  habanalabs: removing extra ;
  habanalabs: Avoid running restore chunks if no execute chunks
  habanalabs: Modify CS jobs counter to u16
  habanalabs: split the host MMU properties
  habanalabs: use the user CB size as a default job size
  habanalabs: flush only at the end of the map/unmap

9d20328d

habanalabs: fix pm manual->auto in GOYA · 11845501

Oded Gabbay authored Mar 22, 2020

When moving from manual to automatic power management mode in GOYA, the
driver didn't correctly place the device in LOW power mode. As a result, if
an application was run immediately after the move, it would have run with
low frequencies.
Reviewed-by: Omer Shpigelman <oshpigelman@habana.ai>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>

11845501

habanalabs: show unsupported message for GAUDI · 6966d9e1

Oded Gabbay authored Mar 21, 2020

If a GAUDI device is present in the system, display an error message that
it is not supported by the current kernel.
Reviewed-by: Omer Shpigelman <oshpigelman@habana.ai>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>

6966d9e1

habanalabs: add print upon clock change · 4f0e6ab7

Omer Shpigelman authored Mar 09, 2020

Add print upon clock slow down due to power consumption or overheating.
In addition, add print when back to optimal clock.
Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>

4f0e6ab7

habanalabs: update goya firmware register map · bc6ed3aa

Oded Gabbay authored Mar 05, 2020

Use specific values in enum of register map to be able to deprecate old
values.
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>

bc6ed3aa

habanalabs: Add missing annotation for goya_hw_queues_unlock() · 8a7a88c1

Jules Irenge authored Feb 23, 2020

Sparse reports a warning at goya_hw_queues_unlock()
warning: context imbalance in goya_hw_queues_unlock() - unexpected unlock
The root cause is a missing annotation at goya_hw_queues_unlock()
Add the missing __releases(&goya->hw_queues_lock) annotation
Signed-off-by: Jules Irenge <jbi.octave@gmail.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>

8a7a88c1

habanalabs: Add missing annotation for goya_hw_queues_lock() · cf87f966

Jules Irenge authored Feb 23, 2020

Sparse reports a warning at goya_hw_queues_lock()
warning: context imbalance in goya_hw_queues_lock() - wrong count at exit
The root cause is a missing annotation at goya_hw_queues_lock()
Add the missing __acquires(&goya->hw_queues_lock) annotation
Signed-off-by: Jules Irenge <jbi.octave@gmail.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>

cf87f966

habanalabs: Remove unused parse_cnt variable · b41e9728

Tomer Tayar authored Feb 25, 2020

The "parse_cnt" variable is incremented while validating the CS chunks,
but it is actually not being used.
Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>

b41e9728

habanalabs: provide historical maximum of various sensors · 0da10e68

Christine Gharzuzi authored Jan 28, 2020

Add support for hwmon_in_highest, hwmon_temp_highest and hwmon_curr_highest
attributes. These attributes retrieve the historical maximum voltage,
temperature and current that were sampled, respectively.
Signed-off-by: Christine Gharzuzi <cgharzuzi@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>

0da10e68

habanalabs: modify the return values of hl_read/write routines · d57b83c3

Moti Haimovski authored Jan 23, 2020

The hl read and write routines implement the hwmon_ops read and write
interface routines respectively.
These routines are expected to return a completion status when called,
which was not the case until this commit.
This commit modifies these routines to return 0 upon success and a
negative error value upon failure.
Signed-off-by: Moti Haimovski <mhaimovski@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>

d57b83c3

habanalabs: support temperature offset via sysfs · 5557b138

Moti Haimovski authored Jan 21, 2020

This commit adds support for offsetting the temperatures reading
by a specified value as defined in
https://www.kernel.org/doc/Documentation/hwmon/sysfs-interface
using the standard sysfs defined for hwmon.
This is required by system administrators to inject errors to test
their monitoring applications in data centers.
Signed-off-by: Moti Haimovski <mhaimovski@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>

5557b138

habanalabs: ratelimit error prints of IRQs · e5509d52

Oded Gabbay authored Jan 16, 2020

The compute engines can perform millions of transactions per second. If
there is a bug in the S/W stack, we could get a lot of interrupts and spam
the kernel log. Therefore, ratelimit these prints
Reviewed-by: Omer Shpigelman <oshpigelman@habana.ai>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>

e5509d52

habanalabs: add debugfs write64/read64 · 5cce5146

Moti Haimovski authored Nov 12, 2019

Allow debug user to write/read 64-bit data through debugfs.
This will expedite the dump process of the (large) internal
memories of the device done during debug.
Signed-off-by: Moti Haimovski <mhaimovski@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>

5cce5146

habanalabs: fix DDR bar address setting · 0c002ceb

Omer Shpigelman authored Feb 06, 2020

DRAM_PHYS_BASE is already taken into account in MMU_PAGE_TABLES_ADDR.
Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>

0c002ceb

habanalabs: removing extra ; · 7491c036

Oded Gabbay authored Jan 07, 2020

There is an extra ; after the end of a function, which needs to be removed
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Tomer Tayar <ttayar@habana.ai>

7491c036

habanalabs: Avoid running restore chunks if no execute chunks · 1718a45b

Tomer Tayar authored Jan 05, 2020

CS with no chunks for execute phase is invalid, so its
context_switch/restore phase should not be run.
Hence, move the check of the execute chunks number to the beginning of
hl_cs_ioctl().
Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>

1718a45b

habanalabs: Modify CS jobs counter to u16 · f3a838c0

Tomer Tayar authored Jan 05, 2020

As HL_MAX_JOBS_PER_CS is 512, it is possible that more than 255 CS jobs
will be submitted for a certain queue. Hence, modify the
"jobs_in_queue_cnt" parameter of the "hl_cs" structure to be u16 instead
of u8.
Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>

f3a838c0

habanalabs: split the host MMU properties · 64a7e295

Omer Shpigelman authored Jan 05, 2020

Host memory may be allocated with huge pages.
A different virtual range may be used for mapping in this case.
Add Huge PCI MMU (HPMMU) properties to support it.
This patch is a prerequisite for future ASICs support and has no effect on
Goya ASIC as currently a single virtual host range is used for all page
sizes.
Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>

64a7e295

habanalabs: use the user CB size as a default job size · 240c92fd

Omer Shpigelman authored Dec 16, 2019

When no patched command buffer (CB) is created, use the user CB size as
the job size.
Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>

240c92fd

habanalabs: flush only at the end of the map/unmap · 7fc40bca

Pawel Piskorski authored Dec 06, 2019

Optimize hl_mmu_map and hl_mmu_unmap by not calling flush(ctx)
within per-page loop.
Signed-off-by: Pawel Piskorski <ppiskorski@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>

7fc40bca

23 Mar, 2020 7 commits

nvmem: mxs-ocotp: Use devm_add_action_or_reset() for cleanup · bbde5709

Anson Huang authored Mar 23, 2020

Use devm_add_action_or_reset() for cleanup to call clk_unprepare(),
which can simplify the error handling in .probe, and .remove callback
can be dropped.
Signed-off-by: Anson Huang <Anson.Huang@nxp.com>
Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Link: https://lore.kernel.org/r/20200323150007.7487-5-srinivas.kandagatla@linaro.orgSigned-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

bbde5709

nvmem: sprd: Determine double data programming from device data · 4bd5a15d

Baolin Wang authored Mar 23, 2020

We've saved the double data flag in the device data, so we should
use it when programming a block.
Signed-off-by: Baolin Wang <baolin.wang7@gmail.com>
Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Link: https://lore.kernel.org/r/20200323150007.7487-4-srinivas.kandagatla@linaro.orgSigned-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

4bd5a15d

nvmem: sprd: Optimize the block lock operation · 5af25388

Freeman Liu authored Mar 23, 2020

We have some cases that will programme the eFuse block partially multiple
times, so we should allow the block to be programmed again if it was
programmed partially. But we should lock the block if the whole block
was programmed. Thus add a condition to validate if we need lock the
block or not.

Moreover we only enable the auto-check function when locking the block.
Signed-off-by: Freeman Liu <freeman.liu@unisoc.com>
Signed-off-by: Baolin Wang <baolin.wang7@gmail.com>
Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Link: https://lore.kernel.org/r/20200323150007.7487-3-srinivas.kandagatla@linaro.orgSigned-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

5af25388

nvmem: sprd: Fix the block lock operation · c66ebde4

Freeman Liu authored Mar 23, 2020

According to the Spreadtrum eFuse specification, we should write 0 to
the block to trigger the lock operation.

Fixes: 096030e7 ("nvmem: sprd: Add Spreadtrum SoCs eFuse support")
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Freeman Liu <freeman.liu@unisoc.com>
Signed-off-by: Baolin Wang <baolin.wang7@gmail.com>
Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Link: https://lore.kernel.org/r/20200323150007.7487-2-srinivas.kandagatla@linaro.orgSigned-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

c66ebde4

Merge tag 'soundwire-5.7-rc1' of... · 33e12f6e

Greg Kroah-Hartman authored Mar 23, 2020

Merge tag 'soundwire-5.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire into char-misc-next

Vinod writes:

soundwire updates for v5.7-rc1

This contains updates to stream and pm handling in the core as well as
updates to Intel drivers for hw sequencing and multi-link.

Details:
Core:
  - Updates to stream handling for state machine checks
  - Changes to handle potential races for probe/enumeration and init of the bus
  - Add no pm version of read and writes
  - Support for multiple Slave on same link
  - Add read_only_wordlength for simple/reduced ports

Intel:
  - Updates to cadence lib to handle hw sequencing
  - Support for audio dai calls in intel driver
  - Multi link support for cadence lib

Qualcomm:
  - Support for get_sdw_stream()

* tag 'soundwire-5.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire: (43 commits)
  soundwire: qcom: add support for get_sdw_stream()
  soundwire: stream: Add read_only_wordlength flag to port properties
  soundwire: cadence: clear FIFO to avoid pop noise issue on playback start
  soundwire: cadence: multi-link support
  soundwire: cadence: commit changes in the exit_reset() sequence
  soundwire: cadence: remove automatic command retries
  soundwire: cadence: remove PREQ_DELAY assignment
  soundwire: cadence: enable NORMAL operation in cdns_init()
  soundwire: cadence: reorder MCP_CONFIG settings
  soundwire: cadence: make SSP interval programmable
  soundwire: cadence: move clock/SSP related inits to dedicated function
  soundwire: cadence: merge routines to clear/set bits
  soundwire: cadence: mask Slave interrupt before stopping clock
  soundwire: cadence: fix a io timeout issue in S3 test
  soundwire: cadence: add clock_stop/restart routines
  soundwire: cadence: handle error cases with CONFIG_UPDATE
  soundwire: cadence: add interface to check clock status
  soundwire: cadence: simplifiy cdns_init()
  soundwire: cadence: s/update_config/config_update
  soundwire: stream: use sdw_write instead of update
  ...

33e12f6e

Merge 5.6-rc7 into char-misc-next · baca54d9

Greg Kroah-Hartman authored Mar 23, 2020

We need the char/misc driver fixes in here as well.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

baca54d9

Linux 5.6-rc7 · 16fbf79b
Linus Torvalds authored Mar 22, 2020

16fbf79b

22 Mar, 2020 4 commits

Merge tag 'for-5.6-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux · 67d584e3

Linus Torvalds authored Mar 22, 2020

Pull btrfs fixes from David Sterba:
 "Two fixes.

  The first is a regression: when dropping some incompat bits the
  conditions were reversed. The other is a fix for rename whiteout
  potentially leaving stack memory linked to a list"

* tag 'for-5.6-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
  btrfs: fix removal of raid[56|1c34} incompat flags after removing block group
  btrfs: fix log context list corruption after rename whiteout error

67d584e3

Merge branch 'akpm' (patches from Andrew) · b3c03db6

Linus Torvalds authored Mar 22, 2020

Merge misc fixes from Andrew Morton:
 "10 fixes"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  x86/mm: split vmalloc_sync_all()
  mm, slub: prevent kmalloc_node crashes and memory leaks
  mm/mmu_notifier: silence PROVE_RCU_LIST warnings
  epoll: fix possible lost wakeup on epoll_ctl() path
  mm: do not allow MADV_PAGEOUT for CoW pages
  mm, memcg: throttle allocators based on ancestral memory.high
  mm, memcg: fix corruption on 64-bit divisor in memory.high throttling
  page-flags: fix a crash at SetPageError(THP_SWAP)
  mm/hotplug: fix hot remove failure in SPARSEMEM|!VMEMMAP case
  memcg: fix NULL pointer dereference in __mem_cgroup_usage_unregister_event

b3c03db6

x86/mm: split vmalloc_sync_all() · 763802b5

Joerg Roedel authored Mar 21, 2020

Commit 3f8fd02b ("mm/vmalloc: Sync unmappings in
__purge_vmap_area_lazy()") introduced a call to vmalloc_sync_all() in
the vunmap() code-path.  While this change was necessary to maintain
correctness on x86-32-pae kernels, it also adds additional cycles for
architectures that don't need it.

Specifically on x86-64 with CONFIG_VMAP_STACK=y some people reported
severe performance regressions in micro-benchmarks because it now also
calls the x86-64 implementation of vmalloc_sync_all() on vunmap().  But
the vmalloc_sync_all() implementation on x86-64 is only needed for newly
created mappings.

To avoid the unnecessary work on x86-64 and to gain the performance
back, split up vmalloc_sync_all() into two functions:

	* vmalloc_sync_mappings(), and
	* vmalloc_sync_unmappings()

Most call-sites to vmalloc_sync_all() only care about new mappings being
synchronized.  The only exception is the new call-site added in the
above mentioned commit.

Shile Zhang directed us to a report of an 80% regression in reaim
throughput.

Fixes: 3f8fd02b ("mm/vmalloc: Sync unmappings in __purge_vmap_area_lazy()")
Reported-by: kernel test robot <oliver.sang@intel.com>
Reported-by: Shile Zhang <shile.zhang@linux.alibaba.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Tested-by: Borislav Petkov <bp@suse.de>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	[GHES]
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/20191009124418.8286-1-joro@8bytes.org
Link: https://lists.01.org/hyperkitty/list/lkp@lists.01.org/thread/4D3JPPHBNOSPFK2KEPC6KGKS6J25AIDB/
Link: http://lkml.kernel.org/r/20191113095530.228959-1-shile.zhang@linux.alibaba.comSigned-off-by: Linus Torvalds <torvalds@linux-foundation.org>

763802b5

mm, slub: prevent kmalloc_node crashes and memory leaks · 0715e6c5

Vlastimil Babka authored Mar 21, 2020

Sachin reports [1] a crash in SLUB __slab_alloc():

  BUG: Kernel NULL pointer dereference on read at 0x000073b0
  Faulting instruction address: 0xc0000000003d55f4
  Oops: Kernel access of bad area, sig: 11 [#1]
  LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
  Modules linked in:
  CPU: 19 PID: 1 Comm: systemd Not tainted 5.6.0-rc2-next-20200218-autotest #1
  NIP:  c0000000003d55f4 LR: c0000000003d5b94 CTR: 0000000000000000
  REGS: c0000008b37836d0 TRAP: 0300   Not tainted  (5.6.0-rc2-next-20200218-autotest)
  MSR:  8000000000009033 <SF,EE,ME,IR,DR,RI,LE>  CR: 24004844  XER: 00000000
  CFAR: c00000000000dec4 DAR: 00000000000073b0 DSISR: 40000000 IRQMASK: 1
  GPR00: c0000000003d5b94 c0000008b3783960 c00000000155d400 c0000008b301f500
  GPR04: 0000000000000dc0 0000000000000002 c0000000003443d8 c0000008bb398620
  GPR08: 00000008ba2f0000 0000000000000001 0000000000000000 0000000000000000
  GPR12: 0000000024004844 c00000001ec52a00 0000000000000000 0000000000000000
  GPR16: c0000008a1b20048 c000000001595898 c000000001750c18 0000000000000002
  GPR20: c000000001750c28 c000000001624470 0000000fffffffe0 5deadbeef0000122
  GPR24: 0000000000000001 0000000000000dc0 0000000000000002 c0000000003443d8
  GPR28: c0000008b301f500 c0000008bb398620 0000000000000000 c00c000002287180
  NIP ___slab_alloc+0x1f4/0x760
  LR __slab_alloc+0x34/0x60
  Call Trace:
    ___slab_alloc+0x334/0x760 (unreliable)
    __slab_alloc+0x34/0x60
    __kmalloc_node+0x110/0x490
    kvmalloc_node+0x58/0x110
    mem_cgroup_css_online+0x108/0x270
    online_css+0x48/0xd0
    cgroup_apply_control_enable+0x2ec/0x4d0
    cgroup_mkdir+0x228/0x5f0
    kernfs_iop_mkdir+0x90/0xf0
    vfs_mkdir+0x110/0x230
    do_mkdirat+0xb0/0x1a0
    system_call+0x5c/0x68

This is a PowerPC platform with following NUMA topology:

  available: 2 nodes (0-1)
  node 0 cpus:
  node 0 size: 0 MB
  node 0 free: 0 MB
  node 1 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
  node 1 size: 35247 MB
  node 1 free: 30907 MB
  node distances:
  node   0   1
    0:  10  40
    1:  40  10

  possible numa nodes: 0-31

This only happens with a mmotm patch "mm/memcontrol.c: allocate
shrinker_map on appropriate NUMA node" [2] which effectively calls
kmalloc_node for each possible node.  SLUB however only allocates
kmem_cache_node on online N_NORMAL_MEMORY nodes, and relies on
node_to_mem_node to return such valid node for other nodes since commit
a561ce00 ("slub: fall back to node_to_mem_node() node if allocating
on memoryless node").  This is however not true in this configuration
where the _node_numa_mem_ array is not initialized for nodes 0 and 2-31,
thus it contains zeroes and get_partial() ends up accessing
non-allocated kmem_cache_node.

A related issue was reported by Bharata (originally by Ramachandran) [3]
where a similar PowerPC configuration, but with mainline kernel without
patch [2] ends up allocating large amounts of pages by kmalloc-1k
kmalloc-512.  This seems to have the same underlying issue with
node_to_mem_node() not behaving as expected, and might probably also
lead to an infinite loop with CONFIG_SLUB_CPU_PARTIAL [4].

This patch should fix both issues by not relying on node_to_mem_node()
anymore and instead simply falling back to NUMA_NO_NODE, when
kmalloc_node(node) is attempted for a node that's not online, or has no
usable memory.  The "usable memory" condition is also changed from
node_present_pages() to N_NORMAL_MEMORY node state, as that is exactly
the condition that SLUB uses to allocate kmem_cache_node structures.
The check in get_partial() is removed completely, as the checks in
___slab_alloc() are now sufficient to prevent get_partial() being
reached with an invalid node.

[1] https://lore.kernel.org/linux-next/3381CD91-AB3D-4773-BA04-E7A072A63968@linux.vnet.ibm.com/
[2] https://lore.kernel.org/linux-mm/fff0e636-4c36-ed10-281c-8cdb0687c839@virtuozzo.com/
[3] https://lore.kernel.org/linux-mm/20200317092624.GB22538@in.ibm.com/
[4] https://lore.kernel.org/linux-mm/088b5996-faae-8a56-ef9c-5b567125ae54@suse.cz/

Fixes: a561ce00 ("slub: fall back to node_to_mem_node() node if allocating on memoryless node")
Reported-by: Sachin Sant <sachinp@linux.vnet.ibm.com>
Reported-by: PUVICHAKRAVARTHY RAMACHANDRAN <puvichakravarthy@in.ibm.com>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Tested-by: Sachin Sant <sachinp@linux.vnet.ibm.com>
Tested-by: Bharata B Rao <bharata@linux.ibm.com>
Reviewed-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Christopher Lameter <cl@linux.com>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Kirill Tkhai <ktkhai@virtuozzo.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Nathan Lynch <nathanl@linux.ibm.com>
Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/20200320115533.9604-1-vbabka@suse.czDebugged-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

0715e6c5