- 25 Mar, 2020 2 commits
-
-
Srinivas Kandagatla authored
As we are planning to move to use sysfs is_bin_visible callback, having root_only as part of nvmem_device will help decide correct permissions. Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> Link: https://lore.kernel.org/r/20200325122116.15096-2-srinivas.kandagatla@linaro.orgSigned-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Greg Kroah-Hartman authored
Merge tag 'extcon-next-for-5.7' of git://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/extcon into char-misc-next Chanwoo writes: Update extcon for 5.7 Detailed description for this pull request: 1. Update the extcon provider driver as following: - Add wakeup support for extcon-axp288.c - Clean-up code of -EPROBE_DEFER error case for extcon-palmas.c - Covert extcon-usbc-cros-ec.txt to yaml format 2. Export symbol of extcon_get_edev_name() * tag 'extcon-next-for-5.7' of git://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/extcon: extcon: axp288: Add wakeup support extcon: Mark extcon_get_edev_name() function as exported symbol extcon: palmas: Hide error messages if gpio returns -EPROBE_DEFER dt-bindings: extcon: usbc-cros-ec: convert extcon-usbc-cros-ec.txt to yaml format
-
- 24 Mar, 2020 27 commits
-
-
Hans de Goede authored
On devices with an AXP288, we need to wakeup from suspend when a charger is plugged in, so that we can do charger-type detection and so that the axp288-charger driver, which listens for our extcon events, can configure the input-current-limit accordingly. Cc: stable@vger.kernel.org Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
-
Mayank Rana authored
extcon_get_edev_name() function provides client driver to request extcon dev's name. If extcon driver and client driver are compiled as loadable modules, extcon_get_edev_name() function symbol is not visible to client driver. Hence mark extcon_find_edev_name() function as exported symbol. Signed-off-by: Mayank Rana <mrana@codeaurora.org> Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
-
H. Nikolaus Schaller authored
If the gpios are probed after this driver (e.g. if they come from an i2c expander) there is no need to print an error message. Signed-off-by: H. Nikolaus Schaller <hns@goldelico.com> Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
-
Dafna Hirschfeld authored
convert the binding file extcon-usbc-cros-ec.txt to yaml format extcon-usbc-cros-ec.yaml This was tested and verified on ARM with: make dt_binding_check DT_SCHEMA_FILES=Documentation/devicetree/bindings/extcon/extcon-usbc-cros-ec.yaml make dtbs_check DT_SCHEMA_FILES=Documentation/devicetree/bindings/extcon/extcon-usbc-cros-ec.yaml Signed-off-by: Dafna Hirschfeld <dafna.hirschfeld@collabora.com> Reviewed-by: Enric Balletbo i Serra <enric.balletbo@collabora.com> Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
-
Manivannan Sadhasivam authored
The module owner field can be used to prevent the removal of kernel modules when there are any device files associated with it opened in userspace. Hence, modify the API to pass module owner field. For convenience, module_mhi_driver() macro is used which takes care of passing the module owner through THIS_MODULE of the module of the driver and also avoiding the use of specifying the default MHI client driver register/unregister routines. Suggested-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org> Link: https://lore.kernel.org/r/20200324061050.14845-2-manivannan.sadhasivam@linaro.orgSigned-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Alexander Shishkin authored
Some use cases prefer to keep collecting the trace data into the last available window while the other windows are being offloaded instead of stopping the trace. In this scenario, the window switch happens automatically when the next window becomes available again. Add an option to allow this and a sysfs attribute to enable it. Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://lore.kernel.org/r/20200319085152.52183-1-alexander.shishkin@linux.intel.comSigned-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Randy Dunlap authored
Fix printk format warning by using %z for size_t modifier: ../drivers/bus/mhi/core/boot.c: In function `mhi_rddm_prepare': ../drivers/bus/mhi/core/boot.c:55:15: warning: format `%lx' expects argument of type `long unsigned int', but argument 5 has type `size_t {aka unsigned int}' [-Wformat=] dev_dbg(dev, "Address: %p and len: 0x%lx sequence: %u ", Link: http://lkml.kernel.org/r/c4852a82-cdb9-6318-70a4-96ccb4ba5af2@infradead.org Fixes: 6fdfdd27 ("bus: mhi: core: Add support for downloading RDDM image during panic") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Acked-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Cc: Hemant Kumar <hemantk@codeaurora.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Link: https://lore.kernel.org/r/20200324022505.UiPPJZVXX%akpm@linux-foundation.orgSigned-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Greg Kroah-Hartman authored
Merge tag 'misc-habanalabs-next-2020-03-24' of git://people.freedesktop.org/~gabbayo/linux into char-misc-next Oded writes: This tag contains the following changes for kernel 5.7: - MMU code improvements that includes: - Flush MMU TLB cache only once, at the end of mapping/unmapping function, instead of flushing after mapping of every page. - Add future ASIC support by splitting properties of ASIC capabilities regarding mapping of host memory to regular and huge pages. - Add debugfs interface to write and read 64-bit values from the device's memory/registers. Previously the driver provided interface for 32-bit values and this will allow the user to debug much more quickly. We saw it gives a boost of around 1.5 - 1.7 when reading internal memories. - Support temperature offset via sysfs as defined in https://www.kernel.org/doc/Documentation/hwmon/sysfs-interface - Display historical maximum of various sensors. - Print to kernel log when clock throttling occurs to due breach of power or thermal envelope. Also prints when clock throttling is finished (clock is back to optimal). - Fix bug when moving from manual to auto power-management mode. - Print a message ("unsupported device") to kernel log in case a GAUDI device is recognized. - Small bug fixes and minor improvements to code. * tag 'misc-habanalabs-next-2020-03-24' of git://people.freedesktop.org/~gabbayo/linux: habanalabs: fix pm manual->auto in GOYA habanalabs: show unsupported message for GAUDI habanalabs: add print upon clock change habanalabs: update goya firmware register map habanalabs: Add missing annotation for goya_hw_queues_unlock() habanalabs: Add missing annotation for goya_hw_queues_lock() habanalabs: Remove unused parse_cnt variable habanalabs: provide historical maximum of various sensors habanalabs: modify the return values of hl_read/write routines habanalabs: support temperature offset via sysfs habanalabs: ratelimit error prints of IRQs habanalabs: add debugfs write64/read64 habanalabs: fix DDR bar address setting habanalabs: removing extra ; habanalabs: Avoid running restore chunks if no execute chunks habanalabs: Modify CS jobs counter to u16 habanalabs: split the host MMU properties habanalabs: use the user CB size as a default job size habanalabs: flush only at the end of the map/unmap
-
Oded Gabbay authored
When moving from manual to automatic power management mode in GOYA, the driver didn't correctly place the device in LOW power mode. As a result, if an application was run immediately after the move, it would have run with low frequencies. Reviewed-by: Omer Shpigelman <oshpigelman@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
-
Oded Gabbay authored
If a GAUDI device is present in the system, display an error message that it is not supported by the current kernel. Reviewed-by: Omer Shpigelman <oshpigelman@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
-
Omer Shpigelman authored
Add print upon clock slow down due to power consumption or overheating. In addition, add print when back to optimal clock. Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
-
Oded Gabbay authored
Use specific values in enum of register map to be able to deprecate old values. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
-
Jules Irenge authored
Sparse reports a warning at goya_hw_queues_unlock() warning: context imbalance in goya_hw_queues_unlock() - unexpected unlock The root cause is a missing annotation at goya_hw_queues_unlock() Add the missing __releases(&goya->hw_queues_lock) annotation Signed-off-by: Jules Irenge <jbi.octave@gmail.com> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
-
Jules Irenge authored
Sparse reports a warning at goya_hw_queues_lock() warning: context imbalance in goya_hw_queues_lock() - wrong count at exit The root cause is a missing annotation at goya_hw_queues_lock() Add the missing __acquires(&goya->hw_queues_lock) annotation Signed-off-by: Jules Irenge <jbi.octave@gmail.com> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
-
Tomer Tayar authored
The "parse_cnt" variable is incremented while validating the CS chunks, but it is actually not being used. Signed-off-by: Tomer Tayar <ttayar@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
-
Christine Gharzuzi authored
Add support for hwmon_in_highest, hwmon_temp_highest and hwmon_curr_highest attributes. These attributes retrieve the historical maximum voltage, temperature and current that were sampled, respectively. Signed-off-by: Christine Gharzuzi <cgharzuzi@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
-
Moti Haimovski authored
The hl read and write routines implement the hwmon_ops read and write interface routines respectively. These routines are expected to return a completion status when called, which was not the case until this commit. This commit modifies these routines to return 0 upon success and a negative error value upon failure. Signed-off-by: Moti Haimovski <mhaimovski@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
-
Moti Haimovski authored
This commit adds support for offsetting the temperatures reading by a specified value as defined in https://www.kernel.org/doc/Documentation/hwmon/sysfs-interface using the standard sysfs defined for hwmon. This is required by system administrators to inject errors to test their monitoring applications in data centers. Signed-off-by: Moti Haimovski <mhaimovski@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
-
Oded Gabbay authored
The compute engines can perform millions of transactions per second. If there is a bug in the S/W stack, we could get a lot of interrupts and spam the kernel log. Therefore, ratelimit these prints Reviewed-by: Omer Shpigelman <oshpigelman@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
-
Moti Haimovski authored
Allow debug user to write/read 64-bit data through debugfs. This will expedite the dump process of the (large) internal memories of the device done during debug. Signed-off-by: Moti Haimovski <mhaimovski@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
-
Omer Shpigelman authored
DRAM_PHYS_BASE is already taken into account in MMU_PAGE_TABLES_ADDR. Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
-
Oded Gabbay authored
There is an extra ; after the end of a function, which needs to be removed Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Tomer Tayar <ttayar@habana.ai>
-
Tomer Tayar authored
CS with no chunks for execute phase is invalid, so its context_switch/restore phase should not be run. Hence, move the check of the execute chunks number to the beginning of hl_cs_ioctl(). Signed-off-by: Tomer Tayar <ttayar@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
-
Tomer Tayar authored
As HL_MAX_JOBS_PER_CS is 512, it is possible that more than 255 CS jobs will be submitted for a certain queue. Hence, modify the "jobs_in_queue_cnt" parameter of the "hl_cs" structure to be u16 instead of u8. Signed-off-by: Tomer Tayar <ttayar@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
-
Omer Shpigelman authored
Host memory may be allocated with huge pages. A different virtual range may be used for mapping in this case. Add Huge PCI MMU (HPMMU) properties to support it. This patch is a prerequisite for future ASICs support and has no effect on Goya ASIC as currently a single virtual host range is used for all page sizes. Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
-
Omer Shpigelman authored
When no patched command buffer (CB) is created, use the user CB size as the job size. Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
-
Pawel Piskorski authored
Optimize hl_mmu_map and hl_mmu_unmap by not calling flush(ctx) within per-page loop. Signed-off-by: Pawel Piskorski <ppiskorski@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
-
- 23 Mar, 2020 7 commits
-
-
Anson Huang authored
Use devm_add_action_or_reset() for cleanup to call clk_unprepare(), which can simplify the error handling in .probe, and .remove callback can be dropped. Signed-off-by: Anson Huang <Anson.Huang@nxp.com> Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> Link: https://lore.kernel.org/r/20200323150007.7487-5-srinivas.kandagatla@linaro.orgSigned-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Baolin Wang authored
We've saved the double data flag in the device data, so we should use it when programming a block. Signed-off-by: Baolin Wang <baolin.wang7@gmail.com> Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> Link: https://lore.kernel.org/r/20200323150007.7487-4-srinivas.kandagatla@linaro.orgSigned-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Freeman Liu authored
We have some cases that will programme the eFuse block partially multiple times, so we should allow the block to be programmed again if it was programmed partially. But we should lock the block if the whole block was programmed. Thus add a condition to validate if we need lock the block or not. Moreover we only enable the auto-check function when locking the block. Signed-off-by: Freeman Liu <freeman.liu@unisoc.com> Signed-off-by: Baolin Wang <baolin.wang7@gmail.com> Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> Link: https://lore.kernel.org/r/20200323150007.7487-3-srinivas.kandagatla@linaro.orgSigned-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Freeman Liu authored
According to the Spreadtrum eFuse specification, we should write 0 to the block to trigger the lock operation. Fixes: 096030e7 ("nvmem: sprd: Add Spreadtrum SoCs eFuse support") Cc: stable <stable@vger.kernel.org> Signed-off-by: Freeman Liu <freeman.liu@unisoc.com> Signed-off-by: Baolin Wang <baolin.wang7@gmail.com> Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> Link: https://lore.kernel.org/r/20200323150007.7487-2-srinivas.kandagatla@linaro.orgSigned-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Greg Kroah-Hartman authored
Merge tag 'soundwire-5.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire into char-misc-next Vinod writes: soundwire updates for v5.7-rc1 This contains updates to stream and pm handling in the core as well as updates to Intel drivers for hw sequencing and multi-link. Details: Core: - Updates to stream handling for state machine checks - Changes to handle potential races for probe/enumeration and init of the bus - Add no pm version of read and writes - Support for multiple Slave on same link - Add read_only_wordlength for simple/reduced ports Intel: - Updates to cadence lib to handle hw sequencing - Support for audio dai calls in intel driver - Multi link support for cadence lib Qualcomm: - Support for get_sdw_stream() * tag 'soundwire-5.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire: (43 commits) soundwire: qcom: add support for get_sdw_stream() soundwire: stream: Add read_only_wordlength flag to port properties soundwire: cadence: clear FIFO to avoid pop noise issue on playback start soundwire: cadence: multi-link support soundwire: cadence: commit changes in the exit_reset() sequence soundwire: cadence: remove automatic command retries soundwire: cadence: remove PREQ_DELAY assignment soundwire: cadence: enable NORMAL operation in cdns_init() soundwire: cadence: reorder MCP_CONFIG settings soundwire: cadence: make SSP interval programmable soundwire: cadence: move clock/SSP related inits to dedicated function soundwire: cadence: merge routines to clear/set bits soundwire: cadence: mask Slave interrupt before stopping clock soundwire: cadence: fix a io timeout issue in S3 test soundwire: cadence: add clock_stop/restart routines soundwire: cadence: handle error cases with CONFIG_UPDATE soundwire: cadence: add interface to check clock status soundwire: cadence: simplifiy cdns_init() soundwire: cadence: s/update_config/config_update soundwire: stream: use sdw_write instead of update ...
-
Greg Kroah-Hartman authored
We need the char/misc driver fixes in here as well. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Linus Torvalds authored
-
- 22 Mar, 2020 4 commits
-
-
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linuxLinus Torvalds authored
Pull btrfs fixes from David Sterba: "Two fixes. The first is a regression: when dropping some incompat bits the conditions were reversed. The other is a fix for rename whiteout potentially leaving stack memory linked to a list" * tag 'for-5.6-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: fix removal of raid[56|1c34} incompat flags after removing block group btrfs: fix log context list corruption after rename whiteout error
-
Linus Torvalds authored
Merge misc fixes from Andrew Morton: "10 fixes" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: x86/mm: split vmalloc_sync_all() mm, slub: prevent kmalloc_node crashes and memory leaks mm/mmu_notifier: silence PROVE_RCU_LIST warnings epoll: fix possible lost wakeup on epoll_ctl() path mm: do not allow MADV_PAGEOUT for CoW pages mm, memcg: throttle allocators based on ancestral memory.high mm, memcg: fix corruption on 64-bit divisor in memory.high throttling page-flags: fix a crash at SetPageError(THP_SWAP) mm/hotplug: fix hot remove failure in SPARSEMEM|!VMEMMAP case memcg: fix NULL pointer dereference in __mem_cgroup_usage_unregister_event
-
Joerg Roedel authored
Commit 3f8fd02b ("mm/vmalloc: Sync unmappings in __purge_vmap_area_lazy()") introduced a call to vmalloc_sync_all() in the vunmap() code-path. While this change was necessary to maintain correctness on x86-32-pae kernels, it also adds additional cycles for architectures that don't need it. Specifically on x86-64 with CONFIG_VMAP_STACK=y some people reported severe performance regressions in micro-benchmarks because it now also calls the x86-64 implementation of vmalloc_sync_all() on vunmap(). But the vmalloc_sync_all() implementation on x86-64 is only needed for newly created mappings. To avoid the unnecessary work on x86-64 and to gain the performance back, split up vmalloc_sync_all() into two functions: * vmalloc_sync_mappings(), and * vmalloc_sync_unmappings() Most call-sites to vmalloc_sync_all() only care about new mappings being synchronized. The only exception is the new call-site added in the above mentioned commit. Shile Zhang directed us to a report of an 80% regression in reaim throughput. Fixes: 3f8fd02b ("mm/vmalloc: Sync unmappings in __purge_vmap_area_lazy()") Reported-by: kernel test robot <oliver.sang@intel.com> Reported-by: Shile Zhang <shile.zhang@linux.alibaba.com> Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Tested-by: Borislav Petkov <bp@suse.de> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> [GHES] Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: <stable@vger.kernel.org> Link: http://lkml.kernel.org/r/20191009124418.8286-1-joro@8bytes.org Link: https://lists.01.org/hyperkitty/list/lkp@lists.01.org/thread/4D3JPPHBNOSPFK2KEPC6KGKS6J25AIDB/ Link: http://lkml.kernel.org/r/20191113095530.228959-1-shile.zhang@linux.alibaba.comSigned-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Vlastimil Babka authored
Sachin reports [1] a crash in SLUB __slab_alloc(): BUG: Kernel NULL pointer dereference on read at 0x000073b0 Faulting instruction address: 0xc0000000003d55f4 Oops: Kernel access of bad area, sig: 11 [#1] LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries Modules linked in: CPU: 19 PID: 1 Comm: systemd Not tainted 5.6.0-rc2-next-20200218-autotest #1 NIP: c0000000003d55f4 LR: c0000000003d5b94 CTR: 0000000000000000 REGS: c0000008b37836d0 TRAP: 0300 Not tainted (5.6.0-rc2-next-20200218-autotest) MSR: 8000000000009033 <SF,EE,ME,IR,DR,RI,LE> CR: 24004844 XER: 00000000 CFAR: c00000000000dec4 DAR: 00000000000073b0 DSISR: 40000000 IRQMASK: 1 GPR00: c0000000003d5b94 c0000008b3783960 c00000000155d400 c0000008b301f500 GPR04: 0000000000000dc0 0000000000000002 c0000000003443d8 c0000008bb398620 GPR08: 00000008ba2f0000 0000000000000001 0000000000000000 0000000000000000 GPR12: 0000000024004844 c00000001ec52a00 0000000000000000 0000000000000000 GPR16: c0000008a1b20048 c000000001595898 c000000001750c18 0000000000000002 GPR20: c000000001750c28 c000000001624470 0000000fffffffe0 5deadbeef0000122 GPR24: 0000000000000001 0000000000000dc0 0000000000000002 c0000000003443d8 GPR28: c0000008b301f500 c0000008bb398620 0000000000000000 c00c000002287180 NIP ___slab_alloc+0x1f4/0x760 LR __slab_alloc+0x34/0x60 Call Trace: ___slab_alloc+0x334/0x760 (unreliable) __slab_alloc+0x34/0x60 __kmalloc_node+0x110/0x490 kvmalloc_node+0x58/0x110 mem_cgroup_css_online+0x108/0x270 online_css+0x48/0xd0 cgroup_apply_control_enable+0x2ec/0x4d0 cgroup_mkdir+0x228/0x5f0 kernfs_iop_mkdir+0x90/0xf0 vfs_mkdir+0x110/0x230 do_mkdirat+0xb0/0x1a0 system_call+0x5c/0x68 This is a PowerPC platform with following NUMA topology: available: 2 nodes (0-1) node 0 cpus: node 0 size: 0 MB node 0 free: 0 MB node 1 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 node 1 size: 35247 MB node 1 free: 30907 MB node distances: node 0 1 0: 10 40 1: 40 10 possible numa nodes: 0-31 This only happens with a mmotm patch "mm/memcontrol.c: allocate shrinker_map on appropriate NUMA node" [2] which effectively calls kmalloc_node for each possible node. SLUB however only allocates kmem_cache_node on online N_NORMAL_MEMORY nodes, and relies on node_to_mem_node to return such valid node for other nodes since commit a561ce00 ("slub: fall back to node_to_mem_node() node if allocating on memoryless node"). This is however not true in this configuration where the _node_numa_mem_ array is not initialized for nodes 0 and 2-31, thus it contains zeroes and get_partial() ends up accessing non-allocated kmem_cache_node. A related issue was reported by Bharata (originally by Ramachandran) [3] where a similar PowerPC configuration, but with mainline kernel without patch [2] ends up allocating large amounts of pages by kmalloc-1k kmalloc-512. This seems to have the same underlying issue with node_to_mem_node() not behaving as expected, and might probably also lead to an infinite loop with CONFIG_SLUB_CPU_PARTIAL [4]. This patch should fix both issues by not relying on node_to_mem_node() anymore and instead simply falling back to NUMA_NO_NODE, when kmalloc_node(node) is attempted for a node that's not online, or has no usable memory. The "usable memory" condition is also changed from node_present_pages() to N_NORMAL_MEMORY node state, as that is exactly the condition that SLUB uses to allocate kmem_cache_node structures. The check in get_partial() is removed completely, as the checks in ___slab_alloc() are now sufficient to prevent get_partial() being reached with an invalid node. [1] https://lore.kernel.org/linux-next/3381CD91-AB3D-4773-BA04-E7A072A63968@linux.vnet.ibm.com/ [2] https://lore.kernel.org/linux-mm/fff0e636-4c36-ed10-281c-8cdb0687c839@virtuozzo.com/ [3] https://lore.kernel.org/linux-mm/20200317092624.GB22538@in.ibm.com/ [4] https://lore.kernel.org/linux-mm/088b5996-faae-8a56-ef9c-5b567125ae54@suse.cz/ Fixes: a561ce00 ("slub: fall back to node_to_mem_node() node if allocating on memoryless node") Reported-by: Sachin Sant <sachinp@linux.vnet.ibm.com> Reported-by: PUVICHAKRAVARTHY RAMACHANDRAN <puvichakravarthy@in.ibm.com> Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Tested-by: Sachin Sant <sachinp@linux.vnet.ibm.com> Tested-by: Bharata B Rao <bharata@linux.ibm.com> Reviewed-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Hocko <mhocko@kernel.org> Cc: Christopher Lameter <cl@linux.com> Cc: linuxppc-dev@lists.ozlabs.org Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Kirill Tkhai <ktkhai@virtuozzo.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Nathan Lynch <nathanl@linux.ibm.com> Cc: <stable@vger.kernel.org> Link: http://lkml.kernel.org/r/20200320115533.9604-1-vbabka@suse.czDebugged-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-