- 03 Nov, 2018 3 commits
-
-
Andrea Arcangeli authored
THP allocation might be really disruptive when allocated on NUMA system with the local node full or hard to reclaim. Stefan has posted an allocation stall report on 4.12 based SLES kernel which suggests the same issue: kvm: page allocation stalls for 194572ms, order:9, mode:0x4740ca(__GFP_HIGHMEM|__GFP_IO|__GFP_FS|__GFP_COMP|__GFP_NOMEMALLOC|__GFP_HARDWALL|__GFP_THISNODE|__GFP_MOVABLE|__GFP_DIRECT_RECLAIM), nodemask=(null) kvm cpuset=/ mems_allowed=0-1 CPU: 10 PID: 84752 Comm: kvm Tainted: G W 4.12.0+98-ph <a href="/view.php?id=1" title="[geschlossen] Integration Ramdisk" class="resolved">0000001</a> SLE15 (unreleased) Hardware name: Supermicro SYS-1029P-WTRT/X11DDW-NT, BIOS 2.0 12/05/2017 Call Trace: dump_stack+0x5c/0x84 warn_alloc+0xe0/0x180 __alloc_pages_slowpath+0x820/0xc90 __alloc_pages_nodemask+0x1cc/0x210 alloc_pages_vma+0x1e5/0x280 do_huge_pmd_wp_page+0x83f/0xf00 __handle_mm_fault+0x93d/0x1060 handle_mm_fault+0xc6/0x1b0 __do_page_fault+0x230/0x430 do_page_fault+0x2a/0x70 page_fault+0x7b/0x80 [...] Mem-Info: active_anon:126315487 inactive_anon:1612476 isolated_anon:5 active_file:60183 inactive_file:245285 isolated_file:0 unevictable:15657 dirty:286 writeback:1 unstable:0 slab_reclaimable:75543 slab_unreclaimable:2509111 mapped:81814 shmem:31764 pagetables:370616 bounce:0 free:32294031 free_pcp:6233 free_cma:0 Node 0 active_anon:254680388kB inactive_anon:1112760kB active_file:240648kB inactive_file:981168kB unevictable:13368kB isolated(anon):0kB isolated(file):0kB mapped:280240kB dirty:1144kB writeback:0kB shmem:95832kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 81225728kB writeback_tmp:0kB unstable:0kB all_unreclaimable? no Node 1 active_anon:250583072kB inactive_anon:5337144kB active_file:84kB inactive_file:0kB unevictable:49260kB isolated(anon):20kB isolated(file):0kB mapped:47016kB dirty:0kB writeback:4kB shmem:31224kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 31897600kB writeback_tmp:0kB unstable:0kB all_unreclaimable? no The defrag mode is "madvise" and from the above report it is clear that the THP has been allocated for MADV_HUGEPAGA vma. Andrea has identified that the main source of the problem is __GFP_THISNODE usage: : The problem is that direct compaction combined with the NUMA : __GFP_THISNODE logic in mempolicy.c is telling reclaim to swap very : hard the local node, instead of failing the allocation if there's no : THP available in the local node. : : Such logic was ok until __GFP_THISNODE was added to the THP allocation : path even with MPOL_DEFAULT. : : The idea behind the __GFP_THISNODE addition, is that it is better to : provide local memory in PAGE_SIZE units than to use remote NUMA THP : backed memory. That largely depends on the remote latency though, on : threadrippers for example the overhead is relatively low in my : experience. : : The combination of __GFP_THISNODE and __GFP_DIRECT_RECLAIM results in : extremely slow qemu startup with vfio, if the VM is larger than the : size of one host NUMA node. This is because it will try very hard to : unsuccessfully swapout get_user_pages pinned pages as result of the : __GFP_THISNODE being set, instead of falling back to PAGE_SIZE : allocations and instead of trying to allocate THP on other nodes (it : would be even worse without vfio type1 GUP pins of course, except it'd : be swapping heavily instead). Fix this by removing __GFP_THISNODE for THP requests which are requesting the direct reclaim. This effectivelly reverts 5265047a on the grounds that the zone/node reclaim was known to be disruptive due to premature reclaim when there was memory free. While it made sense at the time for HPC workloads without NUMA awareness on rare machines, it was ultimately harmful in the majority of cases. The existing behaviour is similar, if not as widespare as it applies to a corner case but crucially, it cannot be tuned around like zone_reclaim_mode can. The default behaviour should always be to cause the least harm for the common case. If there are specialised use cases out there that want zone_reclaim_mode in specific cases, then it can be built on top. Longterm we should consider a memory policy which allows for the node reclaim like behavior for the specific memory ranges which would allow a [1] http://lkml.kernel.org/r/20180820032204.9591-1-aarcange@redhat.com Mel said: : Both patches look correct to me but I'm responding to this one because : it's the fix. The change makes sense and moves further away from the : severe stalling behaviour we used to see with both THP and zone reclaim : mode. : : I put together a basic experiment with usemem configured to reference a : buffer multiple times that is 80% the size of main memory on a 2-socket : box with symmetric node sizes and defrag set to "always". The defrag : setting is not the default but it would be functionally similar to : accessing a buffer with madvise(MADV_HUGEPAGE). Usemem is configured to : reference the buffer multiple times and while it's not an interesting : workload, it would be expected to complete reasonably quickly as it fits : within memory. The results were; : : usemem : vanilla noreclaim-v1 : Amean Elapsd-1 42.78 ( 0.00%) 26.87 ( 37.18%) : Amean Elapsd-3 27.55 ( 0.00%) 7.44 ( 73.00%) : Amean Elapsd-4 5.72 ( 0.00%) 5.69 ( 0.45%) : : This shows the elapsed time in seconds for 1 thread, 3 threads and 4 : threads referencing buffers 80% the size of memory. With the patches : applied, it's 37.18% faster for the single thread and 73% faster with two : threads. Note that 4 threads showing little difference does not indicate : the problem is related to thread counts. It's simply the case that 4 : threads gets spread so their workload mostly fits in one node. : : The overall view from /proc/vmstats is more startling : : 4.19.0-rc1 4.19.0-rc1 : vanillanoreclaim-v1r1 : Minor Faults 35593425 708164 : Major Faults 484088 36 : Swap Ins 3772837 0 : Swap Outs 3932295 0 : : Massive amounts of swap in/out without the patch : : Direct pages scanned 6013214 0 : Kswapd pages scanned 0 0 : Kswapd pages reclaimed 0 0 : Direct pages reclaimed 4033009 0 : : Lots of reclaim activity without the patch : : Kswapd efficiency 100% 100% : Kswapd velocity 0.000 0.000 : Direct efficiency 67% 100% : Direct velocity 11191.956 0.000 : : Mostly from direct reclaim context as you'd expect without the patch. : : Page writes by reclaim 3932314.000 0.000 : Page writes file 19 0 : Page writes anon 3932295 0 : Page reclaim immediate 42336 0 : : Writes from reclaim context is never good but the patch eliminates it. : : We should never have default behaviour to thrash the system for such a : basic workload. If zone reclaim mode behaviour is ever desired but on a : single task instead of a global basis then the sensible option is to build : a mempolicy that enforces that behaviour. This was a severe regression compared to previous kernels that made important workloads unusable and it starts when __GFP_THISNODE was added to THP allocations under MADV_HUGEPAGE. It is not a significant risk to go to the previous behavior before __GFP_THISNODE was added, it worked like that for years. This was simply an optimization to some lucky workloads that can fit in a single node, but it ended up breaking the VM for others that can't possibly fit in a single node, so going back is safe. [mhocko@suse.com: rewrote the changelog based on the one from Andrea] Link: http://lkml.kernel.org/r/20180925120326.24392-2-mhocko@kernel.org Fixes: 5265047a ("mm, thp: really limit transparent hugepage allocation to local node") Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> Signed-off-by: Michal Hocko <mhocko@suse.com> Reported-by: Stefan Priebe <s.priebe@profihost.ag> Debugged-by: Andrea Arcangeli <aarcange@redhat.com> Reported-by: Alex Williamson <alex.williamson@redhat.com> Reviewed-by: Mel Gorman <mgorman@techsingularity.net> Tested-by: Mel Gorman <mgorman@techsingularity.net> Cc: Zi Yan <zi.yan@cs.rutgers.edu> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: David Rientjes <rientjes@google.com> Cc: "Kirill A. Shutemov" <kirill@shutemov.name> Cc: <stable@vger.kernel.org> [4.1+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Sam Protsenko authored
ctags indexing ("make tags" command) throws this warning: ctags: Warning: include/linux/notifier.h:125: null expansion of name pattern "\1" This is the result of DEFINE_PER_CPU() macro expansion. Fix that by getting rid of line break. Similar fix was already done in commit 25528213 ("tags: Fix DEFINE_PER_CPU expansions"), but this one probably wasn't noticed. Link: http://lkml.kernel.org/r/20181030202808.28027-1-semen.protsenko@linaro.org Fixes: 9c80172b ("kernel/SRCU: provide a static initializer") Signed-off-by: Sam Protsenko <semen.protsenko@linaro.org> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Andy Shevchenko <andy.shevchenko@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Roman Gushchin authored
Mike Galbraith reported a regression caused by the commit 9b6f7e16 ("mm: rework memcg kernel stack accounting") on a system with "cgroup_disable=memory" boot option: the system panics with the following stack trace: BUG: unable to handle kernel NULL pointer dereference at 00000000000000f8 PGD 0 P4D 0 Oops: 0002 [#1] PREEMPT SMP PTI CPU: 0 PID: 1 Comm: systemd Not tainted 4.19.0-preempt+ #410 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20180531_142017-buildhw-08.phx2.fed4 RIP: 0010:page_counter_try_charge+0x22/0xc0 Code: 41 5d c3 c3 0f 1f 40 00 0f 1f 44 00 00 48 85 ff 0f 84 a7 00 00 00 41 56 48 89 f8 49 89 fe 49 Call Trace: try_charge+0xcb/0x780 memcg_kmem_charge_memcg+0x28/0x80 memcg_kmem_charge+0x8b/0x1d0 copy_process.part.41+0x1ca/0x2070 _do_fork+0xd7/0x3d0 do_syscall_64+0x5a/0x180 entry_SYSCALL_64_after_hwframe+0x49/0xbe The problem occurs because get_mem_cgroup_from_current() returns the NULL pointer if memory controller is disabled. Let's check if this is a case at the beginning of memcg_kmem_charge() and just return 0 if mem_cgroup_disabled() returns true. This is how we handle this case in many other places in the memory controller code. Link: http://lkml.kernel.org/r/20181029215123.17830-1-guro@fb.com Fixes: 9b6f7e16 ("mm: rework memcg kernel stack accounting") Signed-off-by: Roman Gushchin <guro@fb.com> Reported-by: Mike Galbraith <efault@gmx.de> Acked-by: Rik van Riel <riel@surriel.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Vladimir Davydov <vdavydov.dev@gmail.com> Cc: Shakeel Butt <shakeelb@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
- 02 Nov, 2018 22 commits
-
-
git://git.kernel.dk/linux-blockLinus Torvalds authored
Pull block layer fixes from Jens Axboe: "The biggest part of this pull request is the revert of the blkcg cleanup series. It had one fix earlier for a stacked device issue, but another one was reported. Rather than play whack-a-mole with this, revert the entire series and try again for the next kernel release. Apart from that, only small fixes/changes. Summary: - Indentation fixup for mtip32xx (Colin Ian King) - The blkcg cleanup series revert (Dennis Zhou) - Two NVMe fixes. One fixing a regression in the nvme request initialization in this merge window, causing nvme-fc to not work. The other is a suspend/resume p2p resource issue (James, Keith) - Fix sg discard merge, allowing us to merge in cases where we didn't before (Jianchao Wang) - Call rq_qos_exit() after the queue is frozen, preventing a hang (Ming) - Fix brd queue setup, fixing an oops if we fail setting up all devices (Ming)" * tag 'for-linus-20181102' of git://git.kernel.dk/linux-block: nvme-pci: fix conflicting p2p resource adds nvme-fc: fix request private initialization blkcg: revert blkcg cleanups series block: brd: associate with queue until adding disk block: call rq_qos_exit() after queue is frozen mtip32xx: clean an indentation issue, remove extraneous tabs block: fix the DISCARD request merge
-
Linus Torvalds authored
Merge tag 'pwm/for-4.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/thierry.reding/linux-pwm Pull pwm updates from Thierry Reding: "This series contains a number of improvements to existing drivers, such as LPSS. Some drivers, such as renesas-tpu and rcar get support for more SoC generations. To round things off this fixes an issue with the sysfs interface" * tag 'pwm/for-4.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/thierry.reding/linux-pwm: pwm: lpss: Only set update bit if we are actually changing the settings pwm: lpss: Force runtime-resume on suspend on Cherry Trail pwm: Enable TI ECAP driver for ARCH_K3 dt-bindings: pwm: tiecap: Add TI AM654 SoC specific compatible dt-bindings: pwm: rcar: Add r8a774a1 support pwm: Send a uevent on the pwmchip device upon channel sysfs (un)export Revert "pwm: Set class for exported channels in sysfs" dt-bindings: pwm: renesas-tpu: Document r8a7744 support dt-bindings: pwm: rcar: Add r8a7744 support dt-bindings: pwm: renesas: tpu: Document R8A779{7|8}0 bindings dt-bindings: pwm: renesas: pwm-rcar: Document R8A779{7|8}0 bindings dt-bindings: pwm: renesas: tpu: Fix "compatible" prop description pwm: Use SPDX identifier for Renesas drivers pwm: lpss: Add get_state callback pwm: lpss: Release runtime-pm reference from the driver's remove callback pwm: lpss: Check PWM powerstate after resume on Cherry Trail devices pwm: lpss: Move struct pwm_lpss_chip definition to the header file pwm: lpss: Add ACPI HID for second PWM controller on Cherry Trail devices ACPI / PM: Export acpi_device_get_power() for use by modular build drivers pwm: tegra: Remove gratuituous blank line
-
git://git.kernel.org/pub/scm/linux/kernel/git/bp/bpLinus Torvalds authored
Pull more EDAC updates from Borislav Petkov: "The second part of the EDAC pile which contains the ADXL user and a build fix which addresses a not-so-sensical .config but fixes randconfig builds people do: - skx_edac: Address translation for NVDIMMs (Tony Luck and Qiuxu Zhuo) - ACPI_ADXL build fix" [ I don't think "sensical" is a word, particularly when used in the context of actually meaning "nonsensical", but I like it - Linus ] * tag 'edac_for_4.20_2' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp: EDAC, skx: Fix randconfig builds EDAC, skx_edac: Add address translation for non-volatile DIMMs
-
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/soundLinus Torvalds authored
Pull sound fixes from Takashi Iwai: "A few device-specific fixes: a fix for SPDIF on old Creative PCI board, and two additional fixes for the recent changes in FireWire audio stack" * tag 'sound-fix-4.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: ALSA: firewire-lib: fix insufficient PCM rule for period/buffer size ALSA: ca0106: Disable IZD on SB0570 DAC to fix audio pops ALSA: dice: fix to wait for releases of all ALSA character devices
-
git://anongit.freedesktop.org/drm/drmLinus Torvalds authored
Pull drm fixes from Dave Airlie: "Pretty much a normal fixes pull pre-rc1, mostly amdgpu fixes, one i915 link training regression fix, and a couple of minor panel/bridge fixes and a panel quirk" * tag 'drm-next-2018-11-02' of git://anongit.freedesktop.org/drm/drm: (37 commits) drm/amdgpu: revert "enable gfxoff in non-sriov and stutter mode by default" drm/amd/pp: Print warning if od_sclk/mclk out of range drm/amd/pp: Fix pp_sclk/mclk_od not work on Vega10 drm/amd/pp: Fix pp_sclk/mclk_od not work on smu7 drm/amd/powerplay: no MGPU fan boost enablement on DPM disabled drm/amdgpu: Fix skipping hangged job reset during gpu recover. drm/amd/powerplay: revise Vega20 pptable version check drm/amd/display: set backlight level limit to 1 drm/panel: simple: Innolux TV123WAM is actually P120ZDG-BF1 dt-bindings: drm/panel: simple: Innolux TV123WAM is actually P120ZDG-BF1 drm/bridge: ti-sn65dsi86: Remove the mystery delay drm/panel: simple: Add "no-hpd" delay for Innolux TV123WAM drm/panel: simple: Support panels with HPD where HPD isn't connected dt-bindings: drm/panel: simple: Add no-hpd property drm/edid: Add 6 bpc quirk for BOE panel. drm/amdgpu: fix reporting of failed msg sent to SMU (v2) drm/amdgpu: Fix compute ring 1.0.0 failure after reset drm/amdgpu: fix VM leaf walking drm/amdgpu: fix amdgpu_vm_fini drm/amd/powerplay: commonize the API for retrieving current clocks ...
-
Linus Torvalds authored
Merge tag 'apparmor-pr-2018-11-01' of git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor Pull apparmor updates from John Johansen: "Features/Improvements: - replace spin_is_locked() with lockdep - add base support for secmark labeling and matching Cleanups: - clean an indentation issue, remove extraneous space - remove no-op permission check in policy_unpack - fix checkpatch missing spaces error in Parse secmark policy - fix network performance issue in aa_label_sk_perm Bug fixes: - add #ifdef checks for secmark filtering - fix an error code in __aa_create_ns() - don't try to replace stale label in ptrace checks - fix failure to audit context info in build_change_hat - check buffer bounds when mapping permissions mask - fully initialize aa_perms struct when answering userspace query - fix uninitialized value in aa_split_fqname" * tag 'apparmor-pr-2018-11-01' of git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor: apparmor: clean an indentation issue, remove extraneous space apparmor: fix checkpatch error in Parse secmark policy apparmor: add #ifdef checks for secmark filtering apparmor: Fix uninitialized value in aa_split_fqname apparmor: don't try to replace stale label in ptraceme check apparmor: Replace spin_is_locked() with lockdep apparmor: Allow filtering based on secmark policy apparmor: Parse secmark policy apparmor: Add a wildcard secid apparmor: don't try to replace stale label in ptrace access check apparmor: Fix network performance issue in aa_label_sk_perm
-
git://git.kernel.org/pub/scm/fs/xfs/xfs-linuxLinus Torvalds authored
Pull vfs dedup fixes from Dave Chinner: "This reworks the vfs data cloning infrastructure. We discovered many issues with these interfaces late in the 4.19 cycle - the worst of them (data corruption, setuid stripping) were fixed for XFS in 4.19-rc8, but a larger rework of the infrastructure fixing all the problems was needed. That rework is the contents of this pull request. Rework the vfs_clone_file_range and vfs_dedupe_file_range infrastructure to use a common .remap_file_range method and supply generic bounds and sanity checking functions that are shared with the data write path. The current VFS infrastructure has problems with rlimit, LFS file sizes, file time stamps, maximum filesystem file sizes, stripping setuid bits, etc and so they are addressed in these commits. We also introduce the ability for the ->remap_file_range methods to return short clones so that clones for vfs_copy_file_range() don't get rejected if the entire range can't be cloned. It also allows filesystems to sliently skip deduplication of partial EOF blocks if they are not capable of doing so without requiring errors to be thrown to userspace. Existing filesystems are converted to user the new remap_file_range method, and both XFS and ocfs2 are modified to make use of the new generic checking infrastructure" * tag 'xfs-4.20-merge-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: (28 commits) xfs: remove [cm]time update from reflink calls xfs: remove xfs_reflink_remap_range xfs: remove redundant remap partial EOF block checks xfs: support returning partial reflink results xfs: clean up xfs_reflink_remap_blocks call site xfs: fix pagecache truncation prior to reflink ocfs2: remove ocfs2_reflink_remap_range ocfs2: support partial clone range and dedupe range ocfs2: fix pagecache truncation prior to reflink ocfs2: truncate page cache for clone destination file before remapping vfs: clean up generic_remap_file_range_prep return value vfs: hide file range comparison function vfs: enable remap callers that can handle short operations vfs: plumb remap flags through the vfs dedupe functions vfs: plumb remap flags through the vfs clone functions vfs: make remap_file_range functions take and return bytes completed vfs: remap helper should update destination inode metadata vfs: pass remap flags to generic_remap_checks vfs: pass remap flags to generic_remap_file_range_prep vfs: combine the clone and dedupe into a single remap_file_range ...
-
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linuxLinus Torvalds authored
Pull powerpc fixes from Michael Ellerman: "Some things that I missed due to travel, or that came in late. Two fixes also going to stable: - A revert of a buggy change to the 8xx TLB miss handlers. - Our flushing of SPE (Signal Processing Engine) registers on fork was broken. Other changes: - A change to the KVM decrementer emulation to use proper APIs. - Some cleanups to the way we do code patching in the 8xx code. - Expose the maximum possible memory for the system in /proc/powerpc/lparcfg. - Merge some updates from Scott: "a couple device tree updates, and a fix for a missing prototype warning" A few other minor fixes and a handful of fixes for our selftests. Thanks to: Aravinda Prasad, Breno Leitao, Camelia Groza, Christophe Leroy, Felipe Rechia, Joel Stanley, Naveen N. Rao, Paul Mackerras, Scott Wood, Tyrel Datwyler" * tag 'powerpc-4.20-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (21 commits) selftests/powerpc: Fix compilation issue due to asm label selftests/powerpc/cache_shape: Fix out-of-tree build selftests/powerpc/switch_endian: Fix out-of-tree build selftests/powerpc/pmu: Link ebb tests with -no-pie selftests/powerpc/signal: Fix out-of-tree build selftests/powerpc/ptrace: Fix out-of-tree build powerpc/xmon: Relax frame size for clang selftests: powerpc: Fix warning for security subdir selftests/powerpc: Relax L1d miss targets for rfi_flush test powerpc/process: Fix flush_all_to_thread for SPE powerpc/pseries: add missing cpumask.h include file selftests/powerpc: Fix ptrace tm failure KVM: PPC: Use exported tb_to_ns() function in decrementer emulation powerpc/pseries: Export maximum memory value powerpc/8xx: Use patch_site for perf counters setup powerpc/8xx: Use patch_site for memory setup patching powerpc/code-patching: Add a helper to get the address of a patch_site Revert "powerpc/8xx: Use L1 entry APG to handle _PAGE_ACCESSED for CONFIG_SWAP" powerpc/8xx: add missing header in 8xx_mmu.c powerpc/8xx: Add DT node for using the SEC engine of the MPC885 ...
-
Linus Torvalds authored
Merge tag 'riscv-for-linus-4.20-mw3' of git://git.kernel.org/pub/scm/linux/kernel/git/palmer/riscv-linux Pull RISC-V defconfig update from Palmer Dabbelt: "Sorry for the last minute patches, but it was suggested we try to push this in before rc1 to make it easier for people to keep their branch rebases sane" * tag 'riscv-for-linus-4.20-mw3' of git://git.kernel.org/pub/scm/linux/kernel/git/palmer/riscv-linux: RISC-V: refresh defconfig
-
Keith Busch authored
The nvme pci driver had been adding its CMB resource to the P2P DMA subsystem everytime on on a controller reset. This results in the following warning: ------------[ cut here ]------------ nvme 0000:00:03.0: Conflicting mapping in same section WARNING: CPU: 7 PID: 81 at kernel/memremap.c:155 devm_memremap_pages+0xa6/0x380 ... Call Trace: pci_p2pdma_add_resource+0x153/0x370 nvme_reset_work+0x28c/0x17b1 [nvme] ? add_timer+0x107/0x1e0 ? dequeue_entity+0x81/0x660 ? dequeue_entity+0x3b0/0x660 ? pick_next_task_fair+0xaf/0x610 ? __switch_to+0xbc/0x410 process_one_work+0x1cf/0x350 worker_thread+0x215/0x3d0 ? process_one_work+0x350/0x350 kthread+0x107/0x120 ? kthread_park+0x80/0x80 ret_from_fork+0x1f/0x30 ---[ end trace f7ea76ac6ee72727 ]--- nvme nvme0: failed to register the CMB This patch fixes this by registering the CMB with P2P only once. Signed-off-by: Keith Busch <keith.busch@intel.com> Reviewed-by: Logan Gunthorpe <logang@deltatee.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
James Smart authored
The patch made to avoid Coverity reporting of out of bounds access on aen_op moved the assignment of a pointer, leaving it null when it was subsequently used to calculate a private pointer. Thus the private pointer was bad. Move/correct the private pointer initialization to be in sync with the patch. Fixes: 0d2bdf9f ("nvme-fc: rework the request initialization code") Signed-off-by: James Smart <jsmart2021@gmail.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
Colin Ian King authored
Trivial fix to clean up an indentation issue, remove space Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: John Johansen <john.johansen@canonical.com>
-
John Johansen authored
Fix missed spacing error reported by checkpatch for 9caafbe2 ("Parse secmark policy") Signed-off-by: John Johansen <john.johansen@canonical.com>
-
Dave Airlie authored
Merge tag 'drm-intel-next-fixes-2018-10-25' of git://anongit.freedesktop.org/drm/drm-intel into drm-next - Fix to avoid link retraining workaround on eDP (the other is a comment change) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20181025131836.GA2296@jlahtine-desk.ger.corp.intel.com
-
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfsLinus Torvalds authored
Pull misc vfs updates from Al Viro: "No common topic, really - a handful of assorted stuff; the least trivial bits are Mark's dedupe patches" * 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: fs/exofs: only use true/false for asignment of bool type variable fs/exofs: fix potential memory leak in mount option parsing Delete invalid assignment statements in do_sendfile iomap: remove duplicated include from iomap.c vfs: dedupe should return EPERM if permission is not granted vfs: allow dedupe of user owned read-only files ntfs: don't open-code ERR_CAST ext4: don't open-code ERR_CAST
-
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfsLinus Torvalds authored
Pull AFS updates from Al Viro: "AFS series, with some iov_iter bits included" * 'work.afs' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (26 commits) missing bits of "iov_iter: Separate type from direction and use accessor functions" afs: Probe multiple fileservers simultaneously afs: Fix callback handling afs: Eliminate the address pointer from the address list cursor afs: Allow dumping of server cursor on operation failure afs: Implement YFS support in the fs client afs: Expand data structure fields to support YFS afs: Get the target vnode in afs_rmdir() and get a callback on it afs: Calc callback expiry in op reply delivery afs: Fix FS.FetchStatus delivery from updating wrong vnode afs: Implement the YFS cache manager service afs: Remove callback details from afs_callback_break struct afs: Commit the status on a new file/dir/symlink afs: Increase to 64-bit volume ID and 96-bit vnode ID for YFS afs: Don't invoke the server to read data beyond EOF afs: Add a couple of tracepoints to log I/O errors afs: Handle EIO from delivery function afs: Fix TTL on VL server and address lists afs: Implement VL server rotation afs: Improve FS server rotation error handling ...
-
git://people.freedesktop.org/~agd5f/linuxDave Airlie authored
- Fix flickering at low backlight levels on some systems - Fix some overclocking regressions - Vega20 updates for - GPU recovery fixes - Disable gfxoff on RV as some sbios/fw combinations are not stable yet Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexdeucher@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20181101151939.2828-1-alexander.deucher@amd.com
-
Dave Airlie authored
Merge tag 'drm-misc-next-fixes-2018-10-31' of git://anongit.freedesktop.org/drm/drm-misc into drm-next - Properly label Innolux TV123WAM as P120ZDG-BF1 (Doug) - Add optional delay for panels without hpd hooked up (which solves the mystery delay for TI SN65DSI86 bridge) (Doug) - Another 6bpc quirk for BOE panel 0x0771 (Shawn) Cc: Doug Anderson <dianders@chromium.org> Cc: Lee, Shawn C <shawn.c.lee@intel.com> Signed-off-by: Dave Airlie <airlied@redhat.com> From: Sean Paul <sean@poorly.run> Link: https://patchwork.freedesktop.org/patch/msgid/20181031201944.GA262020@art_vandelay
-
Dennis Zhou authored
This reverts a series committed earlier due to null pointer exception bug report in [1]. It seems there are edge case interactions that I did not consider and will need some time to understand what causes the adverse interactions. The original series can be found in [2] with a follow up series in [3]. [1] https://www.spinics.net/lists/cgroups/msg20719.html [2] https://lore.kernel.org/lkml/20180911184137.35897-1-dennisszhou@gmail.com/ [3] https://lore.kernel.org/lkml/20181020185612.51587-1-dennis@kernel.org/ This reverts the following commits: d459d853, b2c3fa54, 101246ec, b3b9f24f, e2b09899, f0fcb3ec, c839e7a0, bdc24917, 74b7c02a, 5bf9a1f3, a7b39b4e, 07b05bcc, 49f4c2dc, 27e6fa99Signed-off-by: Dennis Zhou <dennis@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
Ming Lei authored
brd_free() may be called in failure path on one brd instance which disk isn't added yet, so release handler of gendisk may free the associated request_queue early and causes the following use-after-free[1]. This patch fixes this issue by associating gendisk with request_queue just before adding disk. [1] KASAN: use-after-free Read in del_timer_syncNon-volatile memory driver v1.3 Linux agpgart interface v0.103 [drm] Initialized vgem 1.0.0 20120112 for virtual device on minor 0 usbcore: registered new interface driver udl ================================================================== BUG: KASAN: use-after-free in __lock_acquire+0x36d9/0x4c20 kernel/locking/lockdep.c:3218 Read of size 8 at addr ffff8801d1b6b540 by task swapper/0/1 CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.19.0+ #88 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x244/0x39d lib/dump_stack.c:113 print_address_description.cold.7+0x9/0x1ff mm/kasan/report.c:256 kasan_report_error mm/kasan/report.c:354 [inline] kasan_report.cold.8+0x242/0x309 mm/kasan/report.c:412 __asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:433 __lock_acquire+0x36d9/0x4c20 kernel/locking/lockdep.c:3218 lock_acquire+0x1ed/0x520 kernel/locking/lockdep.c:3844 del_timer_sync+0xb7/0x270 kernel/time/timer.c:1283 blk_cleanup_queue+0x413/0x710 block/blk-core.c:809 brd_free+0x5d/0x71 drivers/block/brd.c:422 brd_init+0x2eb/0x393 drivers/block/brd.c:518 do_one_initcall+0x145/0x957 init/main.c:890 do_initcall_level init/main.c:958 [inline] do_initcalls init/main.c:966 [inline] do_basic_setup init/main.c:984 [inline] kernel_init_freeable+0x5c6/0x6b9 init/main.c:1148 kernel_init+0x11/0x1ae init/main.c:1068 ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:350 Reported-by: syzbot+3701447012fe951dabb2@syzkaller.appspotmail.com Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
https://github.com/ojeda/linuxLinus Torvalds authored
Pull compiler attribute updates from Miguel Ojeda: "This is an effort to disentangle the include/linux/compiler*.h headers and bring them up to date. The main idea behind the series is to use feature checking macros (i.e. __has_attribute) instead of compiler version checks (e.g. GCC_VERSION), which are compiler-agnostic (so they can be shared, reducing the size of compiler-specific headers) and version-agnostic. Other related improvements have been performed in the headers as well, which on top of the use of __has_attribute it has amounted to a significant simplification of these headers (e.g. GCC_VERSION is now only guarding a few non-attribute macros). This series should also help the efforts to support compiling the kernel with clang and icc. A fair amount of documentation and comments have also been added, clarified or removed; and the headers are now more readable, which should help kernel developers in general. The series was triggered due to the move to gcc >= 4.6. In turn, this series has also triggered Sparse to gain the ability to recognize __has_attribute on its own. Finally, the __nonstring variable attribute series has been also applied on top; plus two related patches from Nick Desaulniers for unreachable() that came a bit afterwards" * tag 'compiler-attributes-for-linus-4.20-rc1' of https://github.com/ojeda/linux: compiler-gcc: remove comment about gcc 4.5 from unreachable() compiler.h: update definition of unreachable() Compiler Attributes: ext4: remove local __nonstring definition Compiler Attributes: auxdisplay: panel: use __nonstring Compiler Attributes: enable -Wstringop-truncation on W=1 (gcc >= 8) Compiler Attributes: add support for __nonstring (gcc >= 8) Compiler Attributes: add MAINTAINERS entry Compiler Attributes: add Doc/process/programming-language.rst Compiler Attributes: remove uses of __attribute__ from compiler.h Compiler Attributes: KENTRY used twice the "used" attribute Compiler Attributes: use feature checks instead of version checks Compiler Attributes: add missing SPDX ID in compiler_types.h Compiler Attributes: remove unneeded sparse (__CHECKER__) tests Compiler Attributes: homogenize __must_be_array Compiler Attributes: remove unneeded tests Compiler Attributes: always use the extra-underscores syntax Compiler Attributes: remove unused attributes
-
Anup Patel authored
This patch updates defconfig using savedefconfig on Linux-4.19. It is intended to have no functional change. Signed-off-by: Anup Patel <anup@brainfault.org> Reviewed-by: Palmer Dabbelt <palmer@sifive.com> Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
-
- 01 Nov, 2018 15 commits
-
-
git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-securityLinus Torvalds authored
Pull keys updates from James Morris: "Provide five new operations in the key_type struct that can be used to provide access to asymmetric key operations. These will be implemented for the asymmetric key type in a later patch and may refer to a key retained in RAM by the kernel or a key retained in crypto hardware. int (*asym_query)(const struct kernel_pkey_params *params, struct kernel_pkey_query *info); int (*asym_eds_op)(struct kernel_pkey_params *params, const void *in, void *out); int (*asym_verify_signature)(struct kernel_pkey_params *params, const void *in, const void *in2); Since encrypt, decrypt and sign are identical in their interfaces, they're rolled together in the asym_eds_op() operation and there's an operation ID in the params argument to distinguish them. Verify is different in that we supply the data and the signature instead and get an error value (or 0) as the only result on the expectation that this may well be how a hardware crypto device may work" * 'next-keys2' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (22 commits) KEYS: asym_tpm: Add support for the sign operation [ver #2] KEYS: asym_tpm: Implement tpm_sign [ver #2] KEYS: asym_tpm: Implement signature verification [ver #2] KEYS: asym_tpm: Implement the decrypt operation [ver #2] KEYS: asym_tpm: Implement tpm_unbind [ver #2] KEYS: asym_tpm: Add loadkey2 and flushspecific [ver #2] KEYS: Move trusted.h to include/keys [ver #2] KEYS: trusted: Expose common functionality [ver #2] KEYS: asym_tpm: Implement encryption operation [ver #2] KEYS: asym_tpm: Implement pkey_query [ver #2] KEYS: Add parser for TPM-based keys [ver #2] KEYS: asym_tpm: extract key size & public key [ver #2] KEYS: asym_tpm: add skeleton for asym_tpm [ver #2] crypto: rsa-pkcs1pad: Allow hash to be optional [ver #2] KEYS: Implement PKCS#8 RSA Private Key parser [ver #2] KEYS: Implement encrypt, decrypt and sign for software asymmetric key [ver #2] KEYS: Allow the public_key struct to hold a private key [ver #2] KEYS: Provide software public key query function [ver #2] KEYS: Make the X.509 and PKCS7 parsers supply the sig encoding type [ver #2] KEYS: Provide missing asymmetric key subops for new key type ops [ver #2] ...
-
Al Viro authored
sunrpc patches from nfs tree conflict with calling conventions change done in iov_iter work. Trivial fixup... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
-
git://git.linux-nfs.org/projects/trondmy/linux-nfsAl Viro authored
backmerge to do fixup of iov_iter_kvec() conflict
-
git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfsLinus Torvalds authored
Pull overlayfs updates from Miklos Szeredi: "A mix of fixes and cleanups" * tag 'ovl-update-4.20' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs: ovl: automatically enable redirect_dir on metacopy=on ovl: check whiteout in ovl_create_over_whiteout() ovl: using posix_acl_xattr_size() to get size instead of posix_acl_to_xattr() ovl: abstract ovl_inode lock with a helper ovl: remove the 'locked' argument of ovl_nlink_{start,end} ovl: relax requirement for non null uuid of lower fs ovl: fold copy-up helpers into callers ovl: untangle copy up call chain ovl: relax permission checking on underlying layers ovl: fix recursive oi->lock in ovl_link() vfs: fix FIGETBSZ ioctl on an overlayfs file ovl: clean up error handling in ovl_get_tmpfile() ovl: fix error handling in ovl_verify_set_fh()
-
git://git.kernel.org/pub/scm/linux/kernel/git/robh/linuxLinus Torvalds authored
Pull Devicetree fixes from Rob Herring: - fix cpu node iterator for powerpc systems - clarify ARM CPU binding 'capacities-dmips-mhz' property calculations * tag 'devicetree-fixes-for-4.20-1' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux: of: Fix cpu node iterator to not ignore disabled cpu nodes dt-bindings: arm: Explain capacities-dmips-mhz calculations in example
-
git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhostLinus Torvalds authored
Pull virtio/vhost updates from Michael Tsirkin: "Fixes and tweaks: - virtio balloon page hinting support - vhost scsi control queue - misc fixes" * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: MAINTAINERS: remove reference to bogus vsock file vhost/scsi: Use common handling code in request queue handler vhost/scsi: Extract common handling code from control queue handler vhost/scsi: Respond to control queue operations vhost/scsi: truncate T10 PI iov_iter to prot_bytes virtio-balloon: VIRTIO_BALLOON_F_PAGE_POISON mm/page_poison: expose page_poisoning_enabled to kernel modules virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT kvm_config: add CONFIG_VIRTIO_MENU
-
git://github.com/jcmvbkbc/linux-xtensaLinus Torvalds authored
Pull Xtensa fixes and cleanups from Max Filippov: - use ZONE_NORMAL instead of ZONE_DMA - fix Image.elf build error caused by assignment of incorrect address to the .note.Linux section - clean up debug and property sections in the vmlinux.lds.S * tag 'xtensa-20181101' of git://github.com/jcmvbkbc/linux-xtensa: xtensa: clean up xtensa-specific property sections xtensa: use DWARF_DEBUG in the vmlinux.lds.S xtensa: add NOTES section to the linker script xtensa: remove ZONE_DMA
-
Rob Herring authored
In most cases, nodes with 'status = "disabled";' are treated as if the node is not present though it is a common bug to forget to check that. However, cpu nodes are different in that "disabled" simply means offline and the OS can bring the CPU core online. Commit f1f207e4 ("of: Add cpu node iterator for_each_of_cpu_node()") followed the common behavior of ignoring disabled cpu nodes. This breaks some powerpc systems (at least NXP P50XX/e5500). Fix this by dropping the status check. Fixes: 651d44f9 ("of: use for_each_of_cpu_node iterator") Fixes: f1f207e4 ("of: Add cpu node iterator for_each_of_cpu_node()") Reported-by: Christian Zigotzky <chzigotzky@xenosoft.de> Tested-by: Michael Ellerman <mpe@ellerman.id.au> Cc: Frank Rowand <frowand.list@gmail.com> Signed-off-by: Rob Herring <robh@kernel.org>
-
Miklos Szeredi authored
Current behavior is to automatically disable metacopy if redirect_dir is not enabled and proceed with the mount. If "metacopy=on" mount option was given, then this behavior can confuse the user: no mount failure, yet metacopy is disabled. This patch makes metacopy=on imply redirect_dir=on. The converse is also true: turning off full redirect with redirect_dir= {off|follow|nofollow} will disable metacopy. If both metacopy=on and redirect_dir={off|follow|nofollow} is specified, then mount will fail, since there's no way to correctly resolve the conflict. Reported-by: Daniel Walsh <dwalsh@redhat.com> Fixes: d5791044 ("ovl: Provide a mount option metacopy=on/off...") Cc: <stable@vger.kernel.org> # v4.19 Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
-
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linuxLinus Torvalds authored
Pull stackleak gcc plugin from Kees Cook: "Please pull this new GCC plugin, stackleak, for v4.20-rc1. This plugin was ported from grsecurity by Alexander Popov. It provides efficient stack content poisoning at syscall exit. This creates a defense against at least two classes of flaws: - Uninitialized stack usage. (We continue to work on improving the compiler to do this in other ways: e.g. unconditional zero init was proposed to GCC and Clang, and more plugin work has started too). - Stack content exposure. By greatly reducing the lifetime of valid stack contents, exposures via either direct read bugs or unknown cache side-channels become much more difficult to exploit. This complements the existing buddy and heap poisoning options, but provides the coverage for stacks. The x86 hooks are included in this series (which have been reviewed by Ingo, Dave Hansen, and Thomas Gleixner). The arm64 hooks have already been merged through the arm64 tree (written by Laura Abbott and reviewed by Mark Rutland and Will Deacon). With VLAs having been removed this release, there is no need for alloca() protection, so it has been removed from the plugin" * tag 'stackleak-v4.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: arm64: Drop unneeded stackleak_check_alloca() stackleak: Allow runtime disabling of kernel stack erasing doc: self-protection: Add information about STACKLEAK feature fs/proc: Show STACKLEAK metrics in the /proc file system lkdtm: Add a test for STACKLEAK gcc-plugins: Add STACKLEAK plugin for tracking the kernel stack x86/entry: Add STACKLEAK erasing the kernel stack at the end of syscalls
-
git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linuxLinus Torvalds authored
Pull i2c fixes from Wolfram Sang: "I2C has a core bugfix & cleanup as well as an ID addition and MAINTAINERS update for you" * 'i2c/for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: MAINTAINERS: add maintainer for IMX LPI2C driver dt-bindings: i2c: i2c-imx-lpi2c: add imx8qxp compatible string i2c: Clear client->irq in i2c_device_remove i2c: Remove unnecessary call to irq_find_mapping
-
git://git.kernel.org/pub/scm/linux/kernel/git/dennis/percpuLinus Torvalds authored
Pull percpu fixes from Dennis Zhou: "Two small things for v4.20. The first fixes a clang uninitialized variable warning for arm64 in the default path calls BUILD_BUG(). The second removes an unnecessary unlikely() in a WARN_ON() use" * 'for-4.20' of git://git.kernel.org/pub/scm/linux/kernel/git/dennis/percpu: arm64: percpu: Initialize ret in the default case mm: percpu: remove unnecessary unlikely()
-
git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds authored
Pull networking fixes from David Miller: 1) BPF verifier fixes from Daniel Borkmann. 2) HNS driver fixes from Huazhong Tan. 3) FDB only works for ethernet devices, reject attempts to install FDB rules for others. From Ido Schimmel. 4) Fix spectre V1 in vhost, from Jason Wang. 5) Don't pass on-stack object to irq_set_affinity_hint() in mvpp2 driver, from Marc Zyngier. 6) Fix mlx5e checksum handling when RXFCS is enabled, from Eric Dumazet. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (49 commits) openvswitch: Fix push/pop ethernet validation net: stmmac: Fix stmmac_mdio_reset() when building stmmac as modules bpf: test make sure to run unpriv test cases in test_verifier bpf: add various test cases to test_verifier bpf: don't set id on after map lookup with ptr_to_map_val return bpf: fix partial copy of map_ptr when dst is scalar libbpf: Fix compile error in libbpf_attach_type_by_name kselftests/bpf: use ping6 as the default ipv6 ping binary if it exists selftests: mlxsw: qos_mc_aware: Add a test for UC awareness selftests: mlxsw: qos_mc_aware: Tweak for min shaper mlxsw: spectrum: Set minimum shaper on MC TCs mlxsw: reg: QEEC: Add minimum shaper fields net: hns3: bugfix for rtnl_lock's range in the hclgevf_reset() net: hns3: bugfix for rtnl_lock's range in the hclge_reset() net: hns3: bugfix for handling mailbox while the command queue reinitialized net: hns3: fix incorrect return value/type of some functions net: hns3: bugfix for hclge_mdio_write and hclge_mdio_read net: hns3: bugfix for is_valid_csq_clean_head() net: hns3: remove unnecessary queue reset in the hns3_uninit_all_ring() net: hns3: bugfix for the initialization of command queue's spin lock ...
-
git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparcLinus Torvalds authored
Pull sparc fixes from David Miller: "Two small fixes" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc: sparc64: Wire up compat getpeername and getsockname. sparc64: Remvoe set_fs() from perf_callchain_user().
-
https://github.com/c-sky/csky-linuxLinus Torvalds authored
Pull csky dtb fixups from Guo Ren: "These fix the csky dtb Kbuild to follow the new Devicetree dtb build rules" * tag 'csky-for-linus-4.20-fixup-dtb' of https://github.com/c-sky/csky-linux: csky: use common dtb build rules csky: remove builtin-dtb Kbuild
-