- 22 Jul, 2019 4 commits
-
-
Eiichi Tsukata authored
When arch_stack_walk_user() is called from atomic contexts, access_ok() can trigger the following warning if compiled with CONFIG_DEBUG_ATOMIC_SLEEP=y. Reproducer: // CONFIG_DEBUG_ATOMIC_SLEEP=y # cd /sys/kernel/debug/tracing # echo 1 > options/userstacktrace # echo 1 > events/irq/irq_handler_entry/enable WARNING: CPU: 0 PID: 2649 at arch/x86/kernel/stacktrace.c:103 arch_stack_walk_user+0x6e/0xf6 CPU: 0 PID: 2649 Comm: bash Not tainted 5.3.0-rc1+ #99 RIP: 0010:arch_stack_walk_user+0x6e/0xf6 Call Trace: <IRQ> stack_trace_save_user+0x10a/0x16d trace_buffer_unlock_commit_regs+0x185/0x240 trace_event_buffer_commit+0xec/0x330 trace_event_raw_event_irq_handler_entry+0x159/0x1e0 __handle_irq_event_percpu+0x22d/0x440 handle_irq_event_percpu+0x70/0x100 handle_irq_event+0x5a/0x8b handle_edge_irq+0x12f/0x3f0 handle_irq+0x34/0x40 do_IRQ+0xa6/0x1f0 common_interrupt+0xf/0xf </IRQ> Fix it by calling __range_not_ok() directly instead of access_ok() as copy_from_user_nmi() does. This is fine here because the actual copy is inside a pagefault disabled region. Reported-by: Juri Lelli <juri.lelli@gmail.com> Signed-off-by: Eiichi Tsukata <devel@etsukata.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20190722083216.16192-2-devel@etsukata.com
-
Joerg Roedel authored
On x86-32 with PTI enabled, parts of the kernel page-tables are not shared between processes. This can cause mappings in the vmalloc/ioremap area to persist in some page-tables after the region is unmapped and released. When the region is re-used the processes with the old mappings do not fault in the new mappings but still access the old ones. This causes undefined behavior, in reality often data corruption, kernel oopses and panics and even spontaneous reboots. Fix this problem by activly syncing unmaps in the vmalloc/ioremap area to all page-tables in the system before the regions can be re-used. References: https://bugzilla.suse.com/show_bug.cgi?id=1118689 Fixes: 5d72b4fb ('x86, mm: support huge I/O mapping capability I/F') Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Dave Hansen <dave.hansen@linux.intel.com> Link: https://lkml.kernel.org/r/20190719184652.11391-4-joro@8bytes.org
-
Joerg Roedel authored
With huge-page ioremap areas the unmappings also need to be synced between all page-tables. Otherwise it can cause data corruption when a region is unmapped and later re-used. Make the vmalloc_sync_one() function ready to sync unmappings and make sure vmalloc_sync_all() iterates over all page-tables even when an unmapped PMD is found. Fixes: 5d72b4fb ('x86, mm: support huge I/O mapping capability I/F') Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Dave Hansen <dave.hansen@linux.intel.com> Link: https://lkml.kernel.org/r/20190719184652.11391-3-joro@8bytes.org
-
Joerg Roedel authored
Do not require a struct page for the mapped memory location because it might not exist. This can happen when an ioremapped region is mapped with 2MB pages. Fixes: 5d72b4fb ('x86, mm: support huge I/O mapping capability I/F') Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Dave Hansen <dave.hansen@linux.intel.com> Link: https://lkml.kernel.org/r/20190719184652.11391-2-joro@8bytes.org
-
- 21 Jul, 2019 15 commits
-
-
Linus Torvalds authored
-
git://git.kernel.org/pub/scm/linux/kernel/git/robh/linuxLinus Torvalds authored
Pull Devicetree fixes from Rob Herring: "Fix several warnings/errors in validation of binding schemas" * tag 'devicetree-fixes-for-5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux: dt-bindings: pinctrl: stm32: Fix missing 'clocks' property in examples dt-bindings: iio: ad7124: Fix dtc warnings in example dt-bindings: iio: avia-hx711: Fix avdd-supply typo in example dt-bindings: pinctrl: aspeed: Fix AST2500 example errors dt-bindings: pinctrl: aspeed: Fix 'compatible' schema errors dt-bindings: riscv: Limit cpus schema to only check RiscV 'cpu' nodes dt-bindings: Ensure child nodes are of type 'object'
-
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfsLinus Torvalds authored
Pull vfs documentation typo fix from Al Viro. * 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: typo fix: it's d_make_root, not d_make_inode...
-
git://git.samba.org/sfrench/cifs-2.6Linus Torvalds authored
Pull cifs fixes from Steve French: "Two fixes for stable, one that had dependency on earlier patch in this merge window and can now go in, and a perf improvement in SMB3 open" * tag '5.3-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6: cifs: update internal module number cifs: flush before set-info if we have writeable handles smb3: optimize open to not send query file internal info cifs: copy_file_range needs to strip setuid bits and update timestamps CIFS: fix deadlock in cached root handling
-
Qian Cai authored
The commit b3aa14f0 ("iommu: remove the mapping_error dma_map_ops method") incorrectly changed the checking from dma_ops_alloc_iova() in map_sg() causes a crash under memory pressure as dma_ops_alloc_iova() never return DMA_MAPPING_ERROR on failure but 0, so the error handling is all wrong. kernel BUG at drivers/iommu/iova.c:801! Workqueue: kblockd blk_mq_run_work_fn RIP: 0010:iova_magazine_free_pfns+0x7d/0xc0 Call Trace: free_cpu_cached_iovas+0xbd/0x150 alloc_iova_fast+0x8c/0xba dma_ops_alloc_iova.isra.6+0x65/0xa0 map_sg+0x8c/0x2a0 scsi_dma_map+0xc6/0x160 pqi_aio_submit_io+0x1f6/0x440 [smartpqi] pqi_scsi_queue_command+0x90c/0xdd0 [smartpqi] scsi_queue_rq+0x79c/0x1200 blk_mq_dispatch_rq_list+0x4dc/0xb70 blk_mq_sched_dispatch_requests+0x249/0x310 __blk_mq_run_hw_queue+0x128/0x200 blk_mq_run_work_fn+0x27/0x30 process_one_work+0x522/0xa10 worker_thread+0x63/0x5b0 kthread+0x1d2/0x1f0 ret_from_fork+0x22/0x40 Fixes: b3aa14f0 ("iommu: remove the mapping_error dma_map_ops method") Signed-off-by: Qian Cai <cai@lca.pw> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Mike Rapoport authored
The hexagon implementation pte_alloc_one(), pte_alloc_one_kernel(), pte_free_kernel() and pte_free() is identical to the generic except of lack of __GFP_ACCOUNT for the user PTEs allocation. Switch hexagon to use generic version of these functions. Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
git://github.com/jonmason/ntbLinus Torvalds authored
Pull NTB updates from Jon Mason: "New feature to add support for NTB virtual MSI interrupts, the ability to test and use this feature in the NTB transport layer. Also, bug fixes for the AMD and Switchtec drivers, as well as some general patches" * tag 'ntb-5.3' of git://github.com/jonmason/ntb: (22 commits) NTB: Describe the ntb_msi_test client in the documentation. NTB: Add MSI interrupt support to ntb_transport NTB: Add ntb_msi_test support to ntb_test NTB: Introduce NTB MSI Test Client NTB: Introduce MSI library NTB: Rename ntb.c to support multiple source files in the module NTB: Introduce functions to calculate multi-port resource index NTB: Introduce helper functions to calculate logical port number PCI/switchtec: Add module parameter to request more interrupts PCI/MSI: Support allocating virtual MSI interrupts ntb_hw_switchtec: Fix setup MW with failure bug ntb_hw_switchtec: Skip unnecessary re-setup of shared memory window for crosslink case ntb_hw_switchtec: Remove redundant steps of switchtec_ntb_reinit_peer() function NTB: correct ntb_dev_ops and ntb_dev comment typos NTB: amd: Silence shift wrapping warning in amd_ntb_db_vector_mask() ntb_hw_switchtec: potential shift wrapping bug in switchtec_ntb_init_sndev() NTB: ntb_transport: Ensure qp->tx_mw_dma_addr is initaliazed NTB: ntb_hw_amd: set peer limit register NTB: ntb_perf: Clear stale values in doorbell and command SPAD register NTB: ntb_perf: Disable NTB link after clearing peer XLAT registers ...
-
Al Viro authored
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
-
Rob Herring authored
Now that examples are validated against the DT schema, an error with required 'clocks' property missing is exposed: Documentation/devicetree/bindings/pinctrl/st,stm32-pinctrl.example.dt.yaml: \ pinctrl@40020000: gpio@0: 'clocks' is a required property Documentation/devicetree/bindings/pinctrl/st,stm32-pinctrl.example.dt.yaml: \ pinctrl@50020000: gpio@1000: 'clocks' is a required property Documentation/devicetree/bindings/pinctrl/st,stm32-pinctrl.example.dt.yaml: \ pinctrl@50020000: gpio@2000: 'clocks' is a required property Add the missing 'clocks' properties to the examples to fix the errors. Fixes: 2c9239c1 ("dt-bindings: pinctrl: Convert stm32 pinctrl bindings to json-schema") Cc: Linus Walleij <linus.walleij@linaro.org> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: linux-gpio@vger.kernel.org Cc: linux-stm32@st-md-mailman.stormreply.com Acked-by: Alexandre TORGUE <alexandre.torgue@st.com> Signed-off-by: Rob Herring <robh@kernel.org>
-
Rob Herring authored
With the conversion to DT schema, the examples are now compiled with dtc. The ad7124 binding example has the following warning: Documentation/devicetree/bindings/iio/adc/adi,ad7124.example.dts:19.11-21: \ Warning (reg_format): /example-0/adc@0:reg: property has invalid length (4 bytes) (#address-cells == 1, #size-cells == 1) There's a default #size-cells and #address-cells values of 1 for examples. For examples needing different values such as this one on a SPI bus, they need to provide a SPI bus parent node. Fixes: 26ae15e6 ("Convert AD7124 bindings documentation to YAML format.") Cc: Jonathan Cameron <jic23@kernel.org> Cc: linux-iio@vger.kernel.org Signed-off-by: Rob Herring <robh@kernel.org>
-
Rob Herring authored
Now that examples are validated against the DT schema, a typo in avia-hx711 example generates a warning: Documentation/devicetree/bindings/iio/adc/avia-hx711.example.dt.yaml: weight: 'avdd-supply' is a required property Fix the typo. Fixes: 5150ec3f ("avia-hx711.yaml: transform DT binding to YAML") Cc: Andreas Klinger <ak@it-klinger.de> Cc: Jonathan Cameron <jic23@kernel.org> Cc: linux-iio@vger.kernel.org Signed-off-by: Rob Herring <robh@kernel.org>
-
Rob Herring authored
The schema examples are now validated against the schema itself. The AST2500 pinctrl schema has a couple of errors: Documentation/devicetree/bindings/pinctrl/aspeed,ast2500-pinctrl.example.dt.yaml: \ example-0: $nodename:0: 'example-0' does not match '^(bus|soc|axi|ahb|apb)(@[0-9a-f]+)?$' Documentation/devicetree/bindings/pinctrl/aspeed,ast2500-pinctrl.example.dt.yaml: \ pinctrl: aspeed,external-nodes: [[1, 2]] is too short Fixes: 0a617de1 ("dt-bindings: pinctrl: aspeed: Convert AST2500 bindings to json-schema") Cc: Andrew Jeffery <andrew@aj.id.au> Cc: Linus Walleij <linus.walleij@linaro.org> Cc: Joel Stanley <joel@jms.id.au> Cc: linux-aspeed@lists.ozlabs.org Cc: linux-gpio@vger.kernel.org Cc: linux-arm-kernel@lists.infradead.org Acked-by: Andrew Jeffery <andrew@aj.id.au> Signed-off-by: Rob Herring <robh@kernel.org>
-
Rob Herring authored
The Aspeed pinctl schema have errors in the 'compatible' schema: Documentation/devicetree/bindings/pinctrl/aspeed,ast2400-pinctrl.yaml: \ properties:compatible:enum: ['aspeed', 'ast2400-pinctrl', 'aspeed', 'g4-pinctrl'] has non-unique elements Documentation/devicetree/bindings/pinctrl/aspeed,ast2500-pinctrl.yaml: \ properties:compatible:enum: ['aspeed', 'ast2500-pinctrl', 'aspeed', 'g5-pinctrl'] has non-unique elements Flow style sequences have to be quoted if the vales contain ','. Fix this by using the more common one line per entry formatting. Fixes: 0a617de1 ("dt-bindings: pinctrl: aspeed: Convert AST2500 bindings to json-schema") Fixes: 07457937 ("dt-bindings: pinctrl: aspeed: Convert AST2400 bindings to json-schema") Cc: Andrew Jeffery <andrew@aj.id.au> Cc: Linus Walleij <linus.walleij@linaro.org> Cc: Joel Stanley <joel@jms.id.au> Cc: linux-aspeed@lists.ozlabs.org Cc: linux-gpio@vger.kernel.org Cc: linux-arm-kernel@lists.infradead.org Acked-by: Andrew Jeffery <andrew@aj.id.au> Signed-off-by: Rob Herring <robh@kernel.org>
-
Rob Herring authored
Matching on the 'cpus' node was a bad choice because the schema is incorrectly applied to non-RiscV cpus nodes. As we now have a common cpus schema which checks the general structure, it is also redundant to do so in the Risc-V CPU schema. The downside is one could conceivably mix different architecture's cpu nodes or have typos in the compatible string. The latter problem pretty much exists for every schema. Acked-by: Paul Walmsley <paul.walmsley@sifive.com> Signed-off-by: Rob Herring <robh@kernel.org>
-
Rob Herring authored
Properties which are child node definitions need to have an explict type. Otherwise, a matching (DT) property can silently match when an error is desired. Fix this up tree-wide. Once this is fixed, the meta-schema will enforce this on any child node definitions. Cc: Chen-Yu Tsai <wens@csie.org> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Brian Norris <computersforpeace@gmail.com> Cc: Marek Vasut <marek.vasut@gmail.com> Cc: Richard Weinberger <richard@nod.at> Cc: Vignesh Raghavendra <vigneshr@ti.com> Cc: Linus Walleij <linus.walleij@linaro.org> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: linux-mtd@lists.infradead.org Cc: linux-gpio@vger.kernel.org Cc: linux-stm32@st-md-mailman.stormreply.com Cc: linux-spi@vger.kernel.org Acked-by: Miquel Raynal <miquel.raynal@bootlin.com> Acked-by: Maxime Ripard <maxime.ripard@bootlin.com> Acked-by: Mark Brown <broonie@kernel.org> Acked-by: Alexandre TORGUE <alexandre.torgue@st.com> Signed-off-by: Rob Herring <robh@kernel.org>
-
- 20 Jul, 2019 21 commits
-
-
git://git.kernel.org/pub/scm/linux/kernel/git/dtor/inputLinus Torvalds authored
Pull more input updates from Dmitry Torokhov: - Apple SPI keyboard and trackpad driver for newer Macs - ALPS driver will ignore trackpoint-only devices to give the trackpoint driver a chance to handle them properly - another Lenovo is switched over to SMbus from PS/2 - assorted driver fixups. * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: Input: alps - fix a mismatch between a condition check and its comment Input: psmouse - fix build error of multiple definition Input: applespi - remove set but not used variables 'sts' Input: add Apple SPI keyboard and trackpad driver Input: alps - don't handle ALPS cs19 trackpoint-only device Input: hyperv-keyboard - remove dependencies on PAGE_SIZE for ring buffer Input: adp5589 - initialize GPIO controller parent device Input: iforce - remove empty multiline comments Input: synaptics - fix misuse of strlcpy Input: auo-pixcir-ts - switch to using devm_add_action_or_reset() Input: gtco - bounds check collection indent level Input: mtk-pmic-keys - add of_node_put() before return Input: sun4i-lradc-keys - add of_node_put() before return Input: synaptics - whitelist Lenovo T580 SMBus intertouch
-
git://git.infradead.org/users/hch/dma-mappingLinus Torvalds authored
Pull dma-mapping fixes from Christoph Hellwig: "Fix various regressions: - force unencrypted dma-coherent buffers if encryption bit can't fit into the dma coherent mask (Tom Lendacky) - avoid limiting request size if swiotlb is not used (me) - fix swiotlb handling in dma_direct_sync_sg_for_cpu/device (Fugang Duan)" * tag 'dma-mapping-5.3-1' of git://git.infradead.org/users/hch/dma-mapping: dma-direct: correct the physical addr in dma_direct_sync_sg_for_cpu/device dma-direct: only limit the mapping size if swiotlb could be used dma-mapping: add a dma_addressing_limited helper dma-direct: Force unencrypted DMA under SME for certain DMA masks
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull x86 fixes from Thomas Gleixner: "A set of x86 specific fixes and updates: - The CR2 corruption fixes which store CR2 early in the entry code and hand the stored address to the fault handlers. - Revert a forgotten leftover of the dropped FSGSBASE series. - Plug a memory leak in the boot code. - Make the Hyper-V assist functionality robust by zeroing the shadow page. - Remove a useless check for dead processes with LDT - Update paravirt and VMware maintainers entries. - A few cleanup patches addressing various compiler warnings" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/entry/64: Prevent clobbering of saved CR2 value x86/hyper-v: Zero out the VP ASSIST PAGE on allocation x86, boot: Remove multiple copy of static function sanitize_boot_params() x86/boot/compressed/64: Remove unused variable x86/boot/efi: Remove unused variables x86/mm, tracing: Fix CR2 corruption x86/entry/64: Update comments and sanity tests for create_gap x86/entry/64: Simplify idtentry a little x86/entry/32: Simplify common_exception x86/paravirt: Make read_cr2() CALLEE_SAVE MAINTAINERS: Update PARAVIRT_OPS_INTERFACE and VMWARE_HYPERVISOR_INTERFACE x86/process: Delete useless check for dead process with LDT x86: math-emu: Hide clang warnings for 16-bit overflow x86/e820: Use proper booleans instead of 0/1 x86/apic: Silence -Wtype-limits compiler warnings x86/mm: Free sme_early_buffer after init x86/boot: Fix memory leak in default_get_smp_config() Revert "x86/ptrace: Prevent ptrace from clearing the FS/GS selector" and fix the test
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull perf tooling updates from Thomas Gleixner: "A set of perf improvements and fixes: perf db-export: - Improvements in how COMM details are exported to databases for post processing and use in the sql-viewer.py UI. - Export switch events to the database. BPF: - Bump rlimit(MEMLOCK) for 'perf test bpf' and 'perf trace', just like selftests/bpf/bpf_rlimit.h do, which makes errors due to exhaustion of this limit, which are kinda cryptic (EPERM sometimes) less frequent. perf version: - Fix segfault due to missing OPT_END(), noticed on PowerPC. perf vendor events: - Add JSON files for IBM s/390 machine type 8561. perf cs-etm (ARM): - Fix two cases of error returns not bing done properly: Invalid ERR_PTR() use and loss of propagation error codes" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (28 commits) perf version: Fix segfault due to missing OPT_END() perf vendor events s390: Add JSON files for machine type 8561 perf cs-etm: Return errcode in cs_etm__process_auxtrace_info() perf cs-etm: Remove errnoeous ERR_PTR() usage in cs_etm__process_auxtrace_info perf scripts python: export-to-postgresql.py: Export switch events perf scripts python: export-to-sqlite.py: Export switch events perf db-export: Export switch events perf db-export: Factor out db_export__threads() perf script: Add scripting operation process_switch() perf scripts python: exported-sql-viewer.py: Use new 'has_calls' column perf scripts python: exported-sql-viewer.py: Remove redundant semi-colons perf scripts python: export-to-postgresql.py: Add has_calls column to comms table perf scripts python: export-to-sqlite.py: Add has_calls column to comms table perf db-export: Also export thread's current comm perf db-export: Factor out db_export__comm() perf scripts python: export-to-postgresql.py: Export comm details perf scripts python: export-to-sqlite.py: Export comm details perf db-export: Export comm details perf db-export: Fix a white space issue in db_export__sample() perf db-export: Move export__comm_thread into db_export__sample() ...
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull core fixes from Thomas Gleixner: - A collection of objtool fixes which address recent fallout partially exposed by newer toolchains, clang, BPF and general code changes. - Force USER_DS for user stack traces [ Note: the "objtool fixes" are not all to objtool itself, but for kernel code that triggers objtool warnings. Things like missing function size annotations, or code that confuses the unwinder etc. - Linus] * 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (27 commits) objtool: Support conditional retpolines objtool: Convert insn type to enum objtool: Fix seg fault on bad switch table entry objtool: Support repeated uses of the same C jump table objtool: Refactor jump table code objtool: Refactor sibling call detection logic objtool: Do frame pointer check before dead end check objtool: Change dead_end_function() to return boolean objtool: Warn on zero-length functions objtool: Refactor function alias logic objtool: Track original function across branches objtool: Add mcsafe_handle_tail() to the uaccess safe list bpf: Disable GCC -fgcse optimization for ___bpf_prog_run() x86/uaccess: Remove redundant CLACs in getuser/putuser error paths x86/uaccess: Don't leak AC flag into fentry from mcsafe_handle_tail() x86/uaccess: Remove ELF function annotation from copy_user_handle_tail() x86/head/64: Annotate start_cpu0() as non-callable x86/entry: Fix thunk function ELF sizes x86/kvm: Don't call kvm_spurious_fault() from .fixup x86/kvm: Replace vmx_vmenter()'s call to kvm_spurious_fault() with UD2 ...
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull smp fix from Thomas Gleixner: "Add warnings to the smp function calls so callers from wrong contexts get detected" * 'smp-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: smp: Warn on function calls from softirq context
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull CONFIG_PREEMPT_RT stub config from Thomas Gleixner: "The real-time preemption patch set exists for almost 15 years now and while the vast majority of infrastructure and enhancements have found their way into the mainline kernel, the final integration of RT is still missing. Over the course of the last few years, we have worked on reducing the intrusivenness of the RT patches by refactoring kernel infrastructure to be more real-time friendly. Almost all of these changes were benefitial to the mainline kernel on their own, so there was no objection to integrate them. Though except for the still ongoing printk refactoring, the remaining changes which are required to make RT a first class mainline citizen are not longer arguable as immediately beneficial for the mainline kernel. Most of them are either reordering code flows or adding RT specific functionality. But this now has hit a wall and turned into a classic hen and egg problem: Maintainers are rightfully wary vs. these changes as they make only sense if the final integration of RT into the mainline kernel takes place. Adding CONFIG_PREEMPT_RT aims to solve this as a clear sign that RT will be fully integrated into the mainline kernel. The final integration of the missing bits and pieces will be of course done with the same careful approach as we have used in the past. While I'm aware that you are not entirely enthusiastic about that, I think that RT should receive the same treatment as any other widely used out of tree functionality, which we have accepted into mainline over the years. RT has become the de-facto standard real-time enhancement and is shipped by enterprise, embedded and community distros. It's in use throughout a wide range of industries: telecommunications, industrial automation, professional audio, medical devices, data acquisition, automotive - just to name a few major use cases. RT development is backed by a Linuxfoundation project which is supported by major stakeholders of this technology. The funding will continue over the actual inclusion into mainline to make sure that the functionality is neither introducing regressions, regressing itself, nor becomes subject to bitrot. There is also a lifely user community around RT as well, so contrary to the grim situation 5 years ago, it's a healthy project. As RT is still a good vehicle to exercise rarely used code paths and to detect hard to trigger issues, you could at least view it as a QA tool if nothing else" * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched/rt, Kconfig: Introduce CONFIG_PREEMPT_RT
-
git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds authored
Pull more KVM updates from Paolo Bonzini: "Mostly bugfixes, but also: - s390 support for KVM selftests - LAPIC timer offloading to housekeeping CPUs - Extend an s390 optimization for overcommitted hosts to all architectures - Debugging cleanups and improvements" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (25 commits) KVM: x86: Add fixed counters to PMU filter KVM: nVMX: do not use dangling shadow VMCS after guest reset KVM: VMX: dump VMCS on failed entry KVM: x86/vPMU: refine kvm_pmu err msg when event creation failed KVM: s390: Use kvm_vcpu_wake_up in kvm_s390_vcpu_wakeup KVM: Boost vCPUs that are delivering interrupts KVM: selftests: Remove superfluous define from vmx.c KVM: SVM: Fix detection of AMD Errata 1096 KVM: LAPIC: Inject timer interrupt via posted interrupt KVM: LAPIC: Make lapic timer unpinned KVM: x86/vPMU: reset pmc->counter to 0 for pmu fixed_counters KVM: nVMX: Ignore segment base for VMX memory operand when segment not FS or GS kvm: x86: ioapic and apic debug macros cleanup kvm: x86: some tsc debug cleanup kvm: vmx: fix coccinelle warnings x86: kvm: avoid constant-conversion warning x86: kvm: avoid -Wsometimes-uninitized warning KVM: x86: expose AVX512_BF16 feature to guest KVM: selftests: enable pgste option for the linker on s390 KVM: selftests: Move kvm_create_max_vcpus test to generic code ...
-
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsiLinus Torvalds authored
Pull SCSI fixes from James Bottomley: "This is the final round of mostly small fixes in our initial submit. It's mostly minor fixes and driver updates. The only change of note is adding a virt_boundary_mask to the SCSI host and host template to parametrise this for NVMe devices instead of having them do a call in slave_alloc. It's a fairly straightforward conversion except in the two NVMe handling drivers that didn't set it who now have a virtual infinity parameter added" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (24 commits) scsi: megaraid_sas: set an unlimited max_segment_size scsi: mpt3sas: set an unlimited max_segment_size for SAS 3.0 HBAs scsi: IB/srp: set virt_boundary_mask in the scsi host scsi: IB/iser: set virt_boundary_mask in the scsi host scsi: storvsc: set virt_boundary_mask in the scsi host template scsi: ufshcd: set max_segment_size in the scsi host template scsi: core: take the DMA max mapping size into account scsi: core: add a host / host template field for the virt boundary scsi: core: Fix race on creating sense cache scsi: sd_zbc: Fix compilation warning scsi: libfc: fix null pointer dereference on a null lport scsi: zfcp: fix GCC compiler warning emitted with -Wmaybe-uninitialized scsi: zfcp: fix request object use-after-free in send path causing wrong traces scsi: zfcp: fix request object use-after-free in send path causing seqno errors scsi: megaraid_sas: Update driver version to 07.710.50.00 scsi: megaraid_sas: Add module parameter for FW Async event logging scsi: megaraid_sas: Enable msix_load_balance for Invader and later controllers scsi: megaraid_sas: Fix calculation of target ID scsi: lpfc: reduce stack size with CONFIG_GCC_PLUGIN_STRUCTLEAK_VERBOSE scsi: devinfo: BLIST_TRY_VPD_PAGES for SanDisk Cruzer Blade ...
-
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuildLinus Torvalds authored
Pull more Kbuild updates from Masahiro Yamada: - match the directory structure of the linux-libc-dev package to that of Debian-based distributions - fix incorrect include/config/auto.conf generation when Kconfig creates it along with the .config file - remove misleading $(AS) from documents - clean up precious tag files by distclean instead of mrproper - add a new coccinelle patch for devm_platform_ioremap_resource migration - refactor module-related scripts to read modules.order instead of $(MODVERDIR)/*.mod files to get the list of created modules - remove MODVERDIR - update list of header compile-test - add -fcf-protection=none flag to avoid conflict with the retpoline flags when CONFIG_RETPOLINE=y - misc cleanups * tag 'kbuild-v5.3-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (25 commits) kbuild: add -fcf-protection=none when using retpoline flags kbuild: update compile-test header list for v5.3-rc1 kbuild: split out *.mod out of {single,multi}-used-m rules kbuild: remove 'prepare1' target kbuild: remove the first line of *.mod files kbuild: create *.mod with full directory path and remove MODVERDIR kbuild: export_report: read modules.order instead of .tmp_versions/*.mod kbuild: modpost: read modules.order instead of $(MODVERDIR)/*.mod kbuild: modsign: read modules.order instead of $(MODVERDIR)/*.mod kbuild: modinst: read modules.order instead of $(MODVERDIR)/*.mod scsi: remove pointless $(MODVERDIR)/$(obj)/53c700.ver kbuild: remove duplication from modules.order in sub-directories kbuild: get rid of kernel/ prefix from in-tree modules.{order,builtin} kbuild: do not create empty modules.order in the prepare stage coccinelle: api: add devm_platform_ioremap_resource script kbuild: compile-test headers listed in header-test-m as well kbuild: remove unused hostcc-option kbuild: remove tag files by distclean instead of mrproper kbuild: add --hash-style= and --build-id unconditionally kbuild: get rid of misleading $(AS) from documents ...
-
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfsLinus Torvalds authored
Pull dcache and mountpoint updates from Al Viro: "Saner handling of refcounts to mountpoints. Transfer the counting reference from struct mount ->mnt_mountpoint over to struct mountpoint ->m_dentry. That allows us to get rid of the convoluted games with ordering of mount shutdowns. The cost is in teaching shrink_dcache_{parent,for_umount} to cope with mixed-filesystem shrink lists, which we'll also need for the Slab Movable Objects patchset" * 'work.dcache2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: switch the remnants of releasing the mountpoint away from fs_pin get rid of detach_mnt() make struct mountpoint bear the dentry reference to mountpoint, not struct mount Teach shrink_dcache_parent() to cope with mixed-filesystem shrink lists fs/namespace.c: shift put_mountpoint() to callers of unhash_mnt() __detach_mounts(): lookup_mountpoint() can't return ERR_PTR() anymore nfs: dget_parent() never returns NULL ceph: don't open-code the check for dead lockref
-
Thomas Gleixner authored
The recent fix for CR2 corruption introduced a new way to reliably corrupt the saved CR2 value. CR2 is saved early in the entry code in RDX, which is the third argument to the fault handling functions. But it missed that between saving and invoking the fault handler enter_from_user_mode() can be called. RDX is a caller saved register so the invoked function can freely clobber it with the obvious consequences. The TRACE_IRQS_OFF call is safe as it calls through the thunk which preserves RDX, but TRACE_IRQS_OFF_DEBUG is not because it also calls into C-code outside of the thunk. Store CR2 in R12 instead which is a callee saved register and move R12 to RDX just before calling the fault handler. Fixes: a0d14b89 ("x86/mm, tracing: Fix CR2 corruption") Reported-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/alpine.DEB.2.21.1907201020540.1782@nanos.tec.linutronix.de
-
Peter Zijlstra authored
It's clearly documented that smp function calls cannot be invoked from softirq handling context. Unfortunately nothing enforces that or emits a warning. A single function call can be invoked from softirq context only via smp_call_function_single_async(). The only legit context is task context, so add a warning to that effect. Reported-by: luferry <luferry@163.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20190718160601.GP3402@hirez.programming.kicks-ass.net
-
Eric Hankland authored
Updates KVM_CAP_PMU_EVENT_FILTER so it can also whitelist or blacklist fixed counters. Signed-off-by: Eric Hankland <ehankland@google.com> [No need to check padding fields for zero. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Paolo Bonzini authored
If a KVM guest is reset while running a nested guest, free_nested will disable the shadow VMCS execution control in the vmcs01. However, on the next KVM_RUN vmx_vcpu_run would nevertheless try to sync the VMCS12 to the shadow VMCS which has since been freed. This causes a vmptrld of a NULL pointer on my machime, but Jan reports the host to hang altogether. Let's see how much this trivial patch fixes. Reported-by: Jan Kiszka <jan.kiszka@siemens.com> Cc: Liran Alon <liran.alon@oracle.com> Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Paolo Bonzini authored
This is useful for debugging, and is ratelimited nowadays. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Like Xu authored
If a perf_event creation fails due to any reason of the host perf subsystem, it has no chance to log the corresponding event for guest which may cause abnormal sampling data in guest result. In debug mode, this message helps to understand the state of vPMC and we may not limit the number of occurrences but not in a spamming style. Suggested-by: Joe Perches <joe@perches.com> Signed-off-by: Like Xu <like.xu@linux.intel.com> Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Wanpeng Li authored
Use kvm_vcpu_wake_up() in kvm_s390_vcpu_wakeup(). Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Wanpeng Li <wanpengli@tencent.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Wanpeng Li authored
Inspired by commit 9cac38dd (KVM/s390: Set preempted flag during vcpu wakeup and interrupt delivery), we want to also boost not just lock holders but also vCPUs that are delivering interrupts. Most smp_call_function_many calls are synchronous, so the IPI target vCPUs are also good yield candidates. This patch introduces vcpu->ready to boost vCPUs during wakeup and interrupt delivery time; unlike s390 we do not reuse vcpu->preempted so that voluntarily preempted vCPUs are taken into account by kvm_vcpu_on_spin, but vmx_vcpu_pi_put is not affected (VT-d PI handles voluntary preemption separately, in pi_pre_block). Testing on 80 HT 2 socket Xeon Skylake server, with 80 vCPUs VM 80GB RAM: ebizzy -M vanilla boosting improved 1VM 21443 23520 9% 2VM 2800 8000 180% 3VM 1800 3100 72% Testing on my Haswell desktop 8 HT, with 8 vCPUs VM 8GB RAM, two VMs, one running ebizzy -M, the other running 'stress --cpu 2': w/ boosting + w/o pv sched yield(vanilla) vanilla boosting improved 1570 4000 155% w/ boosting + w/ pv sched yield(vanilla) vanilla boosting improved 1844 5157 179% w/o boosting, perf top in VM: 72.33% [kernel] [k] smp_call_function_many 4.22% [kernel] [k] call_function_i 3.71% [kernel] [k] async_page_fault w/ boosting, perf top in VM: 38.43% [kernel] [k] smp_call_function_many 6.31% [kernel] [k] async_page_fault 6.13% libc-2.23.so [.] __memcpy_avx_unaligned 4.88% [kernel] [k] call_function_interrupt Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Paul Mackerras <paulus@ozlabs.org> Cc: Marc Zyngier <maz@kernel.org> Signed-off-by: Wanpeng Li <wanpengli@tencent.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Thomas Huth authored
The code in vmx.c does not use "program_invocation_name", so there is no need to "#define _GNU_SOURCE" here. Signed-off-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Liran Alon authored
When CPU raise #NPF on guest data access and guest CR4.SMAP=1, it is possible that CPU microcode implementing DecodeAssist will fail to read bytes of instruction which caused #NPF. This is AMD errata 1096 and it happens because CPU microcode reading instruction bytes incorrectly attempts to read code as implicit supervisor-mode data accesses (that is, just like it would read e.g. a TSS), which are susceptible to SMAP faults. The microcode reads CS:RIP and if it is a user-mode address according to the page tables, the processor gives up and returns no instruction bytes. In this case, GuestIntrBytes field of the VMCB on a VMEXIT will incorrectly return 0 instead of the correct guest instruction bytes. Current KVM code attemps to detect and workaround this errata, but it has multiple issues: 1) It mistakenly checks if guest CR4.SMAP=0 instead of guest CR4.SMAP=1, which is required for encountering a SMAP fault. 2) It assumes SMAP faults can only occur when guest CPL==3. However, in case guest CR4.SMEP=0, the guest can execute an instruction which reside in a user-accessible page with CPL<3 priviledge. If this instruction raise a #NPF on it's data access, then CPU DecodeAssist microcode will still encounter a SMAP violation. Even though no sane OS will do so (as it's an obvious priviledge escalation vulnerability), we still need to handle this semanticly correct in KVM side. Note that (2) *is* a useful optimization, because CR4.SMAP=1 is an easy triggerable condition and guests usually enable SMAP together with SMEP. If the vCPU has CR4.SMEP=1, the errata could indeed be encountered onlt at guest CPL==3; otherwise, the CPU would raise a SMEP fault to guest instead of #NPF. We keep this condition to avoid false positives in the detection of the errata. In addition, to avoid future confusion and improve code readbility, include details of the errata in code and not just in commit message. Fixes: 05d5a486 ("KVM: SVM: Workaround errata#1096 (insn_len maybe zero on SMAP violation)") Cc: Singh Brijesh <brijesh.singh@amd.com> Cc: Sean Christopherson <sean.j.christopherson@intel.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Liran Alon <liran.alon@oracle.com> Reviewed-by: Brijesh Singh <brijesh.singh@amd.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-