- 07 Aug, 2018 1 commit
-
-
Martin Schwidefsky authored
The memove, memset, memcpy, __memset16, __memset32 and __memset64 function have an additional indirect return branch in form of a "bzr" instruction. These need to use expolines as well. Cc: <stable@vger.kernel.org> # v4.17+ Fixes: 97489e06 ("s390/lib: use expoline for indirect branches") Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
-
- 01 Aug, 2018 1 commit
-
-
Martin Schwidefsky authored
The numa_init_early initcall sets the node_to_cpumask_map[0] to the full cpu_possible_mask. Unfortunately this early_initcall is too late, the NUMA setup for numa=emu is done even earlier. The order of calls is numa_setup() -> emu_update_cpu_topology(), then the early_initcalls(), followed by sched_init_domains(). Starting with git commit 051f3ca0 "sched/topology: Introduce NUMA identity node sched domain" the incorrect node_to_cpumask_map[0] really screws up the domain setup and the kernel panics with the follow oops: Cc: <stable@vger.kernel.org> # v4.15+ Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
-
- 31 Jul, 2018 3 commits
-
-
Philipp Rudo authored
Before the memory for the elfcorehdr is allocated the required size is estimated with alloc_size = 0x1000 + get_cpu_cnt() * 0x4a0 + mem_chunk_cnt * sizeof(Elf64_Phdr); Where 0x4a0 is used as size for the ELF notes to store the register contend. This size is 8 bytes too small. Usually this does not immediately cause a problem because the page reserved for overhead (Elf_Ehdr, vmcoreinfo, etc.) is pretty generous. So usually there is enough spare memory to counter the mis-calculated per cpu size. However, with growing overhead and/or a huge cpu count the allocated size gets too small for the elfcorehdr. Ultimately a BUG_ON is triggered causing the crash kernel to panic. Fix this by properly calculating the required size instead of relying on magic numbers. Fixes: a62bc073 ("s390/kdump: add support for vector extension") Signed-off-by: Philipp Rudo <prudo@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
-
Hendrik Brueckner authored
Processing the samples in the AUX-area by perf requires the computation of respective time stamps. The time stamps used by perf are based on the monotonic clock. To convert the TOD clock value contained in an SDB to a monotonic clock value, the TOD clock base is required. Hence, also save the TOD clock base in the SDB. Suggested-by: Thomas Richter <tmricht@linux.ibm.com> Signed-off-by: Hendrik Brueckner <brueckner@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
-
git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linuxMartin Schwidefsky authored
Pull hlp_stage1 from Christian Borntraeger with the following changes: KVM: s390: initial host large page support - must be enabled via module parameter hpage=1 - cannot be used together with nested - does support migration - does support hugetlbfs - no THP yet
-
- 30 Jul, 2018 13 commits
-
-
Janosch Frank authored
General KVM huge page support on s390 has to be enabled via the kvm.hpage module parameter. Either nested or hpage can be enabled, as we currently do not support vSIE for huge backed guests. Once the vSIE support is added we will either drop the parameter or enable it as default. For a guest the feature has to be enabled through the new KVM_CAP_S390_HPAGE_1M capability and the hpage module parameter. Enabling it means that cmm can't be enabled for the vm and disables pfmf and storage key interpretation. This is due to the fact that in some cases, in upcoming patches, we have to split huge pages in the guest mapping to be able to set more granular memory protection on 4k pages. These split pages have fake page tables that are not visible to the Linux memory management which subsequently will not manage its PGSTEs, while the SIE will. Disabling these features lets us manage PGSTE data in a consistent matter and solve that problem. Signed-off-by: Janosch Frank <frankja@linux.ibm.com> Reviewed-by: David Hildenbrand <david@redhat.com>
-
Janosch Frank authored
Let's allow huge pmd linking when enabled through the KVM_CAP_S390_HPAGE_1M capability. Also we can now restrict gmap invalidation and notification to the cases where the capability has been activated and save some cycles when that's not the case. Signed-off-by: Janosch Frank <frankja@linux.ibm.com> Reviewed-by: David Hildenbrand <david@redhat.com>
-
Dominik Dingel authored
Guests backed by huge pages could theoretically free unused pages via the diagnose 10 instruction. We currently don't allow that, so we don't have to refault it once it's needed again. Signed-off-by: Dominik Dingel <dingel@linux.vnet.ibm.com> Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Reviewed-by: David Hildenbrand <david@redhat.com> Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
-
Janosch Frank authored
When doing skey emulation for huge guests, we now need to fault in pmds, as we don't have PGSTES anymore to store them when we do not have valid table entries. Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
-
Janosch Frank authored
Storage keys for guests with huge page mappings have to be managed in hardware. There are no PGSTEs for PMDs that we could use to retain the guests's logical view of the key. Signed-off-by: Janosch Frank <frankja@linux.vnet.ibm.com> Reviewed-by: David Hildenbrand <david@redhat.com>
-
Janosch Frank authored
Similarly to the pte skey handling, where we set the storage key to the default key for each newly mapped pte, we have to also do that for huge pmds. With the PG_arch_1 flag we keep track if the area has already been cleared of its skeys. Signed-off-by: Janosch Frank <frankja@linux.ibm.com> Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
-
Dominik Dingel authored
When a guest starts using storage keys, we trap and set a default one for its whole valid address space. With this patch we are now able to do that for large pages. To speed up the storage key insertion, we use __storage_key_init_range, which in-turn will use sske_frame to set multiple storage keys with one instruction. As it has been previously used for debuging we have to get rid of the default key check and make it quiescing. Signed-off-by: Dominik Dingel <dingel@linux.vnet.ibm.com> Signed-off-by: Janosch Frank <frankja@linux.vnet.ibm.com> [replaced page_set_storage_key loop with __storage_key_init_range] Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Reviewed-by: David Hildenbrand <david@redhat.com>
-
Janosch Frank authored
To do dirty loging with huge pages, we protect huge pmds in the gmap. When they are written to, we unprotect them and mark them dirty. We introduce the function gmap_test_and_clear_dirty_pmd which handles dirty sync for huge pages. Signed-off-by: Janosch Frank <frankja@linux.ibm.com> Acked-by: David Hildenbrand <david@redhat.com>
-
Janosch Frank authored
If the host invalidates a pmd, we also have to invalidate the corresponding gmap pmds, as well as flush them from the TLB. This is necessary, as we don't share the pmd tables between host and guest as we do with ptes. The clearing part of these three new functions sets a guest pmd entry to _SEGMENT_ENTRY_EMPTY, so the guest will fault on it and we will re-link it. Flushing the gmap is not necessary in the host's lazy local and csp cases. Both purge the TLB completely. Signed-off-by: Janosch Frank <frankja@linux.vnet.ibm.com> Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Acked-by: David Hildenbrand <david@redhat.com>
-
Janosch Frank authored
Like for ptes, we also need invalidation notification for pmds, to make sure the guest lowcore pages are always accessible and later addition of shadowed pmds. With PMDs we do not have PGSTEs or some other bits we could use in the host PMD. Instead we pick one of the free bits in the gmap PMD. Every time a host pmd will be invalidated, we will check if the respective gmap PMD has the bit set and in that case fire up the notifier. Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
-
Janosch Frank authored
Let's allow pmds to be linked into gmap for the upcoming s390 KVM huge page support. Before this patch we copied the full userspace pmd entry. This is not correct, as it contains SW defined bits that might be interpreted differently in the GMAP context. Now we only copy over all hardware relevant information leaving out the software bits. Signed-off-by: Janosch Frank <frankja@linux.ibm.com> Reviewed-by: David Hildenbrand <david@redhat.com>
-
Janosch Frank authored
Currently we use the software PGSTE bits PGSTE_IN_BIT and PGSTE_VSIE_BIT to notify before an invalidation occurs on a prefix page or a VSIE page respectively. Both bits are pgste specific, but are used when protecting a memory range. Let's introduce abstract GMAP_NOTIFY_* bits that will be realized into the respective bits when gmap DAT table entries are protected. Signed-off-by: Janosch Frank <frankja@linux.vnet.ibm.com> Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: David Hildenbrand <david@redhat.com>
-
Janosch Frank authored
This patch reworks the gmap_protect_range logic and extracts the pte handling into an own function. Also we do now walk to the pmd and make it accessible in the function for later use. This way we can add huge page handling logic more easily. Signed-off-by: Janosch Frank <frankja@linux.vnet.ibm.com> Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
-
- 25 Jul, 2018 2 commits
-
-
Martin Schwidefsky authored
Now that the early boot rework is upstream we can enable the gcc plugins again. See git commit 72f108b308707f21499e0ac05bf7370360cf06d8 "s390: disable gcc plugins" for reference. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
-
Martin Schwidefsky authored
The s390 build currently fails with the latent entropy plugin: arch/s390/kernel/als.o: In function `verify_facilities': als.c:(.init.text+0x24): undefined reference to `latent_entropy' als.c:(.init.text+0xae): undefined reference to `latent_entropy' make[3]: *** [arch/s390/boot/compressed/vmlinux] Error 1 make[2]: *** [arch/s390/boot/compressed/vmlinux] Error 2 make[1]: *** [bzImage] Error 2 This will be fixed with the early boot rework from Vasily, which is planned for the 4.19 merge window. For 4.18 the simplest solution is to disable the gcc plugins and reenable them after the early boot rework is upstream. Reported-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> (cherry picked from commit 2fba3573)
-
- 23 Jul, 2018 7 commits
-
-
Souptick Joarder authored
Use new return type vm_fault_t for fault handler vdso_fault. Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com> Reviewed-by: Matthew Wilcox <willy@infradead.org> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
-
Thomas Richter authored
Tools like 'perf stat' parse the trace point format files defined in /sys/kernel/debug/tracing/events/s390/.../format to handle the print fmt: statement. The kernel provides a library in directory linux/tools/lib/traceevent/* for this reason. This library can not handle structures or unions defined in the TRACE_EVENT/TP_STRUCT__entry macros with __field_struct macro. There is no possibility to extract a structure member (which might be a bit field) since there is no packing information nor bit field offset by parsing the printf fmt line. Therefore rewrite the TRACE_EVENT macro and add the __field macro for the necessary members. Keep the __fieldstruct macro to extract the complete structure when dumps are analysed. Note that the same information is displayed, this is no interface change. Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Reviewed-by: Peter Oberparleiter <oberpar@linux.ibm.com> Acked-by: Sebastian Ott <sebott@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
-
Thomas Richter authored
Tools like 'perf stat' parse the trace point format files defined in /sys/kernel/debug/tracing/events/s390/.../format to handle the print fmt: statement. The kernel provides a library in directory linux/tools/lib/traceevent/* for this reason. This library can not handle structures or unions defined in the TRACE_EVENT/TP_STRUCT__entry macros with __field_struct macro. There is no possibility to extract a structure member (which might be a bit field) since there is no packing information nor bit field offset by parsing the printf fmt line. Therefore rewrite the TRACE_EVENT macro and add the __field macro for the necessary members. Keep the __fieldstruct macro to extract the complete structure when dumps are analysed. Note that the same information is displayed, this is no interface change. Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Reviewed-by: Peter Oberparleiter <oberpar@linux.ibm.com> Acked-by: Sebastian Ott <sebott@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
-
Thomas Richter authored
Tools like 'perf stat' parse the trace point format files defined in /sys/kernel/debug/tracing/events/s390/.../format to handle the print fmt: statement. The kernel provides a library in directory linux/tools/lib/traceevent/* for this reason. This library can not handle structures or unions defined in the TRACE_EVENT/TP_STRUCT__entry macros with __field_struct macro. There is no possibility to extract a structure member (which might be a bit field) since there is no packing information nor bit field offset by parsing the printf fmt line. Therefore rewrite the TRACE_EVENT macro and add the __field macro for the necessary members. Keep the __fieldstruct macro to extract the complete structure when dumps are analysed. Note that the same information is displayed, this is no interface change. Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Reviewed-by: Peter Oberparleiter <oberpar@linux.ibm.com> Acked-by: Sebastian Ott <sebott@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
-
Thomas Richter authored
Tools like 'perf stat' parse the trace point format files defined in /sys/kernel/debug/tracing/events/s390/.../format to handle the print fmt: statement. The kernel provides a library in directory linux/tools/lib/traceevent/* for this reason. This library can not handle structures or unions defined in the TRACE_EVENT/TP_STRUCT__entry macros with __field_struct macro. There is no possibility to extract a structure member (which might be a bit field) since there is no packing information nor bit field offset by parsing the printf fmt line. Therefore rewrite the TRACE_EVENT macro and add the the __field macro for the missing members. Keep the __fieldstruct macro to extract the complete structure when dumps are analysed. Note that the same information is displayed, this is no interface change. Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Reviewed-by: Peter Oberparleiter <oberpar@linux.ibm.com> Acked-by: Sebastian Ott <sebott@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
-
Thomas Richter authored
Tools like 'perf stat' parse the trace point format files defined in /sys/kernel/debug/tracing/events/s390/.../format to handle the print fmt: statement. The kernel provides a library in directory linux/tools/lib/traceevent/* for this reason. This library can not handle structures or unions defined in the TRACE_EVENT/TP_STRUCT__entry macros with __field_struct macro. There is no possibility to extract a structure member (which might be a bit field) since there is no packing information nor bit field offset by parsing the printf fmt line. Therefore rewrite the TRACE_EVENT macro and add the __field macro for the members adapter_IO, isc and type of struct tpi_info. Note that the same information is displayed, this is no interface change. Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Reviewed-by: Peter Oberparleiter <oberpar@linux.ibm.com> Acked-by: Sebastian Ott <sebott@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
-
Thomas Richter authored
Tools like 'perf stat' parse the trace point format files defined in /sys/kernel/debug/tracing/events/s390/.../format to handle the print fmt: statement. The kernel provides a library in directory linux/tools/lib/traceevent/* for this reason. This library can not handle structures or unions defined in the TRACE_EVENT/TP_STRUCT__entry macros with __field_struct macro. There is no possibility to extract a structure member (which might be a bit field) since there is no packing information nor bit field offset by parsing the printf fmt line. Therefore rewrite the TRACE_EVENT macro and add the __field macro for the necessary fields. Keep the __fieldstruct macro to extract the complete structure when dumps are analysed. Note that the same information is displayed, this is no interface change. Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Reviewed-by: Peter Oberparleiter <oberpar@linux.ibm.com> Acked-by: Sebastian Ott <sebott@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
-
- 19 Jul, 2018 5 commits
-
-
Gustavo A. R. Silva authored
PTR_RET is deprecated, use PTR_ERR_OR_ZERO instead. Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
-
Gustavo A. R. Silva authored
PTR_RET is deprecated, use PTR_ERR_OR_ZERO instead. Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
-
Gustavo A. R. Silva authored
PTR_RET is deprecated, use PTR_ERR_OR_ZERO instead. Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
-
Gustavo A. R. Silva authored
PTR_RET is deprecated, use PTR_ERR_OR_ZERO instead. Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
-
Martin Schwidefsky authored
The kbd_ioctl uses two user controlled indexes for KDGKBENT/KDSKBENT. Use array_index_nospec to prevent any out of bounds speculation. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
-
- 18 Jul, 2018 1 commit
-
-
Martin Schwidefsky authored
Detect and report the etoken facility. With spectre_v2=auto or CONFIG_EXPOLINE_AUTO=y automatically disable expolines and use the full branch prediction mode for the kernel. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
-
- 17 Jul, 2018 5 commits
-
-
Sebastian Ott authored
Remove attribute packed where possible failing this add proper alignment information to fix warnings like the one below: drivers/s390/cio/chsc.c: In function 'chsc_siosl': drivers/s390/cio/chsc.c:1287:2: warning: alignment 1 of 'struct <anonymous>' is less than 4 [-Wpacked-not-aligned] } __attribute__ ((packed)) *siosl_area; Note: this patch should be a nop since non of these structs use auto storage but allocated pages. However there are changes to the generated code because of additional padding at the end of some of the structs due to alignment when memset(foo, 0, sizeof(*foo)) is used. Signed-off-by: Sebastian Ott <sebott@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
-
Sebastian Ott authored
Both css_evaluate_new_subchannel and cio_validate_subchannel used stsch and css_sch_is_valid to check for a valid device. Reduce stsch calls during subchannel evaluation by re-using schib data. Also the type/devno valid information is only checked once. Signed-off-by: Sebastian Ott <sebott@linux.ibm.com> Reviewed-by: Peter Oberparleiter <oberpar@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
-
Sebastian Ott authored
In css_alloc_subchannel we allocate the subchannel and do a validation of the subchannel (to decide if we should look for devices via this subchannel). On a typical LPAR we find lots of subchannels to be invalid (because there is no device attached or the device is blacklisted) leading to lots of useless kmalloc and kfree calls. This patch changes the order to only allocate the subchannels that have been found valid. Signed-off-by: Sebastian Ott <sebott@linux.ibm.com> Reviewed-by: Peter Oberparleiter <oberpar@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
-
Sebastian Ott authored
The css bus code uses 2 initcalls: channel_subsystem_init to initialize internal data and channel_subsystem_init_sync to start scanning for devices and wait for it to finish. The start scanning for devices part is moved to the first initcall such that more work happens in parallel. Signed-off-by: Sebastian Ott <sebott@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
-
Sebastian Ott authored
Improve locking in chp_new to make sure that we don't register the same chpid twice. Chpid registration was synchronized via the machine check handler thread but we also have codepaths to look for new chpids triggered independent of that thread (during IPL or resume from hibernate). Signed-off-by: Sebastian Ott <sebott@linux.ibm.com> Reviewed-by: Peter Oberparleiter <oberpar@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
-
- 16 Jul, 2018 2 commits
-
-
Claudio Imbrenda authored
When the oom killer kills a userspace process in the page fault handler while in guest context, the fault handler fails to release the mm_sem if the FAULT_FLAG_RETRY_NOWAIT option is set. This leads to a deadlock when tearing down the mm when the process terminates. This bug can only happen when pfault is enabled, so only KVM clients are affected. The problem arises in the rare cases in which handle_mm_fault does not release the mm_sem. This patch fixes the issue by manually releasing the mm_sem when needed. Fixes: 24eb3a82 ("KVM: s390: Add FAULT_FLAG_RETRY_NOWAIT for guest fault") Cc: <stable@vger.kernel.org> # 3.15+ Signed-off-by: Claudio Imbrenda <imbrenda@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
-
Vasily Gorbik authored
cmm_set_timer could be called concurrently from cmm_thread, cmm proc handler, upon cmm smsg receive and timer function itself. To avoid potential race condition and hitting BUG_ON in add_timer on already pending timer simply reuse mod_timer which is according to documentation "the only safe way to modify the timeout" with multiple unserialized concurrent users. mod_timer can handle both active and inactive timers which allows to carry out minor code simplification as well. Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
-