- 13 Jan, 2023 9 commits
-
-
Alexander Gordeev authored
The setup of the kernel virtual address space is spread throughout the sources, boot stages and config options like this: 1. The available physical memory regions are queried and stored as mem_detect information for later use in the decompressor. 2. Based on the physical memory availability the virtual memory layout is established in the decompressor; 3. If CONFIG_KASAN is disabled the kernel paging setup code populates kernel pgtables and turns DAT mode on. It uses the information stored at step [1]. 4. If CONFIG_KASAN is enabled the kernel early boot kasan setup populates kernel pgtables and turns DAT mode on. It uses the information stored at step [1]. The kasan setup creates early_pg_dir directory and directly overwrites swapper_pg_dir entries to make shadow memory pages available. Move the kernel virtual memory setup to the decompressor and start the kernel with DAT turned on right from the very first istruction. That completely eliminates the boot phase when the kernel runs in DAT-off mode, simplies the overall design and consolidates pgtables setup. The identity mapping is created in the decompressor, while kasan shadow mappings are still created by the early boot kernel code. Share with decompressor the existing kasan memory allocator. It decreases the size of a newly requested memory block from pgalloc_pos and ensures that kernel image is not overwritten. pgalloc_low and pgalloc_pos pointers are made preserved boot variables for that. Use the bootdata infrastructure to setup swapper_pg_dir and invalid_pg_dir directories used by the kernel later. The interim early_pg_dir directory established by the kasan initialization code gets eliminated as result. As the kernel runs in DAT-on mode only the PSW_KERNEL_BITS define gets PSW_MASK_DAT bit by default. Additionally, the setup_lowcore_dat_off() and setup_lowcore_dat_on() routines get merged, since there is no DAT-off mode stage anymore. The memory mappings are created with RW+X protection that allows the early boot code setting up all necessary data and services for the kernel being booted. Just before the paging is enabled the memory protection is changed to RO+X for text, RO+NX for read-only data and RW+NX for kernel data and the identity mapping. Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
-
Alexander Gordeev authored
Detect and enable memory facilities which is a prerequisite for pgtables setup in the decompressor. Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
-
Alexander Gordeev authored
Similar to existing PAGE_KERNEL_EXEC and SEGMENT_KERNEL_EXEC memory protection add REGION3_KERNEL_EXEC attribute that could be set on PUD pgtable entries. Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
-
Alexander Gordeev authored
Convert setup of pgtable entries to use set_pXe_bit() helpers as the preferred way in MM code. Locally introduce pgprot_clear_bit() helper, which is strictly speaking a generic function. However, it is only x86 pgprot_clear_protnone_bits() helper, which does a similar thing, so do not make it public. Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
-
Alexander Gordeev authored
Avoid duplicate IS_ENABLED(CONFIG_KASAN_VMALLOC) condition check. Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
-
Alexander Gordeev authored
Fix variables initialization coding style and setup zero pgtable same way region and segment pgtables are set up. Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
-
Alexander Gordeev authored
The kasan early boot memory allocators operate on pgalloc_pos and segment_pos physical address pointers, but fail to convert it to the corresponding virtual pointers. Currently it is not a problem, since virtual and physical addresses on s390 are the same. Nevertheless, should they ever differ, this would cause an invalid pointer access. Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
-
Alexander Gordeev authored
Commit ada1da31 ("s390/sclp: sort out physical vs virtual pointers usage") fixed the notion of virtual address for sclp_early_sccb pointer. However, it did not take into account that kasan_early_init() can also output messages and sclp_early_sccb should be adjusted by the time kasan_early_init() is called. Currently it is not a problem, since virtual and physical addresses on s390 are the same. Nevertheless, should they ever differ, this would cause an invalid pointer access. Fixes: ada1da31 ("s390/sclp: sort out physical vs virtual pointers usage") Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
-
Alexander Gordeev authored
Move declarations to appropriate header files. Instead of cryptic casting directly assign struct vmlinux_info type to _vmlinux_info linker script variable - wich it actually is. Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
-
- 10 Jan, 2023 2 commits
-
-
Heiko Carstens authored
Add missing header include to get rid of arch/s390/crypto/arch_random.c:15:1: warning: symbol 's390_arch_random_available' was not declared. Should it be static? arch/s390/crypto/arch_random.c:17:12: warning: symbol 's390_arch_random_counter' was not declared. Should it be static? Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
-
Heiko Carstens authored
Fix this for allmodconfig: drivers/s390/char/con3270.c:43:24: error: 'condev' defined but not used [-Werror=unused-variable] static struct tty3270 *condev; ^~~~~~ Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Fixes: c17fe081 ("s390/3270: unify con3270 + tty3270") Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
-
- 09 Jan, 2023 29 commits
-
-
Xu Panda authored
The implementation of strscpy() is more robust and safer. That's now the recommended way to copy NUL-terminated strings. Signed-off-by: Xu Panda <xu.panda@zte.com.cn> Signed-off-by: Yang Yang <yang.yang29@zte.com.cn> Link: https://lore.kernel.org/r/202301052024349365834@zte.com.cnSigned-off-by: Heiko Carstens <hca@linux.ibm.com>
-
Eric Farman authored
By this point, all the pieces are in place to properly support a 2K Format-2 IDAL, and to convert a guest Format-1 IDAL to the 2K Format-2 variety. Let's remove the fence that prohibits them, and allow a guest to submit them if desired. Signed-off-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
-
Eric Farman authored
The vfio_pin_pages() interface allows contiguous pages to be pinned as a single request, which is great for the 4K pages that are normally processed. Old IDA formats operate on 2K chunks, which makes this logic more difficult. Since these formats are rare, let's just invoke the page pinning one-at-a-time, instead of trying to group them. We can rework this code at a later date if needed. Signed-off-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
-
Eric Farman authored
There are two scenarios that need to be addressed here. First, an ORB that does NOT have the Format-2 IDAL bit set could have both a direct-addressed CCW and an indirect-data-address CCW chained together. This means that the IDA CCW will contain a Format-1 IDAL, and can be easily converted to a 2K Format-2 IDAL. But it also means that the direct-addressed CCW needs to be converted to the same 2K Format-2 IDAL for consistency with the ORB settings. Secondly, a Format-1 IDAL is comprised of 31-bit addresses. Thus, we need to cast this IDAL to a pointer of ints while populating the list of addresses that are sent to vfio. Since the result of both of these is the use of the 2K IDAL variants, and the output of vfio-ccw is always a Format-2 IDAL (in order to use 64-bit addresses), make sure that the correct control bit gets set in the ORB when these scenarios occur. Signed-off-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
-
Eric Farman authored
Today, we allocate memory for a list of IDAWs, and if the CCW being processed contains an IDAL we read that data from the guest into that space. We then copy each IDAW into the pa_iova array, or fabricate that pa_iova array with a list of addresses based on a direct-addressed CCW. Combine the reading of the guest IDAL with the creation of a pseudo-IDAL for direct-addressed CCWs, so that both CCW types have a "guest" IDAL that can be populated straight into the pa_iova array. Signed-off-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
-
Eric Farman authored
The idal_nr_words() routine works well for 4K IDAWs, but lost its ability to handle the old 2K formats with the removal of 31-bit builds in commit 5a79859a ("s390: remove 31 bit support"). Since there's nothing preventing a guest from generating this IDAW format, let's re-introduce the math for them and use both when calculating the number of IDAWs based on the bits specified in the ORB. Signed-off-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
-
Eric Farman authored
The intention is to read the first IDAW to determine the starting location of an I/O operation, knowing that the second and any/all subsequent IDAWs will be aligned per architecture. But, this read receives 64-bits of data, which is the size of a Format-2 IDAW. In the event that Format-1 IDAWs are presented, adjust the size of the read to 32-bits. The data will end up occupying the upper word of the target iova variable, so shift it down to the lower word for use as an address. (By definition, this IDAW format uses a 31-bit address, so the "sign" bit will always be off and there is no concern about sign extension.) Signed-off-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
-
Eric Farman authored
The rules of an IDAW are fairly simple: Each one can move no more than a defined amount of data, must not cross the boundary defined by that length, and must be aligned to that length as well. The first IDAW in a list is special, in that it does not need to adhere to that alignment, but the other rules still apply. Thus, by reading the first IDAW in a list, the number of IDAWs that will comprise a data transfer of a particular size can be calculated. Let's factor out the reading of that first IDAW with the logic that calculates the length of the list, to simplify the rest of the routine that handles the individual IDAWs. Signed-off-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
-
Eric Farman authored
There are two possible ways the list of addresses that get passed to vfio are calculated. One is from a guest IDAL, which would be an array of (probably) non-contiguous addresses. The other is built from contiguous pages that follow the starting address provided by ccw->cda. page_array_alloc() attempts to simplify things by pre-populating this array from the starting address, but that's not needed for a CCW with an IDAL anyway so doesn't need to be in the allocator. Move it to the caller in the non-IDAL case, since it will be overwritten when reading the guest IDAL. Remove the initialization of the pa_page output pointers, since it won't be explicitly needed for either case. Signed-off-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
-
Eric Farman authored
The allocation of our page_array struct calculates the number of 4K pages that would be needed to hold a certain number of bytes. But, since the number of pages that will be pinned is also calculated by the length of the IDAL, this logic is unnecessary. Let's pass that information in directly, and avoid the math within the allocator. Also, let's make this two allocations instead of one, to make it apparent what's happening within here. Signed-off-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
-
Eric Farman authored
Everything about this allocation is harder than necessary, since the memory allocation is already aligned to our needs. Break them apart for readability, instead of doing the funky arithmetic. Of the structures that are involved, only ch_ccw needs the GFP_DMA flag, so the others can be allocated without it. Signed-off-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
-
Eric Farman authored
The act of processing a fetched CCW has two components: 1) Process a Transfer-in-channel (TIC) CCW 2) Process any other CCW The former needs to look at whether the TIC jumps backwards into the current channel program or forwards into a new segment, while the latter just processes the CCW data address itself. Rather than passing the chain segment and index within it to the handlers for the above, and requiring each to calculate the elements it needs, simply pass the needed pointers directly. For the TIC, that means the CCW being processed and the location of the entire channel program which holds all segments. For the other CCWs, the page_array pointer is also needed to perform the page pinning, etc. While at it, rename ccwchain_fetch_direct to _ccw, to indicate what it is. The name "_direct" is historical, when it used to process a direct-addressed CCW, but IDAs are processed here too. Signed-off-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
-
Eric Farman authored
It was suggested [1] that we replace the old copy_from_iova() routine (which pins a page, does a memcpy, and unpins the page) with the newer vfio_dma_rw() interface. This has a modest improvement in the overall time spent through the fsm_io_request() path, and simplifies some of the code to boot. [1] https://lore.kernel.org/r/20220706170553.GK693670@nvidia.com/Suggested-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
-
Eric Farman authored
The output of vfio_ccw is always a Format-2 IDAL, but the code that explicitly sets this is buried in cp_init(). In fact the input is often already a Format-2 IDAL, and would be rejected (via the check in ccwchain_calc_length()) if it weren't, so explicitly setting it doesn't do much. Setting it way down here only makes it impossible to make decisions in support of other IDAL formats. Let's move that to where the rest of the ORB is set up, so that the CCW processing in cp_prefetch() is performed according to the contents of the unmodified guest ORB. Signed-off-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
-
Eric Farman authored
Currently, vfio-ccw copies the ORB from the io_region to the channel_program struct being built. It then adjusts various pieces of that ORB to the values needed to be used by the SSCH issued by vfio-ccw in the host. This includes setting the subchannel key to the default, presumably because Linux doesn't do anything with non-zero storage keys itself. But it seems wrong to convert every I/O to the default key if the guest itself requested a non-zero subchannel (access) key. Any channel program that sets a non-zero key would expect the same key returned in the SCSW of the IRB, not zero, so best to allow that to occur unimpeded. Signed-off-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
-
Eric Farman authored
There's no need to send in both the address of the subchannel struct, and an element within it, to populate the ORB. Pass the whole pointer and let cp_get_orb() take the pieces that are needed. Suggested-by: Matthew Rosato <mjrosato@linux.ibm.com> Signed-off-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
-
Eric Farman authored
There is no longer an mdev struct accessible via a channel program struct, but there are some artifacts remaining that mention it. Clean them up. Signed-off-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
-
Sven Schnelle authored
Normally a user can scroll back with PF7/PF8 if printed messages are outside of the visible screen area. This doesn't work when the kernel crashes, because the scrollback handling is done by the kernel, which is no longer alive after the kernel crash. Add code to always print all dirty lines in the screen buffer, so the user can scroll back with the terminal scrollback keys (Page Up/Down). Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Acked-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
-
Sven Schnelle authored
Now that lines are converted during output, the RA and SBA no longer need to get updated as an additional step. Instead set them when converting the line. Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Acked-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
-
Sven Schnelle authored
Make TTY3270_UPDATE_ALL the sum of all TTY3270_* flags, so we don't need any special handling for it. Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Acked-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
-
Sven Schnelle authored
When activating the view fails (in this case because the 3270 is disconnected) return from the notifer callback. Otherwise the system will deadlock. Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Acked-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
-
Sven Schnelle authored
Use __packed __aligned instead of __attribute__((packed, aligned(X))); to match the rest of the file. Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Acked-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
-
Sven Schnelle authored
In order to use the fs3270 one would need at least the ioctl definitions in uapi. Add two new include files in uapi, which contain: fs3270: ioctl number declarations + returned struct for TUBGETMOD. raw3270: all the orders, attributes and similar stuff used with 3270 terminals. Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Acked-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
-
Sven Schnelle authored
fs3270 uses EWRITEA to clear the screen when a user opens /dev/3270/tub. However it misses the attribute byte after the EWRITEA, so (at least) x3270 complains about 'Record too short, missing write flags'. Add the missing flag byte to fix this. Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Acked-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
-
Sven Schnelle authored
fix function prototypes split over two lines like: static void foobar(void) Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Acked-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
-
Sven Schnelle authored
Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Acked-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
-
Sven Schnelle authored
remove a duplicate assignment reported by checkpatch. Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Acked-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
-
Sven Schnelle authored
Fix a few missing braces and wrong placement of braces reported by checkpatch. Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Acked-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
-
Sven Schnelle authored
Fix a few whitespace errors reported by checkpatch, namely superfluous whitespace, missing spaces and empty lines. Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Acked-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
-