1. 21 Sep, 2020 3 commits
  2. 17 Sep, 2020 5 commits
  3. 16 Sep, 2020 8 commits
    • Vasily Gorbik's avatar
      s390/kasan: support protvirt with 4-level paging · c360c9a2
      Vasily Gorbik authored
      Currently the kernel crashes in Kasan instrumentation code if
      CONFIG_KASAN_S390_4_LEVEL_PAGING is used on protected virtualization
      capable machine where the ultravisor imposes addressing limitations on
      the host and those limitations are lower then KASAN_SHADOW_OFFSET.
      
      The problem is that Kasan has to know in advance where vmalloc/modules
      areas would be. With protected virtualization enabled vmalloc/modules
      areas are moved down to the ultravisor secure storage limit while kasan
      still expects them at the very end of 4-level paging address space.
      
      To fix that make Kasan recognize when protected virtualization is enabled
      and predefine vmalloc/modules areas position which are compliant with
      ultravisor secure storage limit.
      
      Kasan shadow itself stays in place and might reside above that ultravisor
      secure storage limit.
      
      One slight difference compaired to a kernel without Kasan enabled is that
      vmalloc/modules areas position is not reverted to default if ultravisor
      initialization fails. It would still be below the ultravisor secure
      storage limit.
      
      Kernel layout with kasan, 4-level paging and protected virtualization
      enabled (ultravisor secure storage limit is at 0x0000800000000000):
      ---[ vmemmap Area Start ]---
      0x0000400000000000-0x0000400080000000
      ---[ vmemmap Area End ]---
      ---[ vmalloc Area Start ]---
      0x00007fe000000000-0x00007fff80000000
      ---[ vmalloc Area End ]---
      ---[ Modules Area Start ]---
      0x00007fff80000000-0x0000800000000000
      ---[ Modules Area End ]---
      ---[ Kasan Shadow Start ]---
      0x0018000000000000-0x001c000000000000
      ---[ Kasan Shadow End ]---
      0x001c000000000000-0x0020000000000000         1P PGD I
      
      Kernel layout with kasan, 4-level paging and protected virtualization
      disabled/unsupported:
      ---[ vmemmap Area Start ]---
      0x0000400000000000-0x0000400060000000
      ---[ vmemmap Area End ]---
      ---[ Kasan Shadow Start ]---
      0x0018000000000000-0x001c000000000000
      ---[ Kasan Shadow End ]---
      ---[ vmalloc Area Start ]---
      0x001fffe000000000-0x001fffff80000000
      ---[ vmalloc Area End ]---
      ---[ Modules Area Start ]---
      0x001fffff80000000-0x0020000000000000
      ---[ Modules Area End ]---
      Signed-off-by: default avatarVasily Gorbik <gor@linux.ibm.com>
      c360c9a2
    • Vasily Gorbik's avatar
      s390/protvirt: support ultravisor without secure storage limit · c2314cb2
      Vasily Gorbik authored
      Avoid potential crash due to lack of secure storage limit. Check that
      max_sec_stor_addr is not 0 before adjusting vmalloc position.
      Signed-off-by: default avatarVasily Gorbik <gor@linux.ibm.com>
      c2314cb2
    • Vasily Gorbik's avatar
      s390/protvirt: parse prot_virt option in the decompressor · 1d6671ae
      Vasily Gorbik authored
      To make early kernel address space layout definition possible parse
      prot_virt option in the decompressor and pass it to the uncompressed
      kernel. This enables kasan to take ultravisor secure storage limit into
      consideration and pre-define vmalloc position correctly.
      Signed-off-by: default avatarVasily Gorbik <gor@linux.ibm.com>
      1d6671ae
    • Vasily Gorbik's avatar
      s390/kasan: avoid unnecessary moving of vmemmap · 8f78657c
      Vasily Gorbik authored
      Currently vmemmap area is unconditionally moved beyond Kasan shadow
      memory. When Kasan is not enabled vmemmap area position is calculated
      in setup_memory_end() and depends on limiting factors like ultravisor
      secure storage limit. Try to follow the same logic with Kasan enabled
      as well and avoid unnecessary vmemmap area position changes unless it
      really intersects with Kasan shadow.
      Reviewed-by: default avatarHeiko Carstens <hca@linux.ibm.com>
      Signed-off-by: default avatarVasily Gorbik <gor@linux.ibm.com>
      8f78657c
    • Vasily Gorbik's avatar
      s390/mm,ptdump: sort markers · ee4b2ce6
      Vasily Gorbik authored
      Kasan configuration options and size of physical memory present could
      affect kernel memory layout. In particular vmemmap, vmalloc and modules
      might come before kasan shadow or after it. To make ptdump correctly
      output markers in the right order markers have to be sorted.
      
      To preserve the original order of markers with the same start address
      avoid using sort() from lib/sort.c (which is not stable sorting algorithm)
      and sort markers in place.
      Reviewed-by: default avatarHeiko Carstens <hca@linux.ibm.com>
      Signed-off-by: default avatarVasily Gorbik <gor@linux.ibm.com>
      ee4b2ce6
    • Niklas Schnelle's avatar
      s390/pci: add missing pci_iov.h include · 4904e194
      Niklas Schnelle authored
      this fixes a missing prototype compiler warning spotted by the kernel
      test robot.
      
      Fixes: abb95b75 ("s390/pci: consolidate SR-IOV specific code")
      Reported-by: default avatarkernel test robot <lkp@intel.com>
      Signed-off-by: default avatarNiklas Schnelle <schnelle@linux.ibm.com>
      Signed-off-by: default avatarVasily Gorbik <gor@linux.ibm.com>
      4904e194
    • Heiko Carstens's avatar
      s390/mm,ptdump: add proper ifdefs · 48111b48
      Heiko Carstens authored
      Use ifdefs instead of IS_ENABLED() to avoid compile error
      for !PTDUMP_DEBUGFS:
      
      arch/s390/mm/dump_pagetables.c: In function ‘pt_dump_init’:
      arch/s390/mm/dump_pagetables.c:248:64: error: ‘ptdump_fops’ undeclared (first use in this function); did you mean ‘pidfd_fops’?
         debugfs_create_file("kernel_page_tables", 0400, NULL, NULL, &ptdump_fops);
      Reported-by: default avatarJulian Wiedmann <jwi@linux.ibm.com>
      Fixes: 08c8e685 ("s390: add ARCH_HAS_DEBUG_WX support")
      Signed-off-by: default avatarHeiko Carstens <hca@linux.ibm.com>
      Signed-off-by: default avatarVasily Gorbik <gor@linux.ibm.com>
      48111b48
    • Alexander Egorenkov's avatar
      s390/boot: enable .bss section for compressed kernel · 980d5f9a
      Alexander Egorenkov authored
      - Support static uninitialized variables in compressed kernel.
      - Remove chkbss script
      - Get rid of workarounds for not having .bss section
      Signed-off-by: default avatarAlexander Egorenkov <egorenar@linux.ibm.com>
      Reviewed-by: default avatarVasily Gorbik <gor@linux.ibm.com>
      Signed-off-by: default avatarVasily Gorbik <gor@linux.ibm.com>
      980d5f9a
  4. 14 Sep, 2020 19 commits
  5. 26 Aug, 2020 5 commits
    • Sven Schnelle's avatar
      s390: convert to GENERIC_VDSO · 4bff8cb5
      Sven Schnelle authored
      Convert s390 to generic vDSO. There are a few special things on s390:
      
      - vDSO can be called without a stack frame - glibc did this in the past.
        So we need to allocate a stackframe on our own.
      
      - The former assembly code used stcke to get the TOD clock and applied
        time steering to it. We need to do the same in the new code. This is done
        in the architecture specific __arch_get_hw_counter function. The steering
        information is stored in an architecure specific area in the vDSO data.
      
      - CPUCLOCK_VIRT is now handled with a syscall fallback, which might
        be slower/less accurate than the old implementation.
      
      The getcpu() function stays as an assembly function because there is no
      generic implementation and the code is just a few lines.
      
      Performance number from my system do 100 mio gettimeofday() calls:
      
      Plain syscall: 8.6s
      Generic VDSO:  1.3s
      old ASM VDSO:  1s
      
      So it's a bit slower but still much faster than syscalls.
      Signed-off-by: default avatarSven Schnelle <svens@linux.ibm.com>
      Reviewed-by: default avatarHeiko Carstens <hca@linux.ibm.com>
      Signed-off-by: default avatarVasily Gorbik <gor@linux.ibm.com>
      4bff8cb5
    • Heiko Carstens's avatar
      s390/checksum: coding style changes · 98ad45fb
      Heiko Carstens authored
      Add some coding style changes which hopefully make the code
      look a bit less odd.
      Signed-off-by: default avatarHeiko Carstens <hca@linux.ibm.com>
      Signed-off-by: default avatarVasily Gorbik <gor@linux.ibm.com>
      98ad45fb
    • Heiko Carstens's avatar
      s390/checksum: have consistent calculations · 612ad078
      Heiko Carstens authored
      Use "|" instead of "+" within csum_fold() for consistency reasons,
      like in the rest of the file.
      Signed-off-by: default avatarHeiko Carstens <hca@linux.ibm.com>
      Signed-off-by: default avatarVasily Gorbik <gor@linux.ibm.com>
      612ad078
    • Heiko Carstens's avatar
      s390/checksum: make ip_fast_csum() faster · 614b4f5d
      Heiko Carstens authored
      Convert ip_fast_csum() so it doesn't call csum_partial(), but instead
      open code the checksum calculation. The problem with csum_partial() is
      that it makes use of the cksm instruction, which has high startup
      costs and therefore is only very fast if used on larger memory
      regions.
      
      IPv4 headers however are small in size (5-16 32-bit words). The open
      coded variant calculates the checksum in ~30% of the time compared to
      the old variant (z14, march=z196).
      Signed-off-by: default avatarHeiko Carstens <hca@linux.ibm.com>
      Signed-off-by: default avatarVasily Gorbik <gor@linux.ibm.com>
      614b4f5d
    • Heiko Carstens's avatar
      s390/checksum: rewrite csum_tcpudp_nofold() · bb4644b1
      Heiko Carstens authored
      Rewrite csum_tcpudp_nofold() so that the generated code will not
      contain branches. The old implementation was also optimized for
      machines which came with "add logical with carry" instructions,
      however the compiler doesn't generate them anymore. This is most
      likely because those instructions are slower.
      
      However with the old code the compiler generates a lot of branches,
      which isn't too helpful usually. Therefore rewrite the code.
      
      In a tight loop this doesn't make any difference since the branch
      prediction unit does its job.
      Signed-off-by: default avatarHeiko Carstens <hca@linux.ibm.com>
      Signed-off-by: default avatarVasily Gorbik <gor@linux.ibm.com>
      bb4644b1