An error occurred fetching the project authors.
- 09 Nov, 2022 1 commit
-
-
Kefeng Wang authored
Most architectures (except arm64/x86/sparc) simply return 1 for kern_addr_valid(), which is only used in read_kcore(), and it calls copy_from_kernel_nofault() which could check whether the address is a valid kernel address. So as there is no need for kern_addr_valid(), let's remove it. Link: https://lkml.kernel.org/r/20221018074014.185687-1-wangkefeng.wang@huawei.comSigned-off-by:
Kefeng Wang <wangkefeng.wang@huawei.com> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> [m68k] Acked-by: Heiko Carstens <hca@linux.ibm.com> [s390] Acked-by:
Christoph Hellwig <hch@lst.de> Acked-by: Helge Deller <deller@gmx.de> [parisc] Acked-by: Michael Ellerman <mpe@ellerman.id.au> [powerpc] Acked-by: Guo Ren <guoren@kernel.org> [csky] Acked-by: Catalin Marinas <catalin.marinas@arm.com> [arm64] Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com> Cc: <aou@eecs.berkeley.edu> Cc: Borislav Petkov <bp@alien8.de> Cc: Christian Borntraeger <borntraeger@linux.ibm.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Chris Zankel <chris@zankel.net> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David S. Miller <davem@davemloft.net> Cc: Dinh Nguyen <dinguyen@kernel.org> Cc: Greg Ungerer <gerg@linux-m68k.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Cc: Johannes Berg <johannes@sipsolutions.net> Cc: Jonas Bonn <jonas@southpole.se> Cc: Matt Turner <mattst88@gmail.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Michal Simek <monstr@monstr.eu> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Palmer Dabbelt <palmer@rivosinc.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Richard Henderson <richard.henderson@linaro.org> Cc: Richard Weinberger <richard@nod.at> Cc: Rich Felker <dalias@libc.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Stafford Horne <shorne@gmail.com> Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Vineet Gupta <vgupta@kernel.org> Cc: Will Deacon <will@kernel.org> Cc: Xuerui Wang <kernel@xen0n.name> Cc: Yoshinori Sato <ysato@users.osdn.me> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org>
-
- 14 Sep, 2022 2 commits
-
-
Alexander Gordeev authored
Function memcpy_real() is an univeral data mover that does not require DAT mode to be able reading from a physical address. Its advantage is an ability to read from any address, even those for which no kernel virtual mapping exists. Although memcpy_real() is interrupt-safe, there are no handlers that make use of this function. The compiler instrumentation have to be disabled and separate no-DAT stack used to allow execution of the function once DAT mode is disabled. Rework memcpy_real() to overcome these shortcomings. As result, data copying (which is primarily reading out a crashed system memory by a user process) is executed on a regular stack with enabled interrupts. Also, use of memcpy_real_buf swap buffer becomes unnecessary and the swapping is eliminated. The above is achieved by using a fixed virtual address range that spans a single page and remaps that page repeatedly when memcpy_real() is called for a particular physical address. Reviewed-by:
Heiko Carstens <hca@linux.ibm.com> Signed-off-by:
Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by:
Vasily Gorbik <gor@linux.ibm.com>
-
Alexander Gordeev authored
Temporary unsetting of the prefix page in memcpy_absolute() routine poses a risk of executing code path with unexpectedly disabled prefix page. This rework avoids the prefix page uninstalling and disabling of normal and machine check interrupts when accessing the absolute zero memory. Although memcpy_absolute() routine can access the whole memory, it is only used to update the absolute zero lowcore. This rework therefore introduces a new mechanism for the absolute zero lowcore access and scraps memcpy_absolute() routine for good. Instead, an area is reserved in the virtual memory that is used for the absolute lowcore access only. That area holds an array of 8KB virtual mappings - one per CPU. Whenever a CPU is brought online, the corresponding item is mapped to the real address of the previously installed prefix page. The absolute zero lowcore access works like this: a CPU calls the new primitive get_abs_lowcore() to obtain its 8KB mapping as a pointer to the struct lowcore. Virtual address references to that pointer get translated to the real addresses of the prefix page, which in turn gets swapped with the absolute zero memory addresses due to prefixing. Once the pointer is not needed it must be released with put_abs_lowcore() primitive: struct lowcore *abs_lc; unsigned long flags; abs_lc = get_abs_lowcore(&flags); abs_lc->... = ...; put_abs_lowcore(abs_lc, flags); To ensure the described mechanism works large segment- and region- table entries must be avoided for the 8KB mappings. Failure to do so results in usage of Region-Frame Absolute Address (RFAA) or Segment-Frame Absolute Address (SFAA) large page fields. In that case absolute addresses would be used to address the prefix page instead of the real ones and the prefixing would get bypassed. Reviewed-by:
Heiko Carstens <hca@linux.ibm.com> Signed-off-by:
Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by:
Vasily Gorbik <gor@linux.ibm.com>
-
- 06 Aug, 2022 1 commit
-
-
Alexander Gordeev authored
This reverts commit 7d06fed7. This introduced vmem_mutex locking from vmem_map_4k_page() function called from smp_reinit_ipl_cpu() with interrupts disabled. While it is a pre-SMP early initcall no other CPUs running in parallel nor other code taking vmem_mutex on this boot stage - it still needs to be fixed. Signed-off-by:
Alexander Gordeev <agordeev@linux.ibm.com>
-
- 28 Jul, 2022 1 commit
-
-
Alexander Gordeev authored
Temporary unsetting of the prefix page in memcpy_absolute() routine poses a risk of executing code path with unexpectedly disabled prefix page. This rework avoids the prefix page uninstalling and disabling of normal and machine check interrupts when accessing the absolute zero memory. Although memcpy_absolute() routine can access the whole memory, it is only used to update the absolute zero lowcore. This rework therefore introduces a new mechanism for the absolute zero lowcore access and scraps memcpy_absolute() routine for good. Instead, an area is reserved in the virtual memory that is used for the absolute lowcore access only. That area holds an array of 8KB virtual mappings - one per CPU. Whenever a CPU is brought online, the corresponding item is mapped to the real address of the previously installed prefix page. The absolute zero lowcore access works like this: a CPU calls the new primitive get_abs_lowcore() to obtain its 8KB mapping as a pointer to the struct lowcore. Virtual address references to that pointer get translated to the real addresses of the prefix page, which in turn gets swapped with the absolute zero memory addresses due to prefixing. Once the pointer is not needed it must be released with put_abs_lowcore() primitive: struct lowcore *abs_lc; unsigned long flags; abs_lc = get_abs_lowcore(&flags); abs_lc->... = ...; put_abs_lowcore(abs_lc, flags); To ensure the described mechanism works large segment- and region- table entries must be avoided for the 8KB mappings. Failure to do so results in usage of Region-Frame Absolute Address (RFAA) or Segment-Frame Absolute Address (SFAA) large page fields. In that case absolute addresses would be used to address the prefix page instead of the real ones and the prefixing would get bypassed. Reviewed-by:
Heiko Carstens <hca@linux.ibm.com> Signed-off-by:
Alexander Gordeev <agordeev@linux.ibm.com>
-
- 19 Jul, 2022 1 commit
-
-
Claudio Imbrenda authored
When ptep_get_and_clear_full is called for a mm teardown, we will now attempt to destroy the secure pages. This will be faster than export. In case it was not a teardown, or if for some reason the destroy page UVC failed, we try with an export page, like before. Signed-off-by:
Claudio Imbrenda <imbrenda@linux.ibm.com> Acked-by:
Janosch Frank <frankja@linux.ibm.com> Reviewed-by:
Nico Boehr <nrb@linux.ibm.com> Link: https://lore.kernel.org/r/20220628135619.32410-11-imbrenda@linux.ibm.com Message-Id: <20220628135619.32410-11-imbrenda@linux.ibm.com> Signed-off-by:
Janosch Frank <frankja@linux.ibm.com>
-
- 18 Jul, 2022 1 commit
-
-
Anshuman Khandual authored
This enables ARCH_HAS_VM_GET_PAGE_PROT on the platform and exports standard vm_get_page_prot() implementation via DECLARE_VM_GET_PAGE_PROT, which looks up a private and static protection_map[] array. Subsequently all __SXXX and __PXXX macros can be dropped which are no longer needed. Link: https://lkml.kernel.org/r/20220711070600.2378316-19-anshuman.khandual@arm.comSigned-off-by:
Anshuman Khandual <anshuman.khandual@arm.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Brian Cain <bcain@quicinc.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Christoph Hellwig <hch@infradead.org> Cc: Christoph Hellwig <hch@lst.de> Cc: Chris Zankel <chris@zankel.net> Cc: "David S. Miller" <davem@davemloft.net> Cc: Dinh Nguyen <dinguyen@kernel.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Guo Ren <guoren@kernel.org> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com> Cc: Jeff Dike <jdike@addtoit.com> Cc: Jonas Bonn <jonas@southpole.se> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Simek <monstr@monstr.eu> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Richard Henderson <rth@twiddle.net> Cc: Rich Felker <dalias@libc.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Sam Ravnborg <sam@ravnborg.org> Cc: Stafford Horne <shorne@gmail.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vineet Gupta <vgupta@kernel.org> Cc: WANG Xuerui <kernel@xen0n.name> Cc: Will Deacon <will@kernel.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org>
-
- 13 Jul, 2022 1 commit
-
-
Claudio Imbrenda authored
Use the new protected_count field as a counter instead of the old is_protected flag. This will be used in upcoming patches. Increment the counter when a secure configuration is created, and decrement it when it is destroyed. Previously the flag was set when the set secure parameters UVC was performed. Signed-off-by:
Claudio Imbrenda <imbrenda@linux.ibm.com> Acked-by:
Janosch Frank <frankja@linux.ibm.com> Link: https://lore.kernel.org/r/20220628135619.32410-6-imbrenda@linux.ibm.com Message-Id: <20220628135619.32410-6-imbrenda@linux.ibm.com> Signed-off-by:
Janosch Frank <frankja@linux.ibm.com>
-
- 10 May, 2022 2 commits
-
-
David Hildenbrand authored
Let's use bit 52, which is unused. Link: https://lkml.kernel.org/r/20220329164329.208407-7-david@redhat.comSigned-off-by:
David Hildenbrand <david@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Don Dutile <ddutile@redhat.com> Cc: Gerald Schaefer <gerald.schaefer@linux.ibm.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Hugh Dickins <hughd@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jan Kara <jack@suse.cz> Cc: Jann Horn <jannh@google.com> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: John Hubbard <jhubbard@nvidia.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Liang Zhang <zhangliang5@huawei.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Hocko <mhocko@kernel.org> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Mike Rapoport <rppt@linux.ibm.com> Cc: Nadav Amit <namit@vmware.com> Cc: Oded Gabbay <oded.gabbay@gmail.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Pedro Demarchi Gomes <pedrodemargomes@gmail.com> Cc: Peter Xu <peterx@redhat.com> Cc: Rik van Riel <riel@surriel.com> Cc: Roman Gushchin <guro@fb.com> Cc: Shakeel Butt <shakeelb@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Will Deacon <will@kernel.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org>
-
David Hildenbrand authored
Bit 52 and bit 55 don't have to be zero: they only trigger a translation-specifiation exception if the PTE is marked as valid, which is not the case for swap ptes. Document which bits are used for what, and which ones are unused. Link: https://lkml.kernel.org/r/20220329164329.208407-6-david@redhat.comSigned-off-by:
David Hildenbrand <david@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Don Dutile <ddutile@redhat.com> Cc: Gerald Schaefer <gerald.schaefer@linux.ibm.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Hugh Dickins <hughd@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jan Kara <jack@suse.cz> Cc: Jann Horn <jannh@google.com> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: John Hubbard <jhubbard@nvidia.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Liang Zhang <zhangliang5@huawei.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Hocko <mhocko@kernel.org> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Mike Rapoport <rppt@linux.ibm.com> Cc: Nadav Amit <namit@vmware.com> Cc: Oded Gabbay <oded.gabbay@gmail.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Pedro Demarchi Gomes <pedrodemargomes@gmail.com> Cc: Peter Xu <peterx@redhat.com> Cc: Rik van Riel <riel@surriel.com> Cc: Roman Gushchin <guro@fb.com> Cc: Shakeel Butt <shakeelb@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Will Deacon <will@kernel.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org>
-
- 10 Mar, 2022 1 commit
-
-
Vasily Gorbik authored
With z10 as minimum supported machine generation many ".insn" encodings could be now converted to instruction names. There are couple of exceptions - stfle is used from the als code built for z900 and cannot be converted - few ".insn" directives encode unsupported instruction formats The generated code is identical before/after this change. Acked-by:
Ilya Leoshkevich <iii@linux.ibm.com> Reviewed-by:
Heiko Carstens <hca@linux.ibm.com> Signed-off-by:
Vasily Gorbik <gor@linux.ibm.com>
-
- 01 Mar, 2022 4 commits
-
-
Heiko Carstens authored
Convert pgtable code so pte_val()/pXd_val() aren't used as lvalue anymore. This allows in later step to convert pte_val()/pXd_val() to functions, which in turn makes it impossible to use these macros to modify page table entries like they have been used before. Therefore a construct like this: pte_val(*pte) = __pa(addr) | prot; which would directly write into a page table, isn't possible anymore with the last step of this series. Reviewed-by:
Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by:
Heiko Carstens <hca@linux.ibm.com> Signed-off-by:
Vasily Gorbik <gor@linux.ibm.com>
-
Heiko Carstens authored
Use the new set_pXd()/set_pte() helper functions at all places where page table entries are modified. Reviewed-by:
Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by:
Heiko Carstens <hca@linux.ibm.com> Signed-off-by:
Vasily Gorbik <gor@linux.ibm.com>
-
Heiko Carstens authored
Add set_pte_bit()/clear_pte_bit() and set_pXd_bit()/clear_pXd_bit helper functions which are supposed to be used if bits within ptes/pXds are set/cleared. The only point of these helper functions is to get more readable code. This is quite similar to what arm64 has. Reviewed-by:
Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by:
Heiko Carstens <hca@linux.ibm.com> Signed-off-by:
Vasily Gorbik <gor@linux.ibm.com>
-
Heiko Carstens authored
Add set_pXd()/set_pte() helper functions which must be used to update page table entries. The new helpers use WRITE_ONCE() to make sure that a page table entry is written to only once. Without this the compiler could otherwise generate code which writes several times to a page table entry when updating its contents from invalid to valid, which could lead to surprising results especially for multithreaded processes... Reviewed-by:
Claudio Imbrenda <imbrenda@linux.ibm.com> Reviewed-by:
Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by:
Heiko Carstens <hca@linux.ibm.com> Signed-off-by:
Vasily Gorbik <gor@linux.ibm.com>
-
- 27 Oct, 2021 1 commit
-
-
Claudio Imbrenda authored
Introduce variants of the convert and destroy page functions that also clear the PG_arch_1 bit used to mark them as secure pages. The PG_arch_1 flag is always allowed to overindicate; using the new functions introduced here allows to reduce the extent of overindication and thus improve performance. These new functions can only be called on pages for which a reference is already being held. Signed-off-by:
Claudio Imbrenda <imbrenda@linux.ibm.com> Reviewed-by:
Janosch Frank <frankja@linux.ibm.com> Reviewed-by:
Christian Borntraeger <borntraeger@de.ibm.com> Link: https://lore.kernel.org/r/20210920132502.36111-7-imbrenda@linux.ibm.comSigned-off-by:
Christian Borntraeger <borntraeger@de.ibm.com>
-
- 26 Oct, 2021 1 commit
-
-
Alexander Gordeev authored
Instructions IPTE, IDTE and CRDTE accept Page-Table Origin as one of the arguments, but instead the pgtable virtual address is passed. Fix that and also update the crdte() prototype to conform to csp() and cspg() friends. Reviewed-by:
Heiko Carstens <hca@linux.ibm.com> Signed-off-by:
Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by:
Vasily Gorbik <gor@linux.ibm.com>
-
- 27 Jul, 2021 2 commits
-
-
Heiko Carstens authored
Print the real pte, pmd, etc. values instead of some hashed value. Otherwise debugging would be even more difficult. This also matches what most other architectures are doing. Signed-off-by:
Heiko Carstens <hca@linux.ibm.com>
-
Heiko Carstens authored
Use pr_err() to use a proper printk level. Signed-off-by:
Heiko Carstens <hca@linux.ibm.com>
-
- 01 Jul, 2021 2 commits
-
-
Anshuman Khandual authored
Currently most platforms define pmd_pgtable() as pmd_page() duplicating the same code all over. Instead just define a default value i.e pmd_page() for pmd_pgtable() and let platforms override when required via <asm/pgtable.h>. All the existing platform that override pmd_pgtable() have been moved into their respective <asm/pgtable.h> header in order to precede before the new generic definition. This makes it much cleaner with reduced code. Link: https://lkml.kernel.org/r/1623646133-20306-1-git-send-email-anshuman.khandual@arm.comSigned-off-by:
Anshuman Khandual <anshuman.khandual@arm.com> Acked-by:
Geert Uytterhoeven <geert@linux-m68k.org> Acked-by:
Mike Rapoport <rppt@linux.ibm.com> Cc: Nick Hu <nickhu@andestech.com> Cc: Richard Henderson <rth@twiddle.net> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Guo Ren <guoren@kernel.org> Cc: Brian Cain <bcain@codeaurora.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Michal Simek <monstr@monstr.eu> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Ley Foon Tan <ley.foon.tan@intel.com> Cc: Jonas Bonn <jonas@southpole.se> Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi> Cc: Stafford Horne <shorne@gmail.com> Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: "David S. Miller" <davem@davemloft.net> Cc: Jeff Dike <jdike@addtoit.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Chris Zankel <chris@zankel.net> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
Anshuman Khandual authored
Currently most platforms define FIRST_USER_ADDRESS as 0UL duplication the same code all over. Instead just define a generic default value (i.e 0UL) for FIRST_USER_ADDRESS and let the platforms override when required. This makes it much cleaner with reduced code. The default FIRST_USER_ADDRESS here would be skipped in <linux/pgtable.h> when the given platform overrides its value via <asm/pgtable.h>. Link: https://lkml.kernel.org/r/1620615725-24623-1-git-send-email-anshuman.khandual@arm.comSigned-off-by:
Anshuman Khandual <anshuman.khandual@arm.com> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> [m68k] Acked-by: Guo Ren <guoren@kernel.org> [csky] Acked-by: Stafford Horne <shorne@gmail.com> [openrisc] Acked-by: Catalin Marinas <catalin.marinas@arm.com> [arm64] Acked-by:
Mike Rapoport <rppt@linux.ibm.com> Acked-by: Palmer Dabbelt <palmerdabbelt@google.com> [RISC-V] Cc: Richard Henderson <rth@twiddle.net> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Guo Ren <guoren@kernel.org> Cc: Brian Cain <bcain@codeaurora.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Michal Simek <monstr@monstr.eu> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Ley Foon Tan <ley.foon.tan@intel.com> Cc: Jonas Bonn <jonas@southpole.se> Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi> Cc: Stafford Horne <shorne@gmail.com> Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: "David S. Miller" <davem@davemloft.net> Cc: Jeff Dike <jdike@addtoit.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Chris Zankel <chris@zankel.net> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- 29 Jun, 2021 1 commit
-
-
Daniel Axtens authored
Commit c65e774f ("x86/mm: Make PGDIR_SHIFT and PTRS_PER_P4D variable") made PTRS_PER_P4D variable on x86 and introduced MAX_PTRS_PER_P4D as a constant for cases which need a compile-time constant (e.g. fixed-size arrays). powerpc likewise has boot-time selectable MMU features which can cause other mm "constants" to vary. For KASAN, we have some static PTE/PMD/PUD/P4D arrays so we need compile-time maximums for all these constants. Extend the MAX_PTRS_PER_ idiom, and place default definitions in include/pgtable.h. These define MAX_PTRS_PER_x to be PTRS_PER_x unless an architecture has defined MAX_PTRS_PER_x in its arch headers. Clean up pgtable-nop4d.h and s390's MAX_PTRS_PER_P4D definitions while we're at it: both can just pick up the default now. Link: https://lkml.kernel.org/r/20210624034050.511391-4-dja@axtens.netSigned-off-by:
Daniel Axtens <dja@axtens.net> Acked-by:
Andrey Konovalov <andreyknvl@gmail.com> Reviewed-by:
Christophe Leroy <christophe.leroy@csgroup.eu> Reviewed-by:
Marco Elver <elver@google.com> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Cc: Balbir Singh <bsingharora@gmail.com> Cc: Alexander Potapenko <glider@google.com> Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com> Cc: Dmitry Vyukov <dvyukov@google.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- 18 Jun, 2021 2 commits
-
-
Heiko Carstens authored
Signed-off-by:
Heiko Carstens <hca@linux.ibm.com> Signed-off-by:
Vasily Gorbik <gor@linux.ibm.com>
-
Vasily Gorbik authored
Currently there are two separate places where kernel memory layout has to be known and adjusted: 1. early kasan setup. 2. paging setup later. Those 2 places had to be kept in sync and adjusted to reflect peculiar technical details of one another. With additional factors which influence kernel memory layout like ultravisor secure storage limit, complexity of keeping two things in sync grew up even more. Besides that if we look forward towards creating identity mapping and enabling DAT before jumping into uncompressed kernel - that would also require full knowledge of and control over kernel memory layout. So, de-duplicate and move kernel memory layout setup logic into the decompressor. Reviewed-by:
Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by:
Vasily Gorbik <gor@linux.ibm.com>
-
- 07 Jun, 2021 1 commit
-
-
Niklas Schnelle authored
In commit b02002cc ("s390/pci: Implement ioremap_wc/prot() with MIO") we implemented both ioremap_wc() and ioremap_prot() however until now we had not set HAVE_IOREMAP_PROT in Kconfig, do so now. This also requires implementing pte_pgprot() as this is used in the generic_access_phys() code enabled by CONFIG_HAVE_IOREMAP_PROT. As with ioremap_wc() we need to take the MMIO Write Back bit index into account. Moreover since the pgprot value returned from pte_pgprot() is to be used for mappings into kernel address space we must make sure that it uses appropriate kernel page table protection bits. In particular a pgprot value originally coming from userspace could have the _PAGE_PROTECT bit set to enable fault based dirty bit accounting which would then make the mapping inaccessible when used in kernel address space. Fixes: b02002cc ("s390/pci: Implement ioremap_wc/prot() with MIO") Reviewed-by:
Gerald Schaefer <gerald.schaefer@linux.ibm.com> Signed-off-by:
Niklas Schnelle <schnelle@linux.ibm.com> Signed-off-by:
Vasily Gorbik <gor@linux.ibm.com>
-
- 23 Feb, 2021 2 commits
-
-
Alexander Gordeev authored
There is little sense in applying __pa() to a physical address, but that what pfn_pXd() macros do. Signed-off-by:
Alexander Gordeev <agordeev@linux.ibm.com> Reviewed-by:
Heiko Carstens <hca@linux.ibm.com> Signed-off-by:
Vasily Gorbik <gor@linux.ibm.com>
-
Alexander Gordeev authored
This update fixes semantics of pXd_deref macros which are expected to return a CPU-addressable pointer. Signed-off-by:
Alexander Gordeev <agordeev@linux.ibm.com> Reviewed-by:
Heiko Carstens <hca@linux.ibm.com> Signed-off-by:
Vasily Gorbik <gor@linux.ibm.com>
-
- 23 Nov, 2020 1 commit
-
-
Heiko Carstens authored
Create a region 3 page table which contains only invalid entries, and use that via "s390_invalid_asce" instead of the kernel ASCE whenever there is either - no user address space available, e.g. during early startup - as an intermediate ASCE when address spaces are switched This makes sure that user space accesses in such situations are guaranteed to fail. Reviewed-by:
Sven Schnelle <svens@linux.ibm.com> Reviewed-by:
Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by:
Heiko Carstens <hca@linux.ibm.com>
-
- 09 Nov, 2020 1 commit
-
-
Heiko Carstens authored
We've seen several occurences in the past where the default vmalloc size of 128GB is not sufficient. Therefore extend the default size. Reviewed-by:
Claudio Imbrenda <imbrenda@linux.ibm.com> Reviewed-by:
Vasily Gorbik <gor@linux.ibm.com> Signed-off-by:
Heiko Carstens <hca@linux.ibm.com>
-
- 03 Nov, 2020 1 commit
-
-
Gerald Schaefer authored
pmd/pud_deref() assume that they will never operate on large pmd/pud entries, and therefore only use the non-large _xxx_ENTRY_ORIGIN mask. With commit 9ec8fa8d ("s390/vmemmap: extend modify_pagetable() to handle vmemmap"), that assumption is no longer true, at least for pmd_deref(). In theory, we could end up with wrong addresses because some of the non-address bits of a large entry would not be masked out. In practice, this does not (yet) show any impact, because vmemmap_free() is currently never used for s390. Fix pmd/pud_deref() to check for the entry type and use the _xxx_ENTRY_ORIGIN_LARGE mask for large entries. While at it, also move pmd/pud_pfn() around, in order to avoid code duplication, because they do the same thing. Fixes: 9ec8fa8d ("s390/vmemmap: extend modify_pagetable() to handle vmemmap") Cc: <stable@vger.kernel.org> # 5.9 Signed-off-by:
Gerald Schaefer <gerald.schaefer@linux.ibm.com> Reviewed-by:
Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by:
Heiko Carstens <hca@linux.ibm.com>
-
- 26 Sep, 2020 1 commit
-
-
Vasily Gorbik authored
Currently to make sure that every page table entry is read just once gup_fast walks perform READ_ONCE and pass pXd value down to the next gup_pXd_range function by value e.g.: static int gup_pud_range(p4d_t p4d, unsigned long addr, unsigned long end, unsigned int flags, struct page **pages, int *nr) ... pudp = pud_offset(&p4d, addr); This function passes a reference on that local value copy to pXd_offset, and might get the very same pointer in return. This happens when the level is folded (on most arches), and that pointer should not be iterated. On s390 due to the fact that each task might have different 5,4 or 3-level address translation and hence different levels folded the logic is more complex and non-iteratable pointer to a local copy leads to severe problems. Here is an example of what happens with gup_fast on s390, for a task with 3-level paging, crossing a 2 GB pud boundary: // addr = 0x1007ffff000, end = 0x10080001000 static int gup_pud_range(p4d_t p4d, unsigned long addr, unsigned long end, unsigned int flags, struct page **pages, int *nr) { unsigned long next; pud_t *pudp; // pud_offset returns &p4d itself (a pointer to a value on stack) pudp = pud_offset(&p4d, addr); do { // on second iteratation reading "random" stack value pud_t pud = READ_ONCE(*pudp); // next = 0x10080000000, due to PUD_SIZE/MASK != PGDIR_SIZE/MASK on s390 next = pud_addr_end(addr, end); ... } while (pudp++, addr = next, addr != end); // pudp++ iterating over stack return 1; } This happens since s390 moved to common gup code with commit d1874a0c ("s390/mm: make the pxd_offset functions more robust") and commit 1a42010c ("s390/mm: convert to the generic get_user_pages_fast code"). s390 tried to mimic static level folding by changing pXd_offset primitives to always calculate top level page table offset in pgd_offset and just return the value passed when pXd_offset has to act as folded. What is crucial for gup_fast and what has been overlooked is that PxD_SIZE/MASK and thus pXd_addr_end should also change correspondingly. And the latter is not possible with dynamic folding. To fix the issue in addition to pXd values pass original pXdp pointers down to gup_pXd_range functions. And introduce pXd_offset_lockless helpers, which take an additional pXd entry value parameter. This has already been discussed in https://lkml.kernel.org/r/20190418100218.0a4afd51@mschwideX1 Fixes: 1a42010c ("s390/mm: convert to the generic get_user_pages_fast code") Signed-off-by:
Vasily Gorbik <gor@linux.ibm.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Reviewed-by:
Gerald Schaefer <gerald.schaefer@linux.ibm.com> Reviewed-by:
Alexander Gordeev <agordeev@linux.ibm.com> Reviewed-by:
Jason Gunthorpe <jgg@nvidia.com> Reviewed-by:
Mike Rapoport <rppt@linux.ibm.com> Reviewed-by:
John Hubbard <jhubbard@nvidia.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Russell King <linux@armlinux.org.uk> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Jeff Dike <jdike@addtoit.com> Cc: Richard Weinberger <richard@nod.at> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Claudio Imbrenda <imbrenda@linux.ibm.com> Cc: <stable@vger.kernel.org> [5.2+] Link: https://lkml.kernel.org/r/patch.git-943f1e5dcff2.your-ad-here.call-01599856292-ext-8676@work.hoursSigned-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- 14 Sep, 2020 2 commits
-
-
Vasily Gorbik authored
Signed-off-by:
Vasily Gorbik <gor@linux.ibm.com> [hca@linux.ibm.com: add more markers, rename some markers] Signed-off-by:
Heiko Carstens <hca@linux.ibm.com> Signed-off-by:
Vasily Gorbik <gor@linux.ibm.com>
-
Niklas Schnelle authored
With our current support for the new MIO PCI instructions, write combining/write back MMIO memory can be obtained via the pci_iomap_wc() and pci_iomap_wc_range() functions. This is achieved by using the write back address for a specific bar as provided in clp_store_query_pci_fn() These functions are however not widely used and instead drivers often rely on ioremap_wc() and ioremap_prot(), which on other platforms enable write combining using a PTE flag set through the pgrprot value. While we do not have a write combining flag in the low order flag bits of the PTE like x86_64 does, with MIO support, there is a write back bit in the physical address (bit 1 on z15) and thus also the PTE. Which bit is used to toggle write back and whether it is available at all, is however not fixed in the architecture. Instead we get this information from the CLP Store Logical Processor Characteristics for PCI command. When the write back bit is not provided we fall back to the existing behavior. Signed-off-by:
Niklas Schnelle <schnelle@linux.ibm.com> Reviewed-by:
Pierre Morel <pmorel@linux.ibm.com> Reviewed-by:
Gerald Schaefer <gerald.schaefer@linux.ibm.com> Signed-off-by:
Vasily Gorbik <gor@linux.ibm.com>
-
- 01 Jul, 2020 1 commit
-
-
David Hildenbrand authored
I can't come up with a satisfying reason why we still need the memory segment list. We used to represent in the list: - boot memory - standby memory added via add_memory() - loaded dcss segments When loading/unloading dcss segments, we already track them in a separate list and check for overlaps (arch/s390/mm/extmem.c:segment_overlaps_others()) when loading segments. The overlap check was introduced for some segments in commit b2300b9e ("[S390] dcssblk: add >2G DCSSs support and stacked contiguous DCSSs support.") and was extended to cover all dcss segments in commit ca571146 ("s390/extmem: remove code for 31 bit addressing mode"). Although I doubt that overlaps with boot memory and standby memory are relevant, let's reshuffle the checks in load_segment() to request the resource first. This will bail out in case we have overlaps with other resources (esp. boot memory and standby memory). The order is now different compared to segment_unload() and segment_unload(), but that should not matter. This smells like a leftover from ancient times, let's get rid of it. We can now convert vmem_remove_mapping() into a void function - everybody ignored the return value already. Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
David Hildenbrand <david@redhat.com> Message-Id: <20200625150029.45019-1-david@redhat.com> Reviewed-by:
Gerald Schaefer <gerald.schaefer@de.ibm.com> Tested-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> [DCSS] Signed-off-by:
Heiko Carstens <heiko.carstens@de.ibm.com>
-
- 09 Jun, 2020 2 commits
-
-
Mike Rapoport authored
All architectures define pte_index() as (address >> PAGE_SHIFT) & (PTRS_PER_PTE - 1) and all architectures define pte_offset_kernel() as an entry in the array of PTEs indexed by the pte_index(). For the most architectures the pte_offset_kernel() implementation relies on the availability of pmd_page_vaddr() that converts a PMD entry value to the virtual address of the page containing PTEs array. Let's move x86 definitions of the PTE accessors to the generic place in <linux/pgtable.h> and then simply drop the respective definitions from the other architectures. The architectures that didn't provide pmd_page_vaddr() are updated to have that defined. The generic implementation of pte_offset_kernel() can be overridden by an architecture and alpha makes use of this because it has special ordering requirements for its version of pte_offset_kernel(). [rppt@linux.ibm.com: v2] Link: http://lkml.kernel.org/r/20200514170327.31389-11-rppt@kernel.org [rppt@linux.ibm.com: update] Link: http://lkml.kernel.org/r/20200514170327.31389-12-rppt@kernel.org [rppt@linux.ibm.com: update] Link: http://lkml.kernel.org/r/20200514170327.31389-13-rppt@kernel.org [akpm@linux-foundation.org: fix x86 warning] [sfr@canb.auug.org.au: fix powerpc build] Link: http://lkml.kernel.org/r/20200607153443.GB738695@linux.ibm.comSigned-off-by:
Mike Rapoport <rppt@linux.ibm.com> Signed-off-by:
Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Cain <bcain@codeaurora.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chris Zankel <chris@zankel.net> Cc: "David S. Miller" <davem@davemloft.net> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Greentime Hu <green.hu@gmail.com> Cc: Greg Ungerer <gerg@linux-m68k.org> Cc: Guan Xuetao <gxt@pku.edu.cn> Cc: Guo Ren <guoren@kernel.org> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Helge Deller <deller@gmx.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Ley Foon Tan <ley.foon.tan@intel.com> Cc: Mark Salter <msalter@redhat.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Matt Turner <mattst88@gmail.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Simek <monstr@monstr.eu> Cc: Nick Hu <nickhu@andestech.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Richard Weinberger <richard@nod.at> Cc: Rich Felker <dalias@libc.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Stafford Horne <shorne@gmail.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: Vincent Chen <deanbo422@gmail.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Will Deacon <will@kernel.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Link: http://lkml.kernel.org/r/20200514170327.31389-10-rppt@kernel.orgSigned-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
Mike Rapoport authored
The include/linux/pgtable.h is going to be the home of generic page table manipulation functions. Start with moving asm-generic/pgtable.h to include/linux/pgtable.h and make the latter include asm/pgtable.h. Signed-off-by:
Mike Rapoport <rppt@linux.ibm.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Cain <bcain@codeaurora.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chris Zankel <chris@zankel.net> Cc: "David S. Miller" <davem@davemloft.net> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Greentime Hu <green.hu@gmail.com> Cc: Greg Ungerer <gerg@linux-m68k.org> Cc: Guan Xuetao <gxt@pku.edu.cn> Cc: Guo Ren <guoren@kernel.org> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Helge Deller <deller@gmx.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Ley Foon Tan <ley.foon.tan@intel.com> Cc: Mark Salter <msalter@redhat.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Matt Turner <mattst88@gmail.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Simek <monstr@monstr.eu> Cc: Nick Hu <nickhu@andestech.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Richard Weinberger <richard@nod.at> Cc: Rich Felker <dalias@libc.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Stafford Horne <shorne@gmail.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: Vincent Chen <deanbo422@gmail.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Will Deacon <will@kernel.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Link: http://lkml.kernel.org/r/20200514170327.31389-3-rppt@kernel.orgSigned-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- 05 May, 2020 1 commit
-
-
Aneesh Kumar K.V authored
We will use this in later patch to do tlb flush when clearing pmd entries. Signed-off-by:
Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200505071729.54912-22-aneesh.kumar@linux.ibm.com
-
- 04 Mar, 2020 1 commit
-
-
Gerald Schaefer authored
On s390 there currently is no implementation of pud_write(). That was ok as long as we had our own implementation of get_user_pages_fast() which checked for pud protection by testing the bit directly w/o using pud_write(). The other callers of pud_write() are not reachable on s390. After commit 1a42010c ("s390/mm: convert to the generic get_user_pages_fast code") we use the generic get_user_pages_fast(), which does call pud_write() in pud_access_permitted() for FOLL_WRITE access on a large pud. Without an s390 specific pud_write(), the generic version is called, which contains a BUG() statement to remind us that we don't have a proper implementation. This results in a kernel panic. Fix this by providing an implementation of pud_write(). Cc: <stable@vger.kernel.org> # 5.2+ Fixes: 1a42010c ("s390/mm: convert to the generic get_user_pages_fast code") Signed-off-by:
Gerald Schaefer <gerald.schaefer@de.ibm.com> Reviewed-by:
Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by:
Vasily Gorbik <gor@linux.ibm.com>
-
- 27 Feb, 2020 1 commit
-
-
Claudio Imbrenda authored
This provides the basic ultravisor calls and page table handling to cope with secure guests: - provide arch_make_page_accessible - make pages accessible after unmapping of secure guests - provide the ultravisor commands convert to/from secure - provide the ultravisor commands pin/unpin shared - provide callbacks to make pages secure (inacccessible) - we check for the expected pin count to only make pages secure if the host is not accessing them - we fence hugetlbfs for secure pages - add missing radix-tree include into gmap.h The basic idea is that a page can have 3 states: secure, normal or shared. The hypervisor can call into a firmware function called ultravisor that allows to change the state of a page: convert from/to secure. The convert from secure will encrypt the page and make it available to the host and host I/O. The convert to secure will remove the host capability to access this page. The design is that on convert to secure we will wait until writeback and page refs are indicating no host usage. At the same time the convert from secure (export to host) will be called in common code when the refcount or the writeback bit is already set. This avoids races between convert from and to secure. Then there is also the concept of shared pages. Those are kind of secure where the host can still access those pages. We need to be notified when the guest "unshares" such a page, basically doing a convert to secure by then. There is a call "pin shared page" that we use instead of convert from secure when possible. We do use PG_arch_1 as an optimization to minimize the convert from secure/pin shared. Several comments have been added in the code to explain the logic in the relevant places. Co-developed-by:
Ulrich Weigand <Ulrich.Weigand@de.ibm.com> Signed-off-by:
Ulrich Weigand <Ulrich.Weigand@de.ibm.com> Signed-off-by:
Claudio Imbrenda <imbrenda@linux.ibm.com> Acked-by:
David Hildenbrand <david@redhat.com> Acked-by:
Cornelia Huck <cohuck@redhat.com> Reviewed-by:
Christian Borntraeger <borntraeger@de.ibm.com> [borntraeger@de.ibm.com: patch merging, splitting, fixing] Signed-off-by:
Christian Borntraeger <borntraeger@de.ibm.com>
-
- 04 Feb, 2020 1 commit
-
-
Steven Price authored
walk_page_range() is going to be allowed to walk page tables other than those of user space. For this it needs to know when it has reached a 'leaf' entry in the page tables. This information is provided by the p?d_leaf() functions/macros. For s390, pud_large() and pmd_large() are already implemented as static inline functions. Add a macro to provide the p?d_leaf names for the generic code to use. Link: http://lkml.kernel.org/r/20191218162402.45610-9-steven.price@arm.comSigned-off-by:
Steven Price <steven.price@arm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexandre Ghiti <alex@ghiti.fr> Cc: Andy Lutomirski <luto@kernel.org> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David S. Miller <davem@davemloft.net> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Hogan <jhogan@kernel.org> Cc: James Morse <james.morse@arm.com> Cc: Jerome Glisse <jglisse@redhat.com> Cc: "Liang, Kan" <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Paul Burton <paul.burton@mips.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Will Deacon <will@kernel.org> Cc: Zong Li <zong.li@sifive.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-