• Jann Horn's avatar
    binfmt_elf_fdpic: stop using dump_emit() on user pointers on !MMU · 8f942eea
    Jann Horn authored
    Patch series "Fix ELF / FDPIC ELF core dumping, and use mmap_lock properly in there", v5.
    
    At the moment, we have that rather ugly mmget_still_valid() helper to work
    around <https://crbug.com/project-zero/1790>: ELF core dumping doesn't
    take the mmap_sem while traversing the task's VMAs, and if anything (like
    userfaultfd) then remotely messes with the VMA tree, fireworks ensue.  So
    at the moment we use mmget_still_valid() to bail out in any writers that
    might be operating on a remote mm's VMAs.
    
    With this series, I'm trying to get rid of the need for that as cleanly as
    possible.  ("cleanly" meaning "avoid holding the mmap_lock across
    unbounded sleeps".)
    
    Patches 1, 2, 3 and 4 are relatively unrelated cleanups in the core
    dumping code.
    
    Patches 5 and 6 implement the main change: Instead of repeatedly accessing
    the VMA list with sleeps in between, we snapshot it at the start with
    proper locking, and then later we just use our copy of the VMA list.  This
    ensures that the kernel won't crash, that VMA metadata in the coredump is
    consistent even in the presence of concurrent modifications, and that any
    virtual addresses that aren't being concurrently modified have their
    contents show up in the core dump properly.
    
    The disadvantage of this approach is that we need a bit more memory during
    core dumping for storing metadata about all VMAs.
    
    At the end of the series, patch 7 removes the old workaround for this
    issue (mmget_still_valid()).
    
    I have tested:
    
     - Creating a simple core dump on X86-64 still works.
     - The created coredump on X86-64 opens in GDB and looks plausible.
     - X86-64 core dumps contain the first page for executable mappings at
       offset 0, and don't contain the first page for non-executable file
       mappings or executable mappings at offset !=0.
     - NOMMU 32-bit ARM can still generate plausible-looking core dumps
       through the FDPIC implementation. (I can't test this with GDB because
       GDB is missing some structure definition for nommu ARM, but I've
       poked around in the hexdump and it looked decent.)
    
    This patch (of 7):
    
    dump_emit() is for kernel pointers, and VMAs describe userspace memory.
    Let's be tidy here and avoid accessing userspace pointers under KERNEL_DS,
    even if it probably doesn't matter much on !MMU systems - especially given
    that it looks like we can just use the same get_dump_page() as on MMU if
    we move it out of the CONFIG_MMU block.
    
    One small change we have to make in get_dump_page() is to use
    __get_user_pages_locked() instead of __get_user_pages(), since the latter
    doesn't exist on nommu.  On mmu builds, __get_user_pages_locked() will
    just call __get_user_pages() for us.
    Signed-off-by: default avatarJann Horn <jannh@google.com>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Acked-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    Cc: Christoph Hellwig <hch@lst.de>
    Cc: Alexander Viro <viro@zeniv.linux.org.uk>
    Cc: "Eric W . Biederman" <ebiederm@xmission.com>
    Cc: Oleg Nesterov <oleg@redhat.com>
    Cc: Hugh Dickins <hughd@google.com>
    Link: http://lkml.kernel.org/r/20200827114932.3572699-1-jannh@google.com
    Link: http://lkml.kernel.org/r/20200827114932.3572699-2-jannh@google.comSigned-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    8f942eea
binfmt_elf_fdpic.c 47 KB