• Patrick Roy's avatar
    secretmem: disable memfd_secret() if arch cannot set direct map · 532b53ce
    Patrick Roy authored
    Return -ENOSYS from memfd_secret() syscall if !can_set_direct_map().  This
    is the case for example on some arm64 configurations, where marking 4k
    PTEs in the direct map not present can only be done if the direct map is
    set up at 4k granularity in the first place (as ARM's break-before-make
    semantics do not easily allow breaking apart large/gigantic pages).
    
    More precisely, on arm64 systems with !can_set_direct_map(),
    set_direct_map_invalid_noflush() is a no-op, however it returns success
    (0) instead of an error.  This means that memfd_secret will seemingly
    "work" (e.g.  syscall succeeds, you can mmap the fd and fault in pages),
    but it does not actually achieve its goal of removing its memory from the
    direct map.
    
    Note that with this patch, memfd_secret() will start erroring on systems
    where can_set_direct_map() returns false (arm64 with
    CONFIG_RODATA_FULL_DEFAULT_ENABLED=n, CONFIG_DEBUG_PAGEALLOC=n and
    CONFIG_KFENCE=n), but that still seems better than the current silent
    failure.  Since CONFIG_RODATA_FULL_DEFAULT_ENABLED defaults to 'y', most
    arm64 systems actually have a working memfd_secret() and aren't be
    affected.
    
    From going through the iterations of the original memfd_secret patch
    series, it seems that disabling the syscall in these scenarios was the
    intended behavior [1] (preferred over having
    set_direct_map_invalid_noflush return an error as that would result in
    SIGBUSes at page-fault time), however the check for it got dropped between
    v16 [2] and v17 [3], when secretmem moved away from CMA allocations.
    
    [1]: https://lore.kernel.org/lkml/20201124164930.GK8537@kernel.org/
    [2]: https://lore.kernel.org/lkml/20210121122723.3446-11-rppt@kernel.org/#t
    [3]: https://lore.kernel.org/lkml/20201125092208.12544-10-rppt@kernel.org/
    
    Link: https://lkml.kernel.org/r/20241001080056.784735-1-roypat@amazon.co.uk
    Fixes: 1507f512 ("mm: introduce memfd_secret system call to create "secret" memory areas")
    Signed-off-by: default avatarPatrick Roy <roypat@amazon.co.uk>
    Reviewed-by: default avatarMike Rapoport (Microsoft) <rppt@kernel.org>
    Cc: Alexander Graf <graf@amazon.com>
    Cc: David Hildenbrand <david@redhat.com>
    Cc: James Gowans <jgowans@amazon.com>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    532b53ce
secretmem.c 6.48 KB