• Peter Xu's avatar
    mm/x86: change pXd_huge() behavior to exclude swap entries · d0973cb9
    Peter Xu authored
    This patch partly reverts below commits:
    
    3a194f3f ("mm/hugetlb: make pud_huge() and follow_huge_pud() aware of non-present pud entry")
    cbef8478 ("mm/hugetlb: pmd_huge() returns true for non-present hugepage")
    
    Right now, pXd_huge() definition across kernel is unclear. We have two
    groups that think differently on swap entries:
    
      - x86/sparc:     Allow pXd_huge() to accept swap entries
      - all the rest:  Doesn't allow pXd_huge() to accept swap entries
    
    This is so confusing.  Since the sparc helpers seem to be added in 2016,
    which is after x86's (2015), so sparc could have followed a trend.  x86
    proposed such swap handling in 2015 to resolve hugetlb swap entries hit in
    GUP, but now GUP guards swap entries with !pXd_present() in all layers so
    we should be safe.
    
    We should define this API properly, one way or another, rather than keep
    them defined differently across archs.
    
    Gut feeling tells me that pXd_huge() shouldn't include swap entries, and it
    turns out that I am not the only one thinking so, the question was raised
    when the current pmd_huge() for x86 was proposed by Ville Syrjälä:
    
    https://lore.kernel.org/all/Y2WQ7I4LXh8iUIRd@intel.com/
    
      I might also be missing something obvious, but why is it even necessary
      to treat PRESENT==0+PSE==0 as a huge entry?
    
    It is also questioned when Jason Gunthorpe reviewed the other patchset on
    swap entry handlings:
    
    https://lore.kernel.org/all/20240221125753.GQ13330@nvidia.com/
    
    Revert its meaning back to original.  It shouldn't have any functional
    change as we should be ready with guards on !pXd_present() explicitly
    everywhere.
    
    Note that I also dropped the "#if CONFIG_PGTABLE_LEVELS > 2", it was there
    probably because it was breaking things when 3a194f3f was proposed,
    according to the report here:
    
    https://lore.kernel.org/all/Y2LYXItKQyaJTv8j@intel.com/
    
    Now we shouldn't need that.
    
    Instead of reverting to _PAGE_PSE raw check, leverage pXd_leaf().
    
    Link: https://lkml.kernel.org/r/20240318200404.448346-5-peterx@redhat.com
    
    Signed-off-by: default avatarPeter Xu <peterx@redhat.com>
    Cc: Naoya Horiguchi <nao.horiguchi@gmail.com>
    Cc: Thomas Gleixner <tglx@linutronix.de>
    Cc: Ingo Molnar <mingo@redhat.com>
    Cc: Borislav Petkov <bp@alien8.de>
    Cc: Dave Hansen <dave.hansen@linux.intel.com>
    Cc: Alistair Popple <apopple@nvidia.com>
    Cc: Andreas Larsson <andreas@gaisler.com>
    Cc: "Aneesh Kumar K.V" <aneesh.kumar@kernel.org>
    Cc: Arnd Bergmann <arnd@arndb.de>
    Cc: Bjorn Andersson <andersson@kernel.org>
    Cc: Catalin Marinas <catalin.marinas@arm.com>
    Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
    Cc: David S. Miller <davem@davemloft.net>
    Cc: Fabio Estevam <festevam@denx.de>
    Cc: Jason Gunthorpe <jgg@nvidia.com>
    Cc: Konrad Dybcio <konrad.dybcio@linaro.org>
    Cc: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
    Cc: Lucas Stach <l.stach@pengutronix.de>
    Cc: Mark Salter <msalter@redhat.com>
    Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
    Cc: Michael Ellerman <mpe@ellerman.id.au>
    Cc: Mike Rapoport (IBM) <rppt@kernel.org>
    Cc: Muchun Song <muchun.song@linux.dev>
    Cc: "Naveen N. Rao" <naveen.n.rao@linux.ibm.com>
    Cc: Nicholas Piggin <npiggin@gmail.com>
    Cc: Russell King <linux@armlinux.org.uk>
    Cc: Shawn Guo <shawnguo@kernel.org>
    Cc: Will Deacon <will@kernel.org>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    d0973cb9
hugetlbpage.c 3.9 KB