1. 02 Apr, 2022 10 commits
    • Linus Torvalds's avatar
      Merge tag 'for-5.18/drivers-2022-04-02' of git://git.kernel.dk/linux-block · 6f34f8c3
      Linus Torvalds authored
      Pull block driver fix from Jens Axboe:
       "Got two reports on nbd spewing warnings on load now, which is a
        regression from a commit that went into your tree yesterday.
      
        Revert the problematic change for now"
      
      * tag 'for-5.18/drivers-2022-04-02' of git://git.kernel.dk/linux-block:
        Revert "nbd: fix possible overflow on 'first_minor' in nbd_dev_add()"
      6f34f8c3
    • Linus Torvalds's avatar
      Merge tag 'pci-v5.18-changes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci · 9a212aaf
      Linus Torvalds authored
      Pull pci fix from Bjorn Helgaas:
      
       - Fix Hyper-V "defined but not used" build issue added during merge
         window (YueHaibing)
      
      * tag 'pci-v5.18-changes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
        PCI: hv: Remove unused hv_set_msi_entry_from_desc()
      9a212aaf
    • Linus Torvalds's avatar
      Merge tag 'tag-chrome-platform-for-v5.18' of... · 02d4f8a3
      Linus Torvalds authored
      Merge tag 'tag-chrome-platform-for-v5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/chrome-platform/linux
      
      Pull chrome platform updates from Benson Leung:
       "cros_ec_typec:
      
         - Check for EC device - Fix a crash when using the cros_ec_typec
           driver on older hardware not capable of typec commands
      
         - Make try power role optional
      
         - Mux configuration reorganization series from Prashant
      
        cros_ec_debugfs:
      
         - Fix use after free. Thanks Tzung-bi
      
        sensorhub:
      
         - cros_ec_sensorhub fixup - Split trace include file
      
        misc:
      
         - Add new mailing list for chrome-platform development:
      
      	chrome-platform@lists.linux.dev
      
           Now with patchwork!"
      
      * tag 'tag-chrome-platform-for-v5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/chrome-platform/linux:
        platform/chrome: cros_ec_debugfs: detach log reader wq from devm
        platform: chrome: Split trace include file
        platform/chrome: cros_ec_typec: Update mux flags during partner removal
        platform/chrome: cros_ec_typec: Configure muxes at start of port update
        platform/chrome: cros_ec_typec: Get mux state inside configure_mux
        platform/chrome: cros_ec_typec: Move mux flag checks
        platform/chrome: cros_ec_typec: Check for EC device
        platform/chrome: cros_ec_typec: Make try power role optional
        MAINTAINERS: platform-chrome: Add new chrome-platform@lists.linux.dev list
      02d4f8a3
    • Jens Axboe's avatar
      Revert "nbd: fix possible overflow on 'first_minor' in nbd_dev_add()" · 7198bfc2
      Jens Axboe authored
      This reverts commit 6d35d04a.
      
      Both Gabriel and Borislav report that this commit casues a regression
      with nbd:
      
      sysfs: cannot create duplicate filename '/dev/block/43:0'
      
      Revert it before 5.18-rc1 and we'll investigage this separately in
      due time.
      
      Link: https://lore.kernel.org/all/YkiJTnFOt9bTv6A2@zn.tnic/Reported-by: default avatarGabriel L. Somlo <somlo@cmu.edu>
      Reported-by: default avatarBorislav Petkov <bp@alien8.de>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      7198bfc2
    • Eric Dumazet's avatar
      watch_queue: Free the page array when watch_queue is dismantled · b4902070
      Eric Dumazet authored
      Commit 7ea1a012 ("watch_queue: Free the alloc bitmap when the
      watch_queue is torn down") took care of the bitmap, but not the page
      array.
      
        BUG: memory leak
        unreferenced object 0xffff88810d9bc140 (size 32):
        comm "syz-executor335", pid 3603, jiffies 4294946994 (age 12.840s)
        hex dump (first 32 bytes):
          40 a7 40 04 00 ea ff ff 00 00 00 00 00 00 00 00  @.@.............
          00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        backtrace:
           kmalloc_array include/linux/slab.h:621 [inline]
           kcalloc include/linux/slab.h:652 [inline]
           watch_queue_set_size+0x12f/0x2e0 kernel/watch_queue.c:251
           pipe_ioctl+0x82/0x140 fs/pipe.c:632
           vfs_ioctl fs/ioctl.c:51 [inline]
           __do_sys_ioctl fs/ioctl.c:874 [inline]
           __se_sys_ioctl fs/ioctl.c:860 [inline]
           __x64_sys_ioctl+0xfc/0x140 fs/ioctl.c:860
           do_syscall_x64 arch/x86/entry/common.c:50 [inline]
      
      Reported-by: syzbot+25ea042ae28f3888727a@syzkaller.appspotmail.com
      Fixes: c73be61c ("pipe: Add general notification queue support")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Cc: Jann Horn <jannh@google.com>
      Link: https://lore.kernel.org/r/20220322004654.618274-1-eric.dumazet@gmail.com/Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      b4902070
    • Steven Rostedt (Google)'s avatar
      tracing: mark user_events as BROKEN · 1cd927ad
      Steven Rostedt (Google) authored
      After being merged, user_events become more visible to a wider audience
      that have concerns with the current API.
      
      It is too late to fix this for this release, but instead of a full
      revert, just mark it as BROKEN (which prevents it from being selected in
      make config).  Then we can work finding a better API.  If that fails,
      then it will need to be completely reverted.
      
      To not have the code silently bitrot, still allow building it with
      COMPILE_TEST.
      
      And to prevent the uapi header from being installed, then later changed,
      and then have an old distro user space see the old version, move the
      header file out of the uapi directory.
      
      Surround the include with CONFIG_COMPILE_TEST to the current location,
      but when the BROKEN tag is taken off, it will use the uapi directory,
      and fail to compile.  This is a good way to remind us to move the header
      back.
      
      Link: https://lore.kernel.org/all/20220330155835.5e1f6669@gandalf.local.home
      Link: https://lkml.kernel.org/r/20220330201755.29319-1-mathieu.desnoyers@efficios.comSuggested-by: default avatarMathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Signed-off-by: default avatarSteven Rostedt (Google) <rostedt@goodmis.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      1cd927ad
    • Linus Torvalds's avatar
      Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · 88e6c020
      Linus Torvalds authored
      Pull vfs updates from Al Viro:
       "Assorted bits and pieces"
      
      * 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
        aio: drop needless assignment in aio_read()
        clean overflow checks in count_mounts() a bit
        seq_file: fix NULL pointer arithmetic warning
        uml/x86: use x86 load_unaligned_zeropad()
        asm/user.h: killed unused macros
        constify struct path argument of finish_automount()/do_add_mount()
        fs: Remove FIXME comment in generic_write_checks()
      88e6c020
    • Linus Torvalds's avatar
      Merge tag 'vfs-5.18-merge-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux · a4251ab9
      Linus Torvalds authored
      Pull vfs fix from Darrick Wong:
       "The erofs developers felt that FIEMAP should handle ranged requests
        starting at s_maxbytes by returning EFBIG instead of passing the
        filesystem implementation a nonsense 0-byte request.
      
        Not sure why they keep tagging this 'iomap', but the VFS shouldn't be
        asking for information about ranges of a file that the filesystem
        already declared that it does not support.
      
         - Fix a potential infinite loop in FIEMAP by fixing an off by one
           error when comparing the requested range against s_maxbytes"
      
      * tag 'vfs-5.18-merge-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
        fs: fix an infinite loop in iomap_fiemap
      a4251ab9
    • Linus Torvalds's avatar
      Merge tag 'xfs-5.18-merge-4' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux · b32e3819
      Linus Torvalds authored
      Pull xfs fixes from Darrick Wong:
       "This fixes multiple problems in the reserve pool sizing functions: an
        incorrect free space calculation, a pointless infinite loop, and even
        more braindamage that could result in the pool being overfilled. The
        pile of patches from Dave fix myriad races and UAF bugs in the log
        recovery code that much to our mutual surprise nobody's tripped over.
        Dave also fixed a performance optimization that had turned into a
        regression.
      
        Dave Chinner is taking over as XFS maintainer starting Sunday and
        lasting until 5.19-rc1 is tagged so that I can focus on starting a
        massive design review for the (feature complete after five years)
        online repair feature. From then on, he and I will be moving XFS to a
        co-maintainership model by trading duties every other release.
      
        NOTE: I hope very strongly that the other pieces of the (X)FS
        ecosystem (fstests and xfsprogs) will make similar changes to spread
        their maintenance load.
      
        Summary:
      
         - Fix an incorrect free space calculation in xfs_reserve_blocks that
           could lead to a request for free blocks that will never succeed.
      
         - Fix a hang in xfs_reserve_blocks caused by an infinite loop and the
           incorrect free space calculation.
      
         - Fix yet a third problem in xfs_reserve_blocks where multiple racing
           threads can overfill the reserve pool.
      
         - Fix an accounting error that lead to us reporting reserved space as
           "available".
      
         - Fix a race condition during abnormal fs shutdown that could cause
           UAF problems when memory reclaim and log shutdown try to clean up
           inodes.
      
         - Fix a bug where log shutdown can race with unmount to tear down the
           log, thereby causing UAF errors.
      
         - Disentangle log and filesystem shutdown to reduce confusion.
      
         - Fix some confusion in xfs_trans_commit such that a race between
           transaction commit and filesystem shutdown can cause unlogged dirty
           inode metadata to be committed, thereby corrupting the filesystem.
      
         - Remove a performance optimization in the log as it was discovered
           that certain storage hardware handle async log flushes so poorly as
           to cause serious performance regressions. Recent restructuring of
           other parts of the logging code mean that no performance benefit is
           seen on hardware that handle it well"
      
      * tag 'xfs-5.18-merge-4' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
        xfs: drop async cache flushes from CIL commits.
        xfs: shutdown during log recovery needs to mark the log shutdown
        xfs: xfs_trans_commit() path must check for log shutdown
        xfs: xfs_do_force_shutdown needs to block racing shutdowns
        xfs: log shutdown triggers should only shut down the log
        xfs: run callbacks before waking waiters in xlog_state_shutdown_callbacks
        xfs: shutdown in intent recovery has non-intent items in the AIL
        xfs: aborting inodes on shutdown may need buffer lock
        xfs: don't report reserved bnobt space as available
        xfs: fix overfilling of reserve pool
        xfs: always succeed at setting the reserve pool size
        xfs: remove infinite loop when reserving free block pool
        xfs: don't include bnobt blocks when reserving free block pool
        xfs: document the XFS_ALLOC_AGFL_RESERVE constant
      b32e3819
    • Linus Torvalds's avatar
      Merge tag 'riscv-for-linus-5.18-mw2' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux · 1fdff407
      Linus Torvalds authored
      Pull RISC-V fix from Palmer Dabbelt:
      
       - Fix the RISC-V section of the generic CPU idle bindings to comply
         with the recently tightened DT schema.
      
      * tag 'riscv-for-linus-5.18-mw2' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
        dt-bindings: Fix phandle-array issues in the idle-states bindings
      1fdff407
  2. 01 Apr, 2022 30 commits
    • Linus Torvalds's avatar
      Merge tag 'for-5.18/drivers-2022-04-01' of git://git.kernel.dk/linux-block · 8467b0ed
      Linus Torvalds authored
      Pull block driver fixes from Jens Axboe:
       "Followup block driver updates and fixes for the 5.18-rc1 merge window.
        In detail:
      
         - NVMe pull request
             - Fix multipath hang when disk goes live over reconnect (Anton
               Eidelman)
             - fix RCU hole that allowed for endless looping in multipath
               round robin (Chris Leech)
             - remove redundant assignment after left shift (Colin Ian King)
             - add quirks for Samsung X5 SSDs (Monish Kumar R)
             - fix the read-only state for zoned namespaces with unsupposed
               features (Pankaj Raghav)
             - use a private workqueue instead of the system workqueue in
               nvmet (Sagi Grimberg)
             - allow duplicate NSIDs for private namespaces (Sungup Moon)
             - expose use_threaded_interrupts read-only in sysfs (Xin Hao)"
      
         - nbd minor allocation fix (Zhang)
      
         - drbd fixes and maintainer addition (Lars, Jakob, Christoph)
      
         - n64cart build fix (Jackie)
      
         - loop compat ioctl fix (Carlos)
      
         - misc fixes (Colin, Dongli)"
      
      * tag 'for-5.18/drivers-2022-04-01' of git://git.kernel.dk/linux-block:
        drbd: remove check of list iterator against head past the loop body
        drbd: remove usage of list iterator variable after loop
        nbd: fix possible overflow on 'first_minor' in nbd_dev_add()
        MAINTAINERS: add drbd co-maintainer
        drbd: fix potential silent data corruption
        loop: fix ioctl calls using compat_loop_info
        nvme-multipath: fix hang when disk goes live over reconnect
        nvme: fix RCU hole that allowed for endless looping in multipath round robin
        nvme: allow duplicate NSIDs for private namespaces
        nvmet: remove redundant assignment after left shift
        nvmet: use a private workqueue instead of the system workqueue
        nvme-pci: add quirks for Samsung X5 SSDs
        nvme-pci: expose use_threaded_interrupts read-only in sysfs
        nvme: fix the read-only state for zoned namespaces with unsupposed features
        n64cart: convert bi_disk to bi_bdev->bd_disk fix build
        xen/blkfront: fix comment for need_copy
        xen-blkback: remove redundant assignment to variable i
      8467b0ed
    • Linus Torvalds's avatar
      Merge tag 'for-5.18/block-2022-04-01' of git://git.kernel.dk/linux-block · d589ae0d
      Linus Torvalds authored
      Pull block fixes from Jens Axboe:
       "Either fixes or a few additions that got missed in the initial merge
        window pull. In detail:
      
         - List iterator fix to avoid leaking value post loop (Jakob)
      
         - One-off fix in minor count (Christophe)
      
         - Fix for a regression in how io priority setting works for an
           exiting task (Jiri)
      
         - Fix a regression in this merge window with blkg_free() being called
           in an inappropriate context (Ming)
      
         - Misc fixes (Ming, Tom)"
      
      * tag 'for-5.18/block-2022-04-01' of git://git.kernel.dk/linux-block:
        blk-wbt: remove wbt_track stub
        block: use dedicated list iterator variable
        block: Fix the maximum minor value is blk_alloc_ext_minor()
        block: restore the old set_task_ioprio() behaviour wrt PF_EXITING
        block: avoid calling blkg_free() in atomic context
        lib/sbitmap: allocate sb->map via kvzalloc_node
      d589ae0d
    • Linus Torvalds's avatar
      Merge tag 'for-5.18/io_uring-2022-04-01' of git://git.kernel.dk/linux-block · 3b1509f2
      Linus Torvalds authored
      Pull io_uring fixes from Jens Axboe:
       "A little bit all over the map, some regression fixes for this merge
        window, and some general fixes that are stable bound. In detail:
      
         - Fix an SQPOLL memory ordering issue (Almog)
      
         - Accept fixes (Dylan)
      
         - Poll fixes (me)
      
         - Fixes for provided buffers and recycling (me)
      
         - Tweak to IORING_OP_MSG_RING command added in this merge window (me)
      
         - Memory leak fix (Pavel)
      
         - Misc fixes and tweaks (Pavel, me)"
      
      * tag 'for-5.18/io_uring-2022-04-01' of git://git.kernel.dk/linux-block:
        io_uring: defer msg-ring file validity check until command issue
        io_uring: fail links if msg-ring doesn't succeeed
        io_uring: fix memory leak of uid in files registration
        io_uring: fix put_kbuf without proper locking
        io_uring: fix invalid flags for io_put_kbuf()
        io_uring: improve req fields comments
        io_uring: enable EPOLLEXCLUSIVE for accept poll
        io_uring: improve task work cache utilization
        io_uring: fix async accept on O_NONBLOCK sockets
        io_uring: remove IORING_CQE_F_MSG
        io_uring: add flag for disabling provided buffer recycling
        io_uring: ensure recv and recvmsg handle MSG_WAITALL correctly
        io_uring: don't recycle provided buffer if punted to async worker
        io_uring: fix assuming triggered poll waitqueue is the single poll
        io_uring: bump poll refs to full 31-bits
        io_uring: remove poll entry from list when canceling all
        io_uring: fix memory ordering when SQPOLL thread goes to sleep
        io_uring: ensure that fsnotify is always called
        io_uring: recycle provided before arming poll
      3b1509f2
    • Linus Torvalds's avatar
      Merge tag 'for-5.18/dm-fixes' of... · fe35fdb3
      Linus Torvalds authored
      Merge tag 'for-5.18/dm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm
      
      Pull device mapper fixes from Mike Snitzer:
      
       - Fix DM integrity shrink crash due to journal entry not being marked
         unused.
      
       - Fix DM bio polling to handle possibility that underlying device(s)
         return BLK_STS_AGAIN during submission.
      
       - Fix dm_io and dm_target_io flags race condition on Alpha.
      
       - Add some pr_err debugging to help debug cases when DM ioctl structure
         is corrupted.
      
      * tag 'for-5.18/dm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
        dm: fix bio polling to handle possibile BLK_STS_AGAIN
        dm: fix dm_io and dm_target_io flags race condition on Alpha
        dm integrity: set journal entry unused when shrinking device
        dm ioctl: log an error if the ioctl structure is corrupted
      fe35fdb3
    • Palmer Dabbelt's avatar
      dt-bindings: Fix phandle-array issues in the idle-states bindings · 2524257b
      Palmer Dabbelt authored
      As per 39bd2b6a ("dt-bindings: Improve phandle-array schemas"), the
      phandle-array bindings have been disambiguated.  This fixes the new
      RISC-V idle-states bindings to comply with the schema.
      
      Fixes: 1bd524f7 ("dt-bindings: Add common bindings for ARM and RISC-V idle states")
      Reviewed-by: default avatarRob Herring <robh@kernel.org>
      Signed-off-by: default avatarPalmer Dabbelt <palmer@rivosinc.com>
      2524257b
    • Linus Torvalds's avatar
      Merge tag '5.18-rc-ksmbd-server-fixes' of git://git.samba.org/ksmbd · 7a3ecddc
      Linus Torvalds authored
      Pull ksmbd updates from Steve French:
      
       - three cleanup fixes
      
       - shorten module load warning
      
       - two documentation fixes
      
      * tag '5.18-rc-ksmbd-server-fixes' of git://git.samba.org/ksmbd:
        ksmbd: replace usage of found with dedicated list iterator variable
        ksmbd: Remove a redundant zeroing of memory
        MAINTAINERS: ksmbd: switch Sergey to reviewer
        ksmbd: shorten experimental warning on loading the module
        ksmbd: use netif_is_bridge_port
        Documentation: ksmbd: update Feature Status table
      7a3ecddc
    • Linus Torvalds's avatar
      Merge tag '5.18-smb3-fixes-part2' of git://git.samba.org/sfrench/cifs-2.6 · 9a005bea
      Linus Torvalds authored
      Pull more cifs updates from Steve French:
      
       - three fixes for big endian issues in how Persistent and Volatile file
         ids were stored
      
       - Various misc. fixes: including some for oops, 2 for ioctls, 1 for
         writeback
      
       - cleanup of how tcon (tree connection) status is tracked
      
       - Four changesets to move various duplicated protocol definitions
         (defined both in cifs.ko and ksmbd) into smbfs_common/smb2pdu.h
      
       - important performance improvement to use cached handles in some key
         compounding code paths (reduces numbers of opens/closes sent in some
         workloads)
      
       - fix to allow alternate DFS target to be used to retry on a failed i/o
      
      * tag '5.18-smb3-fixes-part2' of git://git.samba.org/sfrench/cifs-2.6:
        cifs: fix NULL ptr dereference in smb2_ioctl_query_info()
        cifs: prevent bad output lengths in smb2_ioctl_query_info()
        smb3: fix ksmbd bigendian bug in oplock break, and move its struct to smbfs_common
        smb3: cleanup and clarify status of tree connections
        smb3: move defines for query info and query fsinfo to smbfs_common
        smb3: move defines for ioctl protocol header and SMB2 sizes to smbfs_common
        [smb3] move more common protocol header definitions to smbfs_common
        cifs: fix incorrect use of list iterator after the loop
        ksmbd: store fids as opaque u64 integers
        cifs: fix bad fids sent over wire
        cifs: change smb2_query_info_compound to use a cached fid, if available
        cifs: convert the path to utf16 in smb2_query_info_compound
        cifs: writeback fix
        cifs: do not skip link targets when an I/O fails
      9a005bea
    • Linus Torvalds's avatar
      Merge tag 'exfat-for-5.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat · ec251f3e
      Linus Torvalds authored
      Pull exfat updates from Namjae Jeon:
      
       - Add keep_last_dots mount option to allow access to paths with
         trailing dots
      
       - Avoid repetitive volume dirty bit set/clear to improve storage life
         time
      
      * tag 'exfat-for-5.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat:
        exfat: do not clear VolumeDirty in writeback
        exfat: allow access to paths with trailing dots
      ec251f3e
    • Linus Torvalds's avatar
      Merge tag 'folio-5.18d' of git://git.infradead.org/users/willy/pagecache · cda43512
      Linus Torvalds authored
      Pull more filesystem folio updates from Matthew Wilcox:
       "A mixture of odd changes that didn't quite make it into the original
        pull and fixes for things that did. Also the readpages changes had to
        wait for the NFS tree to be pulled first.
      
         - Remove ->readpages infrastructure
      
         - Remove AOP_FLAG_CONT_EXPAND
      
         - Move read_descriptor_t to networking code
      
         - Pass the iocb to generic_perform_write
      
         - Minor updates to iomap, btrfs, ext4, f2fs, ntfs"
      
      * tag 'folio-5.18d' of git://git.infradead.org/users/willy/pagecache:
        btrfs: Remove a use of PAGE_SIZE in btrfs_invalidate_folio()
        ntfs: Correct mark_ntfs_record_dirty() folio conversion
        f2fs: Get the superblock from the mapping instead of the page
        f2fs: Correct f2fs_dirty_data_folio() conversion
        ext4: Correct ext4_journalled_dirty_folio() conversion
        filemap: Remove AOP_FLAG_CONT_EXPAND
        fs: Pass an iocb to generic_perform_write()
        fs, net: Move read_descriptor_t to net.h
        fs: Remove read_actor_t
        iomap: Simplify is_partially_uptodate a little
        readahead: Update comments
        mm: remove the skip_page argument to read_pages
        mm: remove the pages argument to read_pages
        fs: Remove ->readpages address space operation
        readahead: Remove read_cache_pages()
      cda43512
    • Linus Torvalds's avatar
      Merge tag 'xarray-5.18' of git://git.infradead.org/users/willy/xarray · 5a3fe95d
      Linus Torvalds authored
      Pull XArray updates from Matthew Wilcox:
      
       - Documentation update
      
       - Fix test-suite build after move of bitmap.h
      
       - Fix xas_create_range() when a large entry is already present
      
       - Fix xas_split() of a shadow entry
      
      * tag 'xarray-5.18' of git://git.infradead.org/users/willy/xarray:
        XArray: Update the LRU list in xas_split()
        XArray: Fix xas_create_range() when multi-order entry present
        XArray: Include bitmap.h from xarray.h
        XArray: Document the locking requirement for the xa_state
      5a3fe95d
    • Linus Torvalds's avatar
      Merge tag 'riscv-for-linus-5.18-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux · a3dfc532
      Linus Torvalds authored
      Pull more RISC-V updates from Palmer Dabbelt:
       "This has a handful of new features:
      
         - Support for CURRENT_STACK_POINTER, which enables some extra stack
           debugging for HARDENED_USERCOPY.
      
         - Support for the new SBI CPU idle extension, via cpuidle and suspend
           drivers.
      
         - Profiling has been enabled in the defconfigs.
      
        but is mostly fixes and cleanups"
      
      * tag 'riscv-for-linus-5.18-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (21 commits)
        RISC-V: K210 defconfigs: Drop redundant MEMBARRIER=n
        RISC-V: defconfig: Drop redundant SBI HVC and earlycon
        Documentation: riscv: remove non-existent directory from table of contents
        riscv: cpu.c: don't use kernel-doc markers for comments
        RISC-V: Enable profiling by default
        RISC-V: module: fix apply_r_riscv_rcv_branch_rela typo
        RISC-V: Declare per cpu boot data as static
        RISC-V: Fix a comment typo in riscv_of_parent_hartid()
        riscv: Increase stack size under KASAN
        riscv: Fix fill_callchain return value
        riscv: dts: canaan: Fix SPI3 bus width
        riscv: Rename "sp_in_global" to "current_stack_pointer"
        riscv module: remove (NOLOAD)
        RISC-V: Enable RISC-V SBI CPU Idle driver for QEMU virt machine
        dt-bindings: Add common bindings for ARM and RISC-V idle states
        cpuidle: Add RISC-V SBI CPU idle driver
        cpuidle: Factor-out power domain related code from PSCI domain driver
        RISC-V: Add SBI HSM suspend related defines
        RISC-V: Add arch functions for non-retentive suspend entry/exit
        RISC-V: Rename relocate() and make it global
        ...
      a3dfc532
    • Linus Torvalds's avatar
      Merge tag 's390-5.18-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux · 9ae24d5a
      Linus Torvalds authored
      Pull more s390 updates from Vasily Gorbik:
      
       - Add kretprobes framepointer verification and return address recovery
         in stacktrace.
      
       - Support control domain masks on custom zcrypt devices and filter
         admin requests.
      
       - Cleanup timer API usage.
      
       - Rework absolute lowcore access helpers.
      
       - Other various small improvements and fixes.
      
      * tag 's390-5.18-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (26 commits)
        s390/alternatives: avoid using jgnop mnemonic
        s390/pci: rename get_zdev_by_bus() to zdev_from_bus()
        s390/pci: improve zpci_dev reference counting
        s390/smp: use physical address for SIGP_SET_PREFIX command
        s390: cleanup timer API use
        s390/zcrypt: fix using the correct variable for sizeof()
        s390/vfio-ap: fix kernel doc and signature of group notifier functions
        s390/maccess: rework absolute lowcore accessors
        s390/smp: cleanup control register update routines
        s390/smp: cleanup target CPU callback starting
        s390/test_unwind: verify __kretprobe_trampoline is replaced
        s390/unwind: avoid duplicated unwinding entries for kretprobes
        s390/unwind: recover kretprobe modified return address in stacktrace
        s390/kprobes: enable kretprobes framepointer verification
        s390/test_unwind: extend kretprobe test
        s390/ap: adjust whitespace
        s390/ap: use insn format for new instructions
        s390/alternatives: use insn format for new instructions
        s390/alternatives: use instructions instead of byte patterns
        s390/traps: improve panic message for translation-specification exception
        ...
      9ae24d5a
    • Linus Torvalds's avatar
      Merge tag 'soc-fixes-5.18-1' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc · ba2d6201
      Linus Torvalds authored
      Pull ARM SoC fixes from Arnd BergmannL
       "The introduction of vmap-stack on 32-bit arm caused a regression on a
        few omap3/omap4 machines that pass a stack variable into a firmware
        interface.
      
        The early pre-ACPI AMD Seattle machines have been broken for a while,
        Ard Biesheuvel has a series to bring them back for now.
      
        A few machines with multiple DMA channels used on a device have the
        channels in the wrong order according to the binding, which causes a
        harmless warning. Reversing the order is easier than fixing the tools
        to suppress the warning"
      
      * tag 'soc-fixes-5.18-1' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
        arm64: dts: ls1046a: Update i2c node dma properties
        arm64: dts: ls1043a: Update i2c dma properties
        ARM: dts: spear1340: Update serial node properties
        ARM: dts: spear13xx: Update SPI dma properties
        ARM: OMAP2+: Fix regression for smc calls for vmap stack
        dt: amd-seattle: add a description of the CPUs and caches
        dt: amd-seattle: disable IPMI controller and some GPIO blocks on B0
        dt: amd-seattle: add description of the SATA/CCP SMMUs
        dt: amd-seattle: add a description of the PCIe SMMU
        dt: amd-seattle: fix PCIe legacy interrupt routing
        dt: amd-seattle: upgrade AMD Seattle XGBE to new SMMU binding
        dt: amd-seattle: remove Overdrive revision A0 support
        dt: amd-seattle: remove Husky platform
      ba2d6201
    • Linus Torvalds's avatar
      Merge branch 'akpm' (patches from Andrew) · b012b323
      Linus Torvalds authored
      Merge still more updates from Andrew Morton:
       "16 patches.
      
        Subsystems affected by this patch series: ofs2, nilfs2, mailmap, and
        mm (madvise, mlock, mfence, memory-failure, kasan, debug, kmemleak,
        and damon)"
      
      * emailed patches from Andrew Morton <akpm@linux-foundation.org>:
        mm/damon: prevent activated scheme from sleeping by deactivated schemes
        mm/kmemleak: reset tag when compare object pointer
        doc/vm/page_owner.rst: remove content related to -c option
        tools/vm/page_owner_sort.c: remove -c option
        mm, kasan: fix __GFP_BITS_SHIFT definition breaking LOCKDEP
        mm,hwpoison: unmap poisoned page before invalidation
        mailmap: update Kirill's email
        mm: kfence: fix objcgs vector allocation
        mm/munlock: protect the per-CPU pagevec by a local_lock_t
        mm/munlock: update Documentation/vm/unevictable-lru.rst
        mm/munlock: add lru_add_drain() to fix memcg_stat_test
        nilfs2: get rid of nilfs_mapping_init()
        nilfs2: fix lockdep warnings during disk space reclamation
        nilfs2: fix lockdep warnings in page operations for btree nodes
        ocfs2: fix crash when mount with quota enabled
        Revert "mm: madvise: skip unmapped vma holes passed to process_madvise"
      b012b323
    • Jonghyeon Kim's avatar
      mm/damon: prevent activated scheme from sleeping by deactivated schemes · 78049e94
      Jonghyeon Kim authored
      In the DAMON, the minimum wait time of the schemes decides whether the
      kernel wakes up 'kdamon_fn()'.  But since the minimum wait time is
      initialized to zero, there are corner cases against the original
      objective.
      
      For example, if we have several schemes for one target, and if the wait
      time of the first scheme is zero, the minimum wait time will set zero,
      which means 'kdamond_fn()' should wake up to apply this scheme.
      However, in the following scheme, wait time can be set to non-zero.
      Thus, the mininum wait time will be set to non-zero, which can cause
      sleeping this interval for 'kdamon_fn()' due to one deactivated last
      scheme.
      
      This commit prevents making DAMON monitoring inactive state due to other
      deactivated schemes.
      
      Link: https://lkml.kernel.org/r/20220330105302.32114-1-tome01@ajou.ac.krSigned-off-by: default avatarJonghyeon Kim <tome01@ajou.ac.kr>
      Reviewed-by: default avatarSeongJae Park <sj@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      78049e94
    • Kuan-Ying Lee's avatar
      mm/kmemleak: reset tag when compare object pointer · bfc8089f
      Kuan-Ying Lee authored
      When we use HW-tag based kasan and enable vmalloc support, we hit the
      following bug.  It is due to comparison between tagged object and
      non-tagged pointer.
      
      We need to reset the kasan tag when we need to compare tagged object and
      non-tagged pointer.
      
        kmemleak: [name:kmemleak&]Scan area larger than object 0xffffffe77076f440
        CPU: 4 PID: 1 Comm: init Tainted: G S      W         5.15.25-android13-0-g5cacf919c2bc #1
        Hardware name: MT6983(ENG) (DT)
        Call trace:
         add_scan_area+0xc4/0x244
         kmemleak_scan_area+0x40/0x9c
         layout_and_allocate+0x1e8/0x288
         load_module+0x2c8/0xf00
         __se_sys_finit_module+0x190/0x1d0
         __arm64_sys_finit_module+0x20/0x30
         invoke_syscall+0x60/0x170
         el0_svc_common+0xc8/0x114
         do_el0_svc+0x28/0xa0
         el0_svc+0x60/0xf8
         el0t_64_sync_handler+0x88/0xec
         el0t_64_sync+0x1b4/0x1b8
        kmemleak: [name:kmemleak&]Object 0xf5ffffe77076b000 (size 32768):
        kmemleak: [name:kmemleak&]  comm "init", pid 1, jiffies 4294894197
        kmemleak: [name:kmemleak&]  min_count = 0
        kmemleak: [name:kmemleak&]  count = 0
        kmemleak: [name:kmemleak&]  flags = 0x1
        kmemleak: [name:kmemleak&]  checksum = 0
        kmemleak: [name:kmemleak&]  backtrace:
             module_alloc+0x9c/0x120
             move_module+0x34/0x19c
             layout_and_allocate+0x1c4/0x288
             load_module+0x2c8/0xf00
             __se_sys_finit_module+0x190/0x1d0
             __arm64_sys_finit_module+0x20/0x30
             invoke_syscall+0x60/0x170
             el0_svc_common+0xc8/0x114
             do_el0_svc+0x28/0xa0
             el0_svc+0x60/0xf8
             el0t_64_sync_handler+0x88/0xec
             el0t_64_sync+0x1b4/0x1b8
      
      Link: https://lkml.kernel.org/r/20220318034051.30687-1-Kuan-Ying.Lee@mediatek.comSigned-off-by: default avatarKuan-Ying Lee <Kuan-Ying.Lee@mediatek.com>
      Reviewed-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Cc: Matthias Brugger <matthias.bgg@gmail.com>
      Cc: Chinwen Chang <chinwen.chang@mediatek.com>
      Cc: Nicholas Tang <nicholas.tang@mediatek.com>
      Cc: Yee Lee <yee.lee@mediatek.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      bfc8089f
    • Yinan Zhang's avatar
      doc/vm/page_owner.rst: remove content related to -c option · c89b3ad2
      Yinan Zhang authored
      -c option has been removed from page_owner_sort.c.
      
      Remove the usage of -c option from Documentation.
      
      This work is coauthored by
              Shenghong Han
              Yixuan Cao
              Chongxi Zhao
              Jiajian Ye
              Yuhong Feng
              Yongqiang Liu
      
      Link: https://lkml.kernel.org/r/20220326085920.1470081-2-zhangyinan2019@email.szu.edu.cnSigned-off-by: default avatarYinan Zhang <zhangyinan2019@email.szu.edu.cn>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      Cc: Sean Anderson <seanga2@gmail.com>
      Cc: Tang Bin <tangbin@cmss.chinamobile.com>
      Cc: Zhenliang Wei <weizhenliang@huawei.com>
      Cc: Georgi Djakov <georgi.djakov@linaro.org>
      Cc: Chongxi Zhao <zhaochongxi2019@email.szu.edu.cn>
      Cc: Jiajian Ye <yejiajian2018@email.szu.edu.cn>
      Cc: Yixuan Cao <caoyixuan2019@email.szu.edu.cn>
      Cc: Yuhong Feng <yuhongf@szu.edu.cn>
      Cc: Yongqiang Liu <liuyongqiang13@huawei.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      c89b3ad2
    • Yinan Zhang's avatar
      tools/vm/page_owner_sort.c: remove -c option · d8b7b3fa
      Yinan Zhang authored
      The -c option is used to cull by stacktrace.  Now, --cull option has
      been Added in page_owner_sort.c.  Culling by stacktrace is one of the
      function of "--cull".  No need to set an extra parameter.  So remove -c
      option.
      
      Remove parsing of -c when parse parameter and remove "-c" from usage.
      
      This work is coauthored by
              Shenghong Han
              Yixuan Cao
              Chongxi Zhao
              Jiajian Ye
              Yuhong Feng
              Yongqiang Liu
      
      Link: https://lkml.kernel.org/r/20220326085920.1470081-1-zhangyinan2019@email.szu.edu.cnSigned-off-by: default avatarYinan Zhang <zhangyinan2019@email.szu.edu.cn>
      Cc: Chongxi Zhao <zhaochongxi2019@email.szu.edu.cn>
      Cc: Georgi Djakov <georgi.djakov@linaro.org>
      Cc: Jiajian Ye <yejiajian2018@email.szu.edu.cn>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Sean Anderson <seanga2@gmail.com>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      Cc: Tang Bin <tangbin@cmss.chinamobile.com>
      Cc: Yixuan Cao <caoyixuan2019@email.szu.edu.cn>
      Cc: Yongqiang Liu <liuyongqiang13@huawei.com>
      Cc: Yuhong Feng <yuhongf@szu.edu.cn>
      Cc: Zhenliang Wei <weizhenliang@huawei.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      d8b7b3fa
    • Andrey Konovalov's avatar
      mm, kasan: fix __GFP_BITS_SHIFT definition breaking LOCKDEP · ada543af
      Andrey Konovalov authored
      KASAN changes that added new GFP flags mistakenly updated
      __GFP_BITS_SHIFT as the total number of GFP bits instead of as a shift
      used to define __GFP_BITS_MASK.
      
      This broke LOCKDEP, as __GFP_BITS_MASK now gets the 25th bit enabled
      instead of the 28th for __GFP_NOLOCKDEP.
      
      Update __GFP_BITS_SHIFT to always count KASAN GFP bits.
      
      In the future, we could handle all combinations of KASAN and LOCKDEP to
      occupy as few bits as possible.  For now, we have enough GFP bits to be
      inefficient in this quick fix.
      
      Link: https://lkml.kernel.org/r/462ff52742a1fcc95a69778685737f723ee4dfb3.1648400273.git.andreyknvl@google.com
      Fixes: 9353ffa6 ("kasan, page_alloc: allow skipping memory init for HW_TAGS")
      Fixes: 53ae233c ("kasan, page_alloc: allow skipping unpoisoning for HW_TAGS")
      Fixes: f49d9c5b ("kasan, mm: only define ___GFP_SKIP_KASAN_POISON with HW_TAGS")
      Signed-off-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Reported-by: default avatarSebastian Andrzej Siewior <bigeasy@linutronix.de>
      Tested-by: default avatarSebastian Andrzej Siewior <bigeasy@linutronix.de>
      Acked-by: default avatarVlastimil Babka <vbabka@suse.cz>
      Cc: Marco Elver <elver@google.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
      Cc: Matthew Wilcox <willy@infradead.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      ada543af
    • Rik van Riel's avatar
      mm,hwpoison: unmap poisoned page before invalidation · 3149c79f
      Rik van Riel authored
      In some cases it appears the invalidation of a hwpoisoned page fails
      because the page is still mapped in another process.  This can cause a
      program to be continuously restarted and die when it page faults on the
      page that was not invalidated.  Avoid that problem by unmapping the
      hwpoisoned page when we find it.
      
      Another issue is that sometimes we end up oopsing in finish_fault, if
      the code tries to do something with the now-NULL vmf->page.  I did not
      hit this error when submitting the previous patch because there are
      several opportunities for alloc_set_pte to bail out before accessing
      vmf->page, and that apparently happened on those systems, and most of
      the time on other systems, too.
      
      However, across several million systems that error does occur a handful
      of times a day.  It can be avoided by returning VM_FAULT_NOPAGE which
      will cause do_read_fault to return before calling finish_fault.
      
      Link: https://lkml.kernel.org/r/20220325161428.5068d97e@imladris.surriel.com
      Fixes: e53ac737 ("mm: invalidate hwpoison page cache page in fault path")
      Signed-off-by: default avatarRik van Riel <riel@surriel.com>
      Reviewed-by: default avatarMiaohe Lin <linmiaohe@huawei.com>
      Tested-by: default avatarNaoya Horiguchi <naoya.horiguchi@nec.com>
      Reviewed-by: default avatarOscar Salvador <osalvador@suse.de>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      3149c79f
    • Kirill Tkhai's avatar
      4f1f9698
    • Muchun Song's avatar
      mm: kfence: fix objcgs vector allocation · 8f0b3649
      Muchun Song authored
      If the kfence object is allocated to be used for objects vector, then
      this slot of the pool eventually being occupied permanently since the
      vector is never freed.  The solutions could be (1) freeing vector when
      the kfence object is freed or (2) allocating all vectors statically.
      
      Since the memory consumption of object vectors is low, it is better to
      chose (2) to fix the issue and it is also can reduce overhead of vectors
      allocating in the future.
      
      Link: https://lkml.kernel.org/r/20220328132843.16624-1-songmuchun@bytedance.com
      Fixes: d3fb45f3 ("mm, kfence: insert KFENCE hooks for SLAB")
      Signed-off-by: default avatarMuchun Song <songmuchun@bytedance.com>
      Reviewed-by: default avatarMarco Elver <elver@google.com>
      Reviewed-by: default avatarRoman Gushchin <roman.gushchin@linux.dev>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Xiongchun Duan <duanxiongchun@bytedance.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      8f0b3649
    • Sebastian Andrzej Siewior's avatar
      mm/munlock: protect the per-CPU pagevec by a local_lock_t · adb11e78
      Sebastian Andrzej Siewior authored
      The access to mlock_pvec is protected by disabling preemption via
      get_cpu_var() or implicit by having preemption disabled by the caller
      (in mlock_page_drain() case).  This breaks on PREEMPT_RT since
      folio_lruvec_lock_irq() acquires a sleeping lock in this section.
      
      Create struct mlock_pvec which consits of the local_lock_t and the
      pagevec.  Acquire the local_lock() before accessing the per-CPU pagevec.
      Replace mlock_page_drain() with a _local() version which is invoked on
      the local CPU and acquires the local_lock_t and a _remote() version
      which uses the pagevec from a remote CPU which offline.
      
      Link: https://lkml.kernel.org/r/YjizWi9IY0mpvIfb@linutronix.deSigned-off-by: default avatarSebastian Andrzej Siewior <bigeasy@linutronix.de>
      Acked-by: default avatarHugh Dickins <hughd@google.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      adb11e78
    • Hugh Dickins's avatar
      mm/munlock: update Documentation/vm/unevictable-lru.rst · 577e9846
      Hugh Dickins authored
      Update Documentation/vm/unevictable-lru.rst to reflect the changes made
      by the mm/munlock series: keeping an mlock_count instead of page_mlock()
      (formerly try_to_munlock()) and munlock_vma_pages_all() etc.  Also make
      other little updates or cleanups wherever noticed.
      
      But, I apologize, this is already out of date, in that "folio" appears
      nowhere: 5.18 will be in a transitional state from "page" to "folio",
      and documenting its current mix of the two does not help to understand
      "the Unevictable LRU".  Should be revisited when naming is more settled.
      
      Link: https://lkml.kernel.org/r/3753962-d491-bf60-f59f-51bfe84fd6a0@google.comSigned-off-by: default avatarHugh Dickins <hughd@google.com>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Randy Dunlap <rdunlap@infradead.org>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: Alistair Popple <apopple@nvidia.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Rik van Riel <riel@surriel.com>
      Cc: Suren Baghdasaryan <surenb@google.com>
      Cc: Yu Zhao <yuzhao@google.com>
      Cc: Greg Thelen <gthelen@google.com>
      Cc: Shakeel Butt <shakeelb@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      577e9846
    • Hugh Dickins's avatar
      mm/munlock: add lru_add_drain() to fix memcg_stat_test · ece369c7
      Hugh Dickins authored
      Mike reports that LTP memcg_stat_test usually leads to
      
        memcg_stat_test 3 TINFO: Test unevictable with MAP_LOCKED
        memcg_stat_test 3 TINFO: Running memcg_process --mmap-lock1 -s 135168
        memcg_stat_test 3 TINFO: Warming up pid: 3460
        memcg_stat_test 3 TINFO: Process is still here after warm up: 3460
        memcg_stat_test 3 TFAIL: unevictable is 122880, 135168 expected
      
      but may also lead to
      
        memcg_stat_test 4 TINFO: Test unevictable with mlock
        memcg_stat_test 4 TINFO: Running memcg_process --mmap-lock2 -s 135168
        memcg_stat_test 4 TINFO: Warming up pid: 4271
        memcg_stat_test 4 TINFO: Process is still here after warm up: 4271
        memcg_stat_test 4 TFAIL: unevictable is 122880, 135168 expected
      
      or both.  A wee bit flaky.
      
      follow_page_pte() used to have an lru_add_drain() per each page mlocked,
      and the test came to rely on accurate stats.  The pagevec to be drained
      is different now, but still covered by lru_add_drain(); and, never mind
      the test, I believe it's in everyone's interest that a bulk faulting
      interface like populate_vma_page_range() or faultin_vma_page_range()
      should drain its local pagevecs at the end, to save others sometimes
      needing the much more expensive lru_add_drain_all().
      
      This does not absolutely guarantee exact stats - the mlocking task can
      be migrated between CPUs as it proceeds - but it's good enough and the
      tests pass.
      
      Link: https://lkml.kernel.org/r/47f6d39c-a075-50cb-1cfb-26dd957a48af@google.com
      Fixes: b67bf49c ("mm/munlock: delete FOLL_MLOCK and FOLL_POPULATE")
      Signed-off-by: default avatarHugh Dickins <hughd@google.com>
      Reported-by: default avatarMike Galbraith <efault@gmx.de>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      ece369c7
    • Ryusuke Konishi's avatar
      nilfs2: get rid of nilfs_mapping_init() · cdd81b31
      Ryusuke Konishi authored
      After applying the lockdep warning fixes, nilfs_mapping_init() is no
      longer used, so delete it.
      
      Link: https://lkml.kernel.org/r/1647867427-30498-4-git-send-email-konishi.ryusuke@gmail.comSigned-off-by: default avatarRyusuke Konishi <konishi.ryusuke@gmail.com>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: Hao Sun <sunhao.th@gmail.com>
      Cc: Matthew Wilcox <willy@infradead.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      cdd81b31
    • Ryusuke Konishi's avatar
      nilfs2: fix lockdep warnings during disk space reclamation · 6e211930
      Ryusuke Konishi authored
      During disk space reclamation, nilfs2 still emits the following lockdep
      warning due to page/folio operations on shadowed page caches that nilfs2
      uses to get a snapshot of DAT file in memory:
      
        WARNING: CPU: 0 PID: 2643 at include/linux/backing-dev.h:272 __folio_mark_dirty+0x645/0x670
        ...
        RIP: 0010:__folio_mark_dirty+0x645/0x670
        ...
        Call Trace:
          filemap_dirty_folio+0x74/0xd0
          __set_page_dirty_nobuffers+0x85/0xb0
          nilfs_copy_dirty_pages+0x288/0x510 [nilfs2]
          nilfs_mdt_save_to_shadow_map+0x50/0xe0 [nilfs2]
          nilfs_clean_segments+0xee/0x5d0 [nilfs2]
          nilfs_ioctl_clean_segments.isra.19+0xb08/0xf40 [nilfs2]
          nilfs_ioctl+0xc52/0xfb0 [nilfs2]
          __x64_sys_ioctl+0x11d/0x170
      
      This fixes the remaining warning by using inode objects to hold those
      page caches.
      
      Link: https://lkml.kernel.org/r/1647867427-30498-3-git-send-email-konishi.ryusuke@gmail.comSigned-off-by: default avatarRyusuke Konishi <konishi.ryusuke@gmail.com>
      Tested-by: default avatarRyusuke Konishi <konishi.ryusuke@gmail.com>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: Hao Sun <sunhao.th@gmail.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      6e211930
    • Ryusuke Konishi's avatar
      nilfs2: fix lockdep warnings in page operations for btree nodes · e897be17
      Ryusuke Konishi authored
      Patch series "nilfs2 lockdep warning fixes".
      
      The first two are to resolve the lockdep warning issue, and the last one
      is the accompanying cleanup and low priority.
      
      Based on your comment, this series solves the issue by separating inode
      object as needed.  Since I was worried about the impact of the object
      composition changes, I tested the series carefully not to cause
      regressions especially for delicate functions such like disk space
      reclamation and snapshots.
      
      This patch (of 3):
      
      If CONFIG_LOCKDEP is enabled, nilfs2 hits lockdep warnings at
      inode_to_wb() during page/folio operations for btree nodes:
      
        WARNING: CPU: 0 PID: 6575 at include/linux/backing-dev.h:269 inode_to_wb include/linux/backing-dev.h:269 [inline]
        WARNING: CPU: 0 PID: 6575 at include/linux/backing-dev.h:269 folio_account_dirtied mm/page-writeback.c:2460 [inline]
        WARNING: CPU: 0 PID: 6575 at include/linux/backing-dev.h:269 __folio_mark_dirty+0xa7c/0xe30 mm/page-writeback.c:2509
        Modules linked in:
        ...
        RIP: 0010:inode_to_wb include/linux/backing-dev.h:269 [inline]
        RIP: 0010:folio_account_dirtied mm/page-writeback.c:2460 [inline]
        RIP: 0010:__folio_mark_dirty+0xa7c/0xe30 mm/page-writeback.c:2509
        ...
        Call Trace:
          __set_page_dirty include/linux/pagemap.h:834 [inline]
          mark_buffer_dirty+0x4e6/0x650 fs/buffer.c:1145
          nilfs_btree_propagate_p fs/nilfs2/btree.c:1889 [inline]
          nilfs_btree_propagate+0x4ae/0xea0 fs/nilfs2/btree.c:2085
          nilfs_bmap_propagate+0x73/0x170 fs/nilfs2/bmap.c:337
          nilfs_collect_dat_data+0x45/0xd0 fs/nilfs2/segment.c:625
          nilfs_segctor_apply_buffers+0x14a/0x470 fs/nilfs2/segment.c:1009
          nilfs_segctor_scan_file+0x47a/0x700 fs/nilfs2/segment.c:1048
          nilfs_segctor_collect_blocks fs/nilfs2/segment.c:1224 [inline]
          nilfs_segctor_collect fs/nilfs2/segment.c:1494 [inline]
          nilfs_segctor_do_construct+0x14f3/0x6c60 fs/nilfs2/segment.c:2036
          nilfs_segctor_construct+0x7a7/0xb30 fs/nilfs2/segment.c:2372
          nilfs_segctor_thread_construct fs/nilfs2/segment.c:2480 [inline]
          nilfs_segctor_thread+0x3c3/0xf90 fs/nilfs2/segment.c:2563
          kthread+0x405/0x4f0 kernel/kthread.c:327
          ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:295
      
      This is because nilfs2 uses two page caches for each inode and
      inode->i_mapping never points to one of them, the btree node cache.
      
      This causes inode_to_wb(inode) to refer to a different page cache than
      the caller page/folio operations such like __folio_start_writeback(),
      __folio_end_writeback(), or __folio_mark_dirty() acquired the lock.
      
      This patch resolves the issue by allocating and using an additional
      inode to hold the page cache of btree nodes.  The inode is attached
      one-to-one to the traditional nilfs2 inode if it requires a block
      mapping with b-tree.  This setup change is in memory only and does not
      affect the disk format.
      
      Link: https://lkml.kernel.org/r/1647867427-30498-1-git-send-email-konishi.ryusuke@gmail.com
      Link: https://lkml.kernel.org/r/1647867427-30498-2-git-send-email-konishi.ryusuke@gmail.com
      Link: https://lore.kernel.org/r/YXrYvIo8YRnAOJCj@casper.infradead.org
      Link: https://lore.kernel.org/r/9a20b33d-b38f-b4a2-4742-c1eb5b8e4d6c@redhat.comSigned-off-by: default avatarRyusuke Konishi <konishi.ryusuke@gmail.com>
      Reported-by: syzbot+0d5b462a6f07447991b3@syzkaller.appspotmail.com
      Reported-by: syzbot+34ef28bb2aeb28724aa0@syzkaller.appspotmail.com
      Reported-by: default avatarHao Sun <sunhao.th@gmail.com>
      Reported-by: default avatarDavid Hildenbrand <david@redhat.com>
      Tested-by: default avatarRyusuke Konishi <konishi.ryusuke@gmail.com>
      Cc: Matthew Wilcox <willy@infradead.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      e897be17
    • Joseph Qi's avatar
      ocfs2: fix crash when mount with quota enabled · de194334
      Joseph Qi authored
      There is a reported crash when mounting ocfs2 with quota enabled.
      
        RIP: 0010:ocfs2_qinfo_lock_res_init+0x44/0x50 [ocfs2]
        Call Trace:
          ocfs2_local_read_info+0xb9/0x6f0 [ocfs2]
          dquot_load_quota_sb+0x216/0x470
          dquot_load_quota_inode+0x85/0x100
          ocfs2_enable_quotas+0xa0/0x1c0 [ocfs2]
          ocfs2_fill_super.cold+0xc8/0x1bf [ocfs2]
          mount_bdev+0x185/0x1b0
          legacy_get_tree+0x27/0x40
          vfs_get_tree+0x25/0xb0
          path_mount+0x465/0xac0
          __x64_sys_mount+0x103/0x140
      
      It is caused by when initializing dqi_gqlock, the corresponding dqi_type
      and dqi_sb are not properly initialized.
      
      This issue is introduced by commit 6c85c2c7, which wants to avoid
      accessing uninitialized variables in error cases.  So make global quota
      info properly initialized.
      
      Link: https://lkml.kernel.org/r/20220323023644.40084-1-joseph.qi@linux.alibaba.com
      Link: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1007141
      Fixes: 6c85c2c7 ("ocfs2: quota_local: fix possible uninitialized-variable access in ocfs2_local_read_info()")
      Signed-off-by: default avatarJoseph Qi <joseph.qi@linux.alibaba.com>
      Reported-by: default avatarDayvison <sathlerds@gmail.com>
      Tested-by: default avatarValentin Vidic <vvidic@valentin-vidic.from.hr>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      de194334
    • Charan Teja Kalla's avatar
      Revert "mm: madvise: skip unmapped vma holes passed to process_madvise" · e6b0a7b3
      Charan Teja Kalla authored
      This reverts commit 08095d63 ("mm: madvise: skip unmapped vma holes
      passed to process_madvise") as process_madvise() fails to return the
      exact processed bytes in other cases too.
      
      As an example: if process_madvise() hits mlocked pages after processing
      some initial bytes passed in [start, end), it just returns EINVAL
      although some bytes are processed.  Thus making an exception only for
      ENOMEM is partially fixing the problem of returning the proper advised
      bytes.
      
      Thus revert this patch and return proper bytes advised.
      
      Link: https://lkml.kernel.org/r/e73da1304a88b6a8a11907045117cccf4c2b8374.1648046642.git.quic_charante@quicinc.com
      Fixes: 08095d63 ("mm: madvise: skip unmapped vma holes passed to process_madvise")
      Signed-off-by: default avatarCharan Teja Kalla <quic_charante@quicinc.com>
      Acked-by: default avatarMichal Hocko <mhocko@suse.com>
      Cc: Suren Baghdasaryan <surenb@google.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Nadav Amit <nadav.amit@gmail.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      e6b0a7b3