1. 13 Feb, 2024 9 commits
    • Dave Chinner's avatar
      xfs: place intent recovery under NOFS allocation context · 2c1e31ed
      Dave Chinner authored
      When recovery starts processing intents, all of the initial intent
      allocations are done outside of transaction contexts. That means
      they need to specifically use GFP_NOFS as we do not want memory
      reclaim to attempt to run direct reclaim of filesystem objects while
      we have lots of objects added into deferred operations.
      
      Rather than use GFP_NOFS for these specific allocations, just place
      the entire intent recovery process under NOFS context and we can
      then just use GFP_KERNEL for these allocations.
      Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
      Reviewed-by: default avatar"Darrick J. Wong" <djwong@kernel.org>
      Signed-off-by: default avatarChandan Babu R <chandanbabu@kernel.org>
      2c1e31ed
    • Dave Chinner's avatar
      xfs: use GFP_KERNEL in pure transaction contexts · 0b3a76e9
      Dave Chinner authored
      When running in a transaction context, memory allocations are scoped
      to GFP_NOFS. Hence we don't need to use GFP_NOFS contexts in pure
      transaction context allocations - GFP_KERNEL will automatically get
      converted to GFP_NOFS as appropriate.
      
      Go through the code and convert all the obvious GFP_NOFS allocations
      in transaction context to use GFP_KERNEL. This further reduces the
      explicit use of GFP_NOFS in XFS.
      Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
      Reviewed-by: default avatar"Darrick J. Wong" <djwong@kernel.org>
      Signed-off-by: default avatarChandan Babu R <chandanbabu@kernel.org>
      0b3a76e9
    • Dave Chinner's avatar
      xfs: use __GFP_NOLOCKDEP instead of GFP_NOFS · 94a69db2
      Dave Chinner authored
      In the past we've had problems with lockdep false positives stemming
      from inode locking occurring in memory reclaim contexts (e.g. from
      superblock shrinkers). Lockdep doesn't know that inodes access from
      above memory reclaim cannot be accessed from below memory reclaim
      (and vice versa) but there has never been a good solution to solving
      this problem with lockdep annotations.
      
      This situation isn't unique to inode locks - buffers are also locked
      above and below memory reclaim, and we have to maintain lock
      ordering for them - and against inodes - appropriately. IOWs, the
      same code paths and locks are taken both above and below memory
      reclaim and so we always need to make sure the lock orders are
      consistent. We are spared the lockdep problems this might cause
      by the fact that semaphores and bit locks aren't covered by lockdep.
      
      In general, this sort of lockdep false positive detection is cause
      by code that runs GFP_KERNEL memory allocation with an actively
      referenced inode locked. When it is run from a transaction, memory
      allocation is automatically GFP_NOFS, so we don't have reclaim
      recursion issues. So in the places where we do memory allocation
      with inodes locked outside of a transaction, we have explicitly set
      them to use GFP_NOFS allocations to prevent lockdep false positives
      from being reported if the allocation dips into direct memory
      reclaim.
      
      More recently, __GFP_NOLOCKDEP was added to the memory allocation
      flags to tell lockdep not to track that particular allocation for
      the purposes of reclaim recursion detection. This is a much better
      way of preventing false positives - it allows us to use GFP_KERNEL
      context outside of transactions, and allows direct memory reclaim to
      proceed normally without throwing out false positive deadlock
      warnings.
      
      The obvious places that lock inodes and do memory allocation are the
      lookup paths and inode extent list initialisation. These occur in
      non-transactional GFP_KERNEL contexts, and so can run direct reclaim
      and lock inodes.
      
      This patch makes a first path through all the explicit GFP_NOFS
      allocations in XFS and converts the obvious ones to GFP_KERNEL |
      __GFP_NOLOCKDEP as a first step towards removing explicit GFP_NOFS
      allocations from the XFS code.
      Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
      Reviewed-by: default avatar"Darrick J. Wong" <djwong@kernel.org>
      Signed-off-by: default avatarChandan Babu R <chandanbabu@kernel.org>
      94a69db2
    • Dave Chinner's avatar
      xfs: use an empty transaction for fstrim · 178231af
      Dave Chinner authored
      We currently use a btree walk in the fstrim code. This requires a
      btree cursor and btree cursors are only used inside transactions
      except for the fstrim code. This means that all the btree operations
      that allocate memory operate in both GFP_KERNEL and GFP_NOFS
      contexts.
      
      This causes problems with lockdep being unable to determine the
      difference between objects that are safe to lock both above and
      below memory reclaim. Free space btree buffers are definitely locked
      both above and below reclaim and that means we have to mark all
      btree infrastructure allocations with GFP_NOFS to avoid potential
      lockdep false positives.
      
      If we wrap this btree walk in an empty cursor, all btree walks are
      now done under transaction context and so all allocations inherit
      GFP_NOFS context from the tranaction. This enables us to move all
      the btree allocations to GFP_KERNEL context and hence help remove
      the explicit use of GFP_NOFS in XFS.
      Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
      Reviewed-by: default avatar"Darrick J. Wong" <djwong@kernel.org>
      Signed-off-by: default avatarChandan Babu R <chandanbabu@kernel.org>
      178231af
    • Dave Chinner's avatar
      xfs: convert remaining kmem_free() to kfree() · d4c75a1b
      Dave Chinner authored
      The remaining callers of kmem_free() are freeing heap memory, so
      we can convert them directly to kfree() and get rid of kmem_free()
      altogether.
      
      This conversion was done with:
      
      $ for f in `git grep -l kmem_free fs/xfs`; do
      > sed -i s/kmem_free/kfree/ $f
      > done
      $
      Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
      Reviewed-by: default avatar"Darrick J. Wong" <djwong@kernel.org>
      Signed-off-by: default avatarChandan Babu R <chandanbabu@kernel.org>
      d4c75a1b
    • Dave Chinner's avatar
      xfs: convert kmem_free() for kvmalloc users to kvfree() · 49292576
      Dave Chinner authored
      Start getting rid of kmem_free() by converting all the cases where
      memory can come from vmalloc interfaces to calling kvfree()
      directly.
      Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
      Reviewed-by: default avatar"Darrick J. Wong" <djwong@kernel.org>
      Signed-off-by: default avatarChandan Babu R <chandanbabu@kernel.org>
      49292576
    • Dave Chinner's avatar
      xfs: move kmem_to_page() · afdc1155
      Dave Chinner authored
      Move it to the general xfs linux wrapper header file so we can
      prepare to remove kmem.h
      Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
      Reviewed-by: default avatar"Darrick J. Wong" <djwong@kernel.org>
      Signed-off-by: default avatarChandan Babu R <chandanbabu@kernel.org>
      afdc1155
    • Dave Chinner's avatar
      xfs: convert kmem_alloc() to kmalloc() · f078d4ea
      Dave Chinner authored
      kmem_alloc() is just a thin wrapper around kmalloc() these days.
      Convert everything to use kmalloc() so we can get rid of the
      wrapper.
      
      Note: the transaction region allocation in xlog_add_to_transaction()
      can be a high order allocation. Converting it to use
      kmalloc(__GFP_NOFAIL) results in warnings in the page allocation
      code being triggered because the mm subsystem does not want us to
      use __GFP_NOFAIL with high order allocations like we've been doing
      with the kmem_alloc() wrapper for a couple of decades. Hence this
      specific case gets converted to xlog_kvmalloc() rather than
      kmalloc() to avoid this issue.
      Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
      Reviewed-by: default avatar"Darrick J. Wong" <djwong@kernel.org>
      Signed-off-by: default avatarChandan Babu R <chandanbabu@kernel.org>
      f078d4ea
    • Dave Chinner's avatar
      xfs: convert kmem_zalloc() to kzalloc() · 10634530
      Dave Chinner authored
      There's no reason to keep the kmem_zalloc() around anymore, it's
      just a thin wrapper around kmalloc(), so lets get rid of it.
      Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
      Reviewed-by: default avatar"Darrick J. Wong" <djwong@kernel.org>
      Signed-off-by: default avatarChandan Babu R <chandanbabu@kernel.org>
      10634530
  2. 11 Feb, 2024 3 commits
  3. 10 Feb, 2024 8 commits
    • Linus Torvalds's avatar
      Merge tag 'mm-hotfixes-stable-2024-02-10-11-16' of... · 7521f258
      Linus Torvalds authored
      Merge tag 'mm-hotfixes-stable-2024-02-10-11-16' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
      
      Pull misc fixes from Andrew Morton:
       "21 hotfixes. 12 are cc:stable and the remainder pertain to post-6.7
        issues or aren't considered to be needed in earlier kernel versions"
      
      * tag 'mm-hotfixes-stable-2024-02-10-11-16' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (21 commits)
        nilfs2: fix potential bug in end_buffer_async_write
        mm/damon/sysfs-schemes: fix wrong DAMOS tried regions update timeout setup
        nilfs2: fix hang in nilfs_lookup_dirty_data_buffers()
        MAINTAINERS: Leo Yan has moved
        mm/zswap: don't return LRU_SKIP if we have dropped lru lock
        fs,hugetlb: fix NULL pointer dereference in hugetlbs_fill_super
        mailmap: switch email address for John Moon
        mm: zswap: fix objcg use-after-free in entry destruction
        mm/madvise: don't forget to leave lazy MMU mode in madvise_cold_or_pageout_pte_range()
        arch/arm/mm: fix major fault accounting when retrying under per-VMA lock
        selftests: core: include linux/close_range.h for CLOSE_RANGE_* macros
        mm/memory-failure: fix crash in split_huge_page_to_list from soft_offline_page
        mm: memcg: optimize parent iteration in memcg_rstat_updated()
        nilfs2: fix data corruption in dsync block recovery for small block sizes
        mm/userfaultfd: UFFDIO_MOVE implementation should use ptep_get()
        exit: wait_task_zombie: kill the no longer necessary spin_lock_irq(siglock)
        fs/proc: do_task_stat: use sig->stats_lock to gather the threads/children stats
        fs/proc: do_task_stat: move thread_group_cputime_adjusted() outside of lock_task_sighand()
        getrusage: use sig->stats_lock rather than lock_task_sighand()
        getrusage: move thread_group_cputime_adjusted() outside of lock_task_sighand()
        ...
      7521f258
    • Linus Torvalds's avatar
      Merge tag 'block-6.8-2024-02-10' of git://git.kernel.dk/linux · a5b6244c
      Linus Torvalds authored
      Pull block fixes from Jens Axboe:
      
       - NVMe pull request via Keith:
           - Update a potentially stale firmware attribute (Maurizio)
           - Fixes for the recent verbose error logging (Keith, Chaitanya)
           - Protection information payload size fix for passthrough (Francis)
      
       - Fix for a queue freezing issue in virtblk (Yi)
      
       - blk-iocost underflow fix (Tejun)
      
       - blk-wbt task detection fix (Jan)
      
      * tag 'block-6.8-2024-02-10' of git://git.kernel.dk/linux:
        virtio-blk: Ensure no requests in virtqueues before deleting vqs.
        blk-iocost: Fix an UBSAN shift-out-of-bounds warning
        nvme: use ns->head->pi_size instead of t10_pi_tuple structure size
        nvme-core: fix comment to reflect right functions
        nvme: move passthrough logging attribute to head
        blk-wbt: Fix detection of dirty-throttled tasks
        nvme-host: fix the updating of the firmware version
      a5b6244c
    • Linus Torvalds's avatar
      Merge tag 'firewire-fixes-6.8-rc4' of... · a38ff5bb
      Linus Torvalds authored
      Merge tag 'firewire-fixes-6.8-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394
      
      Pull firewire fix from Takashi Sakamoto:
       "A change to accelerate the device detection step in some cases.
      
        In the self-identification step after bus-reset, all nodes in the same
        bus broadcast selfID packet including the value of gap count. The
        value is related to the cable hops between nodes, and used to
        calculate the subaction gap and the arbitration reset gap.
      
        When each node has the different value of the gap count, the
        asynchronous communication between them is unreliable, since an
        asynchronous transaction could be interrupted by another asynchronous
        transaction before completion. The gap count inconsistency can be
        resolved by several ways; e.g. the transfer of PHY configuration
        packet and generation of bus-reset.
      
        The current implementation of firewire stack can correctly detect the
        gap count inconsistency, however the recovery action from the
        inconsistency tends to be delayed after reading configuration ROM of
        root node. This results in the long time to probe devices in some
        combinations of hardware.
      
        Here the stack is changed to schedule the action as soon as possible"
      
      * tag 'firewire-fixes-6.8-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394:
        firewire: core: send bus reset promptly on gap count error
      a38ff5bb
    • Linus Torvalds's avatar
      Merge tag '6.8-rc3-ksmbd-server-fixes' of git://git.samba.org/ksmbd · 5a7ec870
      Linus Torvalds authored
      Pull smb server fixes from Steve French:
       "Two ksmbd server fixes:
      
         - memory leak fix
      
         - a minor kernel-doc fix"
      
      * tag '6.8-rc3-ksmbd-server-fixes' of git://git.samba.org/ksmbd:
        ksmbd: free aux buffer if ksmbd_iov_pin_rsp_read fails
        ksmbd: Add kernel-doc for ksmbd_extract_sharename() function
      5a7ec870
    • Linus Torvalds's avatar
      Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi · 4a7bbe75
      Linus Torvalds authored
      Pull SCSI fixes from James Bottomley:
       "Three small driver fixes and one core fix.
      
        The core fix being a fixup to the one in the last pull request which
        didn't entirely move checking of scsi_host_busy() out from under the
        host lock"
      
      * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
        scsi: ufs: core: Remove the ufshcd_release() in ufshcd_err_handling_prepare()
        scsi: ufs: core: Fix shift issue in ufshcd_clear_cmd()
        scsi: lpfc: Use unsigned type for num_sge
        scsi: core: Move scsi_host_busy() out of host lock if it is for per-command
      4a7bbe75
    • Linus Torvalds's avatar
      Merge tag '6.8-rc3-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6 · ca00c700
      Linus Torvalds authored
      Pull smb client fixes from Steve French:
      
       - reconnect fix
      
       - multichannel channel selection fix
      
       - minor mount warning fix
      
       - reparse point fix
      
       - null pointer check improvement
      
      * tag '6.8-rc3-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
        smb3: clarify mount warning
        cifs: handle cases where multiple sessions share connection
        cifs: change tcon status when need_reconnect is set on it
        smb: client: set correct d_type for reparse points under DFS mounts
        smb3: add missing null server pointer check
      ca00c700
    • Linus Torvalds's avatar
      Merge tag 'ceph-for-6.8-rc4' of https://github.com/ceph/ceph-client · e1e3f530
      Linus Torvalds authored
      Pull ceph fixes from Ilya Dryomov:
       "Some fscrypt-related fixups (sparse reads are used only for encrypted
        files) and two cap handling fixes from Xiubo and Rishabh"
      
      * tag 'ceph-for-6.8-rc4' of https://github.com/ceph/ceph-client:
        ceph: always check dir caps asynchronously
        ceph: prevent use-after-free in encode_cap_msg()
        ceph: always set initial i_blkbits to CEPH_FSCRYPT_BLOCK_SHIFT
        libceph: just wait for more data to be available on the socket
        libceph: rename read_sparse_msg_*() to read_partial_sparse_msg_*()
        libceph: fail sparse-read if the data length doesn't match
      e1e3f530
    • Linus Torvalds's avatar
      Merge tag 'ntfs3_for_6.8' of https://github.com/Paragon-Software-Group/linux-ntfs3 · a2343df3
      Linus Torvalds authored
      Pull ntfs3 fixes from Konstantin Komarov:
       "Fixed:
         - size update for compressed file
         - some logic errors, overflows
         - memory leak
         - some code was refactored
      
        Added:
         - implement super_operations::shutdown
      
        Improved:
         - alternative boot processing
         - reduced stack usage"
      
      * tag 'ntfs3_for_6.8' of https://github.com/Paragon-Software-Group/linux-ntfs3: (28 commits)
        fs/ntfs3: Slightly simplify ntfs_inode_printk()
        fs/ntfs3: Add ioctl operation for directories (FITRIM)
        fs/ntfs3: Fix oob in ntfs_listxattr
        fs/ntfs3: Fix an NULL dereference bug
        fs/ntfs3: Update inode->i_size after success write into compressed file
        fs/ntfs3: Fixed overflow check in mi_enum_attr()
        fs/ntfs3: Correct function is_rst_area_valid
        fs/ntfs3: Use i_size_read and i_size_write
        fs/ntfs3: Prevent generic message "attempt to access beyond end of device"
        fs/ntfs3: use non-movable memory for ntfs3 MFT buffer cache
        fs/ntfs3: Use kvfree to free memory allocated by kvmalloc
        fs/ntfs3: Disable ATTR_LIST_ENTRY size check
        fs/ntfs3: Fix c/mtime typo
        fs/ntfs3: Add NULL ptr dereference checking at the end of attr_allocate_frame()
        fs/ntfs3: Add and fix comments
        fs/ntfs3: ntfs3_forced_shutdown use int instead of bool
        fs/ntfs3: Implement super_operations::shutdown
        fs/ntfs3: Drop suid and sgid bits as a part of fpunch
        fs/ntfs3: Add file_modified
        fs/ntfs3: Correct use bh_read
        ...
      a2343df3
  4. 09 Feb, 2024 20 commits
    • Linus Torvalds's avatar
      work around gcc bugs with 'asm goto' with outputs · 4356e9f8
      Linus Torvalds authored
      We've had issues with gcc and 'asm goto' before, and we created a
      'asm_volatile_goto()' macro for that in the past: see commits
      3f0116c3 ("compiler/gcc4: Add quirk for 'asm goto' miscompilation
      bug") and a9f18034 ("compiler/gcc4: Make quirk for
      asm_volatile_goto() unconditional").
      
      Then, much later, we ended up removing the workaround in commit
      43c249ea ("compiler-gcc.h: remove ancient workaround for gcc PR
      58670") because we no longer supported building the kernel with the
      affected gcc versions, but we left the macro uses around.
      
      Now, Sean Christopherson reports a new version of a very similar
      problem, which is fixed by re-applying that ancient workaround.  But the
      problem in question is limited to only the 'asm goto with outputs'
      cases, so instead of re-introducing the old workaround as-is, let's
      rename and limit the workaround to just that much less common case.
      
      It looks like there are at least two separate issues that all hit in
      this area:
      
       (a) some versions of gcc don't mark the asm goto as 'volatile' when it
           has outputs:
      
              https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98619
              https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110420
      
           which is easy to work around by just adding the 'volatile' by hand.
      
       (b) Internal compiler errors:
      
              https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110422
      
           which are worked around by adding the extra empty 'asm' as a
           barrier, as in the original workaround.
      
      but the problem Sean sees may be a third thing since it involves bad
      code generation (not an ICE) even with the manually added 'volatile'.
      
      but the same old workaround works for this case, even if this feels a
      bit like voodoo programming and may only be hiding the issue.
      Reported-and-tested-by: default avatarSean Christopherson <seanjc@google.com>
      Link: https://lore.kernel.org/all/20240208220604.140859-1-seanjc@google.com/
      Cc: Nick Desaulniers <ndesaulniers@google.com>
      Cc: Uros Bizjak <ubizjak@gmail.com>
      Cc: Jakub Jelinek <jakub@redhat.com>
      Cc: Andrew Pinski <quic_apinski@quicinc.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      4356e9f8
    • Steve French's avatar
      smb3: clarify mount warning · a5cc98eb
      Steve French authored
      When a user tries to use the "sec=krb5p" mount parameter to encrypt
      data on connection to a server (when authenticating with Kerberos), we
      indicate that it is not supported, but do not note the equivalent
      recommended mount parameter ("sec=krb5,seal") which turns on encryption
      for that mount (and uses Kerberos for auth).  Update the warning message.
      Reviewed-by: default avatarShyam Prasad N <sprasad@microsoft.com>
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      a5cc98eb
    • Shyam Prasad N's avatar
      cifs: handle cases where multiple sessions share connection · a39c757b
      Shyam Prasad N authored
      Based on our implementation of multichannel, it is entirely
      possible that a server struct may not be found in any channel
      of an SMB session.
      
      In such cases, we should be prepared to move on and search for
      the server struct in the next session.
      Signed-off-by: default avatarShyam Prasad N <sprasad@microsoft.com>
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      a39c757b
    • Shyam Prasad N's avatar
      cifs: change tcon status when need_reconnect is set on it · c6e02eef
      Shyam Prasad N authored
      When a tcon is marked for need_reconnect, the intention
      is to have it reconnected.
      
      This change adjusts tcon->status in cifs_tree_connect
      when need_reconnect is set. Also, this change has a minor
      correction in resetting need_reconnect on success. It makes
      sure that it is done with tc_lock held.
      Signed-off-by: default avatarShyam Prasad N <sprasad@microsoft.com>
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      c6e02eef
    • Linus Torvalds's avatar
      Merge tag 'riscv-for-linus-6.8-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux · 9ed18b0b
      Linus Torvalds authored
      Pull RISC-V fixes from Palmer Dabbelt:
      
       - fix missing TLB flush during early boot on SPARSEMEM_VMEMMAP
         configurations
      
       - fixes to correctly implement the break-before-make behavior requried
         by the ISA for NAPOT mappings
      
       - fix a missing TLB flush on intermediate mapping changes
      
       - fix build warning about a missing declaration of overflow_stack
      
       - fix performace regression related to incorrect tracking of completed
         batch TLB flushes
      
      * tag 'riscv-for-linus-6.8-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
        riscv: Fix arch_tlbbatch_flush() by clearing the batch cpumask
        riscv: declare overflow_stack as exported from traps.c
        riscv: Fix arch_hugetlb_migration_supported() for NAPOT
        riscv: Flush the tlb when a page directory is freed
        riscv: Fix hugetlb_mask_last_page() when NAPOT is enabled
        riscv: Fix set_huge_pte_at() for NAPOT mapping
        riscv: mm: execute local TLB flush after populating vmemmap
      9ed18b0b
    • Linus Torvalds's avatar
      Merge tag 'trace-v6.8-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace · ca8a6673
      Linus Torvalds authored
      Pull tracing fixes from Steven Rostedt:
      
       - Fix broken direct trampolines being called when another callback is
         attached the same function.
      
         ARM 64 does not support FTRACE_WITH_REGS, and when it added direct
         trampoline calls from ftrace, it removed the "WITH_REGS" flag from
         the ftrace_ops for direct trampolines. This broke x86 as x86 requires
         direct trampolines to have WITH_REGS.
      
         This wasn't noticed because direct trampolines work as long as the
         function it is attached to is not shared with other callbacks (like
         the function tracer). When there are other callbacks, a helper
         trampoline is called, to call all the non direct callbacks and when
         it returns, the direct trampoline is called.
      
         For x86, the direct trampoline sets a flag in the regs field to tell
         the x86 specific code to call the direct trampoline. But this only
         works if the ftrace_ops had WITH_REGS set. ARM does things
         differently that does not require this. For now, set WITH_REGS if the
         arch supports WITH_REGS (which ARM does not), and this makes it work
         for both ARM64 and x86.
      
       - Fix wasted memory in the saved_cmdlines logic.
      
         The saved_cmdlines is a cache that maps PIDs to COMMs that tracing
         can use. Most trace events only save the PID in the event. The
         saved_cmdlines file lists PIDs to COMMs so that the tracing tools can
         show an actual name and not just a PID for each event. There's an
         array of PIDs that map to a small set of saved COMM strings. The
         array is set to PID_MAX_DEFAULT which is usually set to 32768. When a
         PID comes in, it will add itself to this array along with the index
         into the COMM array (note if the system allows more than
         PID_MAX_DEFAULT, this cache is similar to cache lines as an update of
         a PID that has the same PID_MAX_DEFAULT bits set will flush out
         another task with the same matching bits set).
      
         A while ago, the size of this cache was changed to be dynamic and the
         array was moved into a structure and created with kmalloc(). But this
         new structure had the size of 131104 bytes, or 0x20020 in hex. As
         kmalloc allocates in powers of two, it was actually allocating
         0x40000 bytes (262144) leaving 131040 bytes of wasted memory. The
         last element of this structure was a pointer to the COMM string array
         which defaulted to just saving 128 COMMs.
      
         By changing the last field of this structure to a variable length
         string, and just having it round up to fill the allocated memory, the
         default size of the saved COMM cache is now 8190. This not only uses
         the wasted space, but actually saves space by removing the extra
         allocation for the COMM names.
      
      * tag 'trace-v6.8-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
        tracing: Fix wasted memory in saved_cmdlines logic
        ftrace: Fix DIRECT_CALLS to use SAVE_REGS by default
      ca8a6673
    • Linus Torvalds's avatar
      Merge tag 'probes-fixes-v6.8-rc3' of... · 6dc512a0
      Linus Torvalds authored
      Merge tag 'probes-fixes-v6.8-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
      
      Pull probes fixes from Masami Hiramatsu:
      
       - remove unnecessary initial values of kprobes local variables
      
       - probe-events parser bug fixes:
      
          - calculate the argument size and format string after setting type
            information from BTF, because BTF can change the size and format
            string.
      
          - show $comm parse error correctly instead of failing silently.
      
      * tag 'probes-fixes-v6.8-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
        kprobes: Remove unnecessary initial values of variables
        tracing/probes: Fix to set arg size and fmt after setting type from BTF
        tracing/probes: Fix to show a parse error for bad type for $comm
      6dc512a0
    • Linus Torvalds's avatar
      Merge tag 'efi-fixes-for-v6.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi · e6f39a90
      Linus Torvalds authored
      Pull EFI fixes from Ard Biesheuvel:
       "The only notable change here is the patch that changes the way we deal
        with spurious errors from the EFI memory attribute protocol. This will
        be backported to v6.6, and is intended to ensure that we will not
        paint ourselves into a corner when we tighten this further in order to
        comply with MS requirements on signed EFI code.
      
        Note that this protocol does not currently exist in x86 production
        systems in the field, only in Microsoft's fork of OVMF, but it will be
        mandatory for Windows logo certification for x86 PCs in the future.
      
         - Tighten ELF relocation checks on the RISC-V EFI stub
      
         - Give up if the new EFI memory attributes protocol fails spuriously
           on x86
      
         - Take care not to place the kernel in the lowest 16 MB of DRAM on
           x86
      
         - Omit special purpose EFI memory from memblock
      
         - Some fixes for the CXL CPER reporting code
      
         - Make the PE/COFF layout of mixed-mode capable images comply with a
           strict interpretation of the spec"
      
      * tag 'efi-fixes-for-v6.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi:
        x86/efistub: Use 1:1 file:memory mapping for PE/COFF .compat section
        cxl/trace: Remove unnecessary memcpy's
        cxl/cper: Fix errant CPER prints for CXL events
        efi: Don't add memblocks for soft-reserved memory
        efi: runtime: Fix potential overflow of soft-reserved region size
        efi/libstub: Add one kernel-doc comment
        x86/efistub: Avoid placing the kernel below LOAD_PHYSICAL_ADDR
        x86/efistub: Give up if memory attribute protocol returns an error
        riscv/efistub: Tighten ELF relocation check
        riscv/efistub: Ensure GP-relative addressing is not used
      e6f39a90
    • Linus Torvalds's avatar
      Merge tag 'pci-v6.8-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci · 5ddfc246
      Linus Torvalds authored
      Pull pci fixes from Bjorn Helgaas:
      
       - Fix an unintentional truncation of DWC MSI-X address to 32 bits and
         update similar MSI code to match (Dan Carpenter)
      
      * tag 'pci-v6.8-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci:
        PCI: dwc: Clean up dw_pcie_ep_raise_msi_irq() alignment
        PCI: dwc: Fix a 64bit bug in dw_pcie_ep_raise_msix_irq()
      5ddfc246
    • Linus Torvalds's avatar
      Merge tag 'hwmon-for-v6.8-rc4' of... · 5ca243c2
      Linus Torvalds authored
      Merge tag 'hwmon-for-v6.8-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging
      
      Pull hwmon fixes from Guenter Roeck:
      
       - coretemp: Various fixes, and increase number of supported CPU cores
      
       - aspeed-pwm-tacho: Add missing mutex protection
      
      * tag 'hwmon-for-v6.8-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
        hwmon: (coretemp) Enlarge per package core count limit
        hwmon: (coretemp) Fix bogus core_id to attr name mapping
        hwmon: (coretemp) Fix out-of-bounds memory access
        hwmon: (aspeed-pwm-tacho) mutex for tach reading
      5ca243c2
    • Linus Torvalds's avatar
      Merge tag 'mmc-v6.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc · eb747bcc
      Linus Torvalds authored
      Pull MMC fixes from Ulf Hansson:
       "MMC core:
         - Allow non-sleeping read-only slot-gpio
      
        MMC host:
         - sdhci-pci-o2micro: Fix a warm reboot BIOS issue"
      
      * tag 'mmc-v6.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
        mmc: slot-gpio: Allow non-sleeping GPIO ro
        mmc: sdhci-pci-o2micro: Fix a warm reboot issue that disk can't be detected by BIOS
      eb747bcc
    • Linus Torvalds's avatar
      Merge tag 'pmdomain-v6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/linux-pm · 3760081f
      Linus Torvalds authored
      Pull pmdomain fixes from Ulf Hansson:
       "Core:
         - Move the unused cleanup to a _sync initcall
      
        Providers:
         - mediatek: Fix race conditions at probe/remove with genpd
         - renesas: r8a77980-sysc: CR7 must be always on"
      
      * tag 'pmdomain-v6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/linux-pm:
        pmdomain: mediatek: fix race conditions with genpd
        pmdomain: renesas: r8a77980-sysc: CR7 must be always on
        pmdomain: core: Move the unused cleanup to a _sync initcall
      3760081f
    • Linus Torvalds's avatar
      Merge tag 'gpio-fixes-for-v6.8-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux · 4a8e4b3c
      Linus Torvalds authored
      Pull gpio fix from Bartosz Golaszewski:
      
       - remove the new GPIO device from the global list unconditionally in
         error path in core GPIOLIB
      
      * tag 'gpio-fixes-for-v6.8-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
        gpio: remove GPIO device from the list unconditionally in error path
      4a8e4b3c
    • Linus Torvalds's avatar
      Merge tag 'drm-fixes-2024-02-09' of git://anongit.freedesktop.org/drm/drm · c76b766e
      Linus Torvalds authored
      Pull drm fixes from Dave Airlie:
       "Regular weekly fixes, xe, amdgpu and msm are most of them, with some
        misc in i915, ivpu and nouveau, scattered but nothing too intense at
        this point.
      
        i915:
         - gvt: docs fix, uninit var, MAINTAINERS
      
        ivpu:
         - add aborted job status
         - disable d3 hot delay
         - mmu fixes
      
        nouveau:
         - fix gsp rpc size request
         - fix dma buffer leaks
         - use common code for gsp mem ctor
      
        xe:
         - Fix a loop in an error path
         - Fix a missing dma-fence reference
         - Fix a retry path on userptr REMAP
         - Workaround for a false gcc warning
         - Fix missing map of the usm batch buffer in the migrate vm.
         - Fix a memory leak.
         - Fix a bad assumption of used page size
         - Fix hitting a BUG() due to zero pages to map.
         - Remove some leftover async bind queue relics
      
        amdgpu:
         - Misc NULL/bounds check fixes
         - ODM pipe policy fix
         - Aborted suspend fixes
         - JPEG 4.0.5 fix
         - DCN 3.5 fixes
         - PSP fix
         - DP MST fix
         - Phantom pipe fix
         - VRAM vendor fix
         - Clang fix
         - SR-IOV fix
      
        msm:
         - DPU:
            - fix for kernel doc warnings and smatch warnings in dpu_encoder
            - fix for smatch warning in dpu_encoder
            - fix the bus bandwidth value for SDM670
         - DP:
            - fixes to handle unknown bpc case correctly for DP
            - fix for MISC0 programming
         - GPU:
            - dmabuf vmap fix
            - a610 UBWC corruption fix (incorrect hbb)
            - revert a commit that was making GPU recovery unreliable"
      
      * tag 'drm-fixes-2024-02-09' of git://anongit.freedesktop.org/drm/drm: (43 commits)
        drm/xe: Remove TEST_VM_ASYNC_OPS_ERROR
        drm/xe/vm: don't ignore error when in_kthread
        drm/xe: Assume large page size if VMA not yet bound
        drm/xe/display: Fix memleak in display initialization
        drm/xe: Map both mem.kernel_bb_pool and usm.bb_pool
        drm/xe: circumvent bogus stringop-overflow warning
        drm/xe: Pick correct userptr VMA to repin on REMAP op failure
        drm/xe: Take a reference in xe_exec_queue_last_fence_get()
        drm/xe: Fix loop in vm_bind_ioctl_ops_unwind
        drm/amdgpu: Fix HDP flush for VFs on nbio v7.9
        drm/amd/display: Implement bounds check for stream encoder creation in DCN301
        drm/amd/display: Increase frame-larger-than for all display_mode_vba files
        drm/amd/display: Clear phantom stream count and plane count
        drm/amdgpu: Avoid fetching VRAM vendor info
        drm/amd/display: Disable ODM by default for DCN35
        drm/amd/display: Update phantom pipe enable / disable sequence
        drm/amd/display: Fix MST Null Ptr for RV
        drm/amdgpu: Fix shared buff copy to user
        drm/amd/display: Increase eval/entry delay for DCN35
        drm/amdgpu: remove asymmetrical irq disabling in jpeg 4.0.5 suspend
        ...
      c76b766e
    • Aleksander Mazur's avatar
      x86/Kconfig: Transmeta Crusoe is CPU family 5, not 6 · f6a18925
      Aleksander Mazur authored
      The kernel built with MCRUSOE is unbootable on Transmeta Crusoe.  It shows
      the following error message:
      
        This kernel requires an i686 CPU, but only detected an i586 CPU.
        Unable to boot - please use a kernel appropriate for your CPU.
      
      Remove MCRUSOE from the condition introduced in commit in Fixes, effectively
      changing X86_MINIMUM_CPU_FAMILY back to 5 on that machine, which matches the
      CPU family given by CPUID.
      
        [ bp: Massage commit message. ]
      
      Fixes: 25d76ac8 ("x86/Kconfig: Explicitly enumerate i686-class CPUs in Kconfig")
      Signed-off-by: default avatarAleksander Mazur <deweloper@wp.pl>
      Signed-off-by: default avatarBorislav Petkov (AMD) <bp@alien8.de>
      Acked-by: default avatarH. Peter Anvin <hpa@zytor.com>
      Cc: <stable@kernel.org>
      Link: https://lore.kernel.org/r/20240123134309.1117782-1-deweloper@wp.pl
      f6a18925
    • Steven Rostedt (Google)'s avatar
      tracing: Fix wasted memory in saved_cmdlines logic · 44dc5c41
      Steven Rostedt (Google) authored
      While looking at improving the saved_cmdlines cache I found a huge amount
      of wasted memory that should be used for the cmdlines.
      
      The tracing data saves pids during the trace. At sched switch, if a trace
      occurred, it will save the comm of the task that did the trace. This is
      saved in a "cache" that maps pids to comms and exposed to user space via
      the /sys/kernel/tracing/saved_cmdlines file. Currently it only caches by
      default 128 comms.
      
      The structure that uses this creates an array to store the pids using
      PID_MAX_DEFAULT (which is usually set to 32768). This causes the structure
      to be of the size of 131104 bytes on 64 bit machines.
      
      In hex: 131104 = 0x20020, and since the kernel allocates generic memory in
      powers of two, the kernel would allocate 0x40000 or 262144 bytes to store
      this structure. That leaves 131040 bytes of wasted space.
      
      Worse, the structure points to an allocated array to store the comm names,
      which is 16 bytes times the amount of names to save (currently 128), which
      is 2048 bytes. Instead of allocating a separate array, make the structure
      end with a variable length string and use the extra space for that.
      
      This is similar to a recommendation that Linus had made about eventfs_inode names:
      
        https://lore.kernel.org/all/20240130190355.11486-5-torvalds@linux-foundation.org/
      
      Instead of allocating a separate string array to hold the saved comms,
      have the structure end with: char saved_cmdlines[]; and round up to the
      next power of two over sizeof(struct saved_cmdline_buffers) + num_cmdlines * TASK_COMM_LEN
      It will use this extra space for the saved_cmdline portion.
      
      Now, instead of saving only 128 comms by default, by using this wasted
      space at the end of the structure it can save over 8000 comms and even
      saves space by removing the need for allocating the other array.
      
      Link: https://lore.kernel.org/linux-trace-kernel/20240209063622.1f7b6d5f@rorschach.local.home
      
      Cc: stable@vger.kernel.org
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Cc: Vincent Donnefort <vdonnefort@google.com>
      Cc: Sven Schnelle <svens@linux.ibm.com>
      Cc: Mete Durlu <meted@linux.ibm.com>
      Fixes: 939c7a4f ("tracing: Introduce saved_cmdlines_size file")
      Signed-off-by: default avatarSteven Rostedt (Google) <rostedt@goodmis.org>
      44dc5c41
    • Masami Hiramatsu (Google)'s avatar
      ftrace: Fix DIRECT_CALLS to use SAVE_REGS by default · a8b9cf62
      Masami Hiramatsu (Google) authored
      The commit 60c89718 ("ftrace: Make DIRECT_CALLS work WITH_ARGS
      and !WITH_REGS") changed DIRECT_CALLS to use SAVE_ARGS when there
      are multiple ftrace_ops at the same function, but since the x86 only
      support to jump to direct_call from ftrace_regs_caller, when we set
      the function tracer on the same target function on x86, ftrace-direct
      does not work as below (this actually works on arm64.)
      
      At first, insmod ftrace-direct.ko to put a direct_call on
      'wake_up_process()'.
      
       # insmod kernel/samples/ftrace/ftrace-direct.ko
       # less trace
      ...
                <idle>-0       [006] ..s1.   564.686958: my_direct_func: waking up rcu_preempt-17
                <idle>-0       [007] ..s1.   564.687836: my_direct_func: waking up kcompactd0-63
                <idle>-0       [006] ..s1.   564.690926: my_direct_func: waking up rcu_preempt-17
                <idle>-0       [006] ..s1.   564.696872: my_direct_func: waking up rcu_preempt-17
                <idle>-0       [007] ..s1.   565.191982: my_direct_func: waking up kcompactd0-63
      
      Setup a function filter to the 'wake_up_process' too, and enable it.
      
       # cd /sys/kernel/tracing/
       # echo wake_up_process > set_ftrace_filter
       # echo function > current_tracer
       # less trace
      ...
                <idle>-0       [006] ..s3.   686.180972: wake_up_process <-call_timer_fn
                <idle>-0       [006] ..s3.   686.186919: wake_up_process <-call_timer_fn
                <idle>-0       [002] ..s3.   686.264049: wake_up_process <-call_timer_fn
                <idle>-0       [002] d.h6.   686.515216: wake_up_process <-kick_pool
                <idle>-0       [002] d.h6.   686.691386: wake_up_process <-kick_pool
      
      Then, only function tracer is shown on x86.
      But if you enable 'kprobe on ftrace' event (which uses SAVE_REGS flag)
      on the same function, it is shown again.
      
       # echo 'p wake_up_process' >> dynamic_events
       # echo 1 > events/kprobes/p_wake_up_process_0/enable
       # echo > trace
       # less trace
      ...
                <idle>-0       [006] ..s2.  2710.345919: p_wake_up_process_0: (wake_up_process+0x4/0x20)
                <idle>-0       [006] ..s3.  2710.345923: wake_up_process <-call_timer_fn
                <idle>-0       [006] ..s1.  2710.345928: my_direct_func: waking up rcu_preempt-17
                <idle>-0       [006] ..s2.  2710.349931: p_wake_up_process_0: (wake_up_process+0x4/0x20)
                <idle>-0       [006] ..s3.  2710.349934: wake_up_process <-call_timer_fn
                <idle>-0       [006] ..s1.  2710.349937: my_direct_func: waking up rcu_preempt-17
      
      To fix this issue, use SAVE_REGS flag for multiple ftrace_ops flag of
      direct_call by default.
      
      Link: https://lore.kernel.org/linux-trace-kernel/170484558617.178953.1590516949390270842.stgit@devnote2
      
      Fixes: 60c89718 ("ftrace: Make DIRECT_CALLS work WITH_ARGS and !WITH_REGS")
      Cc: stable@vger.kernel.org
      Cc: Florent Revest <revest@chromium.org>
      Signed-off-by: default avatarMasami Hiramatsu (Google) <mhiramat@kernel.org>
      Reviewed-by: default avatarMark Rutland <mark.rutland@arm.com>
      Tested-by: Mark Rutland <mark.rutland@arm.com> [arm64]
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Signed-off-by: default avatarSteven Rostedt (Google) <rostedt@goodmis.org>
      a8b9cf62
    • Dave Airlie's avatar
      Merge tag 'drm-msm-fixes-2024-02-07' of https://gitlab.freedesktop.org/drm/msm into drm-fixes · 31152088
      Dave Airlie authored
      Fixes for v6.8-rc4
      
      DPU:
      - fix for kernel doc warnings and smatch warnings in dpu_encoder
      - fix for smatch warning in dpu_encoder
      - fix the bus bandwidth value for SDM670
      
      DP:
      - fixes to handle unknown bpc case correctly for DP. The current code was
        spilling over into other bits of DP configuration register, had to be
        fixed to avoid the extra shifts which were causing the spill over
      - fix for MISC0 programming in DP driver to program the correct
        colorimetry value
      
      GPU:
      - dmabuf vmap fix
      - a610 UBWC corruption fix (incorrect hbb)
      - revert a commit that was making GPU recovery unreliable
      Signed-off-by: default avatarDave Airlie <airlied@redhat.com>
      From: Rob Clark <robdclark@gmail.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/CAF6AEGv+tb1+_cp7ftxcMZbbxE9810rvxeaC50eL=msQ+zkm0g@mail.gmail.com
      31152088
    • Dave Airlie's avatar
      Merge tag 'amd-drm-fixes-6.8-2024-02-08' of... · b30bed9d
      Dave Airlie authored
      Merge tag 'amd-drm-fixes-6.8-2024-02-08' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes
      
      amd-drm-fixes-6.8-2024-02-08:
      
      amdgpu:
      - Misc NULL/bounds check fixes
      - ODM pipe policy fix
      - Aborted suspend fixes
      - JPEG 4.0.5 fix
      - DCN 3.5 fixes
      - PSP fix
      - DP MST fix
      - Phantom pipe fix
      - VRAM vendor fix
      - Clang fix
      - SR-IOV fix
      Signed-off-by: default avatarDave Airlie <airlied@redhat.com>
      
      From: Alex Deucher <alexander.deucher@amd.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20240208165500.4887-1-alexander.deucher@amd.com
      b30bed9d
    • Dave Airlie's avatar
      Merge tag 'drm-intel-fixes-2024-02-08' of... · 9da93fe4
      Dave Airlie authored
      Merge tag 'drm-intel-fixes-2024-02-08' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes
      
      - Just includes gvt-fixes-2024-02-05
      Signed-off-by: default avatarDave Airlie <airlied@redhat.com>
      From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/ZcTETgXsejwVwat6@jlahtine-mobl.ger.corp.intel.com
      9da93fe4