1. 09 Jan, 2024 28 commits
    • Linus Torvalds's avatar
      Merge tag 'lsm-pr-20240105' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm · 063a7ce3
      Linus Torvalds authored
      Pull security module updates from Paul Moore:
      
       - Add three new syscalls: lsm_list_modules(), lsm_get_self_attr(), and
         lsm_set_self_attr().
      
         The first syscall simply lists the LSMs enabled, while the second and
         third get and set the current process' LSM attributes. Yes, these
         syscalls may provide similar functionality to what can be found under
         /proc or /sys, but they were designed to support multiple,
         simultaneaous (stacked) LSMs from the start as opposed to the current
         /proc based solutions which were created at a time when only one LSM
         was allowed to be active at a given time.
      
         We have spent considerable time discussing ways to extend the
         existing /proc interfaces to support multiple, simultaneaous LSMs and
         even our best ideas have been far too ugly to support as a kernel
         API; after +20 years in the kernel, I felt the LSM layer had
         established itself enough to justify a handful of syscalls.
      
         Support amongst the individual LSM developers has been nearly
         unanimous, with a single objection coming from Tetsuo (TOMOYO) as he
         is worried that the LSM_ID_XXX token concept will make it more
         difficult for out-of-tree LSMs to survive. Several members of the LSM
         community have demonstrated the ability for out-of-tree LSMs to
         continue to exist by picking high/unused LSM_ID values as well as
         pointing out that many kernel APIs rely on integer identifiers, e.g.
         syscalls (!), but unfortunately Tetsuo's objections remain.
      
         My personal opinion is that while I have no interest in penalizing
         out-of-tree LSMs, I'm not going to penalize in-tree development to
         support out-of-tree development, and I view this as a necessary step
         forward to support the push for expanded LSM stacking and reduce our
         reliance on /proc and /sys which has occassionally been problematic
         for some container users. Finally, we have included the linux-api
         folks on (all?) recent revisions of the patchset and addressed all of
         their concerns.
      
       - Add a new security_file_ioctl_compat() LSM hook to handle the 32-bit
         ioctls on 64-bit systems problem.
      
         This patch includes support for all of the existing LSMs which
         provide ioctl hooks, although it turns out only SELinux actually
         cares about the individual ioctls. It is worth noting that while
         Casey (Smack) and Tetsuo (TOMOYO) did not give explicit ACKs to this
         patch, they did both indicate they are okay with the changes.
      
       - Fix a potential memory leak in the CALIPSO code when IPv6 is disabled
         at boot.
      
         While it's good that we are fixing this, I doubt this is something
         users are seeing in the wild as you need to both disable IPv6 and
         then attempt to configure IPv6 labeled networking via
         NetLabel/CALIPSO; that just doesn't make much sense.
      
         Normally this would go through netdev, but Jakub asked me to take
         this patch and of all the trees I maintain, the LSM tree seemed like
         the best fit.
      
       - Update the LSM MAINTAINERS entry with additional information about
         our process docs, patchwork, bug reporting, etc.
      
         I also noticed that the Lockdown LSM is missing a dedicated
         MAINTAINERS entry so I've added that to the pull request. I've been
         working with one of the major Lockdown authors/contributors to see if
         they are willing to step up and assume a Lockdown maintainer role;
         hopefully that will happen soon, but in the meantime I'll continue to
         look after it.
      
       - Add a handful of mailmap entries for Serge Hallyn and myself.
      
      * tag 'lsm-pr-20240105' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm: (27 commits)
        lsm: new security_file_ioctl_compat() hook
        lsm: Add a __counted_by() annotation to lsm_ctx.ctx
        calipso: fix memory leak in netlbl_calipso_add_pass()
        selftests: remove the LSM_ID_IMA check in lsm/lsm_list_modules_test
        MAINTAINERS: add an entry for the lockdown LSM
        MAINTAINERS: update the LSM entry
        mailmap: add entries for Serge Hallyn's dead accounts
        mailmap: update/replace my old email addresses
        lsm: mark the lsm_id variables are marked as static
        lsm: convert security_setselfattr() to use memdup_user()
        lsm: align based on pointer length in lsm_fill_user_ctx()
        lsm: consolidate buffer size handling into lsm_fill_user_ctx()
        lsm: correct error codes in security_getselfattr()
        lsm: cleanup the size counters in security_getselfattr()
        lsm: don't yet account for IMA in LSM_CONFIG_COUNT calculation
        lsm: drop LSM_ID_IMA
        LSM: selftests for Linux Security Module syscalls
        SELinux: Add selfattr hooks
        AppArmor: Add selfattr hooks
        Smack: implement setselfattr and getselfattr hooks
        ...
      063a7ce3
    • Linus Torvalds's avatar
      Merge tag 'selinux-pr-20240105' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux · 9f9310bf
      Linus Torvalds authored
      Pull selinux updates from Paul Moore:
      
       - Add a new SELinux initial SID, SECINITSID_INIT, to represent
         userspace processes started before the SELinux policy is loaded in
         early boot.
      
         Prior to this patch all processes were marked as SECINITSID_KERNEL
         before the SELinux policy was loaded, making it difficult to
         distinquish early boot userspace processes from the kernel in the
         SELinux policy.
      
         For most users this will be a non-issue as the policy is loaded early
         enough during boot, but for users who load their SELinux policy
         relatively late, this should make it easier to construct meaningful
         security policies.
      
       - Cleanups to the selinuxfs code by Al, mostly on VFS related issues
         during a policy reload.
      
         The commit description has more detail, but the quick summary is that
         we are replacing a disconnected directory approach with a temporary
         directory that we swapover at the end of the reload.
      
       - Fix an issue where the input sanity checking on socket bind()
         operations was slightly different depending on the presence of
         SELinux.
      
         This is caused by the placement of the LSM hooks in the generic
         socket layer as opposed to the protocol specific bind() handler where
         the protocol specific sanity checks are performed. Mickaël has
         mentioned that he is working to fix this, but in the meantime we just
         ensure that we are replicating the checks properly.
      
         We need to balance the placement of the LSM hooks with the number of
         LSM hooks; pushing the hooks down into the protocol layers is likely
         not the right answer.
      
       - Update the avc_has_perm_noaudit() prototype to better match the
         function definition.
      
       - Migrate from using partial_name_hash() to full_name_hash() the
         filename transition hash table.
      
         This improves the quality of the code and has the potential for a
         minor performance bump.
      
       - Consolidate some open coded SELinux access vector comparisions into a
         single new function, avtab_node_cmp(), and use that instead.
      
         A small, but nice win for code quality and maintainability.
      
       - Updated the SELinux MAINTAINERS entry with additional information
         around process, bug reporting, etc.
      
         We're also updating some of our "official" roles: dropping Eric Paris
         and adding Ondrej as a reviewer.
      
       - Cleanup the coding style crimes in security/selinux/include.
      
         While I'm not a fan of code churn, I am pushing for more automated
         code checks that can be done at the developer level and one of the
         obvious things to check for is coding style.
      
         In an effort to start from a "good" base I'm slowly working through
         our source files cleaning them up with the help of clang-format and
         good ol' fashioned human eyeballs; this has the first batch of these
         changes.
      
         I've been splitting the changes up per-file to help reduce the impact
         if backports are required (either for LTS or distro kernels), and I
         expect the some of the larger files, e.g. hooks.c and ss/services.c,
         will likely need to be split even further.
      
       - Cleanup old, outdated comments.
      
      * tag 'selinux-pr-20240105' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux: (24 commits)
        selinux: Fix error priority for bind with AF_UNSPEC on PF_INET6 socket
        selinux: fix style issues in security/selinux/include/initial_sid_to_string.h
        selinux: fix style issues in security/selinux/include/xfrm.h
        selinux: fix style issues in security/selinux/include/security.h
        selinux: fix style issues with security/selinux/include/policycap_names.h
        selinux: fix style issues in security/selinux/include/policycap.h
        selinux: fix style issues in security/selinux/include/objsec.h
        selinux: fix style issues with security/selinux/include/netlabel.h
        selinux: fix style issues in security/selinux/include/netif.h
        selinux: fix style issues in security/selinux/include/ima.h
        selinux: fix style issues in security/selinux/include/conditional.h
        selinux: fix style issues in security/selinux/include/classmap.h
        selinux: fix style issues in security/selinux/include/avc_ss.h
        selinux: align avc_has_perm_noaudit() prototype with definition
        selinux: fix style issues in security/selinux/include/avc.h
        selinux: fix style issues in security/selinux/include/audit.h
        MAINTAINERS: drop Eric Paris from his SELinux role
        MAINTAINERS: add Ondrej Mosnacek as a SELinux reviewer
        selinux: remove the wrong comment about multithreaded process handling
        selinux: introduce an initial SID for early boot processes
        ...
      9f9310bf
    • Linus Torvalds's avatar
      Merge tag 'audit-pr-20240105' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit · eab23bc8
      Linus Torvalds authored
      Pull audit updates from Paul Moore:
       "The audit updates are fairly minor with only two patches:
      
         - Send an audit ACK to userspace immediately upon receiving an auditd
           registration event as opposed to waiting until the registration has
           been fully processed and the audit backlog starts filling the
           netlink buffers.
      
           Sending the ACK earlier, as done here, is still safe as the
           operation should not fail at the point when the ACK is done, and
           doing so helps avoid the ACK being dropped in extreme situations.
      
         - Update the audit MAINTAINERS entry with additional information.
      
           There isn't anything in this update that should be new to regular
           contributors or list subscribers, but I'm pushing to start
           documenting our processes, conventions, etc. and this seems like an
           important part of that"
      
      * tag 'audit-pr-20240105' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit:
        MAINTAINERS: update the audit entry
        audit: Send netlink ACK before setting connection in auditd_set
      eab23bc8
    • Linus Torvalds's avatar
      Merge tag 'mm-nonmm-stable-2024-01-09-10-33' of... · 9f2a6352
      Linus Torvalds authored
      Merge tag 'mm-nonmm-stable-2024-01-09-10-33' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
      
      Pull non-MM updates from Andrew Morton:
       "Quite a lot of kexec work this time around. Many singleton patches in
        many places. The notable patch series are:
      
         - nilfs2 folio conversion from Matthew Wilcox in 'nilfs2: Folio
           conversions for file paths'.
      
         - Additional nilfs2 folio conversion from Ryusuke Konishi in 'nilfs2:
           Folio conversions for directory paths'.
      
         - IA64 remnant removal in Heiko Carstens's 'Remove unused code after
           IA-64 removal'.
      
         - Arnd Bergmann has enabled the -Wmissing-prototypes warning
           everywhere in 'Treewide: enable -Wmissing-prototypes'. This had
           some followup fixes:
      
            - Nathan Chancellor has cleaned up the hexagon build in the series
              'hexagon: Fix up instances of -Wmissing-prototypes'.
      
            - Nathan also addressed some s390 warnings in 's390: A couple of
              fixes for -Wmissing-prototypes'.
      
            - Arnd Bergmann addresses the same warnings for MIPS in his series
              'mips: address -Wmissing-prototypes warnings'.
      
         - Baoquan He has made kexec_file operate in a top-down-fitting manner
           similar to kexec_load in the series 'kexec_file: Load kernel at top
           of system RAM if required'
      
         - Baoquan He has also added the self-explanatory 'kexec_file: print
           out debugging message if required'.
      
         - Some checkstack maintenance work from Tiezhu Yang in the series
           'Modify some code about checkstack'.
      
         - Douglas Anderson has disentangled the watchdog code's logging when
           multiple reports are occurring simultaneously. The series is
           'watchdog: Better handling of concurrent lockups'.
      
         - Yuntao Wang has contributed some maintenance work on the crash code
           in 'crash: Some cleanups and fixes'"
      
      * tag 'mm-nonmm-stable-2024-01-09-10-33' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (157 commits)
        crash_core: fix and simplify the logic of crash_exclude_mem_range()
        x86/crash: use SZ_1M macro instead of hardcoded value
        x86/crash: remove the unused image parameter from prepare_elf_headers()
        kdump: remove redundant DEFAULT_CRASH_KERNEL_LOW_SIZE
        scripts/decode_stacktrace.sh: strip unexpected CR from lines
        watchdog: if panicking and we dumped everything, don't re-enable dumping
        watchdog/hardlockup: use printk_cpu_sync_get_irqsave() to serialize reporting
        watchdog/softlockup: use printk_cpu_sync_get_irqsave() to serialize reporting
        watchdog/hardlockup: adopt softlockup logic avoiding double-dumps
        kexec_core: fix the assignment to kimage->control_page
        x86/kexec: fix incorrect end address passed to kernel_ident_mapping_init()
        lib/trace_readwrite.c:: replace asm-generic/io with linux/io
        nilfs2: cpfile: fix some kernel-doc warnings
        stacktrace: fix kernel-doc typo
        scripts/checkstack.pl: fix no space expression between sp and offset
        x86/kexec: fix incorrect argument passed to kexec_dprintk()
        x86/kexec: use pr_err() instead of kexec_dprintk() when an error occurs
        nilfs2: add missing set_freezable() for freezable kthread
        kernel: relay: remove relay_file_splice_read dead code, doesn't work
        docs: submit-checklist: remove all of "make namespacecheck"
        ...
      9f2a6352
    • Linus Torvalds's avatar
      Merge tag 'mm-stable-2024-01-08-15-31' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm · fb46e22a
      Linus Torvalds authored
      Pull MM updates from Andrew Morton:
       "Many singleton patches against the MM code. The patch series which are
        included in this merge do the following:
      
         - Peng Zhang has done some mapletree maintainance work in the series
      
      	'maple_tree: add mt_free_one() and mt_attr() helpers'
      	'Some cleanups of maple tree'
      
         - In the series 'mm: use memmap_on_memory semantics for dax/kmem'
           Vishal Verma has altered the interworking between memory-hotplug
           and dax/kmem so that newly added 'device memory' can more easily
           have its memmap placed within that newly added memory.
      
         - Matthew Wilcox continues folio-related work (including a few fixes)
           in the patch series
      
      	'Add folio_zero_tail() and folio_fill_tail()'
      	'Make folio_start_writeback return void'
      	'Fix fault handler's handling of poisoned tail pages'
      	'Convert aops->error_remove_page to ->error_remove_folio'
      	'Finish two folio conversions'
      	'More swap folio conversions'
      
         - Kefeng Wang has also contributed folio-related work in the series
      
      	'mm: cleanup and use more folio in page fault'
      
         - Jim Cromie has improved the kmemleak reporting output in the series
           'tweak kmemleak report format'.
      
         - In the series 'stackdepot: allow evicting stack traces' Andrey
           Konovalov to permits clients (in this case KASAN) to cause eviction
           of no longer needed stack traces.
      
         - Charan Teja Kalla has fixed some accounting issues in the page
           allocator's atomic reserve calculations in the series 'mm:
           page_alloc: fixes for high atomic reserve caluculations'.
      
         - Dmitry Rokosov has added to the samples/ dorectory some sample code
           for a userspace memcg event listener application. See the series
           'samples: introduce cgroup events listeners'.
      
         - Some mapletree maintanance work from Liam Howlett in the series
           'maple_tree: iterator state changes'.
      
         - Nhat Pham has improved zswap's approach to writeback in the series
           'workload-specific and memory pressure-driven zswap writeback'.
      
         - DAMON/DAMOS feature and maintenance work from SeongJae Park in the
           series
      
      	'mm/damon: let users feed and tame/auto-tune DAMOS'
      	'selftests/damon: add Python-written DAMON functionality tests'
      	'mm/damon: misc updates for 6.8'
      
         - Yosry Ahmed has improved memcg's stats flushing in the series 'mm:
           memcg: subtree stats flushing and thresholds'.
      
         - In the series 'Multi-size THP for anonymous memory' Ryan Roberts
           has added a runtime opt-in feature to transparent hugepages which
           improves performance by allocating larger chunks of memory during
           anonymous page faults.
      
         - Matthew Wilcox has also contributed some cleanup and maintenance
           work against eh buffer_head code int he series 'More buffer_head
           cleanups'.
      
         - Suren Baghdasaryan has done work on Andrea Arcangeli's series
           'userfaultfd move option'. UFFDIO_MOVE permits userspace heap
           compaction algorithms to move userspace's pages around rather than
           UFFDIO_COPY'a alloc/copy/free.
      
         - Stefan Roesch has developed a 'KSM Advisor', in the series 'mm/ksm:
           Add ksm advisor'. This is a governor which tunes KSM's scanning
           aggressiveness in response to userspace's current needs.
      
         - Chengming Zhou has optimized zswap's temporary working memory use
           in the series 'mm/zswap: dstmem reuse optimizations and cleanups'.
      
         - Matthew Wilcox has performed some maintenance work on the writeback
           code, both code and within filesystems. The series is 'Clean up the
           writeback paths'.
      
         - Andrey Konovalov has optimized KASAN's handling of alloc and free
           stack traces for secondary-level allocators, in the series 'kasan:
           save mempool stack traces'.
      
         - Andrey also performed some KASAN maintenance work in the series
           'kasan: assorted clean-ups'.
      
         - David Hildenbrand has gone to town on the rmap code. Cleanups, more
           pte batching, folio conversions and more. See the series 'mm/rmap:
           interface overhaul'.
      
         - Kinsey Ho has contributed some maintenance work on the MGLRU code
           in the series 'mm/mglru: Kconfig cleanup'.
      
         - Matthew Wilcox has contributed lruvec page accounting code cleanups
           in the series 'Remove some lruvec page accounting functions'"
      
      * tag 'mm-stable-2024-01-08-15-31' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (361 commits)
        mm, treewide: rename MAX_ORDER to MAX_PAGE_ORDER
        mm, treewide: introduce NR_PAGE_ORDERS
        selftests/mm: add separate UFFDIO_MOVE test for PMD splitting
        selftests/mm: skip test if application doesn't has root privileges
        selftests/mm: conform test to TAP format output
        selftests: mm: hugepage-mmap: conform to TAP format output
        selftests/mm: gup_test: conform test to TAP format output
        mm/selftests: hugepage-mremap: conform test to TAP format output
        mm/vmstat: move pgdemote_* out of CONFIG_NUMA_BALANCING
        mm: zsmalloc: return -ENOSPC rather than -EINVAL in zs_malloc while size is too large
        mm/memcontrol: remove __mod_lruvec_page_state()
        mm/khugepaged: use a folio more in collapse_file()
        slub: use a folio in __kmalloc_large_node
        slub: use folio APIs in free_large_kmalloc()
        slub: use alloc_pages_node() in alloc_slab_page()
        mm: remove inc/dec lruvec page state functions
        mm: ratelimit stat flush from workingset shrinker
        kasan: stop leaking stack trace handles
        mm/mglru: remove CONFIG_TRANSPARENT_HUGEPAGE
        mm/mglru: add dummy pmd_dirty()
        ...
      fb46e22a
    • Linus Torvalds's avatar
      Merge tag 'slab-for-6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab · d30e51aa
      Linus Torvalds authored
      Pull slab updates from Vlastimil Babka:
      
       - SLUB: delayed freezing of CPU partial slabs (Chengming Zhou)
      
         Freezing is an operation involving double_cmpxchg() that makes a slab
         exclusive for a particular CPU. Chengming noticed that we use it also
         in situations where we are not yet installing the slab as the CPU
         slab, because freezing also indicates that the slab is not on the
         shared list. This results in redundant freeze/unfreeze operation and
         can be avoided by marking separately the shared list presence by
         reusing the PG_workingset flag.
      
         This approach neatly avoids the issues described in 9b1ea29b
         ("Revert "mm, slub: consider rest of partial list if acquire_slab()
         fails"") as we can now grab a slab from the shared list in a quick
         and guaranteed way without the cmpxchg_double() operation that
         amplifies the lock contention and can fail.
      
         As a result, lkp has reported 34.2% improvement of
         stress-ng.rawudp.ops_per_sec
      
       - SLAB removal and SLUB cleanups (Vlastimil Babka)
      
         The SLAB allocator has been deprecated since 6.5 and nobody has
         objected so far. We agreed at LSF/MM to wait until the next LTS,
         which is 6.6, so we should be good to go now.
      
         This doesn't yet erase all traces of SLAB outside of mm/ so some dead
         code, comments or documentation remain, and will be cleaned up
         gradually (some series are already in the works).
      
         Removing the choice of allocators has already allowed to simplify and
         optimize the code wiring up the kmalloc APIs to the SLUB
         implementation.
      
      * tag 'slab-for-6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab: (34 commits)
        mm/slub: free KFENCE objects in slab_free_hook()
        mm/slub: handle bulk and single object freeing separately
        mm/slub: introduce __kmem_cache_free_bulk() without free hooks
        mm/slub: fix bulk alloc and free stats
        mm/slub: optimize free fast path code layout
        mm/slub: optimize alloc fastpath code layout
        mm/slub: remove slab_alloc() and __kmem_cache_alloc_lru() wrappers
        mm/slab: move kmalloc() functions from slab_common.c to slub.c
        mm/slab: move kmalloc_slab() to mm/slab.h
        mm/slab: move kfree() from slab_common.c to slub.c
        mm/slab: move struct kmem_cache_node from slab.h to slub.c
        mm/slab: move memcg related functions from slab.h to slub.c
        mm/slab: move pre/post-alloc hooks from slab.h to slub.c
        mm/slab: consolidate includes in the internal mm/slab.h
        mm/slab: move the rest of slub_def.h to mm/slab.h
        mm/slab: move struct kmem_cache_cpu declaration to slub.c
        mm/slab: remove mm/slab.c and slab_def.h
        mm/mempool/dmapool: remove CONFIG_DEBUG_SLAB ifdefs
        mm/slab: remove CONFIG_SLAB code from slab common code
        cpu/hotplug: remove CPUHP_SLAB_PREPARE hooks
        ...
      d30e51aa
    • Linus Torvalds's avatar
      Merge tag 'cgroup-for-6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup · 9f8413c4
      Linus Torvalds authored
      Pull cgroup updates from Tejun Heo:
      
       - Yafang Shao added task_get_cgroup1() helper to enable a similar BPF
         helper so that BPF progs can be more useful on cgroup1 hierarchies.
         While cgroup1 is mostly in maintenance mode, this addition is very
         small while having an outsized usefulness for users who are still on
         cgroup1. Yafang also optimized root cgroup list access by making it
         RCU protected in the process.
      
       - Waiman Long optimized rstat operation leading to substantially lower
         and more consistent lock hold time while flushing the hierarchical
         statistics. As the lock can be acquired briefly in various hot paths,
         this reduction has cascading benefits.
      
       - Waiman also improved the quality of isolation for cpuset's isolated
         partitions. CPUs which are allocated to isolated partitions are now
         excluded from running unbound work items and cpu_is_isolated() test
         which is used by vmstat and memcg to reduce interference now includes
         cpuset isolated CPUs. While it isn't there yet, the hope is
         eventually reaching parity with the isolation level provided by the
         `isolcpus` boot param but in a dynamic manner.
      
      * tag 'cgroup-for-6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
        cgroup: Move rcu_head up near the top of cgroup_root
        cgroup/cpuset: Include isolated cpuset CPUs in cpu_is_isolated() check
        cgroup: Avoid false cacheline sharing of read mostly rstat_cpu
        cgroup/rstat: Optimize cgroup_rstat_updated_list()
        cgroup: Fix documentation for cpu.idle
        cgroup/cpuset: Expose cpuset.cpus.isolated
        workqueue: Move workqueue_set_unbound_cpumask() and its helpers inside CONFIG_SYSFS
        cgroup/rstat: Reduce cpu_lock hold time in cgroup_rstat_flush_locked()
        cgroup/cpuset: Take isolated CPUs out of workqueue unbound cpumask
        cgroup/cpuset: Keep track of CPUs in isolated partitions
        selftests/cgroup: Minor code cleanup and reorganization of test_cpuset_prs.sh
        workqueue: Add workqueue_unbound_exclude_cpumask() to exclude CPUs from wq_unbound_cpumask
        selftests: cgroup: Fixes a typo in a comment
        cgroup: Add a new helper for cgroup1 hierarchy
        cgroup: Add annotation for holding namespace_sem in current_cgns_cgroup_from_root()
        cgroup: Eliminate the need for cgroup_mutex in proc_cgroup_show()
        cgroup: Make operations on the cgroup root_list RCU safe
        cgroup: Remove unnecessary list_empty()
      9f8413c4
    • Linus Torvalds's avatar
      Merge tag 'sched-core-2024-01-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · bfe8eb3b
      Linus Torvalds authored
      Pull scheduler updates from Ingo Molnar:
       "Energy scheduling:
      
         - Consolidate how the max compute capacity is used in the scheduler
           and how we calculate the frequency for a level of utilization.
      
         - Rework interface between the scheduler and the schedutil governor
      
         - Simplify the util_est logic
      
        Deadline scheduler:
      
         - Work more towards reducing SCHED_DEADLINE starvation of low
           priority tasks (e.g., SCHED_OTHER) tasks when higher priority tasks
           monopolize CPU cycles, via the introduction of 'deadline servers'
           (nested/2-level scheduling).
      
           "Fair servers" to make use of this facility are not introduced yet.
      
        EEVDF:
      
         - Introduce O(1) fastpath for EEVDF task selection
      
        NUMA balancing:
      
         - Tune the NUMA-balancing vma scanning logic some more, to better
           distribute the probability of a particular vma getting scanned.
      
        Plus misc fixes, cleanups and updates"
      
      * tag 'sched-core-2024-01-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (30 commits)
        sched/fair: Fix tg->load when offlining a CPU
        sched/fair: Remove unused 'next_buddy_marked' local variable in check_preempt_wakeup_fair()
        sched/fair: Use all little CPUs for CPU-bound workloads
        sched/fair: Simplify util_est
        sched/fair: Remove SCHED_FEAT(UTIL_EST_FASTUP, true)
        arm64/amu: Use capacity_ref_freq() to set AMU ratio
        cpufreq/cppc: Set the frequency used for computing the capacity
        cpufreq/cppc: Move and rename cppc_cpufreq_{perf_to_khz|khz_to_perf}()
        energy_model: Use a fixed reference frequency
        cpufreq/schedutil: Use a fixed reference frequency
        cpufreq: Use the fixed and coherent frequency for scaling capacity
        sched/topology: Add a new arch_scale_freq_ref() method
        freezer,sched: Clean saved_state when restoring it during thaw
        sched/fair: Update min_vruntime for reweight_entity() correctly
        sched/doc: Update documentation after renames and synchronize Chinese version
        sched/cpufreq: Rework iowait boost
        sched/cpufreq: Rework schedutil governor performance estimation
        sched/pelt: Avoid underestimation of task utilization
        sched/timers: Explain why idle task schedules out on remote timer enqueue
        sched/cpuidle: Comment about timers requirements VS idle handler
        ...
      bfe8eb3b
    • Linus Torvalds's avatar
      Merge tag 'perf-core-2024-01-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · aac4de46
      Linus Torvalds authored
      Pull performance events updates from Ingo Molnar:
      
       - Add branch stack counters ABI extension to better capture the growing
         amount of information the PMU exposes via branch stack sampling.
         There's matching tooling support.
      
       - Fix race when creating the nr_addr_filters sysfs file
      
       - Add Intel Sierra Forest and Grand Ridge intel/cstate PMU support
      
       - Add Intel Granite Rapids, Sierra Forest and Grand Ridge uncore PMU
         support
      
       - Misc cleanups & fixes
      
      * tag 'perf-core-2024-01-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        perf/x86/intel/uncore: Factor out topology_gidnid_map()
        perf/x86/intel/uncore: Fix NULL pointer dereference issue in upi_fill_topology()
        perf/x86/amd: Reject branch stack for IBS events
        perf/x86/intel/uncore: Support Sierra Forest and Grand Ridge
        perf/x86/intel/uncore: Support IIO free-running counters on GNR
        perf/x86/intel/uncore: Support Granite Rapids
        perf/x86/uncore: Use u64 to replace unsigned for the uncore offsets array
        perf/x86/intel/uncore: Generic uncore_get_uncores and MMIO format of SPR
        perf: Fix the nr_addr_filters fix
        perf/x86/intel/cstate: Add Grand Ridge support
        perf/x86/intel/cstate: Add Sierra Forest support
        x86/smp: Export symbol cpu_clustergroup_mask()
        perf/x86/intel/cstate: Cleanup duplicate attr_groups
        perf/core: Fix narrow startup race when creating the perf nr_addr_filters sysfs file
        perf/x86/intel: Support branch counters logging
        perf/x86/intel: Reorganize attrs and is_visible
        perf: Add branch_sample_call_stack
        perf/x86: Add PERF_X86_EVENT_NEEDS_BRANCH_STACK flag
        perf: Add branch stack counters
      aac4de46
    • Linus Torvalds's avatar
      Merge tag 'irq-core-2024-01-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 0bdf0621
      Linus Torvalds authored
      Pull irq subsystem updates from Ingo Molnar:
      
       - Add support for the IA55 interrupt controller on RZ/G3S SoC's
      
       - Update/fix the Qualcom MPM Interrupt Controller driver's register
         enumeration within the somewhat exotic "RPM Message RAM" MMIO-mapped
         shared memory region that is used for other purposes as well
      
       - Clean up the Xtensa built-in Programmable Interrupt Controller driver
         (xtensa-pic) a bit
      
      * tag 'irq-core-2024-01-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        irqchip/irq-xtensa-pic: Clean up
        irqchip/qcom-mpm: Support passing a slice of SRAM as reg space
        dt-bindings: interrupt-controller: mpm: Pass MSG RAM slice through phandle
        dt-bindings: interrupt-controller: renesas,rzg2l-irqc: Document RZ/G3S
        irqchip/renesas-rzg2l: Add support for suspend to RAM
        irqchip/renesas-rzg2l: Add macro to retrieve TITSR register offset based on register's index
        irqchip/renesas-rzg2l: Implement restriction when writing ISCR register
        irqchip/renesas-rzg2l: Document structure members
        irqchip/renesas-rzg2l: Align struct member names to tabs
        irqchip/renesas-rzg2l: Use tabs instead of spaces
      0bdf0621
    • Linus Torvalds's avatar
      Merge tag 'timers-core-2024-01-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · f24dc33f
      Linus Torvalds authored
      Pull timer subsystem updates from Ingo Molnar:
      
       - Various preparatory cleanups & enhancements of the timer-wheel code,
         in preparation for the WIP 'pull timers at expiry' timer migration
         model series (which will replace the current 'push timers at enqueue'
         migration model), by Anna-Maria Behnsen:
      
            - Update comments and clean up confusing variable names
      
            - Add debug check to warn about time travel
      
            - Improve/expand timer-wheel tracepoints
      
            - Optimize away unnecessary IPIs for deferrable timers
      
            - Restructure & clean up next_expiry_recalc()
      
            - Clean up forward_timer_base()
      
            - Introduce __forward_timer_base() and use it to simplify and
              micro-optimize get_next_timer_interrupt()
      
       - Restructure the get_next_timer_interrupt()'s idle logic for better
         readability and to enable a minor optimization.
      
       - Fix the nextevt calculation when no timers are pending
      
       - Fix the sysfs_get_uname() prototype declaration
      
      * tag 'timers-core-2024-01-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        timers: Fix nextevt calculation when no timers are pending
        timers: Rework idle logic
        timers: Use already existing function for forwarding timer base
        timers: Split out forward timer base functionality
        timers: Clarify check in forward_timer_base()
        timers: Move store of next event into __next_timer_interrupt()
        timers: Do not IPI for deferrable timers
        tracing/timers: Add tracepoint for tracking timer base is_idle flag
        tracing/timers: Enhance timer_start tracepoint
        tick-sched: Warn when next tick seems to be in the past
        tick/sched: Cleanup confusing variables
        tick-sched: Fix function names in comments
        time: Make sysfs_get_uname() function visible in header
      f24dc33f
    • Linus Torvalds's avatar
      Merge tag 'smp-core-2024-01-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 46a08b4d
      Linus Torvalds authored
      Pull CPU hotplug updates from Ingo Molnar:
      
       - Remove unused CPU hotplug states
      
       - Increase the number of dynamic CPU hotplug states
         from 30 to 40, because existing drivers can exhaust
         the allocation space
      
      * tag 'smp-core-2024-01-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        cpu/hotplug: Increase the number of dynamic states
        cpu/hotplug: Remove unused CPU hotplug states
      46a08b4d
    • Linus Torvalds's avatar
      Merge tag 'core-entry-2024-01-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · cdc20228
      Linus Torvalds authored
      Pull generic syscall updates from Ingo Molnar:
       "Move various entry functions from kernel/entry/common.c to a header
        file, and always-inline them, to improve syscall entry performance
        on s390 by ~11%"
      
      * tag 'core-entry-2024-01-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        entry: Move syscall_enter_from_user_mode() to header file
        entry: Move enter_from_user_mode() to header file
        entry: Move exit to usermode functions to header file
      cdc20228
    • Linus Torvalds's avatar
      Merge tag 'core-debugobjects-2024-01-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · ab9517fa
      Linus Torvalds authored
      Pull debugobject update from Ingo Molnar:
      
       - Make tracking object use more robust: it's not safe to access a
         tracking object after releasing the hashbucket lock. Create a
         persistent copy for debug printouts instead.
      
      * tag 'core-debugobjects-2024-01-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        debugobjects: Stop accessing objects after releasing hash bucket lock
      ab9517fa
    • Linus Torvalds's avatar
      Merge tag 'objtool-core-2024-01-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 669d089a
      Linus Torvalds authored
      Pull objtool fixlet from Ingo Molnar:
       "Address a GCC-14 warning: there's no real bug, but indeed the calloc
        order doesn't match the prototype.
      
        (Side note: we should really add zalloc() for such cases)"
      
      * tag 'objtool-core-2024-01-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        objtool: Fix calloc call for new -Walloc-size
      669d089a
    • Linus Torvalds's avatar
      Merge tag 'locking-core-2024-01-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 6cbf5b31
      Linus Torvalds authored
      Pull locking updates from Ingo Molar:
       "Lock guards:
      
         - Use lock guards in the ptrace code
      
         - Introduce conditional guards to extend to conditional lock
           primitives like mutex_trylock()/mutex_lock_interruptible()/etc.
      
        lockdep:
      
         - Optimize 'struct lock_class' to be smaller
      
         - Update file patterns in MAINTAINERS
      
        mutexes:
      
         - Document mutex lifetime rules a bit more"
      
      * tag 'locking-core-2024-01-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        locking/mutex: Clarify that mutex_unlock(), and most other sleeping locks, can still use the lock object after it's unlocked
        locking/mutex: Document that mutex_unlock() is non-atomic
        ptrace: Convert ptrace_attach() to use lock guards
        locking/lockdep: Slightly reorder 'struct lock_class' to save some memory
        MAINTAINERS: Add include/linux/lockdep*.h
        cleanup: Add conditional guard support
      6cbf5b31
    • Florian Fainelli's avatar
      arm64: Update __NR_compat_syscalls for statmount/listmount · f0a78b3e
      Florian Fainelli authored
      Commit d8b0f546 ("wire up syscalls for statmount/listmount") added
      two new system calls to arch/arm64/include/asm/unistd32.h but forgot to
      update the __NR_compat_syscalls number, thus causing the following build
      failures:
      
        arch/arm64/include/asm/unistd32.h:922:24: error: array index in initializer exceeds array bounds
          922 | #define __NR_statmount 457
              |                        ^~~
        arch/arm64/kernel/sys32.c:130:34: note: in definition of macro '__SYSCALL'
          130 | #define __SYSCALL(nr, sym)      [nr] = __arm64_##sym,
              |                                  ^~
      
      Bump up the number by two to accomodate for the new system calls added.
      
      Fixes: d8b0f546 ("wire up syscalls for statmount/listmount")
      Signed-off-by: default avatarFlorian Fainelli <florian.fainelli@broadcom.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      f0a78b3e
    • Linus Torvalds's avatar
      Merge tag 'x86-entry-2024-01-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 2fdbcf71
      Linus Torvalds authored
      Pull x86 entry updates from Ingo Molnar:
      
       - Optimize common_interrupt_return()
      
       - Harden the return-to-user code by making a CONFIG_DEBUG_ENTRY=y check
         unconditional & moving it closer to the IRET.
      
      * tag 'x86-entry-2024-01-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/entry: Harden return-to-user
        x86/entry: Optimize common_interrupt_return()
      2fdbcf71
    • Linus Torvalds's avatar
      Merge tag 'x86-core-2024-01-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 33677aef
      Linus Torvalds authored
      Pull x86 core updates from Ingo Molnar:
      
       - Add comments about the magic behind the shadow STI
         before MWAIT in __sti_mwait().
      
       - Fix possible unintended timer delays caused by a race
         in mwait_idle_with_hints().
      
      * tag 'x86-core-2024-01-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86: Fix CPUIDLE_FLAG_IRQ_ENABLE leaking timer reprogram
        x86: Add a comment about the "magic" behind shadow sti before mwait
      33677aef
    • Linus Torvalds's avatar
      Merge tag 'x86-cleanups-2024-01-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · b51cc5d0
      Linus Torvalds authored
      Pull x86 cleanups from Ingo Molnar:
      
       - Change global variables to local
      
       - Add missing kernel-doc function parameter descriptions
      
       - Remove unused parameter from a macro
      
       - Remove obsolete Kconfig entry
      
       - Fix comments
      
       - Fix typos, mostly scripted, manually reviewed
      
      and a micro-optimization got misplaced as a cleanup:
      
       - Micro-optimize the asm code in secondary_startup_64_no_verify()
      
      * tag 'x86-cleanups-2024-01-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        arch/x86: Fix typos
        x86/head_64: Use TESTB instead of TESTL in secondary_startup_64_no_verify()
        x86/docs: Remove reference to syscall trampoline in PTI
        x86/Kconfig: Remove obsolete config X86_32_SMP
        x86/io: Remove the unused 'bw' parameter from the BUILDIO() macro
        x86/mtrr: Document missing function parameters in kernel-doc
        x86/setup: Make relocated_ramdisk a local variable of relocate_initrd()
      b51cc5d0
    • Linus Torvalds's avatar
      Merge tag 'x86-build-2024-01-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 42c371f8
      Linus Torvalds authored
      Pull x86 build updates from Ingo Molnar:
      
       - Update the objdump & instruction decoder self-test code for better
         LLVM toolchain compatibility
      
       - Rework CONFIG_X86_PAE dependencies, for better readability and higher
         robustness.
      
       - Misc cleanups
      
      * tag 'x86-build-2024-01-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/tools: objdump_reformat.awk: Skip bad instructions from llvm-objdump
        x86/Kconfig: Rework CONFIG_X86_PAE dependency
        x86/tools: Remove chkobjdump.awk
        x86/tools: objdump_reformat.awk: Allow for spaces
        x86/tools: objdump_reformat.awk: Ensure regex matches fwait
      42c371f8
    • Linus Torvalds's avatar
      Merge tag 'x86-boot-2024-01-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · f73857ec
      Linus Torvalds authored
      Pull x86 boot updates from Ingo Molnar:
      
       - Ignore NMIs during very early boot, to address kexec crashes
      
       - Remove redundant initialization in boot/string.c's strcmp()
      
      * tag 'x86-boot-2024-01-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/boot: Remove redundant initialization of the 'delta' variable in strcmp()
        x86/boot: Ignore NMIs during very early boot
      f73857ec
    • Linus Torvalds's avatar
      Merge tag 'x86-asm-2024-01-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 106b88d7
      Linus Torvalds authored
      Pull x86 asm updates from Ingo Molnar:
       "Replace magic numbers in GDT descriptor definitions & handling:
      
         - Introduce symbolic names via macros for descriptor
           types/fields/flags, and then use these symbolic names.
      
         - Clean up definitions a bit, such as GDT_ENTRY_INIT()
      
         - Fix/clean up details that became visibly inconsistent after the
           symbol-based code was introduced:
      
            - Unify accessed flag handling
      
            - Set the D/B size flag consistently & according to the HW
              specification"
      
      * tag 'x86-asm-2024-01-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/asm: Add DB flag to 32-bit percpu GDT entry
        x86/asm: Always set A (accessed) flag in GDT descriptors
        x86/asm: Replace magic numbers in GDT descriptors, script-generated change
        x86/asm: Replace magic numbers in GDT descriptors, preparations
        x86/asm: Provide new infrastructure for GDT descriptors
      106b88d7
    • Linus Torvalds's avatar
      Merge tag 'x86-apic-2024-01-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 33034c4f
      Linus Torvalds authored
      Pull x86 apic updates from Ingo Molnar:
      
       - Clean up 'struct apic':
          - Drop ::delivery_mode
          - Drop 'enum apic_delivery_modes'
          - Drop 'struct local_apic'
      
       - Fix comments
      
      * tag 'x86-apic-2024-01-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/ioapic: Remove unfinished sentence from comment
        x86/apic: Drop struct local_apic
        x86/apic: Drop enum apic_delivery_modes
        x86/apic: Drop apic::delivery_mode
      33034c4f
    • Linus Torvalds's avatar
      Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux · ab5f3fcb
      Linus Torvalds authored
      Pull arm64 updates from Will Deacon:
       "CPU features:
      
         - Remove ARM64_HAS_NO_HW_PREFETCH copy_page() optimisation for ye
           olde Thunder-X machines
      
         - Avoid mapping KPTI trampoline when it is not required
      
         - Make CPU capability API more robust during early initialisation
      
        Early idreg overrides:
      
         - Remove dependencies on core kernel helpers from the early
           command-line parsing logic in preparation for moving this code
           before the kernel is mapped
      
        FPsimd:
      
         - Restore kernel-mode fpsimd context lazily, allowing us to run
           fpsimd code sequences in the kernel with pre-emption enabled
      
        KBuild:
      
         - Install 'vmlinuz.efi' when CONFIG_EFI_ZBOOT=y
      
         - Makefile cleanups
      
        LPA2 prep:
      
         - Preparatory work for enabling the 'LPA2' extension, which will
           introduce 52-bit virtual and physical addressing even with 4KiB
           pages (including for KVM guests).
      
        Misc:
      
         - Remove dead code and fix a typo
      
        MM:
      
         - Pass NUMA node information for IRQ stack allocations
      
        Perf:
      
         - Add perf support for the Synopsys DesignWare PCIe PMU
      
         - Add support for event counting thresholds (FEAT_PMUv3_TH)
           introduced in Armv8.8
      
         - Add support for i.MX8DXL SoCs to the IMX DDR PMU driver.
      
         - Minor PMU driver fixes and optimisations
      
        RIP VPIPT:
      
         - Remove what support we had for the obsolete VPIPT I-cache policy
      
        Selftests:
      
         - Improvements to the SVE and SME selftests
      
        Stacktrace:
      
         - Refactor kernel unwind logic so that it can used by BPF unwinding
           and, eventually, reliable backtracing
      
        Sysregs:
      
         - Update a bunch of register definitions based on the latest XML drop
           from Arm"
      
      * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (87 commits)
        kselftest/arm64: Don't probe the current VL for unsupported vector types
        efi/libstub: zboot: do not use $(shell ...) in cmd_copy_and_pad
        arm64: properly install vmlinuz.efi
        arm64/sysreg: Add missing system instruction definitions for FGT
        arm64/sysreg: Add missing system register definitions for FGT
        arm64/sysreg: Add missing ExtTrcBuff field definition to ID_AA64DFR0_EL1
        arm64/sysreg: Add missing Pauth_LR field definitions to ID_AA64ISAR1_EL1
        arm64: memory: remove duplicated include
        arm: perf: Fix ARCH=arm build with GCC
        arm64: Align boot cpucap handling with system cpucap handling
        arm64: Cleanup system cpucap handling
        MAINTAINERS: add maintainers for DesignWare PCIe PMU driver
        drivers/perf: add DesignWare PCIe PMU driver
        PCI: Move pci_clear_and_set_dword() helper to PCI header
        PCI: Add Alibaba Vendor ID to linux/pci_ids.h
        docs: perf: Add description for Synopsys DesignWare PCIe PMU driver
        arm64: irq: set the correct node for shadow call stack
        Revert "perf/arm_dmc620: Remove duplicate format attribute #defines"
        arm64: fpsimd: Implement lazy restore for kernel mode FPSIMD
        arm64: fpsimd: Preserve/restore kernel mode NEON at context switch
        ...
      ab5f3fcb
    • Linus Torvalds's avatar
      Merge tag 'm68k-for-v6.8-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k · 3cf1d6a5
      Linus Torvalds authored
      Pull m68k updates from Geert Uytterhoeven:
      
        - make the NuBus bus type static and constant
      
        - defconfig updates
      
      * tag 'm68k-for-v6.8-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k:
        m68k: defconfig: Update defconfigs for v6.7-rc1
        nubus: Make nubus_bus_type static and constant
      3cf1d6a5
    • Linus Torvalds's avatar
      Merge tag 'powerpc-6.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux · 968b8033
      Linus Torvalds authored
      Pull powerpc updates from Michael Ellerman:
      
       - Add initial support to recognise the HeXin C2000 processor.
      
       - Add papr-vpd and papr-sysparm character device drivers for VPD &
         sysparm retrieval, so userspace tools can be adapted to avoid doing
         raw firmware calls from userspace.
      
       - Sched domains optimisations for shared processor partitions on
         P9/P10.
      
       - A series of optimisations for KVM running as a nested HV under
         PowerVM.
      
       - Other small features and fixes.
      
      Thanks to Aditya Gupta, Aneesh Kumar K.V, Arnd Bergmann, Christophe
      Leroy, Colin Ian King, Dario Binacchi, David Heidelberg, Geoff Levand,
      Gustavo A. R. Silva, Haoran Liu, Jordan Niethe, Kajol Jain, Kevin Hao,
      Kunwu Chan, Li kunyu, Li zeming, Masahiro Yamada, Michal Suchánek,
      Nathan Lynch, Naveen N Rao, Nicholas Piggin, Randy Dunlap, Sathvika
      Vasireddy, Srikar Dronamraju, Stephen Rothwell, Vaibhav Jain, and
      Zhao Ke.
      
      * tag 'powerpc-6.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (96 commits)
        powerpc/ps3_defconfig: Disable PPC64_BIG_ENDIAN_ELF_ABI_V2
        powerpc/86xx: Drop unused CONFIG_MPC8610
        powerpc/powernv: Add error handling to opal_prd_range_is_valid
        selftests/powerpc: Fix spelling mistake "EACCESS" -> "EACCES"
        powerpc/hvcall: Reorder Nestedv2 hcall opcodes
        powerpc/ps3: Add missing set_freezable() for ps3_probe_thread()
        powerpc/mpc83xx: Use wait_event_freezable() for freezable kthread
        powerpc/mpc83xx: Add the missing set_freezable() for agent_thread_fn()
        powerpc/fsl: Fix fsl,tmu-calibration to match the schema
        powerpc/smp: Dynamically build Powerpc topology
        powerpc/smp: Avoid asym packing within thread_group of a core
        powerpc/smp: Add __ro_after_init attribute
        powerpc/smp: Disable MC domain for shared processor
        powerpc/smp: Enable Asym packing for cores on shared processor
        powerpc/sched: Cleanup vcpu_is_preempted()
        powerpc: add cpu_spec.cpu_features to vmcoreinfo
        powerpc/imc-pmu: Add a null pointer check in update_events_in_group()
        powerpc/powernv: Add a null pointer check in opal_powercap_init()
        powerpc/powernv: Add a null pointer check in opal_event_init()
        powerpc/powernv: Add a null pointer check to scom_debug_init_one()
        ...
      968b8033
    • Linus Torvalds's avatar
      Merge tag 'ras_core_for_v6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 3edbe8af
      Linus Torvalds authored
      Pull x86 RAS updates from Borislav Petkov:
      
       - Convert the hw error storm handling into a finer-grained, per-bank
         solution which allows for more timely detection and reporting of
         errors
      
       - Start a documentation section which will hold down relevant RAS
         features description and how they should be used
      
       - Add new AMD error bank types
      
       - Slim down and remove error type descriptions from the kernel side of
         error decoding to rasdaemon which can be used from now on to decode
         hw errors on AMD
      
       - Mark pages containing uncorrectable errors as poison so that kdump
         can avoid them and thus not cause another panic
      
       - The usual cleanups and fixlets
      
      * tag 'ras_core_for_v6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/mce: Handle Intel threshold interrupt storms
        x86/mce: Add per-bank CMCI storm mitigation
        x86/mce: Remove old CMCI storm mitigation code
        Documentation: Begin a RAS section
        x86/MCE/AMD: Add new MA_LLC, USR_DP, and USR_CP bank types
        EDAC/mce_amd: Remove SMCA Extended Error code descriptions
        x86/mce/amd, EDAC/mce_amd: Move long names to decoder module
        x86/mce/inject: Clear test status value
        x86/mce: Remove redundant check from mce_device_create()
        x86/mce: Mark fatal MCE's page as poison to avoid panic in the kdump kernel
      3edbe8af
  2. 08 Jan, 2024 12 commits
    • Linus Torvalds's avatar
      Merge tag 'x86_cpu_for_v6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · bef91c28
      Linus Torvalds authored
      Pull x86 cpu feature updates from Borislav Petkov:
      
       - Add synthetic X86_FEATURE flags for the different AMD Zen generations
         and use them everywhere instead of ad-hoc family/model checks. Drop
         an ancient AMD errata checking facility as a result
      
       - Fix a fragile initcall ordering in intel_epb
      
       - Do not issue the MFENCE+LFENCE barrier for the TSC deadline and
         X2APIC MSRs on AMD as it is not needed there
      
      * tag 'x86_cpu_for_v6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/CPU/AMD: Add X86_FEATURE_ZEN1
        x86/CPU/AMD: Drop now unused CPU erratum checking function
        x86/CPU/AMD: Get rid of amd_erratum_1485[]
        x86/CPU/AMD: Get rid of amd_erratum_400[]
        x86/CPU/AMD: Get rid of amd_erratum_383[]
        x86/CPU/AMD: Get rid of amd_erratum_1054[]
        x86/CPU/AMD: Move the DIV0 bug detection to the Zen1 init function
        x86/CPU/AMD: Move Zenbleed check to the Zen2 init function
        x86/CPU/AMD: Rename init_amd_zn() to init_amd_zen_common()
        x86/CPU/AMD: Call the spectral chicken in the Zen2 init function
        x86/CPU/AMD: Move erratum 1076 fix into the Zen1 init function
        x86/CPU/AMD: Move the Zen3 BTC_NO detection to the Zen3 init function
        x86/CPU/AMD: Carve out the erratum 1386 fix
        x86/CPU/AMD: Add ZenX generations flags
        x86/cpu/intel_epb: Don't rely on link order
        x86/barrier: Do not serialize MSR accesses on AMD
      bef91c28
    • Linus Torvalds's avatar
      Merge tag 'x86_sev_for_v6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · e900042f
      Linus Torvalds authored
      Pull x86 SEV updates from Borislav Petkov:
      
       - Convert the sev-guest plaform ->remove callback to return void
      
       - Move the SEV C-bit verification to the BSP as it needs to happen only
         once and not on every AP
      
      * tag 'x86_sev_for_v6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        virt: sev-guest: Convert to platform remove callback returning void
        x86/sev: Do the C-bit verification only on the BSP
      e900042f
    • Kirill A. Shutemov's avatar
      mm, treewide: rename MAX_ORDER to MAX_PAGE_ORDER · 5e0a760b
      Kirill A. Shutemov authored
      commit 23baf831 ("mm, treewide: redefine MAX_ORDER sanely") has
      changed the definition of MAX_ORDER to be inclusive.  This has caused
      issues with code that was not yet upstream and depended on the previous
      definition.
      
      To draw attention to the altered meaning of the define, rename MAX_ORDER
      to MAX_PAGE_ORDER.
      
      Link: https://lkml.kernel.org/r/20231228144704.14033-2-kirill.shutemov@linux.intel.comSigned-off-by: default avatarKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      5e0a760b
    • Kirill A. Shutemov's avatar
      mm, treewide: introduce NR_PAGE_ORDERS · fd377218
      Kirill A. Shutemov authored
      NR_PAGE_ORDERS defines the number of page orders supported by the page
      allocator, ranging from 0 to MAX_ORDER, MAX_ORDER + 1 in total.
      
      NR_PAGE_ORDERS assists in defining arrays of page orders and allows for
      more natural iteration over them.
      
      [kirill.shutemov@linux.intel.com: fixup for kerneldoc warning]
        Link: https://lkml.kernel.org/r/20240101111512.7empzyifq7kxtzk3@box
      Link: https://lkml.kernel.org/r/20231228144704.14033-1-kirill.shutemov@linux.intel.comSigned-off-by: default avatarKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Reviewed-by: default avatarZi Yan <ziy@nvidia.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      fd377218
    • Linus Torvalds's avatar
      Merge tag 'x86_paravirt_for_v6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · fc5e5c59
      Linus Torvalds authored
      Pull x86 paravirt updates from Borislav Petkov:
      
       - Replace the paravirt patching functionality using the alternatives
         infrastructure and remove the former
      
       - Misc other improvements
      
      * tag 'x86_paravirt_for_v6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/alternative: Correct feature bit debug output
        x86/paravirt: Remove no longer needed paravirt patching code
        x86/paravirt: Switch mixed paravirt/alternative calls to alternatives
        x86/alternative: Add indirect call patching
        x86/paravirt: Move some functions and defines to alternative.c
        x86/paravirt: Introduce ALT_NOT_XEN
        x86/paravirt: Make the struct paravirt_patch_site packed
        x86/paravirt: Use relative reference for the original instruction offset
      fc5e5c59
    • Linus Torvalds's avatar
      Merge tag 'x86_misc_for_v6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 41a80ca4
      Linus Torvalds authored
      Pull misc x86 updates from Borislav Petkov:
      
       - Add an informational message which gets issued when IA32 emulation
         has been disabled on the cmdline
      
       - Clarify in detail how /proc/cpuinfo is used on x86
      
       - Fix a theoretical overflow in num_digits()
      
      * tag 'x86_misc_for_v6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/ia32: State that IA32 emulation is disabled
        Documentation/x86: Document what /proc/cpuinfo is for
        x86/lib: Fix overflow when counting digits
      41a80ca4
    • Linus Torvalds's avatar
      Merge tag 'x86_microcode_for_v6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 6e0b9391
      Linus Torvalds authored
      Pull x86 microcode updates from Borislav Petkov:
      
       - Correct minor issues after the microcode revision reporting
         sanitization
      
      * tag 'x86_microcode_for_v6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/microcode/intel: Set new revision only after a successful update
        x86/microcode/intel: Remove redundant microcode late updated message
      6e0b9391
    • Linus Torvalds's avatar
      Merge tag 'edac_updates_for_v6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras · 1dee7f50
      Linus Torvalds authored
      Pull EDAC updates from Borislav Petkov:
      
       - The EDAC drivers part of the effort to make the ->remove() platform
         driver callback return void
      
       - Add support for AMD AI accelerators
      
       - Add support for a number of Intel SoCs: Alder Lake-N, Raptor Lake-P,
         Meteor Lake-{P,PS}
      
       - Random fixes and cleanups all over the place
      
      * tag 'edac_updates_for_v6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras: (39 commits)
        EDAC/skx_common: Filter out the invalid address
        EDAC, pnd2: Sort headers alphabetically
        EDAC, pnd2: Correct misleading error message in mk_region_mask()
        EDAC, pnd2: Apply bit macros and helpers where it makes sense
        EDAC, pnd2: Replace custom definition by one from sizes.h
        EDAC/igen6: Add Intel Meteor Lake-P SoCs support
        EDAC/igen6: Add Intel Meteor Lake-PS SoCs support
        EDAC/igen6: Add Intel Raptor Lake-P SoCs support
        EDAC/igen6: Add Intel Alder Lake-N SoCs support
        EDAC/igen6: Make get_mchbar() helper function
        EDAC/amd64: Add support for family 0x19, models 0x90-9f devices
        EDAC/mc: Add support for HBM3 memory type
        EDAC/{sb,i7core}_edac: Do not use a plain integer for a NULL pointer
        EDAC/armada_xp: Explicitly include correct DT includes
        EDAC/pci_sysfs: Use PCI_HEADER_TYPE_MASK instead of literals
        EDAC/thunderx: Fix possible out-of-bounds string access
        EDAC/fsl_ddr: Convert to platform remove callback returning void
        EDAC/zynqmp: Convert to platform remove callback returning void
        EDAC/xgene: Convert to platform remove callback returning void
        EDAC/ti: Convert to platform remove callback returning void
        ...
      1dee7f50
    • Linus Torvalds's avatar
      Merge tag 'vfs-6.8.iov_iter' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs · 5db8752c
      Linus Torvalds authored
      Pull vfs iov_iter cleanups from Christian Brauner:
       "This contains a minor cleanup. The patches drop an unused argument
        from import_single_range() allowing to replace import_single_range()
        with import_ubuf() and dropping import_single_range() completely"
      
      * tag 'vfs-6.8.iov_iter' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
        iov_iter: replace import_single_range() with import_ubuf()
        iov_iter: remove unused 'iov' argument from import_single_range()
      5db8752c
    • Linus Torvalds's avatar
      Merge tag 'vfs-6.8.cachefiles' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs · 26458409
      Linus Torvalds authored
      Pull vfs cachefiles updates from Christian Brauner:
       "This contains improvements for on-demand cachefiles.
      
        If the daemon crashes and the on-demand cachefiles fd is unexpectedly
        closed in-flight requests and subsequent read operations associated
        with the fd will fail with EIO. This causes issues in various
        scenarios as this failure is currently unrecoverable.
      
        The work contained in this pull request introduces a failover mode and
        enables the daemon to recover in-flight requested-related objects. A
        restarted daemon will be able to process requests as usual.
      
        This requires that in-flight requests are stored during daemon crash
        or while the daemon is offline. In addition, a handle to
        /dev/cachefiles needs to be stored.
      
        This can be done by e.g., systemd's fdstore (cf. [1]) which enables
        the restarted daemon to recover state.
      
        Three new states are introduced in this patchset:
      
         (1) CLOSE
             Object is closed by the daemon.
      
         (2) OPEN
             Object is open and ready for processing. IOW, the open request
             has been handled successfully.
      
         (3) REOPENING
             Object has been previously closed and is now reopened due to a
             read request.
      
        A restarted daemon can recover the /dev/cachefiles fd from systemd's
        fdstore and writes "restore" to the device. This causes the object
        state to be reset from CLOSE to REOPENING and reinitializes the
        object.
      
        The daemon may now handle the open request. Any in-flight operations
        are restored and handled avoiding interruptions for users"
      
      Link: https://systemd.io/FILE_DESCRIPTOR_STORE [1]
      
      * tag 'vfs-6.8.cachefiles' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
        cachefiles: add restore command to recover inflight ondemand read requests
        cachefiles: narrow the scope of triggering EPOLLIN events in ondemand mode
        cachefiles: resend an open request if the read request's object is closed
        cachefiles: extract ondemand info field from cachefiles_object
        cachefiles: introduce object ondemand state
      26458409
    • Linus Torvalds's avatar
      Merge tag 'vfs-6.8.rw' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs · bb93c5ed
      Linus Torvalds authored
      Pull vfs rw updates from Christian Brauner:
       "This contains updates from Amir for read-write backing file helpers
        for stacking filesystems such as overlayfs:
      
         - Fanotify is currently in the process of introducing pre content
           events. Roughly, a new permission event will be added indicating
           that it is safe to write to the file being accessed. These events
           are used by hierarchical storage managers to e.g., fill the content
           of files on first access.
      
           During that work we noticed that our current permission checking is
           inconsistent in rw_verify_area() and remap_verify_area().
           Especially in the splice code permission checking is done multiple
           times. For example, one time for the whole range and then again for
           partial ranges inside the iterator.
      
           In addition, we mostly do permission checking before we call
           file_start_write() except for a few places where we call it after.
           For pre-content events we need such permission checking to be done
           before file_start_write(). So this is a nice reason to clean this
           all up.
      
           After this series, all permission checking is done before
           file_start_write().
      
           As part of this cleanup we also massaged the splice code a bit. We
           got rid of a few helpers because we are alredy drowning in special
           read-write helpers. We also cleaned up the return types for splice
           helpers.
      
         - Introduce generic read-write helpers for backing files. This lifts
           some overlayfs code to common code so it can be used by the FUSE
           passthrough work coming in over the next cycles. Make Amir and
           Miklos the maintainers for this new subsystem of the vfs"
      
      * tag 'vfs-6.8.rw' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (30 commits)
        fs: fix __sb_write_started() kerneldoc formatting
        fs: factor out backing_file_mmap() helper
        fs: factor out backing_file_splice_{read,write}() helpers
        fs: factor out backing_file_{read,write}_iter() helpers
        fs: prepare for stackable filesystems backing file helpers
        fsnotify: optionally pass access range in file permission hooks
        fsnotify: assert that file_start_write() is not held in permission hooks
        fsnotify: split fsnotify_perm() into two hooks
        fs: use splice_copy_file_range() inline helper
        splice: return type ssize_t from all helpers
        fs: use do_splice_direct() for nfsd/ksmbd server-side-copy
        fs: move file_start_write() into direct_splice_actor()
        fs: fork splice_file_range() from do_splice_direct()
        fs: create {sb,file}_write_not_started() helpers
        fs: create file_write_started() helper
        fs: create __sb_write_started() helper
        fs: move kiocb_start_write() into vfs_iocb_iter_write()
        fs: move permission hook out of do_iter_read()
        fs: move permission hook out of do_iter_write()
        fs: move file_start_write() into vfs_iter_write()
        ...
      bb93c5ed
    • Linus Torvalds's avatar
      Merge tag 'vfs-6.8.mount' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs · 8c9440fe
      Linus Torvalds authored
      Pull vfs mount updates from Christian Brauner:
       "This contains the work to retrieve detailed information about mounts
        via two new system calls. This is hopefully the beginning of the end
        of the saga that started with fsinfo() years ago.
      
        The LWN articles in [1] and [2] can serve as a summary so we can avoid
        rehashing everything here.
      
        At LSFMM in May 2022 we got into a room and agreed on what we want to
        do about fsinfo(). Basically, split it into pieces. This is the first
        part of that agreement. Specifically, it is concerned with retrieving
        information about mounts. So this only concerns the mount information
        retrieval, not the mount table change notification, or the extended
        filesystem specific mount option work. That is separate work.
      
        Currently mounts have a 32bit id. Mount ids are already in heavy use
        by libmount and other low-level userspace but they can't be relied
        upon because they're recycled very quickly. We agreed that mounts
        should carry a unique 64bit id by which they can be referenced
        directly. This is now implemented as part of this work.
      
        The new 64bit mount id is exposed in statx() through the new
        STATX_MNT_ID_UNIQUE flag. If the flag isn't raised the old mount id is
        returned. If it is raised and the kernel supports the new 64bit mount
        id the flag is raised in the result mask and the new 64bit mount id is
        returned. New and old mount ids do not overlap so they cannot be
        conflated.
      
        Two new system calls are introduced that operate on the 64bit mount
        id: statmount() and listmount(). A summary of the api and usage can be
        found on LWN as well (cf. [3]) but of course, I'll provide a summary
        here as well.
      
        Both system calls rely on struct mnt_id_req. Which is the request
        struct used to pass the 64bit mount id identifying the mount to
        operate on. It is extensible to allow for the addition of new
        parameters and for future use in other apis that make use of mount
        ids.
      
        statmount() mimicks the semantics of statx() and exposes a set flags
        that userspace may raise in mnt_id_req to request specific information
        to be retrieved. A statmount() call returns a struct statmount filled
        in with information about the requested mount. Supported requests are
        indicated by raising the request flag passed in struct mnt_id_req in
        the @mask argument in struct statmount.
      
        Currently we do support:
      
         - STATMOUNT_SB_BASIC:
           Basic filesystem info
      
         - STATMOUNT_MNT_BASIC
           Mount information (mount id, parent mount id, mount attributes etc)
      
         - STATMOUNT_PROPAGATE_FROM
           Propagation from what mount in current namespace
      
         - STATMOUNT_MNT_ROOT
           Path of the root of the mount (e.g., mount --bind /bla /mnt returns /bla)
      
         - STATMOUNT_MNT_POINT
           Path of the mount point (e.g., mount --bind /bla /mnt returns /mnt)
      
         - STATMOUNT_FS_TYPE
           Name of the filesystem type as the magic number isn't enough due to submounts
      
        The string options STATMOUNT_MNT_{ROOT,POINT} and STATMOUNT_FS_TYPE
        are appended to the end of the struct. Userspace can use the offsets
        in @fs_type, @mnt_root, and @mnt_point to reference those strings
        easily.
      
        The struct statmount reserves quite a bit of space currently for
        future extensibility. This isn't really a problem and if this bothers
        us we can just send a follow-up pull request during this cycle.
      
        listmount() is given a 64bit mount id via mnt_id_req just as
        statmount(). It takes a buffer and a size to return an array of the
        64bit ids of the child mounts of the requested mount. Userspace can
        thus choose to either retrieve child mounts for a mount in batches or
        iterate through the child mounts. For most use-cases it will be
        sufficient to just leave space for a few child mounts. But for big
        mount tables having an iterator is really helpful. Iterating through a
        mount table works by setting @param in mnt_id_req to the mount id of
        the last child mount retrieved in the previous listmount() call"
      
      Link: https://lwn.net/Articles/934469 [1]
      Link: https://lwn.net/Articles/829212 [2]
      Link: https://lwn.net/Articles/950569 [3]
      
      * tag 'vfs-6.8.mount' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
        add selftest for statmount/listmount
        fs: keep struct mnt_id_req extensible
        wire up syscalls for statmount/listmount
        add listmount(2) syscall
        statmount: simplify string option retrieval
        statmount: simplify numeric option retrieval
        add statmount(2) syscall
        namespace: extract show_path() helper
        mounts: keep list of mounts in an rbtree
        add unique mount ID
      8c9440fe