1. 23 Sep, 2024 5 commits
    • Linus Torvalds's avatar
      Merge tag 'fs_for_v6.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs · d0359e4c
      Linus Torvalds authored
      Pull quota and isofs updates from Jan Kara:
       "A few small cleanups in quota and isofs"
      
      * tag 'fs_for_v6.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
        isofs: Annotate struct SL_component with __counted_by()
        quota: remove unnecessary error code translation in dquot_quota_enable
        quota: remove redundant return at end of void function
        quota: remove unneeded return value of register_quota_format
        quota: avoid missing put_quota_format when DQUOT_SUSPENDED is passed
      d0359e4c
    • Linus Torvalds's avatar
      Merge tag 'bcachefs-2024-09-21' of git://evilpiepirate.org/bcachefs · b3f391fd
      Linus Torvalds authored
      Pull bcachefs updates from Kent Overstreet:
      
       - rcu_pending, btree key cache rework: this solves lock contenting in
         the key cache, eliminating the biggest source of the srcu lock hold
         time warnings, and drastically improving performance on some metadata
         heavy workloads - on multithreaded creates we're now 3-4x faster than
         xfs.
      
       - We're now using an rhashtable instead of the system inode hash table;
         this is another significant performance improvement on multithreaded
         metadata workloads, eliminating more lock contention.
      
       - for_each_btree_key_in_subvolume_upto(): new helper for iterating over
         keys within a specific subvolume, eliminating a lot of open coded
         "subvolume_get_snapshot()" and also fixing another source of srcu
         lock time warnings, by running each loop iteration in its own
         transaction (as the existing for_each_btree_key() does).
      
       - More work on btree_trans locking asserts; we now assert that we don't
         hold btree node locks when trans->locked is false, which is important
         because we don't use lockdep for tracking individual btree node
         locks.
      
       - Some cleanups and improvements in the bset.c btree node lookup code,
         from Alan.
      
       - Rework of btree node pinning, which we use in backpointers fsck. The
         old hacky implementation, where the shrinker just skipped over nodes
         in the pinned range, was causing OOMs; instead we now use another
         shrinker with a much higher seeks number for pinned nodes.
      
       - Rebalance now uses BCH_WRITE_ONLY_SPECIFIED_DEVS; this fixes an issue
         where rebalance would sometimes fall back to allocating from the full
         filesystem, which is not what we want when it's trying to move data
         to a specific target.
      
       - Use __GFP_ACCOUNT, GFP_RECLAIMABLE for btree node, key cache
         allocations.
      
       - Idmap mounts are now supported (Hongbo Li)
      
       - Rename whiteouts are now supported (Hongbo Li)
      
       - Erasure coding can now handle devices being marked as failed, or
         forcibly removed. We still need the evacuate path for erasure coding,
         but it's getting very close to ready for people to start using.
      
      * tag 'bcachefs-2024-09-21' of git://evilpiepirate.org/bcachefs: (99 commits)
        bcachefs: return err ptr instead of null in read sb clean
        bcachefs: Remove duplicated include in backpointers.c
        bcachefs: Don't drop devices with stripe pointers
        bcachefs: bch2_ec_stripe_head_get() now checks for change in rw devices
        bcachefs: bch_fs.rw_devs_change_count
        bcachefs: bch2_dev_remove_stripes()
        bcachefs: bch2_trigger_ptr() calculates sectors even when no device
        bcachefs: improve error messages in bch2_ec_read_extent()
        bcachefs: improve error message on too few devices for ec
        bcachefs: improve bch2_new_stripe_to_text()
        bcachefs: ec_stripe_head.nr_created
        bcachefs: bch_stripe.disk_label
        bcachefs: stripe_to_mem()
        bcachefs: EIO errcode cleanup
        bcachefs: Rework btree node pinning
        bcachefs: split up btree cache counters for live, freeable
        bcachefs: btree cache counters should be size_t
        bcachefs: Don't count "skipped access bit" as touched in btree cache scan
        bcachefs: Failed devices no longer require mounting in degraded mode
        bcachefs: bch2_dev_rcu_noerror()
        ...
      b3f391fd
    • Linus Torvalds's avatar
      Merge tag 'pull-stable-struct_fd' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · f8ffbc36
      Linus Torvalds authored
      Pull 'struct fd' updates from Al Viro:
       "Just the 'struct fd' layout change, with conversion to accessor
        helpers"
      
      * tag 'pull-stable-struct_fd' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
        add struct fd constructors, get rid of __to_fd()
        struct fd: representation change
        introduce fd_file(), convert all accessors to it.
      f8ffbc36
    • Linus Torvalds's avatar
      mm: fix build on 32-bit targets without MAX_PHYSMEM_BITS · f8eb5bd9
      Linus Torvalds authored
      The merge resolution to deal with the conflict between commits
      ea72ce5d ("x86/kaslr: Expose and use the end of the physical memory
      address space") and 99185c10 ("resource, kunit: add test case for
      region_intersects()") ended up being broken in configurations didn't
      define a MAX_PHYSMEM_BITS and that had a 32-bit 'phys_addr_t'.
      
      The fallback to using all bits set (ie "(-1ULL)") ended up causing a
      build error:
      
          kernel/resource.c: In function ‘gfr_start’:
          include/linux/minmax.h:93:30: error: conversion from ‘long long unsigned int’ to ‘resource_size_t’ {aka ‘unsigned int’} changes value from ‘18446744073709551615’ to ‘4294967295’ [-Werror=overflow]
      
      this was reported by Geert for m68k, but he points out that it happens
      on other 32-bit architectures too, eg mips, xtensa, parisc, and powerpc.
      
      Limiting 'PHYSMEM_END' to a 'phys_addr_t' (which is the same as
      'resource_size_t') fixes the build, but Geert points out that it will
      then cause a silent overflow in mm/sparse.c:
      
      	unsigned long max_sparsemem_pfn = (PHYSMEM_END + 1) >> PAGE_SHIFT;
      
      so we actually do want PHYSMEM_END to be defined a 64-bit type - just
      not all ones, and not larger than 'phys_addr_t'.
      
      The proper fix is probably to not have some kind of default fallback at
      all, but just make sure every architecture has a valid MAX_PHYSMEM_BITS.
      But in the meantime, this just applies the rule that PHYSMEM_END is the
      largest value that fits in a 'phys_addr_t', but does not have the high
      bit set in 64 bits.
      
      Ugly, ugly.
      Reported-by: default avatarGeert Uytterhoeven <geert@linux-m68k.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Huang Ying <ying.huang@intel.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      f8eb5bd9
    • Guenter Roeck's avatar
      hexagon: vdso: Fix build failure · 9631042b
      Guenter Roeck authored
      Hexagon images fail to build with the following error.
      
      arch/hexagon/kernel/vdso.c:57:3: error: use of undeclared identifier 'name'
                      name = "[vdso]",
                      ^
      
      Add the missing '.' to fix the problem.
      
      Fixes: 497258df ("mm: remove legacy install_special_mapping() code")
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Reviewed-by: default avatarBrian Cain <bcain@quicinc.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      9631042b
  2. 22 Sep, 2024 7 commits
    • Christoph Lameter (Ampere)'s avatar
      seqcount: replace smp_rmb() in read_seqcount() with load acquire · d0dd066a
      Christoph Lameter (Ampere) authored
      Many architectures support load acquire which can replace a memory
      barrier and save some cycles.
      
      A typical sequence
      
      	do {
      		seq = read_seqcount_begin(&s);
      		<something>
      	} while (read_seqcount_retry(&s, seq);
      
      requires 13 cycles on an N1 Neoverse arm64 core (Ampere Altra, to be
      specific) for an empty loop.  Two read memory barriers are needed.  One
      for each of the seqcount_* functions.
      
      We can replace the first read barrier with a load acquire of the
      seqcount which saves us one barrier.
      
      On the Altra doing so reduces the cycle count from 13 to 8.
      
      According to ARM, this is a general improvement for the ARM64
      architecture and not specific to a certain processor.
      
      See
      
        https://developer.arm.com/documentation/102336/0100/Load-Acquire-and-Store-Release-instructions
      
       "Weaker ordering requirements that are imposed by Load-Acquire and
        Store-Release instructions allow for micro-architectural
        optimizations, which could reduce some of the performance impacts that
        are otherwise imposed by an explicit memory barrier.
      
        If the ordering requirement is satisfied using either a Load-Acquire
        or Store-Release, then it would be preferable to use these
        instructions instead of a DMB"
      
      [ NOTE! This is my original minimal patch that unconditionally switches
        over to using smp_load_acquire(), instead of the much more involved
        and subtle patch that Christoph Lameter wrote that made it
        conditional.
      
        But Christoph gets authorship credit because I had initially thought
        that we needed the more complex model, and Christoph ran with it it
        and did the work. Only after looking at code generation for all the
        relevant architectures, did I come to the conclusion that nobody
        actually really needs the old "smp_rmb()" model.
      
        Even architectures without load-acquire support generally do as well
        or better with smp_load_acquire().
      
        So credit to Christoph, but if this then causes issues on other
        architectures, put the blame solidly on me.
      
        Also note as part of the ruthless simplification, this gets rid of the
        overly subtle optimization where some code uses a non-barrier version
        of the sequence count (see the __read_seqcount_begin() users in
        fs/namei.c). They then play games with their own barriers and/or with
        nested sequence counts.
      
        Those optimizations are literally meaningless on x86, and questionable
        elsewhere. If somebody can show that they matter, we need to re-do
        them more cleanly than "use an internal helper".       - Linus ]
      Signed-off-by: default avatarChristoph Lameter (Ampere) <cl@gentwo.org>
      Link: https://lore.kernel.org/all/20240912-seq_optimize-v3-1-8ee25e04dffa@gentwo.org/Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      d0dd066a
    • Linus Torvalds's avatar
      Merge branch 'address-masking' · de5cb0dc
      Linus Torvalds authored
      Merge user access fast validation using address masking.
      
      This allows architectures to optionally use a data dependent address
      masking model instead of a conditional branch for validating user
      accesses.  That avoids the Spectre-v1 speculation barriers.
      
      Right now only x86-64 takes advantage of this, and not all architectures
      will be able to do it.  It requires a guard region between the user and
      kernel address spaces (so that you can't overflow from one to the
      other), and an easy way to generate a guaranteed-to-fault address for
      invalid user pointers.
      
      Also note that this currently assumes that there is no difference
      between user read and write accesses.  If extended to architectures like
      powerpc, we'll also need to separate out the user read-vs-write cases.
      
      * address-masking:
        x86: make the masked_user_access_begin() macro use its argument only once
        x86: do the user address masking outside the user access area
        x86: support user address masking instead of non-speculative conditional
      de5cb0dc
    • Linus Torvalds's avatar
      x86: make the masked_user_access_begin() macro use its argument only once · 533ab223
      Linus Torvalds authored
      This doesn't actually matter for any of the current users, but before
      merging it mainline, make sure we don't have any surprising semantics.
      
      We don't actually want to use an inline function here, because we want
      to allow - but not require - const pointer arguments, and return them as
      such.  But we already had a local auto-type variable, so let's just use
      it to avoid any possible double evaluation.
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      533ab223
    • Linus Torvalds's avatar
      Merge tag 'trace-ring-buffer-v6.12' of... · af9c191a
      Linus Torvalds authored
      Merge tag 'trace-ring-buffer-v6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
      
      Pull ring-buffer updates from Steven Rostedt:
      
       - tracing/ring-buffer: persistent buffer across reboots
      
         This allows for the tracing instance ring buffer to stay persistent
         across reboots. The way this is done is by adding to the kernel
         command line:
      
           trace_instance=boot_map@0x285400000:12M
      
         This will reserve 12 megabytes at the address 0x285400000, and then
         map the tracing instance "boot_map" ring buffer to that memory. This
         will appear as a normal instance in the tracefs system:
      
           /sys/kernel/tracing/instances/boot_map
      
         A user could enable tracing in that instance, and on reboot or kernel
         crash, if the memory is not wiped by the firmware, it will recreate
         the trace in that instance. For example, if one was debugging a
         shutdown of a kernel reboot:
      
           # cd /sys/kernel/tracing
           # echo function > instances/boot_map/current_tracer
           # reboot
           [..]
           # cd /sys/kernel/tracing
           # tail instances/boot_map/trace
                 swapper/0-1       [000] d..1.   164.549800: restore_boot_irq_mode <-native_machine_shutdown
                 swapper/0-1       [000] d..1.   164.549801: native_restore_boot_irq_mode <-native_machine_shutdown
                 swapper/0-1       [000] d..1.   164.549802: disconnect_bsp_APIC <-native_machine_shutdown
                 swapper/0-1       [000] d..1.   164.549811: hpet_disable <-native_machine_shutdown
                 swapper/0-1       [000] d..1.   164.549812: iommu_shutdown_noop <-native_machine_restart
                 swapper/0-1       [000] d..1.   164.549813: native_machine_emergency_restart <-__do_sys_reboot
                 swapper/0-1       [000] d..1.   164.549813: tboot_shutdown <-native_machine_emergency_restart
                 swapper/0-1       [000] d..1.   164.549820: acpi_reboot <-native_machine_emergency_restart
                 swapper/0-1       [000] d..1.   164.549821: acpi_reset <-acpi_reboot
                 swapper/0-1       [000] d..1.   164.549822: acpi_os_write_port <-acpi_reboot
      
         On reboot, the buffer is examined to make sure it is valid. The
         validation check even steps through every event to make sure the meta
         data of the event is correct. If any test fails, it will simply reset
         the buffer, and the buffer will be empty on boot.
      
       - Allow the tracing persistent boot buffer to use the "reserve_mem"
         option
      
         Instead of having the admin find a physical address to store the
         persistent buffer, which can be very tedious if they have to
         administrate several different machines, allow them to use the
         "reserve_mem" option that will find a location for them. It is not as
         reliable because of KASLR, as the loading of the kernel in different
         locations can cause the memory allocated to be inconsistent. Booting
         with "nokaslr" can make reserve_mem more reliable.
      
       - Have function graph tracer handle offsets from a previous boot.
      
         The ring buffer output from a previous boot may have different
         addresses due to kaslr. Have the function graph tracer handle these
         by using the delta from the previous boot to the new boot address
         space.
      
       - Only reset the saved meta offset when the buffer is started or reset
      
         In the persistent memory meta data, it holds the previous address
         space information, so that it can calculate the delta to have
         function tracing work. But this gets updated after being read to hold
         the new address space. But if the buffer isn't used for that boot, on
         reboot, the delta is now calculated from the previous boot and not
         the boot that holds the data in the ring buffer. This causes the
         functions not to be shown. Do not save the address space information
         of the current kernel until it is being recorded.
      
       - Add a magic variable to test the valid meta data
      
         Add a magic variable in the meta data that can also be used for
         validation. The validator of the previous buffer doesn't need this
         magic data, but it can be used if the meta data is changed by a new
         kernel, which may have the same format that passes the validator but
         is used differently. This magic number can also be used as a
         "versioning" of the meta data.
      
       - Align user space mapped ring buffer sub buffers to improve TLB
         entries
      
         Linus mentioned that the mapped ring buffer sub buffers were
         misaligned between the meta page and the sub-buffers, so that if the
         sub-buffers were bigger than PAGE_SIZE, it wouldn't allow the TLB to
         use bigger entries.
      
       - Add new kernel command line "traceoff" to disable tracing on boot for
         instances
      
         If tracing is enabled for a boot instance, there needs a way to be
         able to disable it on boot so that new events do not get entered into
         the ring buffer and be mixed with events from a previous boot, as
         that can be confusing.
      
       - Allow trace_printk() to go to other instances
      
         Currently, trace_printk() can only go to the top level instance. When
         debugging with a persistent buffer, it is really useful to be able to
         add trace_printk() to go to that buffer, so that you have access to
         them after a crash.
      
       - Do not use "bin_printk()" for traces to a boot instance
      
         The bin_printk() saves only a pointer to the printk format in the
         ring buffer, as the reader of the buffer can still have access to it.
         But this is not the case if the buffer is from a previous boot. If
         the trace_printk() is going to a "persistent" buffer, it will use the
         slower version that writes the printk format into the buffer.
      
       - Add command line option to allow trace_printk() to go to an instance
      
         Allow the kernel command line to define which instance the
         trace_printk() goes to, instead of forcing the admin to set it for
         every boot via the tracefs options.
      
       - Start a document that explains how to use tracefs to debug the kernel
      
       - Add some more kernel selftests to test user mapped ring buffer
      
      * tag 'trace-ring-buffer-v6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: (28 commits)
        selftests/ring-buffer: Handle meta-page bigger than the system
        selftests/ring-buffer: Verify the entire meta-page padding
        tracing/Documentation: Start a document on how to debug with tracing
        tracing: Add option to set an instance to be the trace_printk destination
        tracing: Have trace_printk not use binary prints if boot buffer
        tracing: Allow trace_printk() to go to other instance buffers
        tracing: Add "traceoff" flag to boot time tracing instances
        ring-buffer: Align meta-page to sub-buffers for improved TLB usage
        ring-buffer: Add magic and struct size to boot up meta data
        ring-buffer: Don't reset persistent ring-buffer meta saved addresses
        tracing/fgraph: Have fgraph handle previous boot function addresses
        tracing: Allow boot instances to use reserve_mem boot memory
        tracing: Fix ifdef of snapshots to not prevent last_boot_info file
        ring-buffer: Use vma_pages() helper function
        tracing: Fix NULL vs IS_ERR() check in enable_instances()
        tracing: Add last boot delta offset for stack traces
        tracing: Update function tracing output for previous boot buffer
        tracing: Handle old buffer mappings for event strings and functions
        tracing/ring-buffer: Add last_boot_info file to boot instance
        ring-buffer: Save text and data locations in mapped meta data
        ...
      af9c191a
    • Linus Torvalds's avatar
      Merge tag 'ktest-v6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-ktest · dd609b8a
      Linus Torvalds authored
      Pull ktest updates from Steven Rostedt:
      
       - Add notification of build warnings for all tests
      
         Currently, the build will only fail on warnings if the ktest config
         file states that it should fail or if the compile is done with
         '-Werror'. This has allowed warnings to sneak in if it doesn't fail.
      
         Add a notification at the end of the test that will state that
         warnings were found in the build so that the developer will be aware
         of it.
      
       - Fix the grub2 parser to not return the wrong kernel index
      
         ktest.pl can read the grub.cfg file to know what kernel to boot to
         via grub-reboot. This requires knowing the index that the kernel is
         referenced by in the grub.cfg file. Some distros have logic to
         determine the menuentry that can cause the ktest.pl to come up with
         the wrong index and boot the wrong kernel.
      
      * tag 'ktest-v6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-ktest:
        ktest.pl: Avoid false positives with grub2 skip regex
        ktest.pl: Always warn on build warnings
      dd609b8a
    • Linus Torvalds's avatar
      Merge tag 'perf-tools-for-v6.12-1-2024-09-19' of... · 891e8abe
      Linus Torvalds authored
      Merge tag 'perf-tools-for-v6.12-1-2024-09-19' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools
      
      Pull perf tools updates from Arnaldo Carvalho de Melo:
      
       - Use BPF + BTF to collect and pretty print syscall and tracepoint
         arguments in 'perf trace', done as an GSoC activity
      
       - Data-type profiling improvements:
      
           - Cache debuginfo to speed up data type resolution
      
           - Add the 'typecln' sort order, to show which cacheline in a target
             is hot or cold. The following shows members in the cfs_rq's first
             cache line:
      
               $ perf report -s type,typecln,typeoff -H
               ...
               -    2.67%        struct cfs_rq
                  +    1.23%        struct cfs_rq: cache-line 2
                  +    0.57%        struct cfs_rq: cache-line 4
                  +    0.46%        struct cfs_rq: cache-line 6
                  -    0.41%        struct cfs_rq: cache-line 0
                          0.39%        struct cfs_rq +0x14 (h_nr_running)
                          0.02%        struct cfs_rq +0x38 (tasks_timeline.rb_leftmost)
      
           - When a typedef resolves to a unnamed struct, use the typedef name
      
           - When a struct has just one basic type field (int, etc), resolve
             the type sort order to the name of the struct, not the type of
             the field
      
           - Support type folding/unfolding in the data-type annotation TUI
      
           - Fix bitfields offsets and sizes
      
           - Initial support for PowerPC, using libcapstone and the usual
             objdump disassembly parsing routines
      
       - Add support for disassembling and addr2line using the LLVM libraries,
         speeding up those operations
      
       - Support --addr2line option in 'perf script' as with other tools
      
       - Intel branch counters (LBR event logging) support, only available in
         recent Intel processors, for instance, the new "brcntr" field can be
         asked from 'perf script' to print the information collected from this
         feature:
      
           $ perf script -F +brstackinsn,+brcntr
      
           # Branch counter abbr list:
           # branch-instructions:ppp = A
           # branch-misses = B
           # '-' No event occurs
           # '+' Event occurrences may be lost due to branch counter saturated
               tchain_edit  332203 3366329.405674:  53030 branch-instructions:ppp:    401781 f3+0x2c (home/sdp/test/tchain_edit)
                  f3+31:
               0000000000401774   insn: eb 04                  br_cntr: AA  # PRED 5 cycles [5]
               000000000040177a   insn: 81 7d fc 0f 27 00 00
               0000000000401781   insn: 7e e3                  br_cntr: A   # PRED 1 cycles [6] 2.00 IPC
               0000000000401766   insn: 8b 45 fc
               0000000000401769   insn: 83 e0 01
               000000000040176c   insn: 85 c0
               000000000040176e   insn: 74 06                  br_cntr: A   # PRED 1 cycles [7] 4.00 IPC
               0000000000401776   insn: 83 45 fc 01
               000000000040177a   insn: 81 7d fc 0f 27 00 00
               0000000000401781   insn: 7e e3                  br_cntr: A   # PRED 7 cycles [14] 0.43 IPC
      
       - Support Timed PEBS (Precise Event-Based Sampling), a recent hardware
         feature in Intel processors
      
       - Add 'perf ftrace profile' subcommand, using ftrace's function-graph
         tracer so that users can see the total, average, max execution time
         as well as the number of invocations easily, for instance:
      
           $ sudo perf ftrace profile -G __x64_sys_perf_event_open -- \
             perf stat -e cycles -C1 true 2> /dev/null | head
           # Total (us)  Avg (us)  Max (us)  Count  Function
                 65.611    65.611    65.611      1  __x64_sys_perf_event_open
                 30.527    30.527    30.527      1  anon_inode_getfile
                 30.260    30.260    30.260      1  __anon_inode_getfile
                 29.700    29.700    29.700      1  alloc_file_pseudo
                 17.578    17.578    17.578      1  d_alloc_pseudo
                 17.382    17.382    17.382      1  __d_alloc
                 16.738    16.738    16.738      1  kmem_cache_alloc_lru
                 15.686    15.686    15.686      1  perf_event_alloc
                 14.012     7.006    11.264      2  obj_cgroup_charge
      
       - 'perf sched timehist' improvements, including the addition of
         priority showing/filtering command line options
      
       - Varios improvements to the 'perf probe', including 'perf test'
         regression testings
      
       - Introduce the 'perf check', initially to check if some feature is
         in place, using it in 'perf test'
      
       - Various fixes for 32-bit systems
      
       - Address more leak sanitizer failures
      
       - Fix memory leaks (LBR, disasm lock ops, etc)
      
       - More reference counting fixes (branch_info, etc)
      
       - Constify 'struct perf_tool' parameters to improve code generation
         and reduce the chances of having its internals changed, which isn't
         expected
      
       - More constifications in various other places
      
       - Add more build tests, including for JEVENTS
      
       - Add more 'perf test' entries ('perf record LBR', pipe/inject,
         --setup-filter, 'perf ftrace', 'cgroup sampling', etc)
      
       - Inject build ids for all entries in a call chain in 'perf inject',
         not just for the main sample
      
       - Improve the BPF based sample filter, allowing root to setup filters
         in bpffs that then can be used by non-root users
      
       - Allow filtering by cgroups with the BPF based sample filter
      
       - Allow a more compact way for 'perf mem report' using the
         -T/--type-profile and also provide a --sort option similar to the one
         in 'perf report', 'perf top', to setup the sort order manually
      
       - Fix --group behavior in 'perf annotate' when leader has no samples,
         where it was not showing anything even when other events in the group
         had samples
      
       - Fix spinlock and rwlock accounting in 'perf lock contention'
      
       - Fix libsubcmd fixdep Makefile dependencies
      
       - Improve 'perf ftrace' error message when ftrace isn't available
      
       - Update various Intel JSON vendor event files
      
       - ARM64 CoreSight hardware tracing infrastructure improvements, mostly
         not visible to users
      
       - Update power10 JSON events
      
      * tag 'perf-tools-for-v6.12-1-2024-09-19' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools: (310 commits)
        perf trace: Mark the 'head' arg in the set_robust_list syscall as coming from user space
        perf trace: Mark the 'rseq' arg in the rseq syscall as coming from user space
        perf env: Find correct branch counter info on hybrid
        perf evlist: Print hint for group
        tools: Drop nonsensical -O6
        perf pmu: To info add event_type_desc
        perf evsel: Add accessor for tool_event
        perf pmus: Fake PMU clean up
        perf list: Avoid potential out of bounds memory read
        perf help: Fix a typo ("bellow")
        perf ftrace: Detect whether ftrace is enabled on system
        perf test shell probe_vfs_getname: Remove extraneous '=' from probe line number regex
        perf build: Require at least clang 16.0.6 to build BPF skeletons
        perf trace: If a syscall arg is marked as 'const', assume it is coming _from_ userspace
        perf parse-events: Remove duplicated include in parse-events.c
        perf callchain: Allow symbols to be optional when resolving a callchain
        perf inject: Lazy build-id mmap2 event insertion
        perf inject: Add new mmap2-buildid-all option
        perf inject: Fix build ID injection
        perf annotate-data: Add pr_debug_scope()
        ...
      891e8abe
    • Kan Liang's avatar
      perf: Fix topology_sibling_cpumask check warning on ARM · 673a5009
      Kan Liang authored
      The below warning is triggered when building with arm
      multi_v7_defconfig.
      
        kernel/events/core.c: In function 'perf_event_setup_cpumask':
        kernel/events/core.c:14012:13: warning: the comparison will always evaluate as 'true' for the address of 'thread_sibling' will never be NULL [-Waddress]
        14012 |         if (!topology_sibling_cpumask(cpu)) {
      
      The perf_event_init_cpu() may be invoked at the early boot stage, while
      the topology_*_cpumask hasn't been initialized yet.  The check is to
      specially handle the case, and initialize the perf_online_<domain>_masks
      on the boot CPU.
      
      X86 uses a per-cpu cpumask pointer, which could be NULL at the early
      boot stage.  However, ARM uses a global variable, which never be NULL.
      
      Use perf_online_mask as an indicator instead.  Only initialize the
      perf_online_<domain>_masks when perf_online_mask is empty.
      
      Fix a typo as well.
      
      Fixes: 4ba4f1af ("perf: Generic hotplug support for a PMU with a scope")
      Reported-by: default avatarStephen Rothwell <sfr@canb.auug.org.au>
      Closes: https://lore.kernel.org/lkml/20240911153854.240bbc1f@canb.auug.org.au/Reported-by: default avatarSteven Price <steven.price@arm.com>
      Closes: https://lore.kernel.org/lkml/1835eb6d-3e05-47f3-9eae-507ce165c3bf@arm.com/Signed-off-by: default avatarKan Liang <kan.liang@linux.intel.com>
      Tested-by: default avatarSteven Price <steven.price@arm.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      673a5009
  3. 21 Sep, 2024 28 commits