1. 02 Nov, 2023 7 commits
    • Kent Overstreet's avatar
      bcachefs: Enumerate fsck errors · b65db750
      Kent Overstreet authored
      This patch adds a superblock error counter for every distinct fsck
      error; this means that when analyzing filesystems out in the wild we'll
      be able to see what sorts of inconsistencies are being found and repair,
      and hence what bugs to look for.
      
      Errors validating bkeys are not yet considered distinct fsck errors, but
      this patch adds a new helper, bkey_fsck_err(), in order to add distinct
      error types for them as well.
      Signed-off-by: default avatarKent Overstreet <kent.overstreet@linux.dev>
      b65db750
    • Kent Overstreet's avatar
      bcachefs: bch_sb_field_errors · f5d26fa3
      Kent Overstreet authored
      Add a new superblock section to keep counts of errors seen since
      filesystem creation: we'll be addingcounters for every distinct fsck
      error.
      
      The new superblock section has entries of the for [ id, count,
      time_of_last_error ]; this is intended to let us see what errors are
      occuring - and getting fixed - via show-super output.
      Signed-off-by: default avatarKent Overstreet <kent.overstreet@linux.dev>
      f5d26fa3
    • Kent Overstreet's avatar
      bcachefs: Add IO error counts to bch_member · 94119eeb
      Kent Overstreet authored
      We now track IO errors per device since filesystem creation.
      
      IO error counts can be viewed in sysfs, or with the 'bcachefs
      show-super' command.
      Signed-off-by: default avatarKent Overstreet <kent.overstreet@linux.dev>
      94119eeb
    • Kent Overstreet's avatar
      5394fe94
    • Kent Overstreet's avatar
      bcachefs: Fix a kasan splat in bch2_dev_add() · e8484348
      Kent Overstreet authored
      This fixes a use after free - mi is dangling after the resize call.
      
      Additionally, resizing the device's member info section was useless - we
      were attempting to preallocate the space required before adding it to
      the filesystem superblock, but there's other sections that we should
      have been preallocating as well for that to work.
      Signed-off-by: default avatarKent Overstreet <kent.overstreet@linux.dev>
      e8484348
    • Kent Overstreet's avatar
      bcachefs: Fix kasan splat in members_v1_get() · 5c1ab40e
      Kent Overstreet authored
      This fixes an incorrect memcpy() in the recent members_v2 code - a
      members_v1 member is BCH_MEMBER_V1_BYTES, not sizeof(struct bch_member).
      Signed-off-by: default avatarKent Overstreet <kent.overstreet@linux.dev>
      5c1ab40e
    • Kent Overstreet's avatar
      bcachefs: rebalance_work · fb3f57bb
      Kent Overstreet authored
      This adds a new btree, rebalance_work, to eliminate scanning required
      for finding extents that need work done on them in the background - i.e.
      for the background_target and background_compression options.
      
      rebalance_work is a bitset btree, where a KEY_TYPE_set corresponds to an
      extent in the extents or reflink btree at the same pos.
      
      A new extent field is added, bch_extent_rebalance, which indicates that
      this extent has work that needs to be done in the background - and which
      options to use. This allows per-inode options to be propagated to
      indirect extents - at least in some circumstances. In this patch,
      changing IO options on a file will not propagate the new options to
      indirect extents pointed to by that file.
      
      Updating (setting/clearing) the rebalance_work btree is done by the
      extent trigger, which looks at the bch_extent_rebalance field.
      
      Scanning is still requrired after changing IO path options - either just
      for a given inode, or for the whole filesystem. We indicate that
      scanning is required by adding a KEY_TYPE_cookie key to the
      rebalance_work btree: the cookie counter is so that we can detect that
      scanning is still required when an option has been flipped mid-way
      through an existing scan.
      
      Future possible work:
       - Propagate options to indirect extents when being changed
       - Add other IO path options - nr_replicas, ec, to rebalance_work so
         they can be applied in the background when they change
       - Add a counter, for bcachefs fs usage output, showing the pending
         amount of rebalance work: we'll probably want to do this after the
         disk space accounting rewrite (moving it to a new btree)
      Signed-off-by: default avatarKent Overstreet <kent.overstreet@linux.dev>
      fb3f57bb
  2. 31 Oct, 2023 27 commits
  3. 30 Oct, 2023 6 commits
    • Linus Torvalds's avatar
      Merge tag 'objtool-core-2023-10-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · cd063c8b
      Linus Torvalds authored
      Pull objtool updates from Ingo Molnar:
       "Misc fixes and cleanups:
      
         - Fix potential MAX_NAME_LEN limit related build failures
      
         - Fix scripts/faddr2line symbol filtering bug
      
         - Fix scripts/faddr2line on LLVM=1
      
         - Fix scripts/faddr2line to accept readelf output with mapping
           symbols
      
         - Minor cleanups"
      
      * tag 'objtool-core-2023-10-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        scripts/faddr2line: Skip over mapping symbols in output from readelf
        scripts/faddr2line: Use LLVM addr2line and readelf if LLVM=1
        scripts/faddr2line: Don't filter out non-function symbols from readelf
        objtool: Remove max symbol name length limitation
        objtool: Propagate early errors
        objtool: Use 'the fallthrough' pseudo-keyword
        x86/speculation, objtool: Use absolute relocations for annotations
        x86/unwind/orc: Remove redundant initialization of 'mid' pointer in __orc_find()
      cd063c8b
    • Linus Torvalds's avatar
      Merge tag 'sched-core-2023-10-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 63ce50ff
      Linus Torvalds authored
      Pull scheduler updates from Ingo Molnar:
       "Fair scheduler (SCHED_OTHER) improvements:
         - Remove the old and now unused SIS_PROP code & option
         - Scan cluster before LLC in the wake-up path
         - Use candidate prev/recent_used CPU if scanning failed for cluster
           wakeup
      
        NUMA scheduling improvements:
         - Improve the VMA access-PID code to better skip/scan VMAs
         - Extend tracing to cover VMA-skipping decisions
         - Improve/fix the recently introduced sched_numa_find_nth_cpu() code
         - Generalize numa_map_to_online_node()
      
        Energy scheduling improvements:
         - Remove the EM_MAX_COMPLEXITY limit
         - Add tracepoints to track energy computation
         - Make the behavior of the 'sched_energy_aware' sysctl more
           consistent
         - Consolidate and clean up access to a CPU's max compute capacity
         - Fix uclamp code corner cases
      
        RT scheduling improvements:
         - Drive dl_rq->overloaded with dl_rq->pushable_dl_tasks updates
         - Drive the ->rto_mask with rt_rq->pushable_tasks updates
      
        Scheduler scalability improvements:
         - Rate-limit updates to tg->load_avg
         - On x86 disable IBRS when CPU is offline to improve single-threaded
           performance
         - Micro-optimize in_task() and in_interrupt()
         - Micro-optimize the PSI code
         - Avoid updating PSI triggers and ->rtpoll_total when there are no
           state changes
      
        Core scheduler infrastructure improvements:
         - Use saved_state to reduce some spurious freezer wakeups
         - Bring in a handful of fast-headers improvements to scheduler
           headers
         - Make the scheduler UAPI headers more widely usable by user-space
         - Simplify the control flow of scheduler syscalls by using lock
           guards
         - Fix sched_setaffinity() vs. CPU hotplug race
      
        Scheduler debuggability improvements:
         - Disallow writing invalid values to sched_rt_period_us
         - Fix a race in the rq-clock debugging code triggering warnings
         - Fix a warning in the bandwidth distribution code
         - Micro-optimize in_atomic_preempt_off() checks
         - Enforce that the tasklist_lock is held in for_each_thread()
         - Print the TGID in sched_show_task()
         - Remove the /proc/sys/kernel/sched_child_runs_first sysctl
      
        ... and misc cleanups & fixes"
      
      * tag 'sched-core-2023-10-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (82 commits)
        sched/fair: Remove SIS_PROP
        sched/fair: Use candidate prev/recent_used CPU if scanning failed for cluster wakeup
        sched/fair: Scan cluster before scanning LLC in wake-up path
        sched: Add cpus_share_resources API
        sched/core: Fix RQCF_ACT_SKIP leak
        sched/fair: Remove unused 'curr' argument from pick_next_entity()
        sched/nohz: Update comments about NEWILB_KICK
        sched/fair: Remove duplicate #include
        sched/psi: Update poll => rtpoll in relevant comments
        sched: Make PELT acronym definition searchable
        sched: Fix stop_one_cpu_nowait() vs hotplug
        sched/psi: Bail out early from irq time accounting
        sched/topology: Rename 'DIE' domain to 'PKG'
        sched/psi: Delete the 'update_total' function parameter from update_triggers()
        sched/psi: Avoid updating PSI triggers and ->rtpoll_total when there are no state changes
        sched/headers: Remove comment referring to rq::cpu_load, since this has been removed
        sched/numa: Complete scanning of inactive VMAs when there is no alternative
        sched/numa: Complete scanning of partial VMAs regardless of PID activity
        sched/numa: Move up the access pid reset logic
        sched/numa: Trace decisions related to skipping VMAs
        ...
      63ce50ff
    • Linus Torvalds's avatar
      Merge tag 'locking-core-2023-10-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 3cf3fabc
      Linus Torvalds authored
      Pull locking updates from Info Molnar:
       "Futex improvements:
      
         - Add the 'futex2' syscall ABI, which is an attempt to get away from
           the multiplex syscall and adds a little room for extentions, while
           lifting some limitations.
      
         - Fix futex PI recursive rt_mutex waiter state bug
      
         - Fix inter-process shared futexes on no-MMU systems
      
         - Use folios instead of pages
      
        Micro-optimizations of locking primitives:
      
         - Improve arch_spin_value_unlocked() on asm-generic ticket spinlock
           architectures, to improve lockref code generation
      
         - Improve the x86-32 lockref_get_not_zero() main loop by adding
           build-time CMPXCHG8B support detection for the relevant lockref
           code, and by better interfacing the CMPXCHG8B assembly code with
           the compiler
      
         - Introduce arch_sync_try_cmpxchg() on x86 to improve
           sync_try_cmpxchg() code generation. Convert some sync_cmpxchg()
           users to sync_try_cmpxchg().
      
         - Micro-optimize rcuref_put_slowpath()
      
        Locking debuggability improvements:
      
         - Improve CONFIG_DEBUG_RT_MUTEXES=y to have a fast-path as well
      
         - Enforce atomicity of sched_submit_work(), which is de-facto atomic
           but was un-enforced previously.
      
         - Extend <linux/cleanup.h>'s no_free_ptr() with __must_check
           semantics
      
         - Fix ww_mutex self-tests
      
         - Clean up const-propagation in <linux/seqlock.h> and simplify the
           API-instantiation macros a bit
      
        RT locking improvements:
      
         - Provide the rt_mutex_*_schedule() primitives/helpers and use them
           in the rtmutex code to avoid recursion vs. rtlock on the PI state.
      
         - Add nested blocking lockdep asserts to rt_mutex_lock(),
           rtlock_lock() and rwbase_read_lock()
      
        .. plus misc fixes & cleanups"
      
      * tag 'locking-core-2023-10-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (39 commits)
        futex: Don't include process MM in futex key on no-MMU
        locking/seqlock: Fix grammar in comment
        alpha: Fix up new futex syscall numbers
        locking/seqlock: Propagate 'const' pointers within read-only methods, remove forced type casts
        locking/lockdep: Fix string sizing bug that triggers a format-truncation compiler-warning
        locking/seqlock: Change __seqprop() to return the function pointer
        locking/seqlock: Simplify SEQCOUNT_LOCKNAME()
        locking/atomics: Use atomic_try_cmpxchg_release() to micro-optimize rcuref_put_slowpath()
        locking/atomic, xen: Use sync_try_cmpxchg() instead of sync_cmpxchg()
        locking/atomic/x86: Introduce arch_sync_try_cmpxchg()
        locking/atomic: Add generic support for sync_try_cmpxchg() and its fallback
        locking/seqlock: Fix typo in comment
        futex/requeue: Remove unnecessary ‘NULL’ initialization from futex_proxy_trylock_atomic()
        locking/local, arch: Rewrite local_add_unless() as a static inline function
        locking/debug: Fix debugfs API return value checks to use IS_ERR()
        locking/ww_mutex/test: Make sure we bail out instead of livelock
        locking/ww_mutex/test: Fix potential workqueue corruption
        locking/ww_mutex/test: Use prng instead of rng to avoid hangs at bootup
        futex: Add sys_futex_requeue()
        futex: Add flags2 argument to futex_requeue()
        ...
      3cf3fabc
    • Linus Torvalds's avatar
      Merge tag 'x86_fpu_for_6.7_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 9cda4eb0
      Linus Torvalds authored
      Pull x86 fpu fixlet from Borislav Petkov:
      
       - kernel-doc fix
      
      * tag 'x86_fpu_for_6.7_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/fpu/xstate: Address kernel-doc warning
      9cda4eb0
    • Linus Torvalds's avatar
      Merge tag 'x86_platform_for_6.7_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · f155f3b3
      Linus Torvalds authored
      Pull x86 platform updates from Borislav Petkov:
      
       - Make sure PCI function 4 IDs of AMD family 0x19, models 0x60-0x7f are
         actually used in the amd_nb.c enumeration
      
       - Add support for extracting NUMA information from devicetree for
         Hyper-V usages
      
       - Add PCI device IDs for the new AMD MI300 AI accelerators
      
       - Annotate an array in struct uv_rtc_timer_head with the new
         __counted_by attribute
      
       - Rework UV's NMI action parameter handling
      
      * tag 'x86_platform_for_6.7_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/amd_nb: Use Family 19h Models 60h-7Fh Function 4 IDs
        x86/numa: Add Devicetree support
        x86/of: Move the x86_flattree_get_config() call out of x86_dtb_init()
        x86/amd_nb: Add AMD Family MI300 PCI IDs
        x86/platform/uv: Annotate struct uv_rtc_timer_head with __counted_by
        x86/platform/uv: Rework NMI "action" modparam handling
      f155f3b3
    • Linus Torvalds's avatar
      Merge tag 'x86_cpu_for_6.7_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · ca2e9c3b
      Linus Torvalds authored
      Pull x86 cpuid updates from Borislav Petkov:
      
       - Make sure the "svm" feature flag is cleared from /proc/cpuinfo when
         virtualization support is disabled in the BIOS on AMD and Hygon
         platforms
      
       - A minor cleanup
      
      * tag 'x86_cpu_for_6.7_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/cpu/amd: Remove redundant 'break' statement
        x86/cpu: Clear SVM feature if disabled by BIOS
      ca2e9c3b