1. 15 Feb, 2024 1 commit
  2. 14 Feb, 2024 2 commits
    • Linus Torvalds's avatar
      Merge tag 'for-6.8-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux · 1f3a3e2a
      Linus Torvalds authored
      Pull btrfs fixes from David Sterba:
       "A few regular fixes and one fix for space reservation regression since
        6.7 that users have been reporting:
      
         - fix over-reservation of metadata chunks due to not keeping proper
           balance between global block reserve and delayed refs reserve; in
           practice this leaves behind empty metadata block groups, the
           workaround is to reclaim them by using the '-musage=1' balance
           filter
      
         - other space reservation fixes:
            - do not delete unused block group if it may be used soon
            - do not reserve space for checksums for NOCOW files
      
         - fix extent map assertion failure when writing out free space inode
      
         - reject encoded write if inode has nodatasum flag set
      
         - fix chunk map leak when loading block group zone info"
      
      * tag 'for-6.8-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
        btrfs: don't refill whole delayed refs block reserve when starting transaction
        btrfs: zoned: fix chunk map leak when loading block group zone info
        btrfs: reject encoded write if inode has nodatasum flag set
        btrfs: don't reserve space for checksums when writing to nocow files
        btrfs: add new unused block groups to the list of unused block groups
        btrfs: do not delete unused block group if it may be used soon
        btrfs: add and use helper to check if block group is used
        btrfs: don't drop extent_map for free space inode on write error
      1f3a3e2a
    • Linus Torvalds's avatar
      Merge tag 'linux_kselftest-kunit-fixes-6.8-rc5' of... · 91f842ff
      Linus Torvalds authored
      Merge tag 'linux_kselftest-kunit-fixes-6.8-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
      
      Pull KUnit fix from Shuah Khan:
       "One important fix to unregister kunit_bus when KUnit module is
        unloaded.
      
        Not doing so causes an error when KUnit module tries to re-register
        the bus when it gets reloaded"
      
      * tag 'linux_kselftest-kunit-fixes-6.8-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
        kunit: device: Unregister the kunit_bus on shutdown
      91f842ff
  3. 13 Feb, 2024 5 commits
    • Filipe Manana's avatar
      btrfs: don't refill whole delayed refs block reserve when starting transaction · 2f6397e4
      Filipe Manana authored
      Since commit 28270e25 ("btrfs: always reserve space for delayed refs
      when starting transaction") we started not only to reserve metadata space
      for the delayed refs a caller of btrfs_start_transaction() might generate
      but also to try to fully refill the delayed refs block reserve, because
      there are several case where we generate delayed refs and haven't reserved
      space for them, relying on the global block reserve. Relying too much on
      the global block reserve is not always safe, and can result in hitting
      -ENOSPC during transaction commits or worst, in rare cases, being unable
      to mount a filesystem that needs to do orphan cleanup or anything that
      requires modifying the filesystem during mount, and has no more
      unallocated space and the metadata space is nearly full. This was
      explained in detail in that commit's change log.
      
      However the gap between the reserved amount and the size of the delayed
      refs block reserve can be huge, so attempting to reserve space for such
      a gap can result in allocating many metadata block groups that end up
      not being used. After a recent patch, with the subject:
      
        "btrfs: add new unused block groups to the list of unused block groups"
      
      We started to add new block groups that are unused to the list of unused
      block groups, to avoid having them around for a very long time in case
      they are never used, because a block group is only added to the list of
      unused block groups when we deallocate the last extent or when mounting
      the filesystem and the block group has 0 bytes used. This is not a problem
      introduced by the commit mentioned earlier, it always existed as our
      metadata space reservations are, most of the time, pessimistic and end up
      not using all the space they reserved, so we can occasionally end up with
      one or two unused metadata block groups for a long period. However after
      that commit mentioned earlier, we are just more pessimistic in the
      metadata space reservations when starting a transaction and therefore the
      issue is more likely to happen.
      
      This however is not always enough because we might create unused metadata
      block groups when reserving metadata space at a high rate if there's
      always a gap in the delayed refs block reserve and the cleaner kthread
      isn't triggered often enough or is busy with other work (running delayed
      iputs, cleaning deleted roots, etc), not to mention the block group's
      allocated space is only usable for a new block group after the transaction
      used to remove it is committed.
      
      A user reported that he's getting a lot of allocated metadata block groups
      but the usage percentage of metadata space was very low compared to the
      total allocated space, specially after running a series of block group
      relocations.
      
      So for now stop trying to refill the gap in the delayed refs block reserve
      and reserve space only for the delayed refs we are expected to generate
      when starting a transaction.
      
      CC: stable@vger.kernel.org # 6.7+
      Reported-by: default avatarIvan Shapovalov <intelfx@intelfx.name>
      Link: https://lore.kernel.org/linux-btrfs/9cdbf0ca9cdda1b4c84e15e548af7d7f9f926382.camel@intelfx.name/
      Link: https://lore.kernel.org/linux-btrfs/CAL3q7H6802ayLHUJFztzZAVzBLJAGdFx=6FHNNy87+obZXXZpQ@mail.gmail.com/Tested-by: default avatarIvan Shapovalov <intelfx@intelfx.name>
      Reported-by: default avatarHeddxh <g311571057@gmail.com>
      Link: https://lore.kernel.org/linux-btrfs/CAE93xANEby6RezOD=zcofENYZOT-wpYygJyauyUAZkLv6XVFOA@mail.gmail.com/Reviewed-by: default avatarJosef Bacik <josef@toxicpanda.com>
      Signed-off-by: default avatarFilipe Manana <fdmanana@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      2f6397e4
    • Filipe Manana's avatar
      btrfs: zoned: fix chunk map leak when loading block group zone info · 88e81a67
      Filipe Manana authored
      At btrfs_load_block_group_zone_info() we never drop a reference on the
      chunk map we have looked up, therefore leaking a reference on it. So
      add the missing btrfs_free_chunk_map() at the end of the function.
      
      Fixes: 7dc66abb ("btrfs: use a dedicated data structure for chunk maps")
      Reported-by: default avatarJohannes Thumshirn <johannes.thumshirn@wdc.com>
      Reviewed-by: default avatarJohannes Thumshirn <johannes.thumshirn@wdc.com>
      Tested-by: default avatarJohannes Thumshirn <johannes.thumshirn@wdc.com>
      Reviewed-by: default avatarAnand Jain <anand.jain@oracle.com>
      Signed-off-by: default avatarFilipe Manana <fdmanana@suse.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      88e81a67
    • Filipe Manana's avatar
      btrfs: reject encoded write if inode has nodatasum flag set · 1bd96c92
      Filipe Manana authored
      Currently we allow an encoded write against inodes that have the NODATASUM
      flag set, either because they are NOCOW files or they were created while
      the filesystem was mounted with "-o nodatasum". This results in having
      compressed extents without corresponding checksums, which is a filesystem
      inconsistency reported by 'btrfs check'.
      
      For example, running btrfs/281 with MOUNT_OPTIONS="-o nodatacow" triggers
      this and 'btrfs check' errors out with:
      
         [1/7] checking root items
         [2/7] checking extents
         [3/7] checking free space tree
         [4/7] checking fs roots
         root 256 inode 257 errors 1040, bad file extent, some csum missing
         root 256 inode 258 errors 1040, bad file extent, some csum missing
         ERROR: errors found in fs roots
         (...)
      
      So reject encoded writes if the target inode has NODATASUM set.
      
      CC: stable@vger.kernel.org # 6.1+
      Reviewed-by: default avatarJohannes Thumshirn <johannes.thumshirn@wdc.com>
      Signed-off-by: default avatarFilipe Manana <fdmanana@suse.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      1bd96c92
    • Filipe Manana's avatar
      btrfs: don't reserve space for checksums when writing to nocow files · feefe1f4
      Filipe Manana authored
      Currently when doing a write to a file we always reserve metadata space
      for inserting data checksums. However we don't need to do it if we have
      a nodatacow file (-o nodatacow mount option or chattr +C) or if checksums
      are disabled (-o nodatasum mount option), as in that case we are only
      adding unnecessary pressure to metadata reservations.
      
      For example on x86_64, with the default node size of 16K, a 4K buffered
      write into a nodatacow file is reserving 655360 bytes of metadata space,
      as it's accounting for checksums. After this change, which stops reserving
      space for checksums if we have a nodatacow file or checksums are disabled,
      we only need to reserve 393216 bytes of metadata.
      
      CC: stable@vger.kernel.org # 6.1+
      Signed-off-by: default avatarFilipe Manana <fdmanana@suse.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      feefe1f4
    • Linus Torvalds's avatar
      Merge tag 'trace-tools-v6.8-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace · 7e90b5c2
      Linus Torvalds authored
      Pull tracing tooling fixes from Steven Rostedt:
       "RTLA:
      
         - rtla tools are exiting with a positive value when usage() is
           called. Make them return 0 if the usage was called via -h/--help
      
         - the -P priority sets the sched priority for rtla workload. When the
           SCHED_OTHER scheduler is selected, it sets the rt_priority instead
           of the nice parameter. Setting the nice value is the correct thing,
           so fix it
      
         - rtla is failing to compile with clang due to unsupported options
           from gcc. Adjusting the compiler/linker options makes clang work
           properly
      
         - Remove the sched_getattr() unused function on utils.c
      
         - Fixes for variable initialization and size, reported by clang
      
        Verification:
      
         - rv is failing to compile with clang due to unsupported options from
           gcc. Adjusting the compiler/linker options makes clang work
           properly
      
         - Fix an uninitialized variable on in_kernel.c reported by clang"
      
      * tag 'trace-tools-v6.8-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
        tools/rtla: Exit with EXIT_SUCCESS when help is invoked
        tools/rtla: Replace setting prio with nice for SCHED_OTHER
        tools/rv: Fix curr_reactor uninitialized variable
        tools/rv: Fix Makefile compiler options for clang
        tools/rtla: Remove unused sched_getattr() function
        tools/rtla: Fix clang warning about mount_point var size
        tools/rtla: Fix uninitialized bucket/data->bucket_size warning
        tools/rtla: Fix Makefile compiler options for clang
      7e90b5c2
  4. 12 Feb, 2024 10 commits
    • Linus Torvalds's avatar
      Merge tag 'docs-6.8-fixes2' of git://git.lwn.net/linux · c664e16b
      Linus Torvalds authored
      Pull documentation fix from Jonathan Corbet:
       "A single fix to the kernel_feat extension for a bug that will crash
        the docs build in some situations"
      
      * tag 'docs-6.8-fixes2' of git://git.lwn.net/linux:
        docs: kernel_feat.py: fix build error for missing files
      c664e16b
    • Linus Torvalds's avatar
      Merge tag 'vfs-6.8-rc5.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs · 716f4aaa
      Linus Torvalds authored
      Pull vfs fixes from Christian Brauner:
      
       - Fix performance regression introduced by moving the security
         permission hook out of do_clone_file_range() and into its caller
         vfs_clone_file_range().
      
         This causes the security hook to be called in situation were it
         wasn't called before as the fast permission checks were left in
         do_clone_file_range().
      
         Fix this by merging the two implementations back together and
         restoring the old ordering: fast permission checks first, expensive
         ones later.
      
       - Tweak mount_setattr() permission checking so that mount properties on
         the real rootfs can be changed.
      
         When we added mount_setattr() we added additional checks compared to
         legacy mount(2). If the mount had a parent then verify that the
         caller and the mount namespace the mount is attached to match and if
         not make sure that it's an anonymous mount.
      
         But the real rootfs falls into neither category. It is neither an
         anoymous mount because it is obviously attached to the initial mount
         namespace but it also obviously doesn't have a parent mount. So that
         means legacy mount(2) allows changing mount properties on the real
         rootfs but mount_setattr(2) blocks this. This causes regressions (See
         the commit for details).
      
         Fix this by relaxing the check. If the mount has a parent or if it
         isn't a detached mount, verify that the mount namespaces of the
         caller and the mount are the same. Technically, we could probably
         write this even simpler and check that the mount namespaces match if
         it isn't a detached mount. But the slightly longer check makes it
         clearer what conditions one needs to think about.
      
      * tag 'vfs-6.8-rc5.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
        fs: relax mount_setattr() permission checks
        remap_range: merge do_clone_file_range() into vfs_clone_file_range()
      716f4aaa
    • John Kacur's avatar
      tools/rtla: Exit with EXIT_SUCCESS when help is invoked · b5f31936
      John Kacur authored
      Fix rtla so that the following commands exit with 0 when help is invoked
      
      rtla osnoise top -h
      rtla osnoise hist -h
      rtla timerlat top -h
      rtla timerlat hist -h
      
      Link: https://lore.kernel.org/linux-trace-devel/20240203001607.69703-1-jkacur@redhat.com
      
      Cc: stable@vger.kernel.org
      Fixes: 1eeb6328 ("rtla/timerlat: Add timerlat hist mode")
      Signed-off-by: default avatarJohn Kacur <jkacur@redhat.com>
      Signed-off-by: default avatarDaniel Bristot de Oliveira <bristot@kernel.org>
      b5f31936
    • limingming3's avatar
      tools/rtla: Replace setting prio with nice for SCHED_OTHER · 14f08c97
      limingming3 authored
      Since the sched_priority for SCHED_OTHER is always 0, it makes no
      sence to set it.
      Setting nice for SCHED_OTHER seems more meaningful.
      
      Link: https://lkml.kernel.org/r/20240207065142.1753909-1-limingming3@lixiang.com
      
      Cc: stable@vger.kernel.org
      Fixes: b1696371 ("rtla: Helper functions for rtla")
      Signed-off-by: default avatarlimingming3 <limingming3@lixiang.com>
      Signed-off-by: default avatarDaniel Bristot de Oliveira <bristot@kernel.org>
      14f08c97
    • Daniel Bristot de Oliveira's avatar
      tools/rv: Fix curr_reactor uninitialized variable · 61ec586b
      Daniel Bristot de Oliveira authored
      clang is reporting:
      
      $ make HOSTCC=clang CC=clang LLVM_IAS=1
      
      clang -O -g -DVERSION=\"6.8.0-rc3\" -flto=auto -fexceptions
      	-fstack-protector-strong -fasynchronous-unwind-tables
      	-fstack-clash-protection  -Wall -Werror=format-security
      	-Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS
      	$(pkg-config --cflags libtracefs)  -I include
      	-c -o src/in_kernel.o src/in_kernel.c
      [...]
      
      src/in_kernel.c:227:6: warning: variable 'curr_reactor' is used uninitialized whenever 'if' condition is true [-Wsometimes-uninitialized]
        227 |         if (!end)
            |             ^~~~
      src/in_kernel.c:242:9: note: uninitialized use occurs here
        242 |         return curr_reactor;
            |                ^~~~~~~~~~~~
      src/in_kernel.c:227:2: note: remove the 'if' if its condition is always false
        227 |         if (!end)
            |         ^~~~~~~~~
        228 |                 goto out_free;
            |                 ~~~~~~~~~~~~~
      src/in_kernel.c:221:6: warning: variable 'curr_reactor' is used uninitialized whenever 'if' condition is true [-Wsometimes-uninitialized]
        221 |         if (!start)
            |             ^~~~~~
      src/in_kernel.c:242:9: note: uninitialized use occurs here
        242 |         return curr_reactor;
            |                ^~~~~~~~~~~~
      src/in_kernel.c:221:2: note: remove the 'if' if its condition is always false
        221 |         if (!start)
            |         ^~~~~~~~~~~
        222 |                 goto out_free;
            |                 ~~~~~~~~~~~~~
      src/in_kernel.c:215:20: note: initialize the variable 'curr_reactor' to silence this warning
        215 |         char *curr_reactor;
            |                           ^
            |                            = NULL
      2 warnings generated.
      
      Which is correct. Setting curr_reactor to NULL avoids the problem.
      
      Link: https://lkml.kernel.org/r/3a35551149e5ee0cb0950035afcb8082c3b5d05b.1707217097.git.bristot@kernel.org
      
      Cc: stable@vger.kernel.org
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Nathan Chancellor <nathan@kernel.org>
      Cc: Nick Desaulniers <ndesaulniers@google.com>
      Cc: Bill Wendling <morbo@google.com>
      Cc: Justin Stitt <justinstitt@google.com>
      Cc: Donald Zickus <dzickus@redhat.com>
      Fixes: 6d60f896 ("tools/rv: Add in-kernel monitor interface")
      Signed-off-by: default avatarDaniel Bristot de Oliveira <bristot@kernel.org>
      61ec586b
    • Daniel Bristot de Oliveira's avatar
      tools/rv: Fix Makefile compiler options for clang · f9b2c871
      Daniel Bristot de Oliveira authored
      The following errors are showing up when compiling rv with clang:
      
       $ make HOSTCC=clang CC=clang LLVM_IAS=1
       [...]
        clang -O -g -DVERSION=\"6.8.0-rc1\" -flto=auto -ffat-lto-objects
        -fexceptions -fstack-protector-strong -fasynchronous-unwind-tables
        -fstack-clash-protection  -Wall -Werror=format-security
        -Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS
        -Wno-maybe-uninitialized $(pkg-config --cflags libtracefs)
        -I include   -c -o src/utils.o src/utils.c
        clang: warning: optimization flag '-ffat-lto-objects' is not supported [-Wignored-optimization-argument]
        warning: unknown warning option '-Wno-maybe-uninitialized'; did you mean '-Wno-uninitialized'? [-Wunknown-warning-option]
        1 warning generated.
      
        clang -o rv -ggdb  src/in_kernel.o src/rv.o src/trace.o src/utils.o $(pkg-config --libs libtracefs)
        src/in_kernel.o: file not recognized: file format not recognized
        clang: error: linker command failed with exit code 1 (use -v to see invocation)
        make: *** [Makefile:110: rv] Error 1
      
      Solve these issues by:
        - removing -ffat-lto-objects and -Wno-maybe-uninitialized if using clang
        - informing the linker about -flto=auto
      
      Link: https://lkml.kernel.org/r/ed94a8ddc2ca8c8ef663cfb7ae9dd196c4a66b33.1707217097.git.bristot@kernel.org
      
      Cc: stable@vger.kernel.org
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Nathan Chancellor <nathan@kernel.org>
      Cc: Nick Desaulniers <ndesaulniers@google.com>
      Cc: Bill Wendling <morbo@google.com>
      Cc: Justin Stitt <justinstitt@google.com>
      Fixes: 4bc4b131 ("rv: Add rv tool")
      Suggested-by: default avatarDonald Zickus <dzickus@redhat.com>
      Signed-off-by: default avatarDaniel Bristot de Oliveira <bristot@kernel.org>
      f9b2c871
    • Daniel Bristot de Oliveira's avatar
      tools/rtla: Remove unused sched_getattr() function · 084ce16d
      Daniel Bristot de Oliveira authored
      Clang is reporting:
      
      $ make HOSTCC=clang CC=clang LLVM_IAS=1
      [...]
      clang -O -g -DVERSION=\"6.8.0-rc3\" -flto=auto -fexceptions -fstack-protector-strong -fasynchronous-unwind-tables -fstack-clash-protection  -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS $(pkg-config --cflags libtracefs)    -c -o src/utils.o src/utils.c
      src/utils.c:241:19: warning: unused function 'sched_getattr' [-Wunused-function]
        241 | static inline int sched_getattr(pid_t pid, struct sched_attr *attr,
            |                   ^~~~~~~~~~~~~
      1 warning generated.
      
      Which is correct, so remove the unused function.
      
      Link: https://lkml.kernel.org/r/eaed7ba122c4ae88ce71277c824ef41cbf789385.1707217097.git.bristot@kernel.org
      
      Cc: stable@vger.kernel.org
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Nathan Chancellor <nathan@kernel.org>
      Cc: Nick Desaulniers <ndesaulniers@google.com>
      Cc: Bill Wendling <morbo@google.com>
      Cc: Justin Stitt <justinstitt@google.com>
      Cc: Donald Zickus <dzickus@redhat.com>
      Fixes: b1696371 ("rtla: Helper functions for rtla")
      Signed-off-by: default avatarDaniel Bristot de Oliveira <bristot@kernel.org>
      084ce16d
    • Daniel Bristot de Oliveira's avatar
      tools/rtla: Fix clang warning about mount_point var size · 30369084
      Daniel Bristot de Oliveira authored
      clang is reporting this warning:
      
      $ make HOSTCC=clang CC=clang LLVM_IAS=1
      [...]
      clang -O -g -DVERSION=\"6.8.0-rc3\" -flto=auto -fexceptions
      	-fstack-protector-strong -fasynchronous-unwind-tables
      	-fstack-clash-protection  -Wall -Werror=format-security
      	-Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS
      	$(pkg-config --cflags libtracefs)    -c -o src/utils.o src/utils.c
      
      src/utils.c:548:66: warning: 'fscanf' may overflow; destination buffer in argument 3 has size 1024, but the corresponding specifier may require size 1025 [-Wfortify-source]
        548 |         while (fscanf(fp, "%*s %" STR(MAX_PATH) "s %99s %*s %*d %*d\n", mount_point, type) == 2) {
            |                                                                         ^
      
      Increase mount_point variable size to MAX_PATH+1 to avoid the overflow.
      
      Link: https://lkml.kernel.org/r/1b46712e93a2f4153909514a36016959dcc4021c.1707217097.git.bristot@kernel.org
      
      Cc: stable@vger.kernel.org
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Nathan Chancellor <nathan@kernel.org>
      Cc: Nick Desaulniers <ndesaulniers@google.com>
      Cc: Bill Wendling <morbo@google.com>
      Cc: Justin Stitt <justinstitt@google.com>
      Cc: Donald Zickus <dzickus@redhat.com>
      Fixes: a957cbc0 ("rtla: Add -C cgroup support")
      Signed-off-by: default avatarDaniel Bristot de Oliveira <bristot@kernel.org>
      30369084
    • Daniel Bristot de Oliveira's avatar
      tools/rtla: Fix uninitialized bucket/data->bucket_size warning · 64dc40f7
      Daniel Bristot de Oliveira authored
      When compiling rtla with clang, I am getting the following warnings:
      
      $ make HOSTCC=clang CC=clang LLVM_IAS=1
      
      [..]
      clang -O -g -DVERSION=\"6.8.0-rc3\" -flto=auto -fexceptions
      	-fstack-protector-strong -fasynchronous-unwind-tables
      	-fstack-clash-protection  -Wall -Werror=format-security
      	-Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS
      	$(pkg-config --cflags libtracefs)
      	-c -o src/osnoise_hist.o src/osnoise_hist.c
      src/osnoise_hist.c:138:6: warning: variable 'bucket' is used uninitialized whenever 'if' condition is false [-Wsometimes-uninitialized]
        138 |         if (data->bucket_size)
            |             ^~~~~~~~~~~~~~~~~
      src/osnoise_hist.c:149:6: note: uninitialized use occurs here
        149 |         if (bucket < entries)
            |             ^~~~~~
      src/osnoise_hist.c:138:2: note: remove the 'if' if its condition is always true
        138 |         if (data->bucket_size)
            |         ^~~~~~~~~~~~~~~~~~~~~~
        139 |                 bucket = duration / data->bucket_size;
      src/osnoise_hist.c:132:12: note: initialize the variable 'bucket' to silence this warning
        132 |         int bucket;
            |                   ^
            |                    = 0
      1 warning generated.
      
      [...]
      
      clang -O -g -DVERSION=\"6.8.0-rc3\" -flto=auto -fexceptions
      	-fstack-protector-strong -fasynchronous-unwind-tables
      	-fstack-clash-protection  -Wall -Werror=format-security
      	-Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS
      	$(pkg-config --cflags libtracefs)
      	-c -o src/timerlat_hist.o src/timerlat_hist.c
      src/timerlat_hist.c:181:6: warning: variable 'bucket' is used uninitialized whenever 'if' condition is false [-Wsometimes-uninitialized]
        181 |         if (data->bucket_size)
            |             ^~~~~~~~~~~~~~~~~
      src/timerlat_hist.c:204:6: note: uninitialized use occurs here
        204 |         if (bucket < entries)
            |             ^~~~~~
      src/timerlat_hist.c:181:2: note: remove the 'if' if its condition is always true
        181 |         if (data->bucket_size)
            |         ^~~~~~~~~~~~~~~~~~~~~~
        182 |                 bucket = latency / data->bucket_size;
      src/timerlat_hist.c:175:12: note: initialize the variable 'bucket' to silence this warning
        175 |         int bucket;
            |                   ^
            |                    = 0
      1 warning generated.
      
      This is a legit warning, but data->bucket_size is always > 0 (see
      timerlat_hist_parse_args()), so the if is not necessary.
      
      Remove the unneeded if (data->bucket_size) to avoid the warning.
      
      Link: https://lkml.kernel.org/r/6e1b1665cd99042ae705b3e0fc410858c4c42346.1707217097.git.bristot@kernel.org
      
      Cc: stable@vger.kernel.org
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Nathan Chancellor <nathan@kernel.org>
      Cc: Nick Desaulniers <ndesaulniers@google.com>
      Cc: Bill Wendling <morbo@google.com>
      Cc: Justin Stitt <justinstitt@google.com>
      Cc: Donald Zickus <dzickus@redhat.com>
      Fixes: 1eeb6328 ("rtla/timerlat: Add timerlat hist mode")
      Fixes: 829a6c0b ("rtla/osnoise: Add the hist mode")
      Signed-off-by: default avatarDaniel Bristot de Oliveira <bristot@kernel.org>
      64dc40f7
    • Daniel Bristot de Oliveira's avatar
      tools/rtla: Fix Makefile compiler options for clang · bc4cbc9d
      Daniel Bristot de Oliveira authored
      The following errors are showing up when compiling rtla with clang:
      
       $ make HOSTCC=clang CC=clang LLVM_IAS=1
       [...]
      
        clang -O -g -DVERSION=\"6.8.0-rc1\" -flto=auto -ffat-lto-objects
      	-fexceptions -fstack-protector-strong
      	-fasynchronous-unwind-tables -fstack-clash-protection  -Wall
      	-Werror=format-security -Wp,-D_FORTIFY_SOURCE=2
      	-Wp,-D_GLIBCXX_ASSERTIONS -Wno-maybe-uninitialized
      	$(pkg-config --cflags libtracefs)    -c -o src/utils.o src/utils.c
      
        clang: warning: optimization flag '-ffat-lto-objects' is not supported [-Wignored-optimization-argument]
        warning: unknown warning option '-Wno-maybe-uninitialized'; did you mean '-Wno-uninitialized'? [-Wunknown-warning-option]
        1 warning generated.
      
        clang -o rtla -ggdb  src/osnoise.o src/osnoise_hist.o src/osnoise_top.o
        src/rtla.o src/timerlat_aa.o src/timerlat.o src/timerlat_hist.o
        src/timerlat_top.o src/timerlat_u.o src/trace.o src/utils.o $(pkg-config --libs libtracefs)
      
        src/osnoise.o: file not recognized: file format not recognized
        clang: error: linker command failed with exit code 1 (use -v to see invocation)
        make: *** [Makefile:110: rtla] Error 1
      
      Solve these issues by:
        - removing -ffat-lto-objects and -Wno-maybe-uninitialized if using clang
        - informing the linker about -flto=auto
      
      Link: https://lore.kernel.org/linux-trace-kernel/567ac1b94effc228ce9a0225b9df7232a9b35b55.1707217097.git.bristot@kernel.org
      
      Cc: stable@vger.kernel.org
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Nathan Chancellor <nathan@kernel.org>
      Cc: Nick Desaulniers <ndesaulniers@google.com>
      Cc: Bill Wendling <morbo@google.com>
      Cc: Justin Stitt <justinstitt@google.com>
      Fixes: 1a7b22ab ("tools/rtla: Build with EXTRA_{C,LD}FLAGS")
      Suggested-by: default avatarDonald Zickus <dzickus@redhat.com>
      Signed-off-by: default avatarDaniel Bristot de Oliveira <bristot@kernel.org>
      bc4cbc9d
  5. 11 Feb, 2024 3 commits
  6. 10 Feb, 2024 8 commits
    • Linus Torvalds's avatar
      Merge tag 'mm-hotfixes-stable-2024-02-10-11-16' of... · 7521f258
      Linus Torvalds authored
      Merge tag 'mm-hotfixes-stable-2024-02-10-11-16' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
      
      Pull misc fixes from Andrew Morton:
       "21 hotfixes. 12 are cc:stable and the remainder pertain to post-6.7
        issues or aren't considered to be needed in earlier kernel versions"
      
      * tag 'mm-hotfixes-stable-2024-02-10-11-16' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (21 commits)
        nilfs2: fix potential bug in end_buffer_async_write
        mm/damon/sysfs-schemes: fix wrong DAMOS tried regions update timeout setup
        nilfs2: fix hang in nilfs_lookup_dirty_data_buffers()
        MAINTAINERS: Leo Yan has moved
        mm/zswap: don't return LRU_SKIP if we have dropped lru lock
        fs,hugetlb: fix NULL pointer dereference in hugetlbs_fill_super
        mailmap: switch email address for John Moon
        mm: zswap: fix objcg use-after-free in entry destruction
        mm/madvise: don't forget to leave lazy MMU mode in madvise_cold_or_pageout_pte_range()
        arch/arm/mm: fix major fault accounting when retrying under per-VMA lock
        selftests: core: include linux/close_range.h for CLOSE_RANGE_* macros
        mm/memory-failure: fix crash in split_huge_page_to_list from soft_offline_page
        mm: memcg: optimize parent iteration in memcg_rstat_updated()
        nilfs2: fix data corruption in dsync block recovery for small block sizes
        mm/userfaultfd: UFFDIO_MOVE implementation should use ptep_get()
        exit: wait_task_zombie: kill the no longer necessary spin_lock_irq(siglock)
        fs/proc: do_task_stat: use sig->stats_lock to gather the threads/children stats
        fs/proc: do_task_stat: move thread_group_cputime_adjusted() outside of lock_task_sighand()
        getrusage: use sig->stats_lock rather than lock_task_sighand()
        getrusage: move thread_group_cputime_adjusted() outside of lock_task_sighand()
        ...
      7521f258
    • Linus Torvalds's avatar
      Merge tag 'block-6.8-2024-02-10' of git://git.kernel.dk/linux · a5b6244c
      Linus Torvalds authored
      Pull block fixes from Jens Axboe:
      
       - NVMe pull request via Keith:
           - Update a potentially stale firmware attribute (Maurizio)
           - Fixes for the recent verbose error logging (Keith, Chaitanya)
           - Protection information payload size fix for passthrough (Francis)
      
       - Fix for a queue freezing issue in virtblk (Yi)
      
       - blk-iocost underflow fix (Tejun)
      
       - blk-wbt task detection fix (Jan)
      
      * tag 'block-6.8-2024-02-10' of git://git.kernel.dk/linux:
        virtio-blk: Ensure no requests in virtqueues before deleting vqs.
        blk-iocost: Fix an UBSAN shift-out-of-bounds warning
        nvme: use ns->head->pi_size instead of t10_pi_tuple structure size
        nvme-core: fix comment to reflect right functions
        nvme: move passthrough logging attribute to head
        blk-wbt: Fix detection of dirty-throttled tasks
        nvme-host: fix the updating of the firmware version
      a5b6244c
    • Linus Torvalds's avatar
      Merge tag 'firewire-fixes-6.8-rc4' of... · a38ff5bb
      Linus Torvalds authored
      Merge tag 'firewire-fixes-6.8-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394
      
      Pull firewire fix from Takashi Sakamoto:
       "A change to accelerate the device detection step in some cases.
      
        In the self-identification step after bus-reset, all nodes in the same
        bus broadcast selfID packet including the value of gap count. The
        value is related to the cable hops between nodes, and used to
        calculate the subaction gap and the arbitration reset gap.
      
        When each node has the different value of the gap count, the
        asynchronous communication between them is unreliable, since an
        asynchronous transaction could be interrupted by another asynchronous
        transaction before completion. The gap count inconsistency can be
        resolved by several ways; e.g. the transfer of PHY configuration
        packet and generation of bus-reset.
      
        The current implementation of firewire stack can correctly detect the
        gap count inconsistency, however the recovery action from the
        inconsistency tends to be delayed after reading configuration ROM of
        root node. This results in the long time to probe devices in some
        combinations of hardware.
      
        Here the stack is changed to schedule the action as soon as possible"
      
      * tag 'firewire-fixes-6.8-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394:
        firewire: core: send bus reset promptly on gap count error
      a38ff5bb
    • Linus Torvalds's avatar
      Merge tag '6.8-rc3-ksmbd-server-fixes' of git://git.samba.org/ksmbd · 5a7ec870
      Linus Torvalds authored
      Pull smb server fixes from Steve French:
       "Two ksmbd server fixes:
      
         - memory leak fix
      
         - a minor kernel-doc fix"
      
      * tag '6.8-rc3-ksmbd-server-fixes' of git://git.samba.org/ksmbd:
        ksmbd: free aux buffer if ksmbd_iov_pin_rsp_read fails
        ksmbd: Add kernel-doc for ksmbd_extract_sharename() function
      5a7ec870
    • Linus Torvalds's avatar
      Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi · 4a7bbe75
      Linus Torvalds authored
      Pull SCSI fixes from James Bottomley:
       "Three small driver fixes and one core fix.
      
        The core fix being a fixup to the one in the last pull request which
        didn't entirely move checking of scsi_host_busy() out from under the
        host lock"
      
      * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
        scsi: ufs: core: Remove the ufshcd_release() in ufshcd_err_handling_prepare()
        scsi: ufs: core: Fix shift issue in ufshcd_clear_cmd()
        scsi: lpfc: Use unsigned type for num_sge
        scsi: core: Move scsi_host_busy() out of host lock if it is for per-command
      4a7bbe75
    • Linus Torvalds's avatar
      Merge tag '6.8-rc3-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6 · ca00c700
      Linus Torvalds authored
      Pull smb client fixes from Steve French:
      
       - reconnect fix
      
       - multichannel channel selection fix
      
       - minor mount warning fix
      
       - reparse point fix
      
       - null pointer check improvement
      
      * tag '6.8-rc3-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
        smb3: clarify mount warning
        cifs: handle cases where multiple sessions share connection
        cifs: change tcon status when need_reconnect is set on it
        smb: client: set correct d_type for reparse points under DFS mounts
        smb3: add missing null server pointer check
      ca00c700
    • Linus Torvalds's avatar
      Merge tag 'ceph-for-6.8-rc4' of https://github.com/ceph/ceph-client · e1e3f530
      Linus Torvalds authored
      Pull ceph fixes from Ilya Dryomov:
       "Some fscrypt-related fixups (sparse reads are used only for encrypted
        files) and two cap handling fixes from Xiubo and Rishabh"
      
      * tag 'ceph-for-6.8-rc4' of https://github.com/ceph/ceph-client:
        ceph: always check dir caps asynchronously
        ceph: prevent use-after-free in encode_cap_msg()
        ceph: always set initial i_blkbits to CEPH_FSCRYPT_BLOCK_SHIFT
        libceph: just wait for more data to be available on the socket
        libceph: rename read_sparse_msg_*() to read_partial_sparse_msg_*()
        libceph: fail sparse-read if the data length doesn't match
      e1e3f530
    • Linus Torvalds's avatar
      Merge tag 'ntfs3_for_6.8' of https://github.com/Paragon-Software-Group/linux-ntfs3 · a2343df3
      Linus Torvalds authored
      Pull ntfs3 fixes from Konstantin Komarov:
       "Fixed:
         - size update for compressed file
         - some logic errors, overflows
         - memory leak
         - some code was refactored
      
        Added:
         - implement super_operations::shutdown
      
        Improved:
         - alternative boot processing
         - reduced stack usage"
      
      * tag 'ntfs3_for_6.8' of https://github.com/Paragon-Software-Group/linux-ntfs3: (28 commits)
        fs/ntfs3: Slightly simplify ntfs_inode_printk()
        fs/ntfs3: Add ioctl operation for directories (FITRIM)
        fs/ntfs3: Fix oob in ntfs_listxattr
        fs/ntfs3: Fix an NULL dereference bug
        fs/ntfs3: Update inode->i_size after success write into compressed file
        fs/ntfs3: Fixed overflow check in mi_enum_attr()
        fs/ntfs3: Correct function is_rst_area_valid
        fs/ntfs3: Use i_size_read and i_size_write
        fs/ntfs3: Prevent generic message "attempt to access beyond end of device"
        fs/ntfs3: use non-movable memory for ntfs3 MFT buffer cache
        fs/ntfs3: Use kvfree to free memory allocated by kvmalloc
        fs/ntfs3: Disable ATTR_LIST_ENTRY size check
        fs/ntfs3: Fix c/mtime typo
        fs/ntfs3: Add NULL ptr dereference checking at the end of attr_allocate_frame()
        fs/ntfs3: Add and fix comments
        fs/ntfs3: ntfs3_forced_shutdown use int instead of bool
        fs/ntfs3: Implement super_operations::shutdown
        fs/ntfs3: Drop suid and sgid bits as a part of fpunch
        fs/ntfs3: Add file_modified
        fs/ntfs3: Correct use bh_read
        ...
      a2343df3
  7. 09 Feb, 2024 11 commits
    • Linus Torvalds's avatar
      work around gcc bugs with 'asm goto' with outputs · 4356e9f8
      Linus Torvalds authored
      We've had issues with gcc and 'asm goto' before, and we created a
      'asm_volatile_goto()' macro for that in the past: see commits
      3f0116c3 ("compiler/gcc4: Add quirk for 'asm goto' miscompilation
      bug") and a9f18034 ("compiler/gcc4: Make quirk for
      asm_volatile_goto() unconditional").
      
      Then, much later, we ended up removing the workaround in commit
      43c249ea ("compiler-gcc.h: remove ancient workaround for gcc PR
      58670") because we no longer supported building the kernel with the
      affected gcc versions, but we left the macro uses around.
      
      Now, Sean Christopherson reports a new version of a very similar
      problem, which is fixed by re-applying that ancient workaround.  But the
      problem in question is limited to only the 'asm goto with outputs'
      cases, so instead of re-introducing the old workaround as-is, let's
      rename and limit the workaround to just that much less common case.
      
      It looks like there are at least two separate issues that all hit in
      this area:
      
       (a) some versions of gcc don't mark the asm goto as 'volatile' when it
           has outputs:
      
              https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98619
              https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110420
      
           which is easy to work around by just adding the 'volatile' by hand.
      
       (b) Internal compiler errors:
      
              https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110422
      
           which are worked around by adding the extra empty 'asm' as a
           barrier, as in the original workaround.
      
      but the problem Sean sees may be a third thing since it involves bad
      code generation (not an ICE) even with the manually added 'volatile'.
      
      but the same old workaround works for this case, even if this feels a
      bit like voodoo programming and may only be hiding the issue.
      Reported-and-tested-by: default avatarSean Christopherson <seanjc@google.com>
      Link: https://lore.kernel.org/all/20240208220604.140859-1-seanjc@google.com/
      Cc: Nick Desaulniers <ndesaulniers@google.com>
      Cc: Uros Bizjak <ubizjak@gmail.com>
      Cc: Jakub Jelinek <jakub@redhat.com>
      Cc: Andrew Pinski <quic_apinski@quicinc.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      4356e9f8
    • Steve French's avatar
      smb3: clarify mount warning · a5cc98eb
      Steve French authored
      When a user tries to use the "sec=krb5p" mount parameter to encrypt
      data on connection to a server (when authenticating with Kerberos), we
      indicate that it is not supported, but do not note the equivalent
      recommended mount parameter ("sec=krb5,seal") which turns on encryption
      for that mount (and uses Kerberos for auth).  Update the warning message.
      Reviewed-by: default avatarShyam Prasad N <sprasad@microsoft.com>
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      a5cc98eb
    • Shyam Prasad N's avatar
      cifs: handle cases where multiple sessions share connection · a39c757b
      Shyam Prasad N authored
      Based on our implementation of multichannel, it is entirely
      possible that a server struct may not be found in any channel
      of an SMB session.
      
      In such cases, we should be prepared to move on and search for
      the server struct in the next session.
      Signed-off-by: default avatarShyam Prasad N <sprasad@microsoft.com>
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      a39c757b
    • Shyam Prasad N's avatar
      cifs: change tcon status when need_reconnect is set on it · c6e02eef
      Shyam Prasad N authored
      When a tcon is marked for need_reconnect, the intention
      is to have it reconnected.
      
      This change adjusts tcon->status in cifs_tree_connect
      when need_reconnect is set. Also, this change has a minor
      correction in resetting need_reconnect on success. It makes
      sure that it is done with tc_lock held.
      Signed-off-by: default avatarShyam Prasad N <sprasad@microsoft.com>
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      c6e02eef
    • Filipe Manana's avatar
      btrfs: add new unused block groups to the list of unused block groups · 12c5128f
      Filipe Manana authored
      Space reservations for metadata are, most of the time, pessimistic as we
      reserve space for worst possible cases - where tree heights are at the
      maximum possible height (8), we need to COW every extent buffer in a tree
      path, need to split extent buffers, etc.
      
      For data, we generally reserve the exact amount of space we are going to
      allocate. The exception here is when using compression, in which case we
      reserve space matching the uncompressed size, as the compression only
      happens at writeback time and in the worst possible case we need that
      amount of space in case the data is not compressible.
      
      This means that when there's not available space in the corresponding
      space_info object, we may need to allocate a new block group, and then
      that block group might not be used after all. In this case the block
      group is never added to the list of unused block groups and ends up
      never being deleted - except if we unmount and mount again the fs, as
      when reading block groups from disk we add unused ones to the list of
      unused block groups (fs_info->unused_bgs). Otherwise a block group is
      only added to the list of unused block groups when we deallocate the
      last extent from it, so if no extent is ever allocated, the block group
      is kept around forever.
      
      This also means that if we have a bunch of tasks reserving space in
      parallel we can end up allocating many block groups that end up never
      being used or kept around for too long without being used, which has
      the potential to result in ENOSPC failures in case for example we over
      allocate too many metadata block groups and then end up in a state
      without enough unallocated space to allocate a new data block group.
      
      This is more likely to happen with metadata reservations as of kernel
      6.7, namely since commit 28270e25 ("btrfs: always reserve space for
      delayed refs when starting transaction"), because we started to always
      reserve space for delayed references when starting a transaction handle
      for a non-zero number of items, and also to try to reserve space to fill
      the gap between the delayed block reserve's reserved space and its size.
      
      So to avoid this, when finishing the creation a new block group, add the
      block group to the list of unused block groups if it's still unused at
      that time. This way the next time the cleaner kthread runs, it will delete
      the block group if it's still unused and not needed to satisfy existing
      space reservations.
      Reported-by: default avatarIvan Shapovalov <intelfx@intelfx.name>
      Link: https://lore.kernel.org/linux-btrfs/9cdbf0ca9cdda1b4c84e15e548af7d7f9f926382.camel@intelfx.name/
      CC: stable@vger.kernel.org # 6.7+
      Reviewed-by: default avatarJohannes Thumshirn <johannes.thumshirn@wdc.com>
      Reviewed-by: default avatarJosef Bacik <josef@toxicpanda.com>
      Reviewed-by: default avatarBoris Burkov <boris@bur.io>
      Signed-off-by: default avatarFilipe Manana <fdmanana@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      12c5128f
    • Filipe Manana's avatar
      btrfs: do not delete unused block group if it may be used soon · f4a9f219
      Filipe Manana authored
      Before deleting a block group that is in the list of unused block groups
      (fs_info->unused_bgs), we check if the block group became used before
      deleting it, as extents from it may have been allocated after it was added
      to the list.
      
      However even if the block group was not yet used, there may be tasks that
      have only reserved space and have not yet allocated extents, and they
      might be relying on the availability of the unused block group in order
      to allocate extents. The reservation works first by increasing the
      "bytes_may_use" field of the corresponding space_info object (which may
      first require flushing delayed items, allocating a new block group, etc),
      and only later a task does the actual allocation of extents.
      
      For metadata we usually don't end up using all reserved space, as we are
      pessimistic and typically account for the worst cases (need to COW every
      single node in a path of a tree at maximum possible height, etc). For
      data we usually reserve the exact amount of space we're going to allocate
      later, except when using compression where we always reserve space based
      on the uncompressed size, as compression is only triggered when writeback
      starts so we don't know in advance how much space we'll actually need, or
      if the data is compressible.
      
      So don't delete an unused block group if the total size of its space_info
      object minus the block group's size is less then the sum of used space and
      space that may be used (space_info->bytes_may_use), as that means we have
      tasks that reserved space and may need to allocate extents from the block
      group. In this case, besides skipping the deletion, re-add the block group
      to the list of unused block groups so that it may be reconsidered later,
      in case the tasks that reserved space end up not needing to allocate
      extents from it.
      
      Allowing the deletion of the block group while we have reserved space, can
      result in tasks failing to allocate metadata extents (-ENOSPC) while under
      a transaction handle, resulting in a transaction abort, or failure during
      writeback for the case of data extents.
      
      CC: stable@vger.kernel.org # 6.0+
      Reviewed-by: default avatarJohannes Thumshirn <johannes.thumshirn@wdc.com>
      Reviewed-by: default avatarJosef Bacik <josef@toxicpanda.com>
      Reviewed-by: default avatarBoris Burkov <boris@bur.io>
      Signed-off-by: default avatarFilipe Manana <fdmanana@suse.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      f4a9f219
    • Filipe Manana's avatar
      btrfs: add and use helper to check if block group is used · 1693d544
      Filipe Manana authored
      Add a helper function to determine if a block group is being used and make
      use of it at btrfs_delete_unused_bgs(). This helper will also be used in
      future code changes.
      Reviewed-by: default avatarJohannes Thumshirn <johannes.thumshirn@wdc.com>
      Reviewed-by: default avatarJosef Bacik <josef@toxicpanda.com>
      Reviewed-by: default avatarBoris Burkov <boris@bur.io>
      Signed-off-by: default avatarFilipe Manana <fdmanana@suse.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      1693d544
    • Josef Bacik's avatar
      btrfs: don't drop extent_map for free space inode on write error · 5571e41e
      Josef Bacik authored
      While running the CI for an unrelated change I hit the following panic
      with generic/648 on btrfs_holes_spacecache.
      
      assertion failed: block_start != EXTENT_MAP_HOLE, in fs/btrfs/extent_io.c:1385
      ------------[ cut here ]------------
      kernel BUG at fs/btrfs/extent_io.c:1385!
      invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
      CPU: 1 PID: 2695096 Comm: fsstress Kdump: loaded Tainted: G        W          6.8.0-rc2+ #1
      RIP: 0010:__extent_writepage_io.constprop.0+0x4c1/0x5c0
      Call Trace:
       <TASK>
       extent_write_cache_pages+0x2ac/0x8f0
       extent_writepages+0x87/0x110
       do_writepages+0xd5/0x1f0
       filemap_fdatawrite_wbc+0x63/0x90
       __filemap_fdatawrite_range+0x5c/0x80
       btrfs_fdatawrite_range+0x1f/0x50
       btrfs_write_out_cache+0x507/0x560
       btrfs_write_dirty_block_groups+0x32a/0x420
       commit_cowonly_roots+0x21b/0x290
       btrfs_commit_transaction+0x813/0x1360
       btrfs_sync_file+0x51a/0x640
       __x64_sys_fdatasync+0x52/0x90
       do_syscall_64+0x9c/0x190
       entry_SYSCALL_64_after_hwframe+0x6e/0x76
      
      This happens because we fail to write out the free space cache in one
      instance, come back around and attempt to write it again.  However on
      the second pass through we go to call btrfs_get_extent() on the inode to
      get the extent mapping.  Because this is a new block group, and with the
      free space inode we always search the commit root to avoid deadlocking
      with the tree, we find nothing and return a EXTENT_MAP_HOLE for the
      requested range.
      
      This happens because the first time we try to write the space cache out
      we hit an error, and on an error we drop the extent mapping.  This is
      normal for normal files, but the free space cache inode is special.  We
      always expect the extent map to be correct.  Thus the second time
      through we end up with a bogus extent map.
      
      Since we're deprecating this feature, the most straightforward way to
      fix this is to simply skip dropping the extent map range for this failed
      range.
      
      I shortened the test by using error injection to stress the area to make
      it easier to reproduce.  With this patch in place we no longer panic
      with my error injection test.
      
      CC: stable@vger.kernel.org # 4.14+
      Reviewed-by: default avatarFilipe Manana <fdmanana@suse.com>
      Signed-off-by: default avatarJosef Bacik <josef@toxicpanda.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      5571e41e
    • Linus Torvalds's avatar
      Merge tag 'riscv-for-linus-6.8-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux · 9ed18b0b
      Linus Torvalds authored
      Pull RISC-V fixes from Palmer Dabbelt:
      
       - fix missing TLB flush during early boot on SPARSEMEM_VMEMMAP
         configurations
      
       - fixes to correctly implement the break-before-make behavior requried
         by the ISA for NAPOT mappings
      
       - fix a missing TLB flush on intermediate mapping changes
      
       - fix build warning about a missing declaration of overflow_stack
      
       - fix performace regression related to incorrect tracking of completed
         batch TLB flushes
      
      * tag 'riscv-for-linus-6.8-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
        riscv: Fix arch_tlbbatch_flush() by clearing the batch cpumask
        riscv: declare overflow_stack as exported from traps.c
        riscv: Fix arch_hugetlb_migration_supported() for NAPOT
        riscv: Flush the tlb when a page directory is freed
        riscv: Fix hugetlb_mask_last_page() when NAPOT is enabled
        riscv: Fix set_huge_pte_at() for NAPOT mapping
        riscv: mm: execute local TLB flush after populating vmemmap
      9ed18b0b
    • Linus Torvalds's avatar
      Merge tag 'trace-v6.8-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace · ca8a6673
      Linus Torvalds authored
      Pull tracing fixes from Steven Rostedt:
      
       - Fix broken direct trampolines being called when another callback is
         attached the same function.
      
         ARM 64 does not support FTRACE_WITH_REGS, and when it added direct
         trampoline calls from ftrace, it removed the "WITH_REGS" flag from
         the ftrace_ops for direct trampolines. This broke x86 as x86 requires
         direct trampolines to have WITH_REGS.
      
         This wasn't noticed because direct trampolines work as long as the
         function it is attached to is not shared with other callbacks (like
         the function tracer). When there are other callbacks, a helper
         trampoline is called, to call all the non direct callbacks and when
         it returns, the direct trampoline is called.
      
         For x86, the direct trampoline sets a flag in the regs field to tell
         the x86 specific code to call the direct trampoline. But this only
         works if the ftrace_ops had WITH_REGS set. ARM does things
         differently that does not require this. For now, set WITH_REGS if the
         arch supports WITH_REGS (which ARM does not), and this makes it work
         for both ARM64 and x86.
      
       - Fix wasted memory in the saved_cmdlines logic.
      
         The saved_cmdlines is a cache that maps PIDs to COMMs that tracing
         can use. Most trace events only save the PID in the event. The
         saved_cmdlines file lists PIDs to COMMs so that the tracing tools can
         show an actual name and not just a PID for each event. There's an
         array of PIDs that map to a small set of saved COMM strings. The
         array is set to PID_MAX_DEFAULT which is usually set to 32768. When a
         PID comes in, it will add itself to this array along with the index
         into the COMM array (note if the system allows more than
         PID_MAX_DEFAULT, this cache is similar to cache lines as an update of
         a PID that has the same PID_MAX_DEFAULT bits set will flush out
         another task with the same matching bits set).
      
         A while ago, the size of this cache was changed to be dynamic and the
         array was moved into a structure and created with kmalloc(). But this
         new structure had the size of 131104 bytes, or 0x20020 in hex. As
         kmalloc allocates in powers of two, it was actually allocating
         0x40000 bytes (262144) leaving 131040 bytes of wasted memory. The
         last element of this structure was a pointer to the COMM string array
         which defaulted to just saving 128 COMMs.
      
         By changing the last field of this structure to a variable length
         string, and just having it round up to fill the allocated memory, the
         default size of the saved COMM cache is now 8190. This not only uses
         the wasted space, but actually saves space by removing the extra
         allocation for the COMM names.
      
      * tag 'trace-v6.8-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
        tracing: Fix wasted memory in saved_cmdlines logic
        ftrace: Fix DIRECT_CALLS to use SAVE_REGS by default
      ca8a6673
    • Linus Torvalds's avatar
      Merge tag 'probes-fixes-v6.8-rc3' of... · 6dc512a0
      Linus Torvalds authored
      Merge tag 'probes-fixes-v6.8-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
      
      Pull probes fixes from Masami Hiramatsu:
      
       - remove unnecessary initial values of kprobes local variables
      
       - probe-events parser bug fixes:
      
          - calculate the argument size and format string after setting type
            information from BTF, because BTF can change the size and format
            string.
      
          - show $comm parse error correctly instead of failing silently.
      
      * tag 'probes-fixes-v6.8-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
        kprobes: Remove unnecessary initial values of variables
        tracing/probes: Fix to set arg size and fmt after setting type from BTF
        tracing/probes: Fix to show a parse error for bad type for $comm
      6dc512a0