1. 01 Feb, 2024 11 commits
    • Jon Hunter's avatar
      soc/tegra: fuse: Fix crash in tegra_fuse_readl() · 81b3f0ef
      Jon Hunter authored
      Commit c5b2d43e67bb ("soc/tegra: fuse: Add ACPI support for Tegra194 and
      Tegra234") updated the Tegra fuse driver to add ACPI support and added a
      test to the tegra_fuse_readl() function to check if the device is
      booting with device-tree. This test passes 'fuse->dev' variable to
      dev_fwnode() but does not check first is 'fuse->dev' is valid. This is
      causing a crash to occur in Tegra XUSB PHY driver that calls the
      tegra_fuse_readl() function before 'fuse->dev' variable has been
      initialised ...
      
       Unable to handle kernel NULL pointer dereference at virtual address 0000000000000290
       Mem abort info:
         ESR = 0x0000000096000004
         EC = 0x25: DABT (current EL), IL = 32 bits
         SET = 0, FnV = 0
         EA = 0, S1PTW = 0
         FSC = 0x04: level 0 translation fault
       Data abort info:
         ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000
         CM = 0, WnR = 0, TnD = 0, TagAccess = 0
         GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
       [0000000000000290] user address but active_mm is swapper
       Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP
       Modules linked in:
       CPU: 7 PID: 70 Comm: kworker/u16:4 Not tainted 6.8.0-rc1-next-20240129-02825-g596764183be8 #1
       Hardware name: NVIDIA Jetson AGX Xavier Developer Kit (DT)
       Workqueue: events_unbound deferred_probe_work_func
       pstate: 60400009 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
       pc : __dev_fwnode+0x0/0x18
       lr : tegra_fuse_readl+0x24/0x98
       sp : ffff80008393ba10
       x29: ffff80008393ba10 x28: 0000000000000000 x27: ffff800081233c10
       x26: 00000000000001c8 x25: ffff000080b7bc10 x24: ffff000082df3b00
       x23: fffffffffffffff4 x22: 0000000000000004 x21: ffff80008393ba84
       x20: 00000000000000f0 x19: ffff800082f1e000 x18: ffff800081d72000
       x17: 0000000000000001 x16: 0000000000000001 x15: ffff800082fcdfff
       x14: 0000000000000000 x13: 0000000003541000 x12: 0000000000000020
       x11: 0140000000000000 x10: ffff800080000000 x9 : 0000000000000000
       x8 : ffff000082df3b40 x7 : 0000000000000000 x6 : 000000000000003f
       x5 : 00000000ffffffff x4 : 0000000000000dc0 x3 : 00000000000000c0
       x2 : 0000000000000001 x1 : ffff80008393ba84 x0 : 0000000000000000
       Call trace:
        __dev_fwnode+0x0/0x18
        tegra186_xusb_padctl_probe+0xb0/0x1a8
        tegra_xusb_padctl_probe+0x7c/0xebc
        platform_probe+0x90/0xd8
        really_probe+0x13c/0x29c
        __driver_probe_device+0x7c/0x124
        driver_probe_device+0x38/0x11c
        __device_attach_driver+0x90/0xdc
        bus_for_each_drv+0x78/0xdc
        __device_attach+0xfc/0x188
        device_initial_probe+0x10/0x18
        bus_probe_device+0xa4/0xa8
        deferred_probe_work_func+0x80/0xb4
        process_scheduled_works+0x178/0x3e0
        worker_thread+0x164/0x2e8
        kthread+0xfc/0x11c
        ret_from_fork+0x10/0x20
       Code: a8c27bfd d65f03c0 128002a0 d65f03c0 (f9414801)
       ---[ end trace 0000000000000000 ]---
      
      Fix this by verifying that 'fuse->dev' is valid before passing to
      dev_fwnode().
      
      Fixes: c5b2d43e67bb ("soc/tegra: fuse: Add ACPI support for Tegra194 and Tegra234")
      Signed-off-by: default avatarJon Hunter <jonathanh@nvidia.com>
      Reviewed-by: default avatarKartik <kkartik@nvidia.com>
      Signed-off-by: default avatarThierry Reding <treding@nvidia.com>
      81b3f0ef
    • Kartik's avatar
      soc/tegra: fuse: Define tegra194_soc_attr_group for Tegra241 · 7a849d0b
      Kartik authored
      Tegra241 SoC data uses tegra194_soc_attr_group, which is only defined
      if config CONFIG_ARCH_TEGRA_194_SOC or CONFIG_ARCH_TEGRA_234_SOC or
      both are enabled. This causes a build failure if both of these configs
      are disabled and CONFIG_ARCH_TEGRA_241_SOC is enabled.
      
      Define tegra194_soc_attr_group if CONFIG_ARCH_TEGRA_241_SOC is enabled.
      Signed-off-by: default avatarKartik <kkartik@nvidia.com>
      Acked-by: default avatarRandy Dunlap <rdunlap@infradead.org>
      Tested-by: Randy Dunlap <rdunlap@infradead.org> # build-tested
      Signed-off-by: default avatarThierry Reding <treding@nvidia.com>
      7a849d0b
    • Kartik's avatar
      soc/tegra: fuse: Add support for Tegra241 · 8402074f
      Kartik authored
      Add support for Tegra241 which use ACPI boot.
      Signed-off-by: default avatarKartik <kkartik@nvidia.com>
      Signed-off-by: default avatarThierry Reding <treding@nvidia.com>
      8402074f
    • Kartik's avatar
      soc/tegra: fuse: Add ACPI support for Tegra194 and Tegra234 · 972167c6
      Kartik authored
      Add ACPI support for Tegra194 & Tegra243 SoC's. This requires
      following modifications to the probe when ACPI boot is used:
       - Initialize soc data.
       - Add nvmem lookups.
       - Register soc device.
       - use devm_clk_get_optional() instead of devm_clk_get() to get
         fuse->clk, as fuse clocks are not required when using ACPI boot.
      
      Also, drop '__init' keyword for tegra_soc_device_register() as this is also
      used by tegra_fuse_probe() and use dev_err_probe() wherever applicable.
      Signed-off-by: default avatarKartik <kkartik@nvidia.com>
      Signed-off-by: default avatarThierry Reding <treding@nvidia.com>
      972167c6
    • Kartik's avatar
      soc/tegra: fuse: Add function to print SKU info · 13a69354
      Kartik authored
      Add helper function tegra_fuse_print_sku_info() to print Tegra SKU
      information. So, it can be shared between tegra_fuse_init() and
      ACPI probe which is to be introduced later.
      Signed-off-by: default avatarKartik <kkartik@nvidia.com>
      Signed-off-by: default avatarThierry Reding <treding@nvidia.com>
      13a69354
    • Kartik's avatar
      soc/tegra: fuse: Add function to add lookups · 71661c1c
      Kartik authored
      Add helper function tegra_fuse_add_lookups() to register Tegra fuse
      nvmem lookups. So, this can be shared between tegra_fuse_init() and
      ACPI probe, which is to be introduced later.
      
      Use kmemdup_array to duplicate fuse->soc->lookups.
      Signed-off-by: default avatarKartik <kkartik@nvidia.com>
      Signed-off-by: default avatarThierry Reding <treding@nvidia.com>
      71661c1c
    • Kartik's avatar
      soc/tegra: fuse: Add tegra_acpi_init_apbmisc() · 7b0c505e
      Kartik authored
      In preparation to ACPI support in Tegra fuse driver add function
      tegra_acpi_init_apbmisc() to initialize tegra-apbmisc driver.
      Also, document the reason of calling tegra_init_apbmisc() at early init.
      
      Note that function tegra_acpi_init_apbmisc() is not placed in the __init
      section, because it will be called during probe.
      Signed-off-by: default avatarKartik <kkartik@nvidia.com>
      Signed-off-by: default avatarThierry Reding <treding@nvidia.com>
      7b0c505e
    • Kartik's avatar
      soc/tegra: fuse: Refactor resource mapping · f0139d66
      Kartik authored
      To prepare for adding ACPI support to the tegra-apbmisc driver,
      relocate the code responsible for mapping memory resources from
      the function ‘tegra_init_apbmisc’ to the function
      ‘tegra_init_apbmisc_resources.’ This adjustment will allow the
      code to be shared between ‘tegra_init_apbmisc’ and the upcoming
      ‘tegra_acpi_init_apbmisc’ function.
      Signed-off-by: default avatarKartik <kkartik@nvidia.com>
      Signed-off-by: default avatarThierry Reding <treding@nvidia.com>
      f0139d66
    • Kartik's avatar
      soc/tegra: fuse: Use dev_err_probe for probe failures · 4569e604
      Kartik authored
      Currently, in tegra_fuse_probe() if clock/reset get fails, then the
      driver prints an error if the error is not caused by -EPROBE_DEFER.
      This can be improved by using dev_err_probe() instead.
      
      So, return dev_err_probe() if clock/reset get fails.
      Signed-off-by: default avatarKartik <kkartik@nvidia.com>
      Signed-off-by: default avatarThierry Reding <treding@nvidia.com>
      4569e604
    • Kartik's avatar
      mm/util: Introduce kmemdup_array() · 7092e9b3
      Kartik authored
      Introduce kmemdup_array() API to duplicate `n` number of elements
      from a given array. This internally uses kmemdup to allocate and duplicate
      the `src` array.
      Signed-off-by: default avatarKartik <kkartik@nvidia.com>
      Acked-by: default avatarKees Cook <keescook@chromium.org>
      Signed-off-by: default avatarThierry Reding <treding@nvidia.com>
      7092e9b3
    • Christophe JAILLET's avatar
      soc/tegra: pmc: Remove some old and deprecated functions and constants · 9863084d
      Christophe JAILLET authored
      These TEGRA_IO_RAIL_... functions and constants have been deprecated in
      commit 21b49910 ("soc/tegra: pmc: Add I/O pad voltage support") in
      2016-11.
      
      There seems to be no users since kernel 4.16.
      
      Remove them now.
      Signed-off-by: default avatarChristophe JAILLET <christophe.jaillet@wanadoo.fr>
      Signed-off-by: default avatarThierry Reding <treding@nvidia.com>
      9863084d
  2. 21 Jan, 2024 29 commits
    • Linus Torvalds's avatar
      Linux 6.8-rc1 · 6613476e
      Linus Torvalds authored
      6613476e
    • Linus Torvalds's avatar
      Merge tag 'bcachefs-2024-01-21' of https://evilpiepirate.org/git/bcachefs · 35a4474b
      Linus Torvalds authored
      Pull more bcachefs updates from Kent Overstreet:
       "Some fixes, Some refactoring, some minor features:
      
         - Assorted prep work for disk space accounting rewrite
      
         - BTREE_TRIGGER_ATOMIC: after combining our trigger callbacks, this
           makes our trigger context more explicit
      
         - A few fixes to avoid excessive transaction restarts on
           multithreaded workloads: fstests (in addition to ktest tests) are
           now checking slowpath counters, and that's shaking out a few bugs
      
         - Assorted tracepoint improvements
      
         - Starting to break up bcachefs_format.h and move on disk types so
           they're with the code they belong to; this will make room to start
           documenting the on disk format better.
      
         - A few minor fixes"
      
      * tag 'bcachefs-2024-01-21' of https://evilpiepirate.org/git/bcachefs: (46 commits)
        bcachefs: Improve inode_to_text()
        bcachefs: logged_ops_format.h
        bcachefs: reflink_format.h
        bcachefs; extents_format.h
        bcachefs: ec_format.h
        bcachefs: subvolume_format.h
        bcachefs: snapshot_format.h
        bcachefs: alloc_background_format.h
        bcachefs: xattr_format.h
        bcachefs: dirent_format.h
        bcachefs: inode_format.h
        bcachefs; quota_format.h
        bcachefs: sb-counters_format.h
        bcachefs: counters.c -> sb-counters.c
        bcachefs: comment bch_subvolume
        bcachefs: bch_snapshot::btime
        bcachefs: add missing __GFP_NOWARN
        bcachefs: opts->compression can now also be applied in the background
        bcachefs: Prep work for variable size btree node buffers
        bcachefs: grab s_umount only if snapshotting
        ...
      35a4474b
    • Linus Torvalds's avatar
      Merge tag 'timers-core-2024-01-21' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 4fbbed78
      Linus Torvalds authored
      Pull timer updates from Thomas Gleixner:
       "Updates for time and clocksources:
      
         - A fix for the idle and iowait time accounting vs CPU hotplug.
      
           The time is reset on CPU hotplug which makes the accumulated
           systemwide time jump backwards.
      
         - Assorted fixes and improvements for clocksource/event drivers"
      
      * tag 'timers-core-2024-01-21' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        tick-sched: Fix idle and iowait sleeptime accounting vs CPU hotplug
        clocksource/drivers/ep93xx: Fix error handling during probe
        clocksource/drivers/cadence-ttc: Fix some kernel-doc warnings
        clocksource/drivers/timer-ti-dm: Fix make W=n kerneldoc warnings
        clocksource/timer-riscv: Add riscv_clock_shutdown callback
        dt-bindings: timer: Add StarFive JH8100 clint
        dt-bindings: timer: thead,c900-aclint-mtimer: separate mtime and mtimecmp regs
      4fbbed78
    • Linus Torvalds's avatar
      Merge tag 'powerpc-6.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux · 7b297a5c
      Linus Torvalds authored
      Pull powerpc fixes from Aneesh Kumar:
      
       - Increase default stack size to 32KB for Book3S
      
      Thanks to Michael Ellerman.
      
      * tag 'powerpc-6.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
        powerpc/64s: Increase default stack size to 32KB
      7b297a5c
    • Kent Overstreet's avatar
      bcachefs: Improve inode_to_text() · 249f441f
      Kent Overstreet authored
      Add line breaks - inode_to_text() is now much easier to read.
      Signed-off-by: default avatarKent Overstreet <kent.overstreet@linux.dev>
      249f441f
    • Kent Overstreet's avatar
      bcachefs: logged_ops_format.h · d826cc57
      Kent Overstreet authored
      Signed-off-by: default avatarKent Overstreet <kent.overstreet@linux.dev>
      d826cc57
    • Kent Overstreet's avatar
      bcachefs: reflink_format.h · 8d52ba60
      Kent Overstreet authored
      Signed-off-by: default avatarKent Overstreet <kent.overstreet@linux.dev>
      8d52ba60
    • Kent Overstreet's avatar
      bcachefs; extents_format.h · b2fa1b63
      Kent Overstreet authored
      Signed-off-by: default avatarKent Overstreet <kent.overstreet@linux.dev>
      b2fa1b63
    • Kent Overstreet's avatar
      bcachefs: ec_format.h · 0560eb9a
      Kent Overstreet authored
      Signed-off-by: default avatarKent Overstreet <kent.overstreet@linux.dev>
      0560eb9a
    • Kent Overstreet's avatar
      bcachefs: subvolume_format.h · c6c4ff65
      Kent Overstreet authored
      Signed-off-by: default avatarKent Overstreet <kent.overstreet@linux.dev>
      c6c4ff65
    • Kent Overstreet's avatar
      bcachefs: snapshot_format.h · 8fed323b
      Kent Overstreet authored
      Signed-off-by: default avatarKent Overstreet <kent.overstreet@linux.dev>
      8fed323b
    • Kent Overstreet's avatar
      d455179f
    • Kent Overstreet's avatar
      bcachefs: xattr_format.h · 72e08010
      Kent Overstreet authored
      Signed-off-by: default avatarKent Overstreet <kent.overstreet@linux.dev>
      72e08010
    • Kent Overstreet's avatar
      bcachefs: dirent_format.h · 7ffc4daa
      Kent Overstreet authored
      Signed-off-by: default avatarKent Overstreet <kent.overstreet@linux.dev>
      7ffc4daa
    • Kent Overstreet's avatar
      bcachefs: inode_format.h · b36425da
      Kent Overstreet authored
      Signed-off-by: default avatarKent Overstreet <kent.overstreet@linux.dev>
      b36425da
    • Kent Overstreet's avatar
      bcachefs; quota_format.h · 82de6207
      Kent Overstreet authored
      Signed-off-by: default avatarKent Overstreet <kent.overstreet@linux.dev>
      82de6207
    • Kent Overstreet's avatar
      bcachefs: sb-counters_format.h · 43314801
      Kent Overstreet authored
      bcachefs_format.h has gotten too big; let's do some organizing.
      Signed-off-by: default avatarKent Overstreet <kent.overstreet@linux.dev>
      43314801
    • Kent Overstreet's avatar
      3a58dfbc
    • Kent Overstreet's avatar
      12207f49
    • Kent Overstreet's avatar
      bcachefs: bch_snapshot::btime · d32088f2
      Kent Overstreet authored
      Add a field to bch_snapshot for creation time; this will be important
      when we start exposing the snapshot tree to userspace.
      Signed-off-by: default avatarKent Overstreet <kent.overstreet@linux.dev>
      d32088f2
    • Kent Overstreet's avatar
      7be0208f
    • Kent Overstreet's avatar
      bcachefs: opts->compression can now also be applied in the background · d7e77f53
      Kent Overstreet authored
      The "apply this compression method in the background" paths now use the
      compression option if background_compression is not set; this means that
      setting or changing the compression option will cause existing data to
      be compressed accordingly in the background.
      Signed-off-by: default avatarKent Overstreet <kent.overstreet@linux.dev>
      d7e77f53
    • Kent Overstreet's avatar
      bcachefs: Prep work for variable size btree node buffers · ec4edd7b
      Kent Overstreet authored
      bcachefs btree nodes are big - typically 256k - and btree roots are
      pinned in memory. As we're now up to 18 btrees, we now have significant
      memory overhead in mostly empty btree roots.
      
      And in the future we're going to start enforcing that certain btree node
      boundaries exist, to solve lock contention issues - analagous to XFS's
      AGIs.
      
      Thus, we need to start allocating smaller btree node buffers when we
      can. This patch changes code that refers to the filesystem constant
      c->opts.btree_node_size to refer to the btree node buffer size -
      btree_buf_bytes() - where appropriate.
      Signed-off-by: default avatarKent Overstreet <kent.overstreet@linux.dev>
      ec4edd7b
    • Su Yue's avatar
      bcachefs: grab s_umount only if snapshotting · 2acc59dd
      Su Yue authored
      When I was testing mongodb over bcachefs with compression,
      there is a lockdep warning when snapshotting mongodb data volume.
      
      $ cat test.sh
      prog=bcachefs
      
      $prog subvolume create /mnt/data
      $prog subvolume create /mnt/data/snapshots
      
      while true;do
          $prog subvolume snapshot /mnt/data /mnt/data/snapshots/$(date +%s)
          sleep 1s
      done
      
      $ cat /etc/mongodb.conf
      systemLog:
        destination: file
        logAppend: true
        path: /mnt/data/mongod.log
      
      storage:
        dbPath: /mnt/data/
      
      lockdep reports:
      [ 3437.452330] ======================================================
      [ 3437.452750] WARNING: possible circular locking dependency detected
      [ 3437.453168] 6.7.0-rc7-custom+ #85 Tainted: G            E
      [ 3437.453562] ------------------------------------------------------
      [ 3437.453981] bcachefs/35533 is trying to acquire lock:
      [ 3437.454325] ffffa0a02b2b1418 (sb_writers#10){.+.+}-{0:0}, at: filename_create+0x62/0x190
      [ 3437.454875]
                     but task is already holding lock:
      [ 3437.455268] ffffa0a02b2b10e0 (&type->s_umount_key#48){.+.+}-{3:3}, at: bch2_fs_file_ioctl+0x232/0xc90 [bcachefs]
      [ 3437.456009]
                     which lock already depends on the new lock.
      
      [ 3437.456553]
                     the existing dependency chain (in reverse order) is:
      [ 3437.457054]
                     -> #3 (&type->s_umount_key#48){.+.+}-{3:3}:
      [ 3437.457507]        down_read+0x3e/0x170
      [ 3437.457772]        bch2_fs_file_ioctl+0x232/0xc90 [bcachefs]
      [ 3437.458206]        __x64_sys_ioctl+0x93/0xd0
      [ 3437.458498]        do_syscall_64+0x42/0xf0
      [ 3437.458779]        entry_SYSCALL_64_after_hwframe+0x6e/0x76
      [ 3437.459155]
                     -> #2 (&c->snapshot_create_lock){++++}-{3:3}:
      [ 3437.459615]        down_read+0x3e/0x170
      [ 3437.459878]        bch2_truncate+0x82/0x110 [bcachefs]
      [ 3437.460276]        bchfs_truncate+0x254/0x3c0 [bcachefs]
      [ 3437.460686]        notify_change+0x1f1/0x4a0
      [ 3437.461283]        do_truncate+0x7f/0xd0
      [ 3437.461555]        path_openat+0xa57/0xce0
      [ 3437.461836]        do_filp_open+0xb4/0x160
      [ 3437.462116]        do_sys_openat2+0x91/0xc0
      [ 3437.462402]        __x64_sys_openat+0x53/0xa0
      [ 3437.462701]        do_syscall_64+0x42/0xf0
      [ 3437.462982]        entry_SYSCALL_64_after_hwframe+0x6e/0x76
      [ 3437.463359]
                     -> #1 (&sb->s_type->i_mutex_key#15){+.+.}-{3:3}:
      [ 3437.463843]        down_write+0x3b/0xc0
      [ 3437.464223]        bch2_write_iter+0x5b/0xcc0 [bcachefs]
      [ 3437.464493]        vfs_write+0x21b/0x4c0
      [ 3437.464653]        ksys_write+0x69/0xf0
      [ 3437.464839]        do_syscall_64+0x42/0xf0
      [ 3437.465009]        entry_SYSCALL_64_after_hwframe+0x6e/0x76
      [ 3437.465231]
                     -> #0 (sb_writers#10){.+.+}-{0:0}:
      [ 3437.465471]        __lock_acquire+0x1455/0x21b0
      [ 3437.465656]        lock_acquire+0xc6/0x2b0
      [ 3437.465822]        mnt_want_write+0x46/0x1a0
      [ 3437.465996]        filename_create+0x62/0x190
      [ 3437.466175]        user_path_create+0x2d/0x50
      [ 3437.466352]        bch2_fs_file_ioctl+0x2ec/0xc90 [bcachefs]
      [ 3437.466617]        __x64_sys_ioctl+0x93/0xd0
      [ 3437.466791]        do_syscall_64+0x42/0xf0
      [ 3437.466957]        entry_SYSCALL_64_after_hwframe+0x6e/0x76
      [ 3437.467180]
                     other info that might help us debug this:
      
      [ 3437.469670] 2 locks held by bcachefs/35533:
                     other info that might help us debug this:
      
      [ 3437.467507] Chain exists of:
                       sb_writers#10 --> &c->snapshot_create_lock --> &type->s_umount_key#48
      
      [ 3437.467979]  Possible unsafe locking scenario:
      
      [ 3437.468223]        CPU0                    CPU1
      [ 3437.468405]        ----                    ----
      [ 3437.468585]   rlock(&type->s_umount_key#48);
      [ 3437.468758]                                lock(&c->snapshot_create_lock);
      [ 3437.469030]                                lock(&type->s_umount_key#48);
      [ 3437.469291]   rlock(sb_writers#10);
      [ 3437.469434]
                      *** DEADLOCK ***
      
      [ 3437.469670] 2 locks held by bcachefs/35533:
      [ 3437.469838]  #0: ffffa0a02ce00a88 (&c->snapshot_create_lock){++++}-{3:3}, at: bch2_fs_file_ioctl+0x1e3/0xc90 [bcachefs]
      [ 3437.470294]  #1: ffffa0a02b2b10e0 (&type->s_umount_key#48){.+.+}-{3:3}, at: bch2_fs_file_ioctl+0x232/0xc90 [bcachefs]
      [ 3437.470744]
                     stack backtrace:
      [ 3437.470922] CPU: 7 PID: 35533 Comm: bcachefs Kdump: loaded Tainted: G            E      6.7.0-rc7-custom+ #85
      [ 3437.471313] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.3-1-1 04/01/2014
      [ 3437.471694] Call Trace:
      [ 3437.471795]  <TASK>
      [ 3437.471884]  dump_stack_lvl+0x57/0x90
      [ 3437.472035]  check_noncircular+0x132/0x150
      [ 3437.472202]  __lock_acquire+0x1455/0x21b0
      [ 3437.472369]  lock_acquire+0xc6/0x2b0
      [ 3437.472518]  ? filename_create+0x62/0x190
      [ 3437.472683]  ? lock_is_held_type+0x97/0x110
      [ 3437.472856]  mnt_want_write+0x46/0x1a0
      [ 3437.473025]  ? filename_create+0x62/0x190
      [ 3437.473204]  filename_create+0x62/0x190
      [ 3437.473380]  user_path_create+0x2d/0x50
      [ 3437.473555]  bch2_fs_file_ioctl+0x2ec/0xc90 [bcachefs]
      [ 3437.473819]  ? lock_acquire+0xc6/0x2b0
      [ 3437.474002]  ? __fget_files+0x2a/0x190
      [ 3437.474195]  ? __fget_files+0xbc/0x190
      [ 3437.474380]  ? lock_release+0xc5/0x270
      [ 3437.474567]  ? __x64_sys_ioctl+0x93/0xd0
      [ 3437.474764]  ? __pfx_bch2_fs_file_ioctl+0x10/0x10 [bcachefs]
      [ 3437.475090]  __x64_sys_ioctl+0x93/0xd0
      [ 3437.475277]  do_syscall_64+0x42/0xf0
      [ 3437.475454]  entry_SYSCALL_64_after_hwframe+0x6e/0x76
      [ 3437.475691] RIP: 0033:0x7f2743c313af
      ======================================================
      
      In __bch2_ioctl_subvolume_create(), we grab s_umount unconditionally
      and unlock it at the end of the function. There is a comment
      "why do we need this lock?" about the lock coming from
      commit 42d23732 ("bcachefs: Snapshot creation, deletion")
      The reason is that __bch2_ioctl_subvolume_create() calls
      sync_inodes_sb() which enforce locked s_umount to writeback all dirty
      nodes before doing snapshot works.
      
      Fix it by read locking s_umount for snapshotting only and unlocking
      s_umount after sync_inodes_sb().
      Signed-off-by: default avatarSu Yue <glass.su@suse.com>
      Signed-off-by: default avatarKent Overstreet <kent.overstreet@linux.dev>
      2acc59dd
    • Su Yue's avatar
      bcachefs: kvfree bch_fs::snapshots in bch2_fs_snapshots_exit · 369acf97
      Su Yue authored
      bch_fs::snapshots is allocated by kvzalloc in __snapshot_t_mut.
      It should be freed by kvfree not kfree.
      Or umount will triger:
      
      [  406.829178 ] BUG: unable to handle page fault for address: ffffe7b487148008
      [  406.830676 ] #PF: supervisor read access in kernel mode
      [  406.831643 ] #PF: error_code(0x0000) - not-present page
      [  406.832487 ] PGD 0 P4D 0
      [  406.832898 ] Oops: 0000 [#1] PREEMPT SMP PTI
      [  406.833512 ] CPU: 2 PID: 1754 Comm: umount Kdump: loaded Tainted: G           OE      6.7.0-rc7-custom+ #90
      [  406.834746 ] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.3-1-1 04/01/2014
      [  406.835796 ] RIP: 0010:kfree+0x62/0x140
      [  406.836197 ] Code: 80 48 01 d8 0f 82 e9 00 00 00 48 c7 c2 00 00 00 80 48 2b 15 78 9f 1f 01 48 01 d0 48 c1 e8 0c 48 c1 e0 06 48 03 05 56 9f 1f 01 <48> 8b 50 08 48 89 c7 f6 c2 01 0f 85 b0 00 00 00 66 90 48 8b 07 f6
      [  406.837810 ] RSP: 0018:ffffb9d641607e48 EFLAGS: 00010286
      [  406.838213 ] RAX: ffffe7b487148000 RBX: ffffb9d645200000 RCX: ffffb9d641607dc4
      [  406.838738 ] RDX: 000065bb00000000 RSI: ffffffffc0d88b84 RDI: ffffb9d645200000
      [  406.839217 ] RBP: ffff9a4625d00068 R08: 0000000000000001 R09: 0000000000000001
      [  406.839650 ] R10: 0000000000000001 R11: 000000000000001f R12: ffff9a4625d4da80
      [  406.840055 ] R13: ffff9a4625d00000 R14: ffffffffc0e2eb20 R15: 0000000000000000
      [  406.840451 ] FS:  00007f0a264ffb80(0000) GS:ffff9a4e2d500000(0000) knlGS:0000000000000000
      [  406.840851 ] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  406.841125 ] CR2: ffffe7b487148008 CR3: 000000018c4d2000 CR4: 00000000000006f0
      [  406.841464 ] Call Trace:
      [  406.841583 ]  <TASK>
      [  406.841682 ]  ? __die+0x1f/0x70
      [  406.841828 ]  ? page_fault_oops+0x159/0x470
      [  406.842014 ]  ? fixup_exception+0x22/0x310
      [  406.842198 ]  ? exc_page_fault+0x1ed/0x200
      [  406.842382 ]  ? asm_exc_page_fault+0x22/0x30
      [  406.842574 ]  ? bch2_fs_release+0x54/0x280 [bcachefs]
      [  406.842842 ]  ? kfree+0x62/0x140
      [  406.842988 ]  ? kfree+0x104/0x140
      [  406.843138 ]  bch2_fs_release+0x54/0x280 [bcachefs]
      [  406.843390 ]  kobject_put+0xb7/0x170
      [  406.843552 ]  deactivate_locked_super+0x2f/0xa0
      [  406.843756 ]  cleanup_mnt+0xba/0x150
      [  406.843917 ]  task_work_run+0x59/0xa0
      [  406.844083 ]  exit_to_user_mode_prepare+0x197/0x1a0
      [  406.844302 ]  syscall_exit_to_user_mode+0x16/0x40
      [  406.844510 ]  do_syscall_64+0x4e/0xf0
      [  406.844675 ]  entry_SYSCALL_64_after_hwframe+0x6e/0x76
      [  406.844907 ] RIP: 0033:0x7f0a2664e4fb
      Signed-off-by: default avatarSu Yue <glass.su@suse.com>
      Reviewed-by: default avatarBrian Foster <bfoster@redhat.com>
      Signed-off-by: default avatarKent Overstreet <kent.overstreet@linux.dev>
      369acf97
    • Kent Overstreet's avatar
      bcachefs: bios must be 512 byte algined · 00fff4dd
      Kent Overstreet authored
      Fixes: 023f9ac9 bcachefs: Delete dio read alignment check
      Reported-by: default avatarBrian Foster <bfoster@redhat.com>
      Signed-off-by: default avatarKent Overstreet <kent.overstreet@linux.dev>
      00fff4dd
    • Colin Ian King's avatar
      bcachefs: remove redundant variable tmp · aead3428
      Colin Ian King authored
      The variable tmp is being assigned a value but it isn't being
      read afterwards. The assignment is redundant and so tmp can be
      removed.
      
      Cleans up clang scan build warning:
      warning: Although the value stored to 'ret' is used in the enclosing
      expression, the value is never actually read from 'ret'
      [deadcode.DeadStores]
      Signed-off-by: default avatarColin Ian King <colin.i.king@gmail.com>
      Reviewed-by: default avatarBrian Foster <bfoster@redhat.com>
      Signed-off-by: default avatarKent Overstreet <kent.overstreet@linux.dev>
      aead3428
    • Kent Overstreet's avatar
    • Kent Overstreet's avatar
      bcachefs: Fix excess transaction restarts in __bchfs_fallocate() · 46bf2e9c
      Kent Overstreet authored
      drop_locks_do() should not be used in a fastpath without first trying
      the do in nonblocking mode - the unlock and relock will cause excessive
      transaction restarts and potentially livelocking with other threads that
      are contending for the same locks.
      Signed-off-by: default avatarKent Overstreet <kent.overstreet@linux.dev>
      46bf2e9c