1. 29 May, 2023 5 commits
  2. 16 May, 2023 6 commits
    • Ritesh Harjani (IBM)'s avatar
      ext2: Add direct-io trace points · 6e335cd7
      Ritesh Harjani (IBM) authored
      This patch adds the trace point to ext2 direct-io apis
      in fs/ext2/file.c
      
      Here is how the output looks like
      
              a.out-467865 [006]  6758.170968: ext2_dio_write_begin: dev 7:12 ino 0xe isize 0x1000 pos 0x0 len 4096 flags DIRECT|WRITE aio 1 ret 0
              a.out-467865 [006]  6758.171061: ext2_dio_write_end:   dev 7:12 ino 0xe isize 0x1000 pos 0x0 len 0 flags DIRECT|WRITE aio 1 ret -529
      kworker/3:153-444162 [003]  6758.171252: ext2_dio_write_endio: dev 7:12 ino 0xe isize 0x1000 pos 0x0 len 4096 flags DIRECT|WRITE aio 1 ret 0
              a.out-468222 [001]  6761.628924: ext2_dio_read_begin:  dev 7:12 ino 0xe isize 0x1000 pos 0x0 len 4096 flags DIRECT aio 1 ret 0
              a.out-468222 [001]  6761.629063: ext2_dio_read_end:    dev 7:12 ino 0xe isize 0x1000 pos 0x0 len 0 flags DIRECT aio 1 ret -529
              a.out-468428 [005]  6763.937454: ext2_dio_write_begin: dev 7:12 ino 0xe isize 0x1000 pos 0x0 len 4096 flags DIRECT aio 0 ret 0
              a.out-468428 [005]  6763.937829: ext2_dio_write_endio: dev 7:12 ino 0xe isize 0x1000 pos 0x0 len 4096 flags DIRECT aio 0 ret 0
              a.out-468428 [005]  6763.937847: ext2_dio_write_end:   dev 7:12 ino 0xe isize 0x1000 pos 0x1000 len 0 flags DIRECT aio 0 ret 4096
              a.out-468609 [000]  6765.702878: ext2_dio_read_begin:  dev 7:12 ino 0xe isize 0x1000 pos 0x0 len 4096 flags DIRECT aio 0 ret 0
              a.out-468609 [000]  6765.703243: ext2_dio_read_end:    dev 7:12 ino 0xe isize 0x1000 pos 0x1000 len 0 flags DIRECT aio 0 ret 4096
      Reported-and-tested-by: default avatarDisha Goel <disgoel@linux.ibm.com>
      [Need to add CFLAGS_trace for fixing unable to find trace file problem]
      Signed-off-by: default avatarRitesh Harjani (IBM) <ritesh.list@gmail.com>
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Message-Id: <b8b0897fa2b273a448d7b4ba7317357ac73c08bc.1682069716.git.ritesh.list@gmail.com>
      6e335cd7
    • Ritesh Harjani (IBM)'s avatar
      ext2: Move direct-io to use iomap · fb5de435
      Ritesh Harjani (IBM) authored
      This patch converts ext2 direct-io path to iomap interface.
      - This also takes care of DIO_SKIP_HOLES part in which we return -ENOTBLK
        from ext2_iomap_begin(), in case if the write is done on a hole.
      - This fallbacks to buffered-io in case of DIO_SKIP_HOLES or in case of
        a partial write or if any error is detected in ext2_iomap_end().
        We try to return -ENOTBLK in such cases.
      - For any unaligned or extending DIO writes, we pass
        IOMAP_DIO_FORCE_WAIT flag to ensure synchronous writes.
      - For extending writes we set IOMAP_F_DIRTY in ext2_iomap_begin because
        otherwise with dsync writes on devices that support FUA, generic_write_sync
        won't be called and we might miss inode metadata updates.
      - Since ext2 already now uses _nolock vartiant of sync write. Hence
        there is no inode lock problem with iomap in this patch.
      - ext2_iomap_ops are now being shared by DIO, DAX & fiemap path
      Tested-by: default avatarDisha Goel <disgoel@linux.ibm.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarRitesh Harjani (IBM) <ritesh.list@gmail.com>
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Message-Id: <610b672a52f2a7ff6dc550fd14d0f995806232a5.1682069716.git.ritesh.list@gmail.com>
      fb5de435
    • Ritesh Harjani (IBM)'s avatar
      ext2: Use generic_buffers_fsync() implementation · d0530704
      Ritesh Harjani (IBM) authored
      Next patch converts ext2 to use iomap interface for DIO.
      iomap layer can call generic_write_sync() -> ext2_fsync() from
      iomap_dio_complete while still holding the inode_lock().
      
      Now writeback from other paths doesn't need inode_lock().
      It seems there is also no need of an inode_lock() for
      sync_mapping_buffers(). It uses it's own mapping->private_lock
      for it's buffer list handling.
      Hence this patch is in preparation to move ext2 to iomap.
      This uses generic_buffers_fsync() which does not take any inode_lock()
      in ext2_fsync().
      Tested-by: default avatarDisha Goel <disgoel@linux.ibm.com>
      Signed-off-by: default avatarRitesh Harjani (IBM) <ritesh.list@gmail.com>
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Message-Id: <76d206a464574ff91db25bc9e43479b51ca7e307.1682069716.git.ritesh.list@gmail.com>
      d0530704
    • Ritesh Harjani (IBM)'s avatar
      ext4: Use generic_buffers_fsync_noflush() implementation · 5b5b4ff8
      Ritesh Harjani (IBM) authored
      ext4 when got converted to iomap for dio, it copied __generic_file_fsync
      implementation to avoid taking inode_lock in order to avoid any deadlock
      (since iomap takes an inode_lock while calling generic_write_sync()).
      
      The previous patch already added generic_buffers_fsync*() which does not
      take any inode_lock(). Hence kill the redundant code and use
      generic_buffers_fsync_noflush() function instead.
      Tested-by: default avatarDisha Goel <disgoel@linux.ibm.com>
      Signed-off-by: default avatarRitesh Harjani (IBM) <ritesh.list@gmail.com>
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Message-Id: <b43d4bb4403061ed86510c9587673e30a461ba14.1682069716.git.ritesh.list@gmail.com>
      5b5b4ff8
    • Ritesh Harjani (IBM)'s avatar
      fs/buffer.c: Add generic_buffers_fsync*() implementation · 31b2ebc0
      Ritesh Harjani (IBM) authored
      Some of the higher layers like iomap takes inode_lock() when calling
      generic_write_sync().
      Also writeback already happens from other paths without inode lock,
      so it's difficult to say that we really need sync_mapping_buffers() to
      take any inode locking here. Having said that, let's add
      generic_buffers_fsync/_noflush() implementation in buffer.c with no
      inode_lock/unlock() for now so that filesystems like ext2 and
      ext4's nojournal mode can use it.
      
      Ext4 when got converted to iomap for direct-io already copied it's own
      variant of __generic_file_fsync() without lock.
      
      This patch adds generic_buffers_fsync()
      & generic_buffers_fsync_noflush() implementations for use in filesystems
      like ext2 & ext4 respectively.
      
      Later we can review other filesystems as well to see if we can make
      generic_buffers_fsync/_noflush() which does not take any inode_lock() as
      the default path.
      Tested-by: default avatarDisha Goel <disgoel@linux.ibm.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarRitesh Harjani (IBM) <ritesh.list@gmail.com>
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Message-Id: <d573408ac8408627d23a3d2d166e748c172c4c9e.1682069716.git.ritesh.list@gmail.com>
      31b2ebc0
    • Ritesh Harjani (IBM)'s avatar
      ext2/dax: Fix ext2_setsize when len is page aligned · fcced95b
      Ritesh Harjani (IBM) authored
      PAGE_ALIGN(x) macro gives the next highest value which is multiple of
      pagesize. But if x is already page aligned then it simply returns x.
      So, if x passed is 0 in dax_zero_range() function, that means the
      length gets passed as 0 to ->iomap_begin().
      
      In ext2 it then calls ext2_get_blocks -> max_blocks as 0 and hits bug_on
      here in ext2_get_blocks().
      	BUG_ON(maxblocks == 0);
      
      Instead we should be calling dax_truncate_page() here which takes
      care of it. i.e. it only calls dax_zero_range if the offset is not
      page/block aligned.
      
      This can be easily triggered with following on fsdax mounted pmem
      device.
      
      dd if=/dev/zero of=file count=1 bs=512
      truncate -s 0 file
      
      [79.525838] EXT2-fs (pmem0): DAX enabled. Warning: EXPERIMENTAL, use at your own risk
      [79.529376] ext2 filesystem being mounted at /mnt1/test supports timestamps until 2038 (0x7fffffff)
      [93.793207] ------------[ cut here ]------------
      [93.795102] kernel BUG at fs/ext2/inode.c:637!
      [93.796904] invalid opcode: 0000 [#1] PREEMPT SMP PTI
      [93.798659] CPU: 0 PID: 1192 Comm: truncate Not tainted 6.3.0-rc2-xfstests-00056-g131086faa369 #139
      [93.806459] RIP: 0010:ext2_get_blocks.constprop.0+0x524/0x610
      <...>
      [93.835298] Call Trace:
      [93.836253]  <TASK>
      [93.837103]  ? lock_acquire+0xf8/0x110
      [93.838479]  ? d_lookup+0x69/0xd0
      [93.839779]  ext2_iomap_begin+0xa7/0x1c0
      [93.841154]  iomap_iter+0xc7/0x150
      [93.842425]  dax_zero_range+0x6e/0xa0
      [93.843813]  ext2_setsize+0x176/0x1b0
      [93.845164]  ext2_setattr+0x151/0x200
      [93.846467]  notify_change+0x341/0x4e0
      [93.847805]  ? lock_acquire+0xf8/0x110
      [93.849143]  ? do_truncate+0x74/0xe0
      [93.850452]  ? do_truncate+0x84/0xe0
      [93.851739]  do_truncate+0x84/0xe0
      [93.852974]  do_sys_ftruncate+0x2b4/0x2f0
      [93.854404]  do_syscall_64+0x3f/0x90
      [93.855789]  entry_SYSCALL_64_after_hwframe+0x72/0xdc
      
      CC: stable@vger.kernel.org
      Fixes: 2aa3048e ("iomap: switch iomap_zero_range to use iomap_iter")
      Reviewed-by: default avatarDarrick J. Wong <djwong@kernel.org>
      Signed-off-by: default avatarRitesh Harjani (IBM) <ritesh.list@gmail.com>
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Message-Id: <046a58317f29d9603d1068b2bbae47c2332c17ae.1682069716.git.ritesh.list@gmail.com>
      fcced95b
  3. 14 May, 2023 13 commits
  4. 13 May, 2023 16 commits