- 02 Jul, 2024 28 commits
-
-
Christoph Hellwig authored
All callers of xfs_perag_intent_get have a fsbno and need boilerplate code to turn that into an agno. Just pass the fsbno to xfs_perag_intent_get and look up the agno there. Signed-off-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Darrick J. Wong <djwong@kernel.org> Signed-off-by:
Darrick J. Wong <djwong@kernel.org>
-
Darrick J. Wong authored
Convert the boolean to skip discard on free into a proper flags field so that we can add more flags in the next patch. Signed-off-by:
Darrick J. Wong <djwong@kernel.org> Reviewed-by:
Christoph Hellwig <hch@lst.de>
-
Darrick J. Wong authored
Pass the incore EFI structure to the tracepoints instead of open-coding the argument passing. This cleans up the call sites a bit. Signed-off-by:
Darrick J. Wong <djwong@kernel.org> Reviewed-by:
Christoph Hellwig <hch@lst.de>
-
Darrick J. Wong authored
Currently, the XFS_SB_CRC_OFF macro uses the incore superblock struct (xfs_sb) to compute the address of sb_crc within the ondisk superblock struct (xfs_dsb). This is a landmine if we ever change the layout of the incore superblock (as we're about to do), so redefine the macro to use xfs_dsb to compute the layout of xfs_dsb. Signed-off-by:
Darrick J. Wong <djwong@kernel.org> Reviewed-by:
Christoph Hellwig <hch@lst.de>
-
Darrick J. Wong authored
Get rid of the largely pointless xfs_cross_rename and xfs_finish_rename now that we've refactored its parent. Signed-off-by:
Darrick J. Wong <djwong@kernel.org> Reviewed-by:
Christoph Hellwig <hch@lst.de>
-
Darrick J. Wong authored
Move the directory entry update hook code to xfs_dir2 so that it is mostly consolidated with the higher level directory functions. Retain the exports so that online fsck can still send notifications through the hooks. Signed-off-by:
Darrick J. Wong <djwong@kernel.org> Reviewed-by:
Christoph Hellwig <hch@lst.de>
-
Darrick J. Wong authored
Create a new libxfs function to rename two directory entries. The upcoming metadata directory feature will need this to replace a metadata inode directory entry. Signed-off-by:
Darrick J. Wong <djwong@kernel.org> Reviewed-by:
Christoph Hellwig <hch@lst.de>
-
Darrick J. Wong authored
Create a new libxfs function to exchange two directory entries. The upcoming metadata directory feature will need this to replace a metadata inode directory entry. Signed-off-by:
Darrick J. Wong <djwong@kernel.org> Reviewed-by:
Christoph Hellwig <hch@lst.de>
-
Darrick J. Wong authored
Create a new libxfs function to remove a (name, inode) entry from a directory. The upcoming metadata directory feature will need this to create a metadata directory tree. Signed-off-by:
Darrick J. Wong <djwong@kernel.org> Reviewed-by:
Christoph Hellwig <hch@lst.de>
-
Darrick J. Wong authored
Create a libxfs helper function that marks an inode free on disk. Signed-off-by:
Darrick J. Wong <djwong@kernel.org> Reviewed-by:
Christoph Hellwig <hch@lst.de>
-
Darrick J. Wong authored
Create a new libxfs function to link an existing inode into a directory. The upcoming metadata directory feature will need this to create a metadata directory tree. Signed-off-by:
Darrick J. Wong <djwong@kernel.org> Reviewed-by:
Christoph Hellwig <hch@lst.de>
-
Darrick J. Wong authored
Create a new libxfs function to link a newly created inode into a directory. The upcoming metadata directory feature will need this to create a metadata directory tree. Signed-off-by:
Darrick J. Wong <djwong@kernel.org> Reviewed-by:
Christoph Hellwig <hch@lst.de>
-
Darrick J. Wong authored
INIT_XATTRS is overloaded here -- it's set during the creat process when we think that we're immediately going to set some ACL xattrs to save time. However, it's also used by the parent pointers code to enable the attr fork in preparation to receive ppptr xattrs. This results in xfs_has_parent() branches scattered around the codebase to turn on INIT_XATTRS. Linkable files are created far more commonly than unlinkable temporary files or directory tree roots, so we should centralize this logic in xfs_inode_init. For the three callers that don't want parent pointers (online repiar tempfiles, unlinkable tempfiles, rootdir creation) we provide an UNLINKABLE flag to skip attr fork initialization. Signed-off-by:
Darrick J. Wong <djwong@kernel.org> Reviewed-by:
Christoph Hellwig <hch@lst.de>
-
Darrick J. Wong authored
Move xfs_bumplink and xfs_droplink to libxfs. Signed-off-by:
Darrick J. Wong <djwong@kernel.org> Reviewed-by:
Christoph Hellwig <hch@lst.de>
-
Darrick J. Wong authored
Move xfs_iunlink and xfs_iunlink_remove to libxfs. Signed-off-by:
Darrick J. Wong <djwong@kernel.org> Reviewed-by:
Christoph Hellwig <hch@lst.de>
-
Darrick J. Wong authored
Create a helper that calls dqalloc to allocate and grab a reference to dquots for the user, group, and project ids listed in an icreate structure. This simplifies the creat-related dqalloc callsites scattered around the code base. Signed-off-by:
Darrick J. Wong <djwong@kernel.org> Reviewed-by:
Christoph Hellwig <hch@lst.de>
-
Darrick J. Wong authored
Move the initialization of the xfs_icreate_args structure out of xfs_create and xfs_create_tempfile into their callers so that we can set the new inode's attributes in one place and pass that through instead of open coding the collection of attributes all over the code. Signed-off-by:
Darrick J. Wong <djwong@kernel.org> Reviewed-by:
Christoph Hellwig <hch@lst.de>
-
Darrick J. Wong authored
Move all the code that initializes a new inode's attributes from the icreate_args structure and the parent directory into libxfs. Signed-off-by:
Darrick J. Wong <djwong@kernel.org> Reviewed-by:
Christoph Hellwig <hch@lst.de>
-
Darrick J. Wong authored
There are two parts to initializing a newly allocated inode: setting up the incore structures, and initializing the new inode core based on the parent inode and the current user's environment. The initialization code is not specific to the kernel, so we would like to share that with userspace by hoisting it to libxfs. Therefore, split xfs_icreate into separate functions to prepare for the next few patches. Signed-off-by:
Darrick J. Wong <djwong@kernel.org> Reviewed-by:
Christoph Hellwig <hch@lst.de>
-
Darrick J. Wong authored
Use xfs_trans_ichgtime to set the inode times when allocating an inode, instead of open-coding them here. Signed-off-by:
Darrick J. Wong <djwong@kernel.org> Reviewed-by:
Christoph Hellwig <hch@lst.de>
-
Darrick J. Wong authored
Enable xfs_trans_ichgtime to change the inode access time so that we can use this function to set inode times when allocating inodes instead of open-coding it. Signed-off-by:
Darrick J. Wong <djwong@kernel.org> Reviewed-by:
Christoph Hellwig <hch@lst.de>
-
Darrick J. Wong authored
Callers that want to create an inode currently pass all possible file attribute values for the new inode into xfs_init_new_inode as ten separate parameters. This causes two code maintenance issues: first, we have large multi-line call sites which programmers must read carefully to make sure they did not accidentally invert a value. Second, all three file id parameters must be passed separately to the quota functions; any discrepancy results in quota count errors. Clean this up by creating a new icreate_args structure to hold all this information, some helpers to initialize them properly, and make the callers pass this structure through to the creation function, whose name we shorten to xfs_icreate. This eliminates the issues, enables us to keep the inode init code in sync with userspace via libxfs, and is needed for future metadata directory tree management. (A subsequent cleanup will also fix the quota alloc calls.) Signed-off-by:
Darrick J. Wong <djwong@kernel.org> Reviewed-by:
Christoph Hellwig <hch@lst.de>
-
Darrick J. Wong authored
Move the project id get and set functions into libxfs. Signed-off-by:
Darrick J. Wong <djwong@kernel.org> Reviewed-by:
Christoph Hellwig <hch@lst.de>
-
Darrick J. Wong authored
Hoist the inode flag conversion functions into libxfs so that we can keep them in sync. Do this by creating a new xfs_inode_util.c file in libxfs. Signed-off-by:
Darrick J. Wong <djwong@kernel.org> Reviewed-by:
Christoph Hellwig <hch@lst.de>
-
Darrick J. Wong authored
Move the extent size helpers to xfs_bmap.c in libxfs since they're used there already. Signed-off-by:
Darrick J. Wong <djwong@kernel.org> Reviewed-by:
Christoph Hellwig <hch@lst.de>
-
Darrick J. Wong authored
Move these inode predicate functions to xfs_inode.[ch] since they're not reflink functions. Signed-off-by:
Darrick J. Wong <djwong@kernel.org> Reviewed-by:
Christoph Hellwig <hch@lst.de>
-
Darrick J. Wong authored
I noticed that callers of xfs_qm_vop_dqalloc use the following code to compute the anticipated uid of the new file: mapped_fsuid(idmap, &init_user_ns); whereas the VFS uses a slightly different computation for actually assigning i_uid: mapped_fsuid(idmap, i_user_ns(inode)); Technically, these are not the same things. According to Christian Brauner, the only time that inode->i_sb->s_user_ns != &init_user_ns is when the filesystem was mounted in a new mount namespace by an unpriviledged user. XFS does not allow this, which is why we've never seen bug reports about quotas being incorrect or the uid checks in xfs_qm_vop_create_dqattach tripping debug assertions. However, this /is/ a logic bomb, so let's make the code consistent. Link: https://lore.kernel.org/linux-fsdevel/20240617-weitblick-gefertigt-4a41f37119fa@brauner/ Fixes: c14329d3 ("fs: port fs{g,u}id helpers to mnt_idmap") Signed-off-by:
Darrick J. Wong <djwong@kernel.org> Reviewed-by:
Christoph Hellwig <hch@lst.de>
-
Darrick J. Wong authored
generic/388 has an annoying tendency to fail like this during log recovery: XFS (sda4): Unmounting Filesystem 435fe39b-82b6-46ef-be56-819499585130 XFS (sda4): Mounting V5 Filesystem 435fe39b-82b6-46ef-be56-819499585130 XFS (sda4): Starting recovery (logdev: internal) 00000000: 49 4e 81 b6 03 02 00 00 00 00 00 07 00 00 00 07 IN.............. 00000010: 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00 10 ................ 00000020: 35 9a 8b c1 3e 6e 81 00 35 9a 8b c1 3f dc b7 00 5...>n..5...?... 00000030: 35 9a 8b c1 3f dc b7 00 00 00 00 00 00 3c 86 4f 5...?........<.O 00000040: 00 00 00 00 00 00 02 f3 00 00 00 00 00 00 00 00 ................ 00000050: 00 00 1f 01 00 00 00 00 00 00 00 02 b2 74 c9 0b .............t.. 00000060: ff ff ff ff d7 45 73 10 00 00 00 00 00 00 00 2d .....Es........- 00000070: 00 00 07 92 00 01 fe 30 00 00 00 00 00 00 00 1a .......0........ 00000080: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00000090: 35 9a 8b c1 3b 55 0c 00 00 00 00 00 04 27 b2 d1 5...;U.......'.. 000000a0: 43 5f e3 9b 82 b6 46 ef be 56 81 94 99 58 51 30 C_....F..V...XQ0 XFS (sda4): Internal error Bad dinode after recovery at line 539 of file fs/xfs/xfs_inode_item_recover.c. Caller xlog_recover_items_pass2+0x4e/0xc0 [xfs] CPU: 0 PID: 2189311 Comm: mount Not tainted 6.9.0-rc4-djwx #rc4 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20171121_152543-x86-ol7-builder-01.us.oracle.com-4.el7.1 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0x4f/0x60 xfs_corruption_error+0x90/0xa0 xlog_recover_inode_commit_pass2+0x5f1/0xb00 xlog_recover_items_pass2+0x4e/0xc0 xlog_recover_commit_trans+0x2db/0x350 xlog_recovery_process_trans+0xab/0xe0 xlog_recover_process_data+0xa7/0x130 xlog_do_recovery_pass+0x398/0x840 xlog_do_log_recovery+0x62/0xc0 xlog_do_recover+0x34/0x1d0 xlog_recover+0xe9/0x1a0 xfs_log_mount+0xff/0x260 xfs_mountfs+0x5d9/0xb60 xfs_fs_fill_super+0x76b/0xa30 get_tree_bdev+0x124/0x1d0 vfs_get_tree+0x17/0xa0 path_mount+0x72b/0xa90 __x64_sys_mount+0x112/0x150 do_syscall_64+0x49/0x100 entry_SYSCALL_64_after_hwframe+0x4b/0x53 </TASK> XFS (sda4): Corruption detected. Unmount and run xfs_repair XFS (sda4): Metadata corruption detected at xfs_dinode_verify.part.0+0x739/0x920 [xfs], inode 0x427b2d1 XFS (sda4): Filesystem has been shut down due to log error (0x2). XFS (sda4): Please unmount the filesystem and rectify the problem(s). XFS (sda4): log mount/recovery failed: error -117 XFS (sda4): log mount failed This inode log item recovery failing the dinode verifier after replaying the contents of the inode log item into the ondisk inode. Looking back into what the kernel was doing at the time of the fs shutdown, a thread was in the middle of running a series of transactions, each of which committed changes to the inode. At some point in the middle of that chain, an invalid (at least according to the verifier) change was committed. Had the filesystem not shut down in the middle of the chain, a subsequent transaction would have corrected the invalid state and nobody would have noticed. But that's not what happened here. Instead, the invalid inode state was committed to the ondisk log, so log recovery tripped over it. The actual defect here was an overzealous inode verifier, which was fixed in a separate patch. This patch adds some transaction precommit functions for CONFIG_XFS_DEBUG=y mode so that we can detect these kinds of transient errors at transaction commit time, where it's much easier to find the root cause. Signed-off-by:
Darrick J. Wong <djwong@kernel.org> Reviewed-by:
Christoph Hellwig <hch@lst.de>
-
- 01 Jul, 2024 12 commits
-
-
Darrick J. Wong authored
Implement FITRIM for the realtime device by pretending that it's "space" immediately after the data device. We have to hold the rtbitmap ILOCK while the discard operations are ongoing because there's no busy extent tracking for the rt volume to prevent reallocations. Cc: Konst Mayer <cdlscpmv@gmail.com> Signed-off-by:
Darrick J. Wong <djwong@kernel.org> Reviewed-by:
Christoph Hellwig <hch@lst.de> Signed-off-by:
Chandan Babu R <chandanbabu@kernel.org>
-
Wenchao Hao authored
Following warning is reported, so remove these duplicated header including: ./fs/xfs/libxfs/xfs_trans_resv.c: xfs_da_format.h is included more than once. ./fs/xfs/scrub/quota_repair.c: xfs_format.h is included more than once. ./fs/xfs/xfs_handle.c: xfs_da_btree.h is included more than once. ./fs/xfs/xfs_qm_bhv.c: xfs_mount.h is included more than once. ./fs/xfs/xfs_trace.c: xfs_bmap.h is included more than once. This is just a clean code, no logic changed. Signed-off-by:
Wenchao Hao <haowenchao22@gmail.com> Reviewed-by:
Darrick J. Wong <djwong@kernel.org> Signed-off-by:
Chandan Babu R <chandanbabu@kernel.org>
-
Christoph Hellwig authored
Now that the page fault handler has been refactored, the only caller of xfs_ilock_for_write_fault is simple enough and calls it unconditionally. Fold the logic and expand the comments explaining it. Signed-off-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Darrick J. Wong <djwong@kernel.org> Signed-off-by:
Chandan Babu R <chandanbabu@kernel.org>
-
Christoph Hellwig authored
After the previous refactoring, xfs_dax_fault is now never used for write faults, so don't bother with the xfs_ilock_for_write_fault logic to protect against writes when remapping is in progress. Signed-off-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Darrick J. Wong <djwong@kernel.org> Signed-off-by:
Chandan Babu R <chandanbabu@kernel.org>
-
Christoph Hellwig authored
Split the write fault and DAX fault handling into separate helpers so that the main fault handler is easier to follow. Signed-off-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Darrick J. Wong <djwong@kernel.org> Signed-off-by:
Chandan Babu R <chandanbabu@kernel.org>
-
Christoph Hellwig authored
Replace the separate stub with an IS_ENABLED check, and take the call to dax_finish_sync_fault into xfs_dax_fault instead of leaving it in the caller. Signed-off-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Darrick J. Wong <djwong@kernel.org> Signed-off-by:
Chandan Babu R <chandanbabu@kernel.org>
-
Christoph Hellwig authored
Move the relock path out of the straight line and add a comment explaining why it exists. Signed-off-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Darrick J. Wong <djwong@kernel.org> Signed-off-by:
Chandan Babu R <chandanbabu@kernel.org>
-
Christoph Hellwig authored
About half of xfs_ilock_for_iomap deals with a special case for direct I/O writes to COW files that need to take the ilock exclusively. Move this code into the one callers that cares and simplify xfs_ilock_for_iomap. Signed-off-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Darrick J. Wong <djwong@kernel.org> Signed-off-by:
Chandan Babu R <chandanbabu@kernel.org>
-
lei lu authored
This adds sanity checks for xfs_dir2_data_unused and xfs_dir2_data_entry to make sure don't stray beyond valid memory region. Before patching, the loop simply checks that the start offset of the dup and dep is within the range. So in a crafted image, if last entry is xfs_dir2_data_unused, we can change dup->length to dup->length-1 and leave 1 byte of space. In the next traversal, this space will be considered as dup or dep. We may encounter an out of bound read when accessing the fixed members. In the patch, we make sure that the remaining bytes large enough to hold an unused entry before accessing xfs_dir2_data_unused and xfs_dir2_data_unused is XFS_DIR2_DATA_ALIGN byte aligned. We also make sure that the remaining bytes large enough to hold a dirent with a single-byte name before accessing xfs_dir2_data_entry. Signed-off-by:
lei lu <llfamsec@gmail.com> Reviewed-by:
Darrick J. Wong <djwong@kernel.org> Signed-off-by:
Chandan Babu R <chandanbabu@kernel.org>
-
lei lu authored
There is a lack of verification of the space occupied by fixed members of xlog_op_header in the xlog_recover_process_data. We can create a crafted image to trigger an out of bounds read by following these steps: 1) Mount an image of xfs, and do some file operations to leave records 2) Before umounting, copy the image for subsequent steps to simulate abnormal exit. Because umount will ensure that tail_blk and head_blk are the same, which will result in the inability to enter xlog_recover_process_data 3) Write a tool to parse and modify the copied image in step 2 4) Make the end of the xlog_op_header entries only 1 byte away from xlog_rec_header->h_size 5) xlog_rec_header->h_num_logops++ 6) Modify xlog_rec_header->h_crc Fix: Add a check to make sure there is sufficient space to access fixed members of xlog_op_header. Signed-off-by:
lei lu <llfamsec@gmail.com> Reviewed-by:
Dave Chinner <dchinner@redhat.com> Reviewed-by:
Darrick J. Wong <djwong@kernel.org> Signed-off-by:
Chandan Babu R <chandanbabu@kernel.org>
-
John Garry authored
The RT extent range must be considered in the xfs_flush_unmap_range() call to stabilize the boundary. This code change is originally from Dave Chinner. Reviewed-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Darrick J. Wong <djwong@kernel.org> Signed-off-by:
John Garry <john.g.garry@oracle.com> Signed-off-by:
Chandan Babu R <chandanbabu@kernel.org>
-
John Garry authored
Currently xfs_flush_unmap_range() does unmap for a full RT extent range, which we also want to ensure is clean and idle. This code change is originally from Dave Chinner. Reviewed-by: Christoph Hellwig <hch@lst.de>4 Reviewed-by:
Darrick J. Wong <djwong@kernel.org> Signed-off-by:
John Garry <john.g.garry@oracle.com> Signed-off-by:
Chandan Babu R <chandanbabu@kernel.org>
-