An error occurred fetching the project authors.
- 09 Aug, 2021 3 commits
-
-
Darrick J. Wong authored
These two features were merged a year ago, userspace tooling have been merged, and no serious errors have been reported by the developers. Drop the experimental tag to encourage wider testing. Signed-off-by:
Darrick J. Wong <djwong@kernel.org> Reviewed-by:
Chandan Babu R <chandanrlinux@gmail.com> Reviewed-by:
Carlos Maiolino <cmaiolino@redhat.com> Reviewed-by:
Bill O'Donnell <billodo@redhat.com> Reviewed-by:
Christoph Hellwig <hch@lst.de>
-
Darrick J. Wong authored
Now that we have the infrastructure to switch background workers on and off at will, fix the block gc worker code so that we don't actually run the worker when the filesystem is frozen, same as we do for deferred inactivation. Signed-off-by:
Darrick J. Wong <djwong@kernel.org> Reviewed-by:
Dave Chinner <dchinner@redhat.com>
-
Darrick J. Wong authored
Users have come to expect that the space accounting information in statfs and getquota reports are fairly accurate. Now that we inactivate inodes from a background queue, these numbers can be thrown off by whatever resources are singly-owned by the inodes in the queue. Flush the pending inactivations when userspace asks for a space usage report. Signed-off-by:
Darrick J. Wong <djwong@kernel.org> Reviewed-by:
Dave Chinner <dchinner@redhat.com>
-
- 06 Aug, 2021 5 commits
-
-
Dave Chinner authored
Move inode inactivation to background work contexts so that it no longer runs in the context that releases the final reference to an inode. This will allow process work that ends up blocking on inactivation to continue doing work while the filesytem processes the inactivation in the background. A typical demonstration of this is unlinking an inode with lots of extents. The extents are removed during inactivation, so this blocks the process that unlinked the inode from the directory structure. By moving the inactivation to the background process, the userspace applicaiton can keep working (e.g. unlinking the next inode in the directory) while the inactivation work on the previous inode is done by a different CPU. The implementation of the queue is relatively simple. We use a per-cpu lockless linked list (llist) to queue inodes for inactivation without requiring serialisation mechanisms, and a work item to allow the queue to be processed by a CPU bound worker thread. We also keep a count of the queue depth so that we can trigger work after a number of deferred inactivations have been queued. The use of a bound workqueue with a single work depth allows the workqueue to run one work item per CPU. We queue the work item on the CPU we are currently running on, and so this essentially gives us affine per-cpu worker threads for the per-cpu queues. THis maintains the effective CPU affinity that occurs within XFS at the AG level due to all objects in a directory being local to an AG. Hence inactivation work tends to run on the same CPU that last accessed all the objects that inactivation accesses and this maintains hot CPU caches for unlink workloads. A depth of 32 inodes was chosen to match the number of inodes in an inode cluster buffer. This hopefully allows sequential allocation/unlink behaviours to defering inactivation of all the inodes in a single cluster buffer at a time, further helping maintain hot CPU and buffer cache accesses while running inactivations. A hard per-cpu queue throttle of 256 inode has been set to avoid runaway queuing when inodes that take a long to time inactivate are being processed. For example, when unlinking inodes with large numbers of extents that can take a lot of processing to free. Signed-off-by:
Dave Chinner <dchinner@redhat.com> [djwong: tweak comments and tracepoints, convert opflags to state bits] Reviewed-by:
Darrick J. Wong <djwong@kernel.org> Signed-off-by:
Darrick J. Wong <djwong@kernel.org>
-
Darrick J. Wong authored
Move the xfs_inactive call and all the other debugging checks and stats updates into xfs_inode_mark_reclaimable because most of that are implementation details about the inode cache. This is preparation for deferred inactivation that is coming up. We also move it around xfs_icache.c in preparation for deferred inactivation. Signed-off-by:
Darrick J. Wong <djwong@kernel.org> Reviewed-by:
Dave Chinner <dchinner@redhat.com>
-
Dave Chinner authored
The inode inactivation and CIL tracking percpu structures are per-xfs_mount structures. That means when we get a CPU dead notification, we need to then iterate all the per-cpu structure instances to process them. Rather than keeping linked lists of per-cpu structures in each subsystem, add a list of all xfs_mounts that the generic xfs_cpu_dead() function will iterate and call into each subsystem appropriately. This allows us to handle both per-mount and global XFS percpu state from xfs_cpu_dead(), and avoids the need to link subsystem structures that can be easily found from the xfs_mount into their own global lists. Signed-off-by:
Dave Chinner <dchinner@redhat.com> [djwong: expand some comments about mount list setup ordering rules] Reviewed-by:
Darrick J. Wong <djwong@kernel.org> Signed-off-by:
Darrick J. Wong <djwong@kernel.org>
-
Dave Chinner authored
We need to move to per-cpu state for both deferred inode inactivation and CIL tracking, but to do that we need to handle CPUs being removed from the system by the hot-plug code. Introduce generic XFS infrastructure to handle CPU hotplug events that is set up at module init time and torn down at module exit time. Initially, we only need CPU dead notifications, so we only set up a callback for these notifications. The infrastructure can be updated in future for other CPU hotplug state machine notifications easily if ever needed. Signed-off-by:
Dave Chinner <dchinner@redhat.com> [djwong: rearrange some macros, fix function prototypes] Reviewed-by:
Darrick J. Wong <djwong@kernel.org> Signed-off-by:
Darrick J. Wong <djwong@kernel.org>
-
Christoph Hellwig authored
These only made a difference when quotaoff supported disabling quota accounting on a mounted file system, so we can switch everyone to use a single set of flags and helpers now. Note that the *QUOTA_ON naming for the helpers is kept as it was the much more commonly used one. Signed-off-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Carlos Maiolino <cmaiolino@redhat.com> Reviewed-by:
Darrick J. Wong <djwong@kernel.org> Signed-off-by:
Darrick J. Wong <djwong@kernel.org>
-
- 21 Jun, 2021 1 commit
-
-
Dave Chinner authored
It's a one line wrapper around blkdev_issue_flush(). Just replace it with direct calls to blkdev_issue_flush(). Signed-off-by:
Dave Chinner <dchinner@redhat.com> Reviewed-by:
Chandan Babu R <chandanrlinux@gmail.com> Reviewed-by:
Darrick J. Wong <djwong@kernel.org> Reviewed-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Brian Foster <bfoster@redhat.com> Reviewed-by:
Allison Henderson <allison.henderson@oracle.com> Signed-off-by:
Darrick J. Wong <djwong@kernel.org>
-
- 03 Jun, 2021 1 commit
-
-
Darrick J. Wong authored
In preparation for adding another incore inode tree tag, refactor the code that sets and clears tags from the per-AG inode tree and the tree of per-AG structures, and remove the open-coded versions used by the blockgc code. Note: For reclaim, we now rely on the radix tree tags instead of the reclaimable inode count more heavily than we used to. The conversion should be fine, but the logic isn't 100% identical. Signed-off-by:
Darrick J. Wong <djwong@kernel.org> Reviewed-by:
Dave Chinner <dchinner@redhat.com>
-
- 02 Jun, 2021 1 commit
-
-
Dave Chinner authored
They are AG functions, not superblock functions, so move them to the appropriate location. Signed-off-by:
Dave Chinner <dchinner@redhat.com> Reviewed-by:
Brian Foster <bfoster@redhat.com> Reviewed-by:
Darrick J. Wong <djwong@kernel.org>
-
- 07 Apr, 2021 1 commit
-
-
Christoph Hellwig authored
In preparation of removing the historic icinode struct, move the flags field into the containing xfs_inode structure. Signed-off-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Darrick J. Wong <djwong@kernel.org> Signed-off-by:
Darrick J. Wong <djwong@kernel.org>
-
- 25 Mar, 2021 4 commits
-
-
Anthony Iliopoulos authored
Removal of kmem_zone_init wrappers accidentally changed a slab cache name from "xfs_trans" to "xf_trans". Fix this so that userspace consumers of /proc/slabinfo and /sys/kernel/slab can find it again. Fixes: b1231760 ("xfs: Remove slab init wrappers") Signed-off-by:
Anthony Iliopoulos <ailiop@suse.com> Reviewed-by:
Darrick J. Wong <djwong@kernel.org> Signed-off-by:
Darrick J. Wong <djwong@kernel.org>
-
Pavel Reichl authored
Skip the warnings about mount option being deprecated if we are remounting and deprecated option state is not changing. Bug: https://bugzilla.kernel.org/show_bug.cgi?id=211605Fix-suggested-by:
Eric Sandeen <sandeen@redhat.com> Signed-off-by:
Pavel Reichl <preichl@redhat.com> Reviewed-by:
Darrick J. Wong <djwong@kernel.org> Reviewed-by:
Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by:
Darrick J. Wong <djwong@kernel.org>
-
Pavel Reichl authored
Rename mp variable to parsisng_mp so it is easy to distinguish between current mount point handle and handle for mount point which mount options are being parsed. Suggested-by:
Eric Sandeen <sandeen@redhat.com> Signed-off-by:
Pavel Reichl <preichl@redhat.com> Reviewed-by:
Darrick J. Wong <djwong@kernel.org> Reviewed-by:
Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by:
Darrick J. Wong <djwong@kernel.org>
-
Darrick J. Wong authored
Since we're about to start using the blockgc workqueue to dispose of inactivated inodes, strip the "block" prefix from the name; now it's merely the general garbage collection (gc) workqueue. Signed-off-by:
Darrick J. Wong <djwong@kernel.org> Reviewed-by:
Christoph Hellwig <hch@lst.de>
-
- 03 Feb, 2021 6 commits
-
-
Darrick J. Wong authored
Expose the workqueue sysfs knobs for the speculative preallocation gc workers on all kernels, and update the sysadmin information. Signed-off-by:
Darrick J. Wong <djwong@kernel.org> Reviewed-by:
Christoph Hellwig <hch@lst.de>
-
Darrick J. Wong authored
Split the block preallocation garbage collection work into per-AG work items so that we can take advantage of parallelization. Signed-off-by:
Darrick J. Wong <djwong@kernel.org> Reviewed-by:
Christoph Hellwig <hch@lst.de>
-
Darrick J. Wong authored
Shorten the names of the two functions that start and stop block preallocation garbage collection and move them up to the other blockgc functions. Signed-off-by:
Darrick J. Wong <djwong@kernel.org> Reviewed-by:
Christoph Hellwig <hch@lst.de>
-
Darrick J. Wong authored
Remove the separate cowblocks work items and knob so that we can control and run everything from a single blockgc work queue. Note that the speculative_prealloc_lifetime sysfs knob retains its historical name even though the functions move to prefix xfs_blockgc_*. Signed-off-by:
Darrick J. Wong <djwong@kernel.org> Reviewed-by:
Christoph Hellwig <hch@lst.de>
-
Darrick J. Wong authored
Change the one remaining caller of xfs_icache_free_cowblocks to use our new combined blockgc scan function instead, since we will soon be combining the two scans. This introduces a slight behavior change, since a readonly remount now clears out post-EOF preallocations and not just CoW staging extents. Signed-off-by:
Darrick J. Wong <djwong@kernel.org> Reviewed-by:
Christoph Hellwig <hch@lst.de>
-
Darrick J. Wong authored
When CONFIG_XFS_DEBUG=y, set WQ_SYSFS on all workqueues that we create so that we (developers) have a means to monitor cpu affinity and whatnot for background workers. In the next patchset we'll expose knobs for more of the workqueues publicly and document it, but not now. Signed-off-by:
Darrick J. Wong <djwong@kernel.org> Reviewed-by:
Dave Chinner <dchinner@redhat.com> Reviewed-by:
Brian Foster <bfoster@redhat.com>
-
- 27 Jan, 2021 1 commit
-
-
Christoph Hellwig authored
There is no point in allocating memory for a synchronous flush. Signed-off-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by:
Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Acked-by:
Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
- 24 Jan, 2021 1 commit
-
-
Christoph Hellwig authored
Enable idmapped mounts for xfs. This basically just means passing down the user_namespace argument from the VFS methods down to where it is passed to the relevant helpers. Note that full-filesystem bulkstat is not supported from inside idmapped mounts as it is an administrative operation that acts on the whole file system. The limitation is not applied to the bulkstat single operation that just operates on a single inode. Link: https://lore.kernel.org/r/20210121131959.646623-40-christian.brauner@ubuntu.comSigned-off-by:
Christoph Hellwig <hch@lst.de> Signed-off-by:
Christian Brauner <christian.brauner@ubuntu.com>
-
- 23 Jan, 2021 5 commits
-
-
Brian Foster authored
Filesystem freeze cleans the log and immediately redirties it so log recovery runs if a crash occurs after the filesystem is frozen. Now that log quiesce covers the log, there is no need to clean the log and redirty it to trigger log recovery because covering has the same effect. Update xfs_fs_freeze() to quiesce (and thus cover) the log. Signed-off-by:
Brian Foster <bfoster@redhat.com> Reviewed-by:
Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by:
Darrick J. Wong <djwong@kernel.org> Signed-off-by:
Darrick J. Wong <djwong@kernel.org> Reviewed-by:
Allison Henderson <allison.henderson@oracle.com>
-
Brian Foster authored
xfs_quiesce_attr() is now a wrapper for xfs_log_clean(). Remove it and call xfs_log_clean() directly. Signed-off-by:
Brian Foster <bfoster@redhat.com> Reviewed-by:
Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by:
Darrick J. Wong <djwong@kernel.org> Signed-off-by:
Darrick J. Wong <djwong@kernel.org> Reviewed-by:
Allison Henderson <allison.henderson@oracle.com>
-
Brian Foster authored
These two calls are repeated at the beginning of xfs_log_quiesce(). Drop them from xfs_quiesce_attr(). Signed-off-by:
Brian Foster <bfoster@redhat.com> Reviewed-by:
Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by:
Darrick J. Wong <djwong@kernel.org> Signed-off-by:
Darrick J. Wong <djwong@kernel.org> Reviewed-by:
Allison Henderson <allison.henderson@oracle.com>
-
Brian Foster authored
xfs_log_sbcount() calls xfs_sync_sb() to sync superblock counters to disk when lazy superblock accounting is enabled. This occurs on unmount, freeze, and read-only (re)mount and ensures the final values are calculated and persisted to disk before each form of quiesce completes. Now that log covering occurs in all of these contexts and uses the same xfs_sync_sb() mechanism to update log state, there is no need to log the superblock separately for any reason. Update the log quiesce path to sync the superblock at least once for any mount where lazy superblock accounting is enabled. If the log is already covered, it will remain in the covered state. Otherwise, the next sync as part of the normal covering sequence will carry the associated superblock update with it. Remove xfs_log_sbcount() now that it is no longer needed. Signed-off-by:
Brian Foster <bfoster@redhat.com> Reviewed-by:
Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by:
Darrick J. Wong <djwong@kernel.org> Signed-off-by:
Darrick J. Wong <djwong@kernel.org> Reviewed-by:
Allison Henderson <allison.henderson@oracle.com>
-
Brian Foster authored
Log quiesce is currently associated with cleaning the log, which is accomplished by writing an unmount record as the last step of the quiesce sequence. The quiesce codepath is a bit convoluted in this regard due to how it is reused from various contexts. In preparation to create separate log cleaning and log covering interfaces, lift the write of the unmount record into a new cleaning helper and call that wherever xfs_log_quiesce() is currently invoked. No functional changes. Signed-off-by:
Brian Foster <bfoster@redhat.com> Reviewed-by:
Allison Henderson <allison.henderson@oracle.com> Reviewed-by:
Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Darrick J. Wong <djwong@kernel.org> Signed-off-by:
Darrick J. Wong <djwong@kernel.org>
-
- 09 Dec, 2020 5 commits
-
-
Kaixu Xia authored
The quota option 'usrquota' should be shown if both the XFS_UQUOTA_ACCT and XFS_UQUOTA_ENFD flags are set. The option 'uqnoenforce' should be shown when only the XFS_UQUOTA_ACCT flag is set. The current code logic seems wrong, Fix it and show proper options. Signed-off-by:
Kaixu Xia <kaixuxia@tencent.com> Reviewed-by:
Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by:
Darrick J. Wong <darrick.wong@oracle.com>
-
Darrick J. Wong authored
Get rid of this one-off namespace since we're done converting things to fscontext now. Suggested-by:
Dave Chinner <david@fromorbit.com> Signed-off-by:
Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by:
Dave Chinner <dchinner@redhat.com> Reviewed-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Brian Foster <bfoster@redhat.com>
-
Darrick J. Wong authored
Refactor all the open-coded validation of file block ranges into a single helper, and teach the bmap scrubber to check the ranges. Signed-off-by:
Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by:
Brian Foster <bfoster@redhat.com> Reviewed-by:
Dave Chinner <dchinner@redhat.com> Reviewed-by:
Christoph Hellwig <hch@lst.de>
-
Darrick J. Wong authored
Define an incompat feature flag to indicate that the filesystem needs to be repaired. While libxfs will recognize this feature, the kernel will refuse to mount if the feature flag is set, and only xfs_repair will be able to clear the flag. The goal here is to force the admin to run xfs_repair to completion after upgrading the filesystem, or if we otherwise detect anomalies. Signed-off-by:
Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by:
Brian Foster <bfoster@redhat.com> Reviewed-by:
Dave Chinner <dchinner@redhat.com> Reviewed-by:
Eric Sandeen <sandeen@redhat.com>
-
Darrick J. Wong authored
A couple of the superblock validation checks apply only to the kernel, so move them to xfs_fc_fill_super before we add the needsrepair "feature", which will prevent the kernel (but not xfsprogs) from mounting the filesystem. This also reduces the diff between kernel and userspace libxfs. Signed-off-by:
Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by:
Dave Chinner <dchinner@redhat.com> Reviewed-by:
Brian Foster <bfoster@redhat.com> Reviewed-by:
Eric Sandeen <sandeen@redhat.com>
-
- 25 Sep, 2020 1 commit
-
-
Pavel Reichl authored
ikeep/noikeep was a workaround for old DMAPI code which is no longer relevant. attr2/noattr2 - is for controlling upgrade behaviour from fixed attribute fork sizes in the inode (attr1) and dynamic attribute fork sizes (attr2). mkfs has defaulted to setting attr2 since 2007, hence just about every XFS filesystem out there in production right now uses attr2. Signed-off-by:
Pavel Reichl <preichl@redhat.com> Reviewed-by:
Darrick J. Wong <darrick.wong@oracle.com> [darrick: fix minor typos] Signed-off-by:
Darrick J. Wong <darrick.wong@oracle.com>
-
- 18 Sep, 2020 1 commit
-
-
Al Viro authored
Get rid of boilerplate in most of ->statfs() instances... Signed-off-by:
Al Viro <viro@zeniv.linux.org.uk>
-
- 16 Sep, 2020 4 commits
-
-
Darrick J. Wong authored
The V4 filesystem format contains known weaknesses in the on-disk format that make metadata verification diffiult. In addition, the format does not support dates past 2038 and will not be upgraded to do so. We should start the process of retiring the old format to close off attack surfaces and to encourage users to migrate onto V5. Therefore, make XFS V4 support a configurable option. For the first period it will be default Y in case some distributors want to withdraw support early; for the second period it will be default N so that anyone who wishes to continue support can do so; and after that, support will be removed from the kernel. Dates for these events have been added to the upstream kernel. Signed-off-by:
Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by:
Dave Chinner <dchinner@redhat.com> Reviewed-by:
Eric Sandeen <sandeen@redhat.com>
-
Darrick J. Wong authored
Add a couple of tracepoints so that we can check the timestamp limits being set on inodes and quotas. Signed-off-by:
Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Allison Collins <allison.henderson@oracle.com> Reviewed-by:
Gao Xiang <hsiangkao@redhat.com> Reviewed-by:
Dave Chinner <dchinner@redhat.com>
-
Darrick J. Wong authored
Redesign the ondisk inode timestamps to be a simple unsigned 64-bit counter of nanoseconds since 14 Dec 1901 (i.e. the minimum time in the 32-bit unix time epoch). This enables us to handle dates up to 2486, which solves the y2038 problem. Signed-off-by:
Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Gao Xiang <hsiangkao@redhat.com> Reviewed-by:
Dave Chinner <dchinner@redhat.com>
-
Darrick J. Wong authored
Formally define the inode timestamp ranges that existing filesystems support, and switch the vfs timetamp ranges to use it. Signed-off-by:
Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by:
Amir Goldstein <amir73il@gmail.com> Reviewed-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Allison Collins <allison.henderson@oracle.com> Reviewed-by:
Gao Xiang <hsiangkao@redhat.com> Reviewed-by:
Dave Chinner <dchinner@redhat.com>
-