An error occurred fetching the project authors.
- 07 Jul, 2020 2 commits
-
-
Dave Chinner authored
xfs_ail_delete_one() is called directly from dquot and inode IO completion, as well as from the generic xfs_trans_ail_delete() function. Inodes are about to have their own failure handling, and dquots will in future, too. Pull the clearing of the LI_FAILED flag up into the callers so we can customise the code appropriately. Signed-off-by:
Dave Chinner <dchinner@redhat.com> Reviewed-by:
Brian Foster <bfoster@redhat.com> Reviewed-by:
Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by:
Darrick J. Wong <darrick.wong@oracle.com>
-
Dave Chinner authored
When an buffer IO error occurs, we want to mark all the log items attached to the buffer as failed. Open code the error handling loop so that we can modify the flagging for the different types of objects directly and independently of each other. This also allows us to remove the ->iop_error method from the log item operations. Signed-off-by:
Dave Chinner <dchinner@redhat.com> Reviewed-by:
Brian Foster <bfoster@redhat.com> Reviewed-by:
Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by:
Darrick J. Wong <darrick.wong@oracle.com>
-
- 06 Jul, 2020 3 commits
-
-
Dave Chinner authored
Having different io completion callbacks for different inode states makes things complex. We can detect if the inode is stale via the XFS_ISTALE flag in IO completion, so we don't need a special callback just for this. This means inodes only have a single iodone callback, and inode IO completion is entirely buffer centric at this point. Hence we no longer need to use a log item callback at all as we can just call xfs_iflush_done() directly from the buffer completions and walk the buffer log item list to complete the all inodes under IO. Signed-off-by:
Dave Chinner <dchinner@redhat.com> Reviewed-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by:
Brian Foster <bfoster@redhat.com> Signed-off-by:
Darrick J. Wong <darrick.wong@oracle.com>
-
Dave Chinner authored
The inode log item is kind of special in that it can be aggregating new changes in memory at the same time time existing changes are being written back to disk. This means there are fields in the log item that are accessed concurrently from contexts that don't share any locking at all. e.g. updating ili_last_fields occurs at flush time under the ILOCK_EXCL and flush lock at flush time, under the flush lock at IO completion time, and is read under the ILOCK_EXCL when the inode is logged. Hence there is no actual serialisation between reading the field during logging of the inode in transactions vs clearing the field in IO completion. We currently get away with this by the fact that we are only clearing fields in IO completion, and nothing bad happens if we accidentally log more of the inode than we actually modify. Worst case is we consume a tiny bit more memory and log bandwidth. However, if we want to do more complex state manipulations on the log item that requires updates at all three of these potential locations, we need to have some mechanism of serialising those operations. To do this, introduce a spinlock into the log item to serialise internal state. This could be done via the xfs_inode i_flags_lock, but this then leads to potential lock inversion issues where inode flag updates need to occur inside locks that best nest inside the inode log item locks (e.g. marking inodes stale during inode cluster freeing). Using a separate spinlock avoids these sorts of problems and simplifies future code. This does not touch the use of ili_fields in the item formatting code - that is entirely protected by the ILOCK_EXCL at this point in time, so it remains untouched. Signed-off-by:
Dave Chinner <dchinner@redhat.com> Reviewed-by:
Brian Foster <bfoster@redhat.com> Reviewed-by:
Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by:
Darrick J. Wong <darrick.wong@oracle.com>
-
Dave Chinner authored
This was used to track if the item had logged fields being flushed to disk. We log everything in the inode these days, so this logic is no longer needed. Remove it. Signed-off-by:
Dave Chinner <dchinner@redhat.com> Reviewed-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by:
Brian Foster <bfoster@redhat.com> Signed-off-by:
Darrick J. Wong <darrick.wong@oracle.com>
-
- 19 May, 2020 2 commits
-
-
Christoph Hellwig authored
Both the data and attr fork have a format that is stored in the legacy idinode. Move it into the xfs_ifork structure instead, where it uses up padding. Signed-off-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Brian Foster <bfoster@redhat.com> Reviewed-by:
Chandan Babu R <chandanrlinux@gmail.com> Reviewed-by:
Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by:
Darrick J. Wong <darrick.wong@oracle.com>
-
Christoph Hellwig authored
There are there are three extents counters per inode, one for each of the forks. Two are in the legacy icdinode and one is directly in struct xfs_inode. Switch to a single counter in the xfs_ifork structure where it uses up padding at the end of the structure. This simplifies various bits of code that just wants the number of extents counter and can now directly dereference it. Signed-off-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Chandan Babu R <chandanrlinux@gmail.com> Reviewed-by:
Brian Foster <bfoster@redhat.com> Reviewed-by:
Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by:
Darrick J. Wong <darrick.wong@oracle.com>
-
- 07 May, 2020 4 commits
-
-
Brian Foster authored
The stale parameter was used to control the now unused shutdown parameter of xfs_trans_ail_remove(). Signed-off-by:
Brian Foster <bfoster@redhat.com> Reviewed-by:
Dave Chinner <dchinner@redhat.com> Reviewed-by:
Allison Collins <allison.henderson@oracle.com> Reviewed-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by:
Darrick J. Wong <darrick.wong@oracle.com>
-
Brian Foster authored
Now that the functions and callers of xfs_trans_ail_[remove|delete]() have been fixed up appropriately, the only difference between the two is the shutdown behavior. There are only a few callers of the _remove() variant, so make the shutdown conditional on the parameter and combine the two functions. Suggested-by:
Dave Chinner <david@fromorbit.com> Signed-off-by:
Brian Foster <bfoster@redhat.com> Reviewed-by:
Dave Chinner <dchinner@redhat.com> Reviewed-by:
Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Allison Collins <allison.henderson@oracle.com> Signed-off-by:
Darrick J. Wong <darrick.wong@oracle.com>
-
Brian Foster authored
The shutdown parameter of xfs_trans_ail_remove() is no longer used. The remaining callers use it for items that legitimately might not be in the AIL or from contexts where AIL state has already been checked. Remove the unnecessary parameter and fix up the callers. Signed-off-by:
Brian Foster <bfoster@redhat.com> Reviewed-by:
Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Allison Collins <allison.henderson@oracle.com> Signed-off-by:
Darrick J. Wong <darrick.wong@oracle.com>
-
Brian Foster authored
Flush locked log items whose underlying buffers fail metadata writeback are tagged with a special flag to indicate that the flush lock is already held. This is currently implemented in the type specific ->iop_push() callback, but the processing required for such items is not type specific because we're only doing basic state management on the underlying buffer. Factor the failed log item handling out of the inode and dquot ->iop_push() callbacks and open code the buffer resubmit helper into a single helper called from xfsaild_push_item(). This provides a generic mechanism for handling failed metadata buffer writeback with a bit less code. Signed-off-by:
Brian Foster <bfoster@redhat.com> Reviewed-by:
Allison Collins <allison.henderson@oracle.com> Reviewed-by:
Dave Chinner <dchinner@redhat.com> Reviewed-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by:
Darrick J. Wong <darrick.wong@oracle.com>
-
- 04 May, 2020 1 commit
-
-
Christoph Hellwig authored
Signed-off-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Brian Foster <bfoster@redhat.com> Reviewed-by:
Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by:
Darrick J. Wong <darrick.wong@oracle.com>
-
- 28 Mar, 2020 1 commit
-
-
Brian Foster authored
If the inode buffer backing a particular inode is locked, xfs_iflush() returns -EAGAIN and xfs_inode_item_push() skips the inode. It still returns success to xfsaild, however, which bypasses the xfsaild backoff heuristic. Update xfs_inode_item_push() to return locked status if the inode buffer couldn't be locked. Signed-off-by:
Brian Foster <bfoster@redhat.com> Reviewed-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by:
Darrick J. Wong <darrick.wong@oracle.com>
-
- 27 Mar, 2020 2 commits
-
-
Dave Chinner authored
We currently wake anything waiting on the log tail to move whenever the log item at the tail of the log is removed. Historically this was fine behaviour because there were very few items at any given LSN. But with delayed logging, there may be thousands of items at any given LSN, and we can't move the tail until they are all gone. Hence if we are removing them in near tail-first order, we might be waking up processes waiting on the tail LSN to change (e.g. log space waiters) repeatedly without them being able to make progress. This also occurs with the new sync push waiters, and can result in thousands of spurious wakeups every second when under heavy direct reclaim pressure. To fix this, check that the tail LSN has actually changed on the AIL before triggering wakeups. This will reduce the number of spurious wakeups when doing bulk AIL removal and make this code much more efficient. Signed-off-by:
Dave Chinner <dchinner@redhat.com> Reviewed-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Brian Foster <bfoster@redhat.com> Reviewed-by:
Allison Collins <allison.henderson@oracle.com> Reviewed-by:
Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by:
Darrick J. Wong <darrick.wong@oracle.com>
-
Dave Chinner authored
Factor the common AIL deletion code that does all the wakeups into a helper so we only have one copy of this somewhat tricky code to interface with all the wakeups necessary when the LSN of the log tail changes. Signed-off-by:
Dave Chinner <dchinner@redhat.com> Reviewed-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Allison Collins <allison.henderson@oracle.com> Reviewed-by:
Brian Foster <bfoster@redhat.com> Reviewed-by:
Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by:
Darrick J. Wong <darrick.wong@oracle.com>
-
- 19 Mar, 2020 2 commits
-
-
Christoph Hellwig authored
We know the version is 3 if on a v5 file system. For earlier file systems formats we always upgrade the remaining v1 inodes to v2 and thus only use v2 inodes. Use the xfs_sb_version_has_large_dinode helper to check if we deal with small or large dinodes, and thus remove the need for the di_version field in struct icdinode. Signed-off-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Brian Foster <bfoster@redhat.com> Reviewed-by:
Chandan Rajendra <chandanrlinux@gmail.com> Reviewed-by:
Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by:
Darrick J. Wong <darrick.wong@oracle.com>
-
Christoph Hellwig authored
The size of the dinode structure is only dependent on the file system version, so instead of checking the individual inode version just use the newly added xfs_sb_version_has_large_dinode helper, and simplify various calling conventions. Signed-off-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Brian Foster <bfoster@redhat.com> Reviewed-by:
Chandan Rajendra <chandanrlinux@gmail.com> Reviewed-by:
Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by:
Darrick J. Wong <darrick.wong@oracle.com>
-
- 03 Mar, 2020 2 commits
-
-
Christoph Hellwig authored
Remove the XFS wrappers for converting from and to the kuid/kgid types. Mostly this means switching to VFS i_{u,g}id_{read,write} helpers, but in a few spots the calls to the conversion functions is open coded. To match the use of sb->s_user_ns in the helpers and other file systems, sb->s_user_ns is also used in the quota code. The ACL code already does the conversion in a grotty layering violation in the VFS xattr code, so it keeps using init_user_ns for the identity mapping. Signed-off-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by:
Darrick J. Wong <darrick.wong@oracle.com>
-
Christoph Hellwig authored
Use the Linux inode i_uid/i_gid members everywhere and just convert from/to the scalar value when reading or writing the on-disk inode. Signed-off-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by:
Darrick J. Wong <darrick.wong@oracle.com>
-
- 18 Nov, 2019 1 commit
-
-
Carlos Maiolino authored
We can remove it now, without needing to rework the KM_ flags. Use kmem_cache_free() directly. Reviewed-by:
Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by:
Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by:
Darrick J. Wong <darrick.wong@oracle.com>
-
- 13 Nov, 2019 2 commits
-
-
Christoph Hellwig authored
There is no point in splitting the fields like this in an purely in-memory structure. Signed-off-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by:
Darrick J. Wong <darrick.wong@oracle.com>
-
Christoph Hellwig authored
struct xfs_icdinode is purely an in-memory data structure, so don't use a log on-disk structure for it. This simplifies the code a bit, and also reduces our include hell slightly. Signed-off-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Darrick J. Wong <darrick.wong@oracle.com> [darrick: fix a minor indenting problem in xfs_trans_ichgtime] Signed-off-by:
Darrick J. Wong <darrick.wong@oracle.com>
-
- 04 Nov, 2019 1 commit
-
-
Darrick J. Wong authored
Make sure we log something to dmesg whenever we return -EFSCORRUPTED up the call stack. Signed-off-by:
Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by:
Carlos Maiolino <cmaiolino@redhat.com> Reviewed-by:
Christoph Hellwig <hch@lst.de>
-
- 26 Aug, 2019 1 commit
-
-
Tetsuo Handa authored
Since no caller is using KM_NOSLEEP and no callee branches on KM_SLEEP, we can remove KM_NOSLEEP and replace KM_SLEEP with 0. Signed-off-by:
Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Reviewed-by:
Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by:
Darrick J. Wong <darrick.wong@oracle.com>
-
- 29 Jun, 2019 4 commits
-
-
Eric Sandeen authored
There are many, many xfs header files which are included but unneeded (or included twice) in the xfs code, so remove them. nb: xfs_linux.h includes about 9 headers for everyone, so those explicit includes get removed by this. I'm not sure what the preference is, but if we wanted explicit includes everywhere, a followup patch could remove those xfs_*.h includes from xfs_linux.h and move them into the files that need them. Or it could be left as-is. Signed-off-by:
Eric Sandeen <sandeen@redhat.com> Reviewed-by:
Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by:
Darrick J. Wong <darrick.wong@oracle.com>
-
Christoph Hellwig authored
Signed-off-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Brian Foster <bfoster@redhat.com> Reviewed-by:
Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by:
Darrick J. Wong <darrick.wong@oracle.com>
-
Christoph Hellwig authored
The iop_unlock method is called when comitting or cancelling a transaction. In the latter case, the transaction may or may not be aborted. While there is no known problem with the current code in practice, this implementation is limited in that any log item implementation that might want to differentiate between a commit and a cancellation must rely on the aborted state. The aborted bit is only set when the cancelled transaction is dirty, however. This means that there is no way to distinguish between a commit and a clean transaction cancellation. For example, intent log items currently rely on this distinction. The log item is either transferred to the CIL on commit or released on transaction cancel. There is currently no possibility for a clean intent log item in a transaction, but if that state is ever introduced a cancel of such a transaction will immediately result in memory leaks of the associated log item(s). This is an interface deficiency and landmine. To clean this up, replace the iop_unlock method with an iop_release method that is specific to transaction cancel. The existing iop_committing method occurs at the same time as iop_unlock in the commit path and there is no need for two separate callbacks here. Overload the iop_committing method with the current commit time iop_unlock implementations to eliminate the need for the latter and further simplify the interface. Signed-off-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Brian Foster <bfoster@redhat.com> Reviewed-by:
Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by:
Darrick J. Wong <darrick.wong@oracle.com>
-
Darrick J. Wong authored
The inode geometry structure isn't related to ondisk format; it's support for the mount structure. Move it to xfs_shared.h. Signed-off-by:
Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by:
Dave Chinner <dchinner@redhat.com> Reviewed-by:
Christoph Hellwig <hch@lst.de>
-
- 30 Jul, 2018 1 commit
-
-
Christoph Hellwig authored
The field is only used for asserts, and to track if we really need to do realloc when growing the inode fork data. But the krealloc function already performs this check internally, so there is no need to keep track of the real allocation size. This will free space in the inode fork for keeping a sequence counter of changes to the extent list. Signed-off-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by:
Darrick J. Wong <darrick.wong@oracle.com>
-
- 06 Jun, 2018 1 commit
-
-
Dave Chinner authored
Remove the verbose license text from XFS files and replace them with SPDX tags. This does not change the license of any of the code, merely refers to the common, up-to-date license files in LICENSES/ This change was mostly scripted. fs/xfs/Makefile and fs/xfs/libxfs/xfs_fs.h were modified by hand, the rest were detected and modified by the following command: for f in `git grep -l "GNU General" fs/xfs/` ; do echo $f cat $f | awk -f hdr.awk > $f.new mv -f $f.new $f done And the hdr.awk script that did the modification (including detecting the difference between GPL-2.0 and GPL-2.0+ licenses) is as follows: $ cat hdr.awk BEGIN { hdr = 1.0 tag = "GPL-2.0" str = "" } /^ \* This program is free software/ { hdr = 2.0; next } /any later version./ { tag = "GPL-2.0+" next } /^ \*\// { if (hdr > 0.0) { print "// SPDX-License-Identifier: " tag print str print $0 str="" hdr = 0.0 next } print $0 next } /^ \* / { if (hdr > 1.0) next if (hdr > 0.0) { if (str != "") str = str "\n" str = str $0 next } print $0 next } /^ \*/ { if (hdr > 0.0) next print $0 next } // { if (hdr > 0.0) { if (str != "") str = str "\n" str = str $0 next } print $0 } END { } $ Signed-off-by:
Dave Chinner <dchinner@redhat.com> Reviewed-by:
Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by:
Darrick J. Wong <darrick.wong@oracle.com>
-
- 10 May, 2018 1 commit
-
-
Dave Chinner authored
The log item flags contain a field that is protected by the AIL lock - the XFS_LI_IN_AIL flag. We use non-atomic RMW operations to set and clear these flags, but most of the updates and checks are not done with the AIL lock held and so are susceptible to update races. Fix this by changing the log item flags to use atomic bitops rather than be reliant on the AIL lock for update serialisation. Signed-Off-By:
Dave Chinner <dchinner@redhat.com> Reviewed-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Brian Foster <bfoster@redhat.com> Reviewed-by:
Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by:
Darrick J. Wong <darrick.wong@oracle.com>
-
- 14 Mar, 2018 2 commits
-
-
Christoph Hellwig authored
The function now does something, and that something is central to our inode logging scheme. Signed-off-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by:
Darrick J. Wong <darrick.wong@oracle.com>
-
Christoph Hellwig authored
Signed-off-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by:
Darrick J. Wong <darrick.wong@oracle.com>
-
- 12 Mar, 2018 1 commit
-
-
Matthew Wilcox authored
This is a simple rename, except that xa_ail becomes ail_head. Signed-off-by:
Matthew Wilcox <mawilcox@microsoft.com> Reviewed-by:
Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by:
Darrick J. Wong <darrick.wong@oracle.com>
-
- 29 Jan, 2018 3 commits
-
-
Carlos Maiolino authored
Now that buffer's b_fspriv has been split, just replace the current singly linked list of xfs_log_items, by the list_head infrastructure. Also, remove the xfs_log_item argument from xfs_buf_resubmit_failed_buffers(), there is no need for this argument, once the log items can be walked through the list_head in the buffer. Signed-off-by:
Carlos Maiolino <cmaiolino@redhat.com> Reviewed-by:
Bill O'Donnell <billodo@redhat.com> Reviewed-by:
Darrick J. Wong <darrick.wong@oracle.com> [darrick: minor style cleanups] Signed-off-by:
Darrick J. Wong <darrick.wong@oracle.com>
-
Carlos Maiolino authored
By splitting the b_fspriv field into two different fields (b_log_item and b_li_list). It's possible to get rid of an old ABI workaround, by using the new b_log_item field to store xfs_buf_log_item separated from the log items attached to the buffer, which will be linked in the new b_li_list field. This way, there is no more need to reorder the log items list to place the buf_log_item at the beginning of the list, simplifying a bit the logic to handle buffer IO. This also opens the possibility to change buffer's log items list into a proper list_head. b_log_item field is still defined as a void *, because it is still used by the log buffers to store xlog_in_core structures, and there is no need to add an extra field on xfs_buf just for xlog_in_core. Signed-off-by:
Carlos Maiolino <cmaiolino@redhat.com> Reviewed-by:
Bill O'Donnell <billodo@redhat.com> Reviewed-by:
Darrick J. Wong <darrick.wong@oracle.com> [darrick: minor style changes] Signed-off-by:
Darrick J. Wong <darrick.wong@oracle.com>
-
Jeff Layton authored
Signed-off-by:
Jeff Layton <jlayton@redhat.com> Acked-by:
Darrick J. Wong <darrick.wong@oracle.com> Acked-by:
Dave Chinner <dchinner@redhat.com>
-
- 06 Nov, 2017 1 commit
-
-
Christoph Hellwig authored
Replace the current linear list and the indirection array for the in-core extent list with a b+tree to avoid the need for larger memory allocations for the indirection array when lots of extents are present. The current extent list implementations leads to heavy pressure on the memory allocator when modifying files with a high extent count, and can lead to high latencies because of that. The replacement is a b+tree with a few quirks. The leaf nodes directly store the extent record in two u64 values. The encoding is a little bit different from the existing in-core extent records so that the start offset and length which are required for lookups can be retreived with simple mask operations. The inner nodes store a 64-bit key containing the start offset in the first half of the node, and the pointers to the next lower level in the second half. In either case we walk the node from the beginninig to the end and do a linear search, as that is more efficient for the low number of cache lines touched during a search (2 for the inner nodes, 4 for the leaf nodes) than a binary search. We store termination markers (zero length for the leaf nodes, an otherwise impossible high bit for the inner nodes) to terminate the key list / records instead of storing a count to use the available cache lines as efficiently as possible. One quirk of the algorithm is that while we normally split a node half and half like usual btree implementations we just spill over entries added at the very end of the list to a new node on its own. This means we get a 100% fill grade for the common cases of bulk insertion when reading an inode into memory, and when only sequentially appending to a file. The downside is a slightly higher chance of splits on the first random insertions. Both insert and removal manually recurse into the lower levels, but the bulk deletion of the whole tree is still implemented as a recursive function call, although one limited by the overall depth and with very little stack usage in every iteration. For the first few extents we dynamically grow the list from a single extent to the next powers of two until we have a first full leaf block and that building the actual tree. The code started out based on the generic lib/btree.c code from Joern Engel based on earlier work from Peter Zijlstra, but has since been rewritten beyond recognition. Signed-off-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by:
Darrick J. Wong <darrick.wong@oracle.com>
-
- 26 Oct, 2017 2 commits
-
-
Christoph Hellwig authored
We can simply use the i_rdev field in the Linux inode and just convert to and from the XFS dev_t when reading or logging/writing the inode. Signed-off-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by:
Darrick J. Wong <darrick.wong@oracle.com>
-
Christoph Hellwig authored
Remove the dead code dealing with the UUID fork format that was never implemented in Linux (and neither in IRIX as far as I know). Signed-off-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by:
Darrick J. Wong <darrick.wong@oracle.com>
-