An error occurred fetching the project authors.
  1. 07 Jul, 2020 2 commits
  2. 06 Jul, 2020 3 commits
    • Dave Chinner's avatar
      xfs: make inode IO completion buffer centric · aac855ab
      Dave Chinner authored
      Having different io completion callbacks for different inode states
      makes things complex. We can detect if the inode is stale via the
      XFS_ISTALE flag in IO completion, so we don't need a special
      callback just for this.
      
      This means inodes only have a single iodone callback, and inode IO
      completion is entirely buffer centric at this point. Hence we no
      longer need to use a log item callback at all as we can just call
      xfs_iflush_done() directly from the buffer completions and walk the
      buffer log item list to complete the all inodes under IO.
      Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      Reviewed-by: default avatarBrian Foster <bfoster@redhat.com>
      Signed-off-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      aac855ab
    • Dave Chinner's avatar
      xfs: add an inode item lock · 1319ebef
      Dave Chinner authored
      The inode log item is kind of special in that it can be aggregating
      new changes in memory at the same time time existing changes are
      being written back to disk. This means there are fields in the log
      item that are accessed concurrently from contexts that don't share
      any locking at all.
      
      e.g. updating ili_last_fields occurs at flush time under the
      ILOCK_EXCL and flush lock at flush time, under the flush lock at IO
      completion time, and is read under the ILOCK_EXCL when the inode is
      logged.  Hence there is no actual serialisation between reading the
      field during logging of the inode in transactions vs clearing the
      field in IO completion.
      
      We currently get away with this by the fact that we are only
      clearing fields in IO completion, and nothing bad happens if we
      accidentally log more of the inode than we actually modify. Worst
      case is we consume a tiny bit more memory and log bandwidth.
      
      However, if we want to do more complex state manipulations on the
      log item that requires updates at all three of these potential
      locations, we need to have some mechanism of serialising those
      operations. To do this, introduce a spinlock into the log item to
      serialise internal state.
      
      This could be done via the xfs_inode i_flags_lock, but this then
      leads to potential lock inversion issues where inode flag updates
      need to occur inside locks that best nest inside the inode log item
      locks (e.g. marking inodes stale during inode cluster freeing).
      Using a separate spinlock avoids these sorts of problems and
      simplifies future code.
      
      This does not touch the use of ili_fields in the item formatting
      code - that is entirely protected by the ILOCK_EXCL at this point in
      time, so it remains untouched.
      Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
      Reviewed-by: default avatarBrian Foster <bfoster@redhat.com>
      Reviewed-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      1319ebef
    • Dave Chinner's avatar
      xfs: remove logged flag from inode log item · 1dfde687
      Dave Chinner authored
      This was used to track if the item had logged fields being flushed
      to disk. We log everything in the inode these days, so this logic is
      no longer needed. Remove it.
      Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      Reviewed-by: default avatarBrian Foster <bfoster@redhat.com>
      Signed-off-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      1dfde687
  3. 19 May, 2020 2 commits
  4. 07 May, 2020 4 commits
  5. 04 May, 2020 1 commit
  6. 28 Mar, 2020 1 commit
  7. 27 Mar, 2020 2 commits
  8. 19 Mar, 2020 2 commits
  9. 03 Mar, 2020 2 commits
  10. 18 Nov, 2019 1 commit
  11. 13 Nov, 2019 2 commits
  12. 04 Nov, 2019 1 commit
  13. 26 Aug, 2019 1 commit
  14. 29 Jun, 2019 4 commits
  15. 30 Jul, 2018 1 commit
  16. 06 Jun, 2018 1 commit
    • Dave Chinner's avatar
      xfs: convert to SPDX license tags · 0b61f8a4
      Dave Chinner authored
      Remove the verbose license text from XFS files and replace them
      with SPDX tags. This does not change the license of any of the code,
      merely refers to the common, up-to-date license files in LICENSES/
      
      This change was mostly scripted. fs/xfs/Makefile and
      fs/xfs/libxfs/xfs_fs.h were modified by hand, the rest were detected
      and modified by the following command:
      
      for f in `git grep -l "GNU General" fs/xfs/` ; do
      	echo $f
      	cat $f | awk -f hdr.awk > $f.new
      	mv -f $f.new $f
      done
      
      And the hdr.awk script that did the modification (including
      detecting the difference between GPL-2.0 and GPL-2.0+ licenses)
      is as follows:
      
      $ cat hdr.awk
      BEGIN {
      	hdr = 1.0
      	tag = "GPL-2.0"
      	str = ""
      }
      
      /^ \* This program is free software/ {
      	hdr = 2.0;
      	next
      }
      
      /any later version./ {
      	tag = "GPL-2.0+"
      	next
      }
      
      /^ \*\// {
      	if (hdr > 0.0) {
      		print "// SPDX-License-Identifier: " tag
      		print str
      		print $0
      		str=""
      		hdr = 0.0
      		next
      	}
      	print $0
      	next
      }
      
      /^ \* / {
      	if (hdr > 1.0)
      		next
      	if (hdr > 0.0) {
      		if (str != "")
      			str = str "\n"
      		str = str $0
      		next
      	}
      	print $0
      	next
      }
      
      /^ \*/ {
      	if (hdr > 0.0)
      		next
      	print $0
      	next
      }
      
      // {
      	if (hdr > 0.0) {
      		if (str != "")
      			str = str "\n"
      		str = str $0
      		next
      	}
      	print $0
      }
      
      END { }
      $
      Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
      Reviewed-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      0b61f8a4
  17. 10 May, 2018 1 commit
  18. 14 Mar, 2018 2 commits
  19. 12 Mar, 2018 1 commit
  20. 29 Jan, 2018 3 commits
  21. 06 Nov, 2017 1 commit
    • Christoph Hellwig's avatar
      xfs: use a b+tree for the in-core extent list · 6bdcf26a
      Christoph Hellwig authored
      Replace the current linear list and the indirection array for the in-core
      extent list with a b+tree to avoid the need for larger memory allocations
      for the indirection array when lots of extents are present.  The current
      extent list implementations leads to heavy pressure on the memory
      allocator when modifying files with a high extent count, and can lead
      to high latencies because of that.
      
      The replacement is a b+tree with a few quirks.  The leaf nodes directly
      store the extent record in two u64 values.  The encoding is a little bit
      different from the existing in-core extent records so that the start
      offset and length which are required for lookups can be retreived with
      simple mask operations.  The inner nodes store a 64-bit key containing
      the start offset in the first half of the node, and the pointers to the
      next lower level in the second half.  In either case we walk the node
      from the beginninig to the end and do a linear search, as that is more
      efficient for the low number of cache lines touched during a search
      (2 for the inner nodes, 4 for the leaf nodes) than a binary search.
      We store termination markers (zero length for the leaf nodes, an
      otherwise impossible high bit for the inner nodes) to terminate the key
      list / records instead of storing a count to use the available cache
      lines as efficiently as possible.
      
      One quirk of the algorithm is that while we normally split a node half and
      half like usual btree implementations we just spill over entries added at
      the very end of the list to a new node on its own.  This means we get a
      100% fill grade for the common cases of bulk insertion when reading an
      inode into memory, and when only sequentially appending to a file.  The
      downside is a slightly higher chance of splits on the first random
      insertions.
      
      Both insert and removal manually recurse into the lower levels, but
      the bulk deletion of the whole tree is still implemented as a recursive
      function call, although one limited by the overall depth and with very
      little stack usage in every iteration.
      
      For the first few extents we dynamically grow the list from a single
      extent to the next powers of two until we have a first full leaf block
      and that building the actual tree.
      
      The code started out based on the generic lib/btree.c code from Joern
      Engel based on earlier work from Peter Zijlstra, but has since been
      rewritten beyond recognition.
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      6bdcf26a
  22. 26 Oct, 2017 2 commits