1. 03 Dec, 2019 3 commits
    • Brian Foster's avatar
      xfs: fix mount failure crash on invalid iclog memory access · 798a9cad
      Brian Foster authored
      syzbot (via KASAN) reports a use-after-free in the error path of
      xlog_alloc_log(). Specifically, the iclog freeing loop doesn't
      handle the case of a fully initialized ->l_iclog linked list.
      Instead, it assumes that the list is partially constructed and NULL
      terminated.
      
      This bug manifested because there was no possible error scenario
      after iclog list setup when the original code was added.  Subsequent
      code and associated error conditions were added some time later,
      while the original error handling code was never updated. Fix up the
      error loop to terminate either on a NULL iclog or reaching the end
      of the list.
      
      Reported-by: syzbot+c732f8644185de340492@syzkaller.appspotmail.com
      Signed-off-by: default avatarBrian Foster <bfoster@redhat.com>
      Reviewed-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      798a9cad
    • Omar Sandoval's avatar
      xfs: don't check for AG deadlock for realtime files in bunmapi · 69ffe596
      Omar Sandoval authored
      Commit 5b094d6d ("xfs: fix multi-AG deadlock in xfs_bunmapi") added
      a check in __xfs_bunmapi() to stop early if we would touch multiple AGs
      in the wrong order. However, this check isn't applicable for realtime
      files. In most cases, it just makes us do unnecessary commits. However,
      without the fix from the previous commit ("xfs: fix realtime file data
      space leak"), if the last and second-to-last extents also happen to have
      different "AG numbers", then the break actually causes __xfs_bunmapi()
      to return without making any progress, which sends
      xfs_itruncate_extents_flags() into an infinite loop.
      
      Fixes: 5b094d6d ("xfs: fix multi-AG deadlock in xfs_bunmapi")
      Signed-off-by: default avatarOmar Sandoval <osandov@fb.com>
      Reviewed-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      69ffe596
    • Omar Sandoval's avatar
      xfs: fix realtime file data space leak · 0c4da70c
      Omar Sandoval authored
      Realtime files in XFS allocate extents in rextsize units. However, the
      written/unwritten state of those extents is still tracked in blocksize
      units. Therefore, a realtime file can be split up into written and
      unwritten extents that are not necessarily aligned to the realtime
      extent size. __xfs_bunmapi() has some logic to handle these various
      corner cases. Consider how it handles the following case:
      
      1. The last extent is unwritten.
      2. The last extent is smaller than the realtime extent size.
      3. startblock of the last extent is not aligned to the realtime extent
         size, but startblock + blockcount is.
      
      In this case, __xfs_bunmapi() calls xfs_bmap_add_extent_unwritten_real()
      to set the second-to-last extent to unwritten. This should merge the
      last and second-to-last extents, so __xfs_bunmapi() moves on to the
      second-to-last extent.
      
      However, if the size of the last and second-to-last extents combined is
      greater than MAXEXTLEN, xfs_bmap_add_extent_unwritten_real() does not
      merge the two extents. When that happens, __xfs_bunmapi() skips past the
      last extent without unmapping it, thus leaking the space.
      
      Fix it by only unwriting the minimum amount needed to align the last
      extent to the realtime extent size, which is guaranteed to merge with
      the last extent.
      Signed-off-by: default avatarOmar Sandoval <osandov@fb.com>
      Reviewed-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      0c4da70c
  2. 27 Nov, 2019 1 commit
  3. 22 Nov, 2019 10 commits
  4. 18 Nov, 2019 4 commits
  5. 16 Nov, 2019 2 commits
    • Brian Foster's avatar
      xfs: fix attr leaf header freemap.size underflow · 2a2b5932
      Brian Foster authored
      The leaf format xattr addition helper xfs_attr3_leaf_add_work()
      adjusts the block freemap in a couple places. The first update drops
      the size of the freemap that the caller had already selected to
      place the xattr name/value data. Before the function returns, it
      also checks whether the entries array has encroached on a freemap
      range by virtue of the new entry addition. This is necessary because
      the entries array grows from the start of the block (but end of the
      block header) towards the end of the block while the name/value data
      grows from the end of the block in the opposite direction. If the
      associated freemap is already empty, however, size is zero and the
      subtraction underflows the field and causes corruption.
      
      This is reproduced rarely by generic/070. The observed behavior is
      that a smaller sized freemap is aligned to the end of the entries
      list, several subsequent xattr additions land in larger freemaps and
      the entries list expands into the smaller freemap until it is fully
      consumed and then underflows. Note that it is not otherwise a
      corruption for the entries array to consume an empty freemap because
      the nameval list (i.e. the firstused pointer in the xattr header)
      starts beyond the end of the corrupted freemap.
      
      Update the freemap size modification to account for the fact that
      the freemap entry can be empty and thus stale.
      Signed-off-by: default avatarBrian Foster <bfoster@redhat.com>
      Reviewed-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      2a2b5932
    • Darrick J. Wong's avatar
      xfs: fix some memory leaks in log recovery · 050552cb
      Darrick J. Wong authored
      Fix a few places where we xlog_alloc_buffer a buffer, hit an error, and
      then bail out without freeing the buffer.
      Signed-off-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      Reviewed-by: default avatarBrian Foster <bfoster@redhat.com>
      050552cb
  6. 14 Nov, 2019 9 commits
  7. 13 Nov, 2019 11 commits
    • Pavel Reichl's avatar
      xfs: remove the xfs_disk_dquot_t and xfs_dquot_t · aefe69a4
      Pavel Reichl authored
      Signed-off-by: default avatarPavel Reichl <preichl@redhat.com>
      Reviewed-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      [darrick: fix some of the comments]
      Signed-off-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      aefe69a4
    • Arnd Bergmann's avatar
      xfs: avoid time_t in user api · e8777b27
      Arnd Bergmann authored
      The ioctl definitions for XFS_IOC_SWAPEXT, XFS_IOC_FSBULKSTAT and
      XFS_IOC_FSBULKSTAT_SINGLE are part of libxfs and based on time_t.
      
      The definition for time_t differs between current kernels and coming
      32-bit libc variants that define it as 64-bit. For most ioctls, that
      means the kernel has to be able to handle two different command codes
      based on the different structure sizes.
      
      The same solution could be applied for XFS_IOC_SWAPEXT, but it would
      not work for XFS_IOC_FSBULKSTAT and XFS_IOC_FSBULKSTAT_SINGLE because
      the structure with the time_t is passed through an indirect pointer,
      and the command number itself is based on struct xfs_fsop_bulkreq,
      which does not differ based on time_t.
      
      This means any solution that can be applied requires a change of the
      ABI definition in the xfs_fs.h header file, as well as doing the same
      change in any user application that contains a copy of this header.
      
      The usual solution would be to define a replacement structure and
      use conditional compilation for the ioctl command codes to use
      one or the other, such as
      
       #define XFS_IOC_FSBULKSTAT_OLD _IOWR('X', 101, struct xfs_fsop_bulkreq)
       #define XFS_IOC_FSBULKSTAT_NEW _IOWR('X', 129, struct xfs_fsop_bulkreq)
       #define XFS_IOC_FSBULKSTAT ((sizeof(time_t) == sizeof(__kernel_long_t)) ? \
      			     XFS_IOC_FSBULKSTAT_OLD : XFS_IOC_FSBULKSTAT_NEW)
      
      After this, the kernel would be able to implement both
      XFS_IOC_FSBULKSTAT_OLD and XFS_IOC_FSBULKSTAT_NEW handlers on
      32-bit architectures with the correct ABI for either definition
      of time_t.
      
      However, as long as two observations are true, a much simpler solution
      can be used:
      
      1. xfsprogs is the only user space project that has a copy of this header
      2. xfsprogs already has a replacement for all three affected ioctl commands,
         based on the xfs_bulkstat structure to pass 64-bit timestamps
         regardless of the architecture
      
      Based on those assumptions, changing xfs_bstime to use __kernel_long_t
      instead of time_t in both the kernel and in xfsprogs preserves the current
      ABI for any libc definition of time_t and solves the problem of passing
      64-bit timestamps to 32-bit user space.
      
      If either of the two assumptions is invalid, more discussion is needed
      for coming up with a way to fix as much of the affected user space
      code as possible.
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      e8777b27
    • kaixuxia's avatar
      xfs: Fix deadlock between AGI and AGF when target_ip exists in xfs_rename() · 93597ae8
      kaixuxia authored
      When target_ip exists in xfs_rename(), the xfs_dir_replace() call may
      need to hold the AGF lock to allocate more blocks, and then invoking
      the xfs_droplink() call to hold AGI lock to drop target_ip onto the
      unlinked list, so we get the lock order AGF->AGI. This would break the
      ordering constraint on AGI and AGF locking - inode allocation locks
      the AGI, then can allocate a new extent for new inodes, locking the
      AGF after the AGI.
      
      In this patch we check whether the replace operation need more
      blocks firstly. If so, acquire the agi lock firstly to preserve
      locking order(AGI/AGF). Actually, the locking order problem only
      occurs when we are locking the AGI/AGF of the same AG. For multiple
      AGs the AGI lock will be released after the transaction committed.
      Signed-off-by: default avatarkaixuxia <kaixuxia@tencent.com>
      Reviewed-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      [darrick: reword the comment]
      Signed-off-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      93597ae8
    • Christoph Hellwig's avatar
      xfs: don't reset the "inode core" in xfs_iread · 048a35d2
      Christoph Hellwig authored
      We have the exact same memset in xfs_inode_alloc, which is always called
      just before xfs_iread.
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      048a35d2
    • Christoph Hellwig's avatar
      xfs: merge the projid fields in struct xfs_icdinode · de7a866f
      Christoph Hellwig authored
      There is no point in splitting the fields like this in an purely
      in-memory structure.
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      de7a866f
    • Christoph Hellwig's avatar
      xfs: use a struct timespec64 for the in-core crtime · 8d2d878d
      Christoph Hellwig authored
      struct xfs_icdinode is purely an in-memory data structure, so don't use
      a log on-disk structure for it.  This simplifies the code a bit, and
      also reduces our include hell slightly.
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      [darrick: fix a minor indenting problem in xfs_trans_ichgtime]
      Signed-off-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      8d2d878d
    • Christoph Hellwig's avatar
      xfs: devirtualize ->m_dirnameops · d8d11fc7
      Christoph Hellwig authored
      Instead of causing a relatively expensive indirect call for each
      hashing and comparism of a file name in a directory just use an
      inline function and a simple branch on the ASCII CI bit.
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      [darrick: fix unused variable warning]
      Signed-off-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      d8d11fc7
    • Christoph Hellwig's avatar
    • Darrick J. Wong's avatar
      xfs: convert open coded corruption check to use XFS_IS_CORRUPT · a71895c5
      Darrick J. Wong authored
      Convert the last of the open coded corruption check and report idioms to
      use the XFS_IS_CORRUPT macro.
      Signed-off-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      a71895c5
    • Darrick J. Wong's avatar
      xfs: kill the XFS_WANT_CORRUPT_* macros · f9e03706
      Darrick J. Wong authored
      The XFS_WANT_CORRUPT_* macros conceal subtle side effects such as the
      creation of local variables and redirections of the code flow.  This is
      pretty ugly, so replace them with explicit XFS_IS_CORRUPT tests that
      remove both of those ugly points.  The change was performed with the
      following coccinelle script:
      
      @@
      expression mp, test;
      identifier label;
      @@
      
      - XFS_WANT_CORRUPTED_GOTO(mp, test, label);
      + if (XFS_IS_CORRUPT(mp, !test)) { error = -EFSCORRUPTED; goto label; }
      
      @@
      expression mp, test;
      @@
      
      - XFS_WANT_CORRUPTED_RETURN(mp, test);
      + if (XFS_IS_CORRUPT(mp, !test)) return -EFSCORRUPTED;
      
      @@
      expression mp, lval, rval;
      @@
      
      - XFS_IS_CORRUPT(mp, !(lval == rval))
      + XFS_IS_CORRUPT(mp, lval != rval)
      
      @@
      expression mp, e1, e2;
      @@
      
      - XFS_IS_CORRUPT(mp, !(e1 && e2))
      + XFS_IS_CORRUPT(mp, !e1 || !e2)
      
      @@
      expression e1, e2;
      @@
      
      - !(e1 == e2)
      + e1 != e2
      
      @@
      expression e1, e2, e3, e4, e5, e6;
      @@
      
      - !(e1 == e2 && e3 == e4) || e5 != e6
      + e1 != e2 || e3 != e4 || e5 != e6
      
      @@
      expression e1, e2, e3, e4, e5, e6;
      @@
      
      - !(e1 == e2 || (e3 <= e4 && e5 <= e6))
      + e1 != e2 && (e3 > e4 || e5 > e6)
      
      @@
      expression mp, e1, e2;
      @@
      
      - XFS_IS_CORRUPT(mp, !(e1 <= e2))
      + XFS_IS_CORRUPT(mp, e1 > e2)
      
      @@
      expression mp, e1, e2;
      @@
      
      - XFS_IS_CORRUPT(mp, !(e1 < e2))
      + XFS_IS_CORRUPT(mp, e1 >= e2)
      
      @@
      expression mp, e1;
      @@
      
      - XFS_IS_CORRUPT(mp, !!e1)
      + XFS_IS_CORRUPT(mp, e1)
      
      @@
      expression mp, e1, e2;
      @@
      
      - XFS_IS_CORRUPT(mp, !(e1 || e2))
      + XFS_IS_CORRUPT(mp, !e1 && !e2)
      
      @@
      expression mp, e1, e2, e3, e4;
      @@
      
      - XFS_IS_CORRUPT(mp, !(e1 == e2) && !(e3 == e4))
      + XFS_IS_CORRUPT(mp, e1 != e2 && e3 != e4)
      
      @@
      expression mp, e1, e2, e3, e4;
      @@
      
      - XFS_IS_CORRUPT(mp, !(e1 <= e2) || !(e3 >= e4))
      + XFS_IS_CORRUPT(mp, e1 > e2 || e3 < e4)
      
      @@
      expression mp, e1, e2, e3, e4;
      @@
      
      - XFS_IS_CORRUPT(mp, !(e1 == e2) && !(e3 <= e4))
      + XFS_IS_CORRUPT(mp, e1 != e2 && e3 > e4)
      Signed-off-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      f9e03706
    • Darrick J. Wong's avatar
      xfs: add a XFS_IS_CORRUPT macro · 1ec28615
      Darrick J. Wong authored
      Add a new macro, XFS_IS_CORRUPT, which we will use to integrate some
      corruption reporting when the corruption test expression is true.  This
      will be used in the next patch to remove the ugly XFS_WANT_CORRUPT*
      macros.
      Signed-off-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      1ec28615