1. 05 May, 2017 1 commit
    • Eryu Guan's avatar
      xfs: fix use-after-free in xfs_finish_page_writeback · 161f55ef
      Eryu Guan authored
      Commit 28b783e4 ("xfs: bufferhead chains are invalid after
      end_page_writeback") fixed one use-after-free issue by
      pre-calculating the loop conditionals before calling bh->b_end_io()
      in the end_io processing loop, but it assigned 'next' pointer before
      checking end offset boundary & breaking the loop, at which point the
      bh might be freed already, and caused use-after-free.
      
      This is caught by KASAN when running fstests generic/127 on sub-page
      block size XFS.
      
      [ 2517.244502] run fstests generic/127 at 2017-04-27 07:30:50
      [ 2747.868840] ==================================================================
      [ 2747.876949] BUG: KASAN: use-after-free in xfs_destroy_ioend+0x3d3/0x4e0 [xfs] at addr ffff8801395ae698
      ...
      [ 2747.918245] Call Trace:
      [ 2747.920975]  dump_stack+0x63/0x84
      [ 2747.924673]  kasan_object_err+0x21/0x70
      [ 2747.928950]  kasan_report+0x271/0x530
      [ 2747.933064]  ? xfs_destroy_ioend+0x3d3/0x4e0 [xfs]
      [ 2747.938409]  ? end_page_writeback+0xce/0x110
      [ 2747.943171]  __asan_report_load8_noabort+0x19/0x20
      [ 2747.948545]  xfs_destroy_ioend+0x3d3/0x4e0 [xfs]
      [ 2747.953724]  xfs_end_io+0x1af/0x2b0 [xfs]
      [ 2747.958197]  process_one_work+0x5ff/0x1000
      [ 2747.962766]  worker_thread+0xe4/0x10e0
      [ 2747.966946]  kthread+0x2d3/0x3d0
      [ 2747.970546]  ? process_one_work+0x1000/0x1000
      [ 2747.975405]  ? kthread_create_on_node+0xc0/0xc0
      [ 2747.980457]  ? syscall_return_slowpath+0xe6/0x140
      [ 2747.985706]  ? do_page_fault+0x30/0x80
      [ 2747.989887]  ret_from_fork+0x2c/0x40
      [ 2747.993874] Object at ffff8801395ae690, in cache buffer_head size: 104
      [ 2748.001155] Allocated:
      [ 2748.003782] PID = 8327
      [ 2748.006411]  save_stack_trace+0x1b/0x20
      [ 2748.010688]  save_stack+0x46/0xd0
      [ 2748.014383]  kasan_kmalloc+0xad/0xe0
      [ 2748.018370]  kasan_slab_alloc+0x12/0x20
      [ 2748.022648]  kmem_cache_alloc+0xb8/0x1b0
      [ 2748.027024]  alloc_buffer_head+0x22/0xc0
      [ 2748.031399]  alloc_page_buffers+0xd1/0x250
      [ 2748.035968]  create_empty_buffers+0x30/0x410
      [ 2748.040730]  create_page_buffers+0x120/0x1b0
      [ 2748.045493]  __block_write_begin_int+0x17a/0x1800
      [ 2748.050740]  iomap_write_begin+0x100/0x2f0
      [ 2748.055308]  iomap_zero_range_actor+0x253/0x5c0
      [ 2748.060362]  iomap_apply+0x157/0x270
      [ 2748.064347]  iomap_zero_range+0x5a/0x80
      [ 2748.068624]  iomap_truncate_page+0x6b/0xa0
      [ 2748.073227]  xfs_setattr_size+0x1f7/0xa10 [xfs]
      [ 2748.078312]  xfs_vn_setattr_size+0x68/0x140 [xfs]
      [ 2748.083589]  xfs_file_fallocate+0x4ac/0x820 [xfs]
      [ 2748.088838]  vfs_fallocate+0x2cf/0x780
      [ 2748.093021]  SyS_fallocate+0x48/0x80
      [ 2748.097006]  do_syscall_64+0x18a/0x430
      [ 2748.101186]  return_from_SYSCALL_64+0x0/0x6a
      [ 2748.105948] Freed:
      [ 2748.108189] PID = 8327
      [ 2748.110816]  save_stack_trace+0x1b/0x20
      [ 2748.115093]  save_stack+0x46/0xd0
      [ 2748.118788]  kasan_slab_free+0x73/0xc0
      [ 2748.122969]  kmem_cache_free+0x7a/0x200
      [ 2748.127247]  free_buffer_head+0x41/0x80
      [ 2748.131524]  try_to_free_buffers+0x178/0x250
      [ 2748.136316]  xfs_vm_releasepage+0x2e9/0x3d0 [xfs]
      [ 2748.141563]  try_to_release_page+0x100/0x180
      [ 2748.146325]  invalidate_inode_pages2_range+0x7da/0xcf0
      [ 2748.152087]  xfs_shift_file_space+0x37d/0x6e0 [xfs]
      [ 2748.157557]  xfs_collapse_file_space+0x49/0x120 [xfs]
      [ 2748.163223]  xfs_file_fallocate+0x2a7/0x820 [xfs]
      [ 2748.168462]  vfs_fallocate+0x2cf/0x780
      [ 2748.172642]  SyS_fallocate+0x48/0x80
      [ 2748.176629]  do_syscall_64+0x18a/0x430
      [ 2748.180810]  return_from_SYSCALL_64+0x0/0x6a
      
      Fixed it by checking on offset against end & breaking out first,
      dereference bh only if there're still bufferheads to process.
      Signed-off-by: default avatarEryu Guan <eguan@redhat.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      161f55ef
  2. 03 May, 2017 1 commit
    • Darrick J. Wong's avatar
      xfs: reserve enough blocks to handle btree splits when remapping · fe0be23e
      Darrick J. Wong authored
      In xfs_reflink_end_cow, we erroneously reserve only enough blocks to
      handle adding 1 extent.  This is problematic if we fragment free space,
      have to do CoW, and then have to perform multiple bmap btree expansions.
      Furthermore, the BUI recovery routine doesn't reserve /any/ blocks to
      handle btree splits, so log recovery fails after our first error causes
      the filesystem to go down.
      
      Therefore, refactor the transaction block reservation macros until we
      have a macro that works for our deferred (re)mapping activities, and fix
      both problems by using that macro.
      
      With 1k blocks we can hit this fairly often in g/187 if the scratch fs
      is big enough.
      Signed-off-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      fe0be23e
  3. 28 Apr, 2017 4 commits
    • Brian Foster's avatar
      xfs: wait on new inodes during quotaoff dquot release · e20c8a51
      Brian Foster authored
      The quotaoff operation has a race with inode allocation that results
      in a livelock. An inode allocation that occurs before the quota
      status flags are updated acquires the appropriate dquots for the
      inode via xfs_qm_vop_dqalloc(). It then inserts the XFS_INEW inode
      into the perag radix tree, sometime later attaches the dquots to the
      inode and finally clears the XFS_INEW flag. Quotaoff expects to
      release the dquots from all inodes in the filesystem via
      xfs_qm_dqrele_all_inodes(). This invokes the AG inode iterator,
      which skips inodes in the XFS_INEW state because they are not fully
      constructed. If the scan occurs after dquots have been attached to
      an inode, but before XFS_INEW is cleared, the newly allocated inode
      will continue to hold a reference to the applicable dquots. When
      quotaoff invokes xfs_qm_dqpurge_all(), the reference count of those
      dquot(s) remain elevated and the dqpurge scan spins indefinitely.
      
      To address this problem, update the xfs_qm_dqrele_all_inodes() scan
      to wait on inodes marked on the XFS_INEW state. We wait on the
      inodes explicitly rather than skip and retry to avoid continuous
      retry loops due to a parallel inode allocation workload. Since
      quotaoff updates the quota state flags and uses a synchronous
      transaction before the dqrele scan, and dquots are attached to
      inodes after radix tree insertion iff quota is enabled, one INEW
      waiting pass through the AG guarantees that the scan has processed
      all inodes that could possibly hold dquot references.
      Reported-by: default avatarEryu Guan <eguan@redhat.com>
      Signed-off-by: default avatarBrian Foster <bfoster@redhat.com>
      Reviewed-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      e20c8a51
    • Brian Foster's avatar
      xfs: update ag iterator to support wait on new inodes · ae2c4ac2
      Brian Foster authored
      The AG inode iterator currently skips new inodes as such inodes are
      inserted into the inode radix tree before they are fully
      constructed. Certain contexts require the ability to wait on the
      construction of new inodes, however. The fs-wide dquot release from
      the quotaoff sequence is an example of this.
      
      Update the AG inode iterator to support the ability to wait on
      inodes flagged with XFS_INEW upon request. Create a new
      xfs_inode_ag_iterator_flags() interface and support a set of
      iteration flags to modify the iteration behavior. When the
      XFS_AGITER_INEW_WAIT flag is set, include XFS_INEW flags in the
      radix tree inode lookup and wait on them before the callback is
      executed.
      Signed-off-by: default avatarBrian Foster <bfoster@redhat.com>
      Reviewed-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      ae2c4ac2
    • Brian Foster's avatar
      xfs: support ability to wait on new inodes · 756baca2
      Brian Foster authored
      Inodes that are inserted into the perag tree but still under
      construction are flagged with the XFS_INEW bit. Most contexts either
      skip such inodes when they are encountered or have the ability to
      handle them.
      
      The runtime quotaoff sequence introduces a context that must wait
      for construction of such inodes to correctly ensure that all dquots
      in the fs are released. In anticipation of this, support the ability
      to wait on new inodes. Wake the appropriate bit when XFS_INEW is
      cleared.
      Signed-off-by: default avatarBrian Foster <bfoster@redhat.com>
      Reviewed-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      756baca2
    • Amir Goldstein's avatar
      xfs: publish UUID in struct super_block · 8f720d9f
      Amir Goldstein authored
      Copy the uuid of the filesystem to struct super_block s_uuid field,
      as several other filesystems already do.  Copy regardless of the nouuid
      mount option, because other filesystems also do not guaranty uniqueness
      of the s_uuid field in super_block struct.
      Signed-off-by: default avatarAmir Goldstein <amir73il@gmail.com>
      Reviewed-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      8f720d9f
  4. 27 Apr, 2017 1 commit
  5. 25 Apr, 2017 24 commits
  6. 12 Apr, 2017 3 commits
  7. 06 Apr, 2017 2 commits
  8. 03 Apr, 2017 4 commits