1. 14 Jun, 2017 30 commits
  2. 07 Jun, 2017 10 commits
    • Greg Kroah-Hartman's avatar
      Linux 4.9.31 · f1aa865a
      Greg Kroah-Hartman authored
      f1aa865a
    • Jan Kara's avatar
      xfs: Fix off-by-in in loop termination in xfs_find_get_desired_pgoff() · 11214bd2
      Jan Kara authored
      commit d7fd2425 upstream.
      
      There is an off-by-one error in loop termination conditions in
      xfs_find_get_desired_pgoff() since 'end' may index a page beyond end of
      desired range if 'endoff' is page aligned. It doesn't have any visible
      effects but still it is good to fix it.
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Reviewed-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      11214bd2
    • Eric Sandeen's avatar
      xfs: fix unaligned access in xfs_btree_visit_blocks · 75c5afd5
      Eric Sandeen authored
      commit a4d768e7 upstream.
      
      This structure copy was throwing unaligned access warnings on sparc64:
      
      Kernel unaligned access at TPC[1043c088] xfs_btree_visit_blocks+0x88/0xe0 [xfs]
      
      xfs_btree_copy_ptrs does a memcpy, which avoids it.
      Signed-off-by: default avatarEric Sandeen <sandeen@redhat.com>
      Reviewed-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      75c5afd5
    • Darrick J. Wong's avatar
      xfs: avoid mount-time deadlock in CoW extent recovery · 7fb8ab8f
      Darrick J. Wong authored
      commit 3ecb3ac7 upstream.
      
      If a malicious user corrupts the refcount btree to cause a cycle between
      different levels of the tree, the next mount attempt will deadlock in
      the CoW recovery routine while grabbing buffer locks.  We can use the
      ability to re-grab a buffer that was previous locked to a transaction to
      avoid deadlocks, so do that here.
      Signed-off-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      Reviewed-by: default avatarBrian Foster <bfoster@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7fb8ab8f
    • Christoph Hellwig's avatar
      xfs: xfs_trans_alloc_empty · e40c145c
      Christoph Hellwig authored
      This is a partial cherry-pick of commit e89c0413
      ("xfs: implement the GETFSMAP ioctl"), which also adds this helper, and
      a great example of why feature patches should be properly split into
      their parts.
      Signed-off-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      [hch: split from the larger patch for -stable]
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      e40c145c
    • Zorro Lang's avatar
      xfs: bad assertion for delalloc an extent that start at i_size · 0e542792
      Zorro Lang authored
      commit 892d2a5f upstream.
      
      By run fsstress long enough time enough in RHEL-7, I find an
      assertion failure (harder to reproduce on linux-4.11, but problem
      is still there):
      
        XFS: Assertion failed: (iflags & BMV_IF_DELALLOC) != 0, file: fs/xfs/xfs_bmap_util.c
      
      The assertion is in xfs_getbmap() funciton:
      
        if (map[i].br_startblock == DELAYSTARTBLOCK &&
      -->   map[i].br_startoff <= XFS_B_TO_FSB(mp, XFS_ISIZE(ip)))
                ASSERT((iflags & BMV_IF_DELALLOC) != 0);
      
      When map[i].br_startoff == XFS_B_TO_FSB(mp, XFS_ISIZE(ip)), the
      startoff is just at EOF. But we only need to make sure delalloc
      extents that are within EOF, not include EOF.
      Signed-off-by: default avatarZorro Lang <zlang@redhat.com>
      Reviewed-by: default avatarBrian Foster <bfoster@redhat.com>
      Reviewed-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0e542792
    • Darrick J. Wong's avatar
      xfs: BMAPX shouldn't barf on inline-format directories · f60d76ef
      Darrick J. Wong authored
      commit 6eadbf4c upstream.
      
      When we're fulfilling a BMAPX request, jump out early if the data fork
      is in local format.  This prevents us from hitting a debugging check in
      bmapi_read and barfing errors back to userspace.  The on-disk extent
      count check later isn't sufficient for IF_DELALLOC mode because da
      extents are in memory and not on disk.
      Signed-off-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      Reviewed-by: default avatarBrian Foster <bfoster@redhat.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f60d76ef
    • Brian Foster's avatar
      xfs: fix indlen accounting error on partial delalloc conversion · 53c44c23
      Brian Foster authored
      commit 0daaecac upstream.
      
      The delalloc -> real block conversion path uses an incorrect
      calculation in the case where the middle part of a delalloc extent
      is being converted. This is documented as a rare situation because
      XFS generally attempts to maximize contiguity by converting as much
      of a delalloc extent as possible.
      
      If this situation does occur, the indlen reservation for the two new
      delalloc extents left behind by the conversion of the middle range
      is calculated and compared with the original reservation. If more
      blocks are required, the delta is allocated from the global block
      pool. This delta value can be characterized as the difference
      between the new total requirement (temp + temp2) and the currently
      available reservation minus those blocks that have already been
      allocated (startblockval(PREV.br_startblock) - allocated).
      
      The problem is that the current code does not account for previously
      allocated blocks correctly. It subtracts the current allocation
      count from the (new - old) delta rather than the old indlen
      reservation. This means that more indlen blocks than have been
      allocated end up stashed in the remaining extents and free space
      accounting is broken as a result.
      
      Fix up the calculation to subtract the allocated block count from
      the original extent indlen and thus correctly allocate the
      reservation delta based on the difference between the new total
      requirement and the unused blocks from the original reservation.
      Also remove a bogus assert that contradicts the fact that the new
      indlen reservation can be larger than the original indlen
      reservation.
      Signed-off-by: default avatarBrian Foster <bfoster@redhat.com>
      Reviewed-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      53c44c23
    • Eryu Guan's avatar
      xfs: fix use-after-free in xfs_finish_page_writeback · 54894ea3
      Eryu Guan authored
      commit 161f55ef upstream.
      
      Commit 28b783e4 ("xfs: bufferhead chains are invalid after
      end_page_writeback") fixed one use-after-free issue by
      pre-calculating the loop conditionals before calling bh->b_end_io()
      in the end_io processing loop, but it assigned 'next' pointer before
      checking end offset boundary & breaking the loop, at which point the
      bh might be freed already, and caused use-after-free.
      
      This is caught by KASAN when running fstests generic/127 on sub-page
      block size XFS.
      
      [ 2517.244502] run fstests generic/127 at 2017-04-27 07:30:50
      [ 2747.868840] ==================================================================
      [ 2747.876949] BUG: KASAN: use-after-free in xfs_destroy_ioend+0x3d3/0x4e0 [xfs] at addr ffff8801395ae698
      ...
      [ 2747.918245] Call Trace:
      [ 2747.920975]  dump_stack+0x63/0x84
      [ 2747.924673]  kasan_object_err+0x21/0x70
      [ 2747.928950]  kasan_report+0x271/0x530
      [ 2747.933064]  ? xfs_destroy_ioend+0x3d3/0x4e0 [xfs]
      [ 2747.938409]  ? end_page_writeback+0xce/0x110
      [ 2747.943171]  __asan_report_load8_noabort+0x19/0x20
      [ 2747.948545]  xfs_destroy_ioend+0x3d3/0x4e0 [xfs]
      [ 2747.953724]  xfs_end_io+0x1af/0x2b0 [xfs]
      [ 2747.958197]  process_one_work+0x5ff/0x1000
      [ 2747.962766]  worker_thread+0xe4/0x10e0
      [ 2747.966946]  kthread+0x2d3/0x3d0
      [ 2747.970546]  ? process_one_work+0x1000/0x1000
      [ 2747.975405]  ? kthread_create_on_node+0xc0/0xc0
      [ 2747.980457]  ? syscall_return_slowpath+0xe6/0x140
      [ 2747.985706]  ? do_page_fault+0x30/0x80
      [ 2747.989887]  ret_from_fork+0x2c/0x40
      [ 2747.993874] Object at ffff8801395ae690, in cache buffer_head size: 104
      [ 2748.001155] Allocated:
      [ 2748.003782] PID = 8327
      [ 2748.006411]  save_stack_trace+0x1b/0x20
      [ 2748.010688]  save_stack+0x46/0xd0
      [ 2748.014383]  kasan_kmalloc+0xad/0xe0
      [ 2748.018370]  kasan_slab_alloc+0x12/0x20
      [ 2748.022648]  kmem_cache_alloc+0xb8/0x1b0
      [ 2748.027024]  alloc_buffer_head+0x22/0xc0
      [ 2748.031399]  alloc_page_buffers+0xd1/0x250
      [ 2748.035968]  create_empty_buffers+0x30/0x410
      [ 2748.040730]  create_page_buffers+0x120/0x1b0
      [ 2748.045493]  __block_write_begin_int+0x17a/0x1800
      [ 2748.050740]  iomap_write_begin+0x100/0x2f0
      [ 2748.055308]  iomap_zero_range_actor+0x253/0x5c0
      [ 2748.060362]  iomap_apply+0x157/0x270
      [ 2748.064347]  iomap_zero_range+0x5a/0x80
      [ 2748.068624]  iomap_truncate_page+0x6b/0xa0
      [ 2748.073227]  xfs_setattr_size+0x1f7/0xa10 [xfs]
      [ 2748.078312]  xfs_vn_setattr_size+0x68/0x140 [xfs]
      [ 2748.083589]  xfs_file_fallocate+0x4ac/0x820 [xfs]
      [ 2748.088838]  vfs_fallocate+0x2cf/0x780
      [ 2748.093021]  SyS_fallocate+0x48/0x80
      [ 2748.097006]  do_syscall_64+0x18a/0x430
      [ 2748.101186]  return_from_SYSCALL_64+0x0/0x6a
      [ 2748.105948] Freed:
      [ 2748.108189] PID = 8327
      [ 2748.110816]  save_stack_trace+0x1b/0x20
      [ 2748.115093]  save_stack+0x46/0xd0
      [ 2748.118788]  kasan_slab_free+0x73/0xc0
      [ 2748.122969]  kmem_cache_free+0x7a/0x200
      [ 2748.127247]  free_buffer_head+0x41/0x80
      [ 2748.131524]  try_to_free_buffers+0x178/0x250
      [ 2748.136316]  xfs_vm_releasepage+0x2e9/0x3d0 [xfs]
      [ 2748.141563]  try_to_release_page+0x100/0x180
      [ 2748.146325]  invalidate_inode_pages2_range+0x7da/0xcf0
      [ 2748.152087]  xfs_shift_file_space+0x37d/0x6e0 [xfs]
      [ 2748.157557]  xfs_collapse_file_space+0x49/0x120 [xfs]
      [ 2748.163223]  xfs_file_fallocate+0x2a7/0x820 [xfs]
      [ 2748.168462]  vfs_fallocate+0x2cf/0x780
      [ 2748.172642]  SyS_fallocate+0x48/0x80
      [ 2748.176629]  do_syscall_64+0x18a/0x430
      [ 2748.180810]  return_from_SYSCALL_64+0x0/0x6a
      
      Fixed it by checking on offset against end & breaking out first,
      dereference bh only if there're still bufferheads to process.
      Signed-off-by: default avatarEryu Guan <eguan@redhat.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      54894ea3
    • Darrick J. Wong's avatar
      xfs: reserve enough blocks to handle btree splits when remapping · d457f822
      Darrick J. Wong authored
      commit fe0be23e upstream.
      
      In xfs_reflink_end_cow, we erroneously reserve only enough blocks to
      handle adding 1 extent.  This is problematic if we fragment free space,
      have to do CoW, and then have to perform multiple bmap btree expansions.
      Furthermore, the BUI recovery routine doesn't reserve /any/ blocks to
      handle btree splits, so log recovery fails after our first error causes
      the filesystem to go down.
      
      Therefore, refactor the transaction block reservation macros until we
      have a macro that works for our deferred (re)mapping activities, and fix
      both problems by using that macro.
      
      With 1k blocks we can hit this fairly often in g/187 if the scratch fs
      is big enough.
      Signed-off-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d457f822