1. 20 Nov, 2018 4 commits
    • Dave Chinner's avatar
      xfs: extent shifting doesn't fully invalidate page cache · 7f9f71be
      Dave Chinner authored
      The extent shifting code uses a flush and invalidate mechainsm prior
      to shifting extents around. This is similar to what
      xfs_free_file_space() does, but it doesn't take into account things
      like page cache vs block size differences, and it will fail if there
      is a page that it currently busy.
      
      xfs_flush_unmap_range() handles all of these cases, so just convert
      xfs_prepare_shift() to us that mechanism rather than having it's own
      special sauce.
      Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      7f9f71be
    • Dave Chinner's avatar
      xfs: finobt AG reserves don't consider last AG can be a runt · c0876897
      Dave Chinner authored
      The last AG may be very small comapred to all other AGs, and hence
      AG reservations based on the superblock AG size may actually consume
      more space than the AG actually has. This results on assert failures
      like:
      
      XFS: Assertion failed: xfs_perag_resv(pag, XFS_AG_RESV_METADATA)->ar_reserved + xfs_perag_resv(pag, XFS_AG_RESV_RMAPBT)->ar_reserved <= pag->pagf_freeblks + pag->pagf_flcount, file: fs/xfs/libxfs/xfs_ag_resv.c, line: 319
      [   48.932891]  xfs_ag_resv_init+0x1bd/0x1d0
      [   48.933853]  xfs_fs_reserve_ag_blocks+0x37/0xb0
      [   48.934939]  xfs_mountfs+0x5b3/0x920
      [   48.935804]  xfs_fs_fill_super+0x462/0x640
      [   48.936784]  ? xfs_test_remount_options+0x60/0x60
      [   48.937908]  mount_bdev+0x178/0x1b0
      [   48.938751]  mount_fs+0x36/0x170
      [   48.939533]  vfs_kern_mount.part.43+0x54/0x130
      [   48.940596]  do_mount+0x20e/0xcb0
      [   48.941396]  ? memdup_user+0x3e/0x70
      [   48.942249]  ksys_mount+0xba/0xd0
      [   48.943046]  __x64_sys_mount+0x21/0x30
      [   48.943953]  do_syscall_64+0x54/0x170
      [   48.944835]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      Hence we need to ensure the finobt per-ag space reservations take
      into account the size of the last AG rather than treat it like all
      the other full size AGs.
      
      Note that both refcountbt and rmapbt already take the size of the AG
      into account via reading the AGF length directly.
      Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      c0876897
    • Dave Chinner's avatar
      xfs: fix transient reference count error in xfs_buf_resubmit_failed_buffers · d43aaf16
      Dave Chinner authored
      When retrying a failed inode or dquot buffer,
      xfs_buf_resubmit_failed_buffers() clears all the failed flags from
      the inde/dquot log items. In doing so, it also drops all the
      reference counts on the buffer that the failed log items hold. This
      means it can drop all the active references on the buffer and hence
      free the buffer before it queues it for write again.
      
      Putting the buffer on the delwri queue takes a reference to the
      buffer (so that it hangs around until it has been written and
      completed), but this goes bang if the buffer has already been freed.
      
      Hence we need to add the buffer to the delwri queue before we remove
      the failed flags from the log items attached to the buffer to ensure
      it always remains referenced during the resubmit process.
      Reported-by: default avatarJosef Bacik <josef@toxicpanda.com>
      Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      d43aaf16
    • Dave Chinner's avatar
      xfs: uncached buffer tracing needs to print bno · d61fa8cb
      Dave Chinner authored
      Useless:
      
      xfs_buf_get_uncached: dev 253:32 bno 0xffffffffffffffff nblks 0x1 ...
      xfs_buf_unlock:       dev 253:32 bno 0xffffffffffffffff nblks 0x1 ...
      xfs_buf_submit:       dev 253:32 bno 0xffffffffffffffff nblks 0x1 ...
      xfs_buf_hold:         dev 253:32 bno 0xffffffffffffffff nblks 0x1 ...
      xfs_buf_iowait:       dev 253:32 bno 0xffffffffffffffff nblks 0x1 ...
      xfs_buf_iodone:       dev 253:32 bno 0xffffffffffffffff nblks 0x1 ...
      xfs_buf_iowait_done:  dev 253:32 bno 0xffffffffffffffff nblks 0x1 ...
      xfs_buf_rele:         dev 253:32 bno 0xffffffffffffffff nblks 0x1 ...
      
      Useful:
      
      
      xfs_buf_get_uncached: dev 253:32 bno 0xffffffffffffffff nblks 0x1 ...
      xfs_buf_unlock:       dev 253:32 bno 0xffffffffffffffff nblks 0x1 ...
      xfs_buf_submit:       dev 253:32 bno 0x200b5 nblks 0x1 ...
      xfs_buf_hold:         dev 253:32 bno 0x200b5 nblks 0x1 ...
      xfs_buf_iowait:       dev 253:32 bno 0x200b5 nblks 0x1 ...
      xfs_buf_iodone:       dev 253:32 bno 0x200b5 nblks 0x1 ...
      xfs_buf_iowait_done:  dev 253:32 bno 0x200b5 nblks 0x1 ...
      xfs_buf_rele:         dev 253:32 bno 0x200b5 nblks 0x1 ...
      Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      d61fa8cb
  2. 19 Nov, 2018 2 commits
    • Eric Biggers's avatar
      xfs: make xfs_file_remap_range() static · da034bcc
      Eric Biggers authored
      xfs_file_remap_range() is only used in fs/xfs/xfs_file.c, so make it
      static.
      
      This addresses a gcc warning when -Wmissing-prototypes is enabled.
      Signed-off-by: default avatarEric Biggers <ebiggers@google.com>
      Reviewed-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      da034bcc
    • Brian Foster's avatar
      xfs: fix shared extent data corruption due to missing cow reservation · 59e42931
      Brian Foster authored
      Page writeback indirectly handles shared extents via the existence
      of overlapping COW fork blocks. If COW fork blocks exist, writeback
      always performs the associated copy-on-write regardless if the
      underlying blocks are actually shared. If the blocks are shared,
      then overlapping COW fork blocks must always exist.
      
      fstests shared/010 reproduces a case where a buffered write occurs
      over a shared block without performing the requisite COW fork
      reservation.  This ultimately causes writeback to the shared extent
      and data corruption that is detected across md5 checks of the
      filesystem across a mount cycle.
      
      The problem occurs when a buffered write lands over a shared extent
      that crosses an extent size hint boundary and that also happens to
      have a partial COW reservation that doesn't cover the start and end
      blocks of the data fork extent.
      
      For example, a buffered write occurs across the file offset (in FSB
      units) range of [29, 57]. A shared extent exists at blocks [29, 35]
      and COW reservation already exists at blocks [32, 34]. After
      accommodating a COW extent size hint of 32 blocks and the existing
      reservation at offset 32, xfs_reflink_reserve_cow() allocates 32
      blocks of reservation at offset 0 and returns with COW reservation
      across the range of [0, 34]. The associated data fork extent is
      still [29, 35], however, which isn't fully covered by the COW
      reservation.
      
      This leads to a buffered write at file offset 35 over a shared
      extent without associated COW reservation. Writeback eventually
      kicks in, performs an overwrite of the underlying shared block and
      causes the associated data corruption.
      
      Update xfs_reflink_reserve_cow() to accommodate the fact that a
      delalloc allocation request may not fully cover the extent in the
      data fork. Trim the data fork extent appropriately, just as is done
      for shared extent boundaries and/or existing COW reservations that
      happen to overlap the start of the data fork extent. This prevents
      shared/010 failures due to data corruption on reflink enabled
      filesystems.
      Signed-off-by: default avatarBrian Foster <bfoster@redhat.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      59e42931
  3. 06 Nov, 2018 3 commits
    • Dave Chinner's avatar
      xfs: fix overflow in xfs_attr3_leaf_verify · 837514f7
      Dave Chinner authored
      generic/070 on 64k block size filesystems is failing with a verifier
      corruption on writeback or an attribute leaf block:
      
      [   94.973083] XFS (pmem0): Metadata corruption detected at xfs_attr3_leaf_verify+0x246/0x260, xfs_attr3_leaf block 0x811480
      [   94.975623] XFS (pmem0): Unmount and run xfs_repair
      [   94.976720] XFS (pmem0): First 128 bytes of corrupted metadata buffer:
      [   94.978270] 000000004b2e7b45: 00 00 00 00 00 00 00 00 3b ee 00 00 00 00 00 00  ........;.......
      [   94.980268] 000000006b1db90b: 00 00 00 00 00 81 14 80 00 00 00 00 00 00 00 00  ................
      [   94.982251] 00000000433f2407: 22 7b 5c 82 2d 5c 47 4c bb 31 1c 37 fa a9 ce d6  "{\.-\GL.1.7....
      [   94.984157] 0000000010dc7dfb: 00 00 00 00 00 81 04 8a 00 0a 18 e8 dd 94 01 00  ................
      [   94.986215] 00000000d5a19229: 00 a0 dc f4 fe 98 01 68 f0 d8 07 e0 00 00 00 00  .......h........
      [   94.988171] 00000000521df36c: 0c 2d 32 e2 fe 20 01 00 0c 2d 58 65 fe 0c 01 00  .-2.. ...-Xe....
      [   94.990162] 000000008477ae06: 0c 2d 5b 66 fe 8c 01 00 0c 2d 71 35 fe 7c 01 00  .-[f.....-q5.|..
      [   94.992139] 00000000a4a6bca6: 0c 2d 72 37 fc d4 01 00 0c 2d d8 b8 f0 90 01 00  .-r7.....-......
      [   94.994789] XFS (pmem0): xfs_do_force_shutdown(0x8) called from line 1453 of file fs/xfs/xfs_buf.c. Return address = ffffffff815365f3
      
      This is failing this check:
      
                      end = ichdr.freemap[i].base + ichdr.freemap[i].size;
                      if (end < ichdr.freemap[i].base)
      >>>>>                   return __this_address;
                      if (end > mp->m_attr_geo->blksize)
                              return __this_address;
      
      And from the buffer output above, the freemap array is:
      
      	freemap[0].base = 0x00a0
      	freemap[0].size = 0xdcf4	end = 0xdd94
      	freemap[1].base = 0xfe98
      	freemap[1].size = 0x0168	end = 0x10000
      	freemap[2].base = 0xf0d8
      	freemap[2].size = 0x07e0	end = 0xf8b8
      
      These all look valid - the block size is 0x10000 and so from the
      last check in the above verifier fragment we know that the end
      of freemap[1] is valid. The problem is that end is declared as:
      
      	uint16_t	end;
      
      And (uint16_t)0x10000 = 0. So we have a verifier bug here, not a
      corruption. Fix the verifier to use uint32_t types for the check and
      hence avoid the overflow.
      
      Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=201577Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
      Reviewed-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      837514f7
    • Darrick J. Wong's avatar
      xfs: print buffer offsets when dumping corrupt buffers · bdec055b
      Darrick J. Wong authored
      Use DUMP_PREFIX_OFFSET when printing hex dumps of corrupt buffers
      because modern Linux now prints a 32-bit hash of our 64-bit pointer when
      using DUMP_PREFIX_ADDRESS:
      
      00000000b4bb4297: 00 00 00 00 00 00 00 00 3b ee 00 00 00 00 00 00  ........;.......
      00000005ec77e26: 00 00 00 00 02 d0 5a 00 00 00 00 00 00 00 00 00  ......Z.........
      000000015938018: 21 98 e8 b4 fd de 4c 07 bc ea 3c e5 ae b4 7c 48  !.....L...<...|H
      
      This is totally worthless for a sequential dump since we probably only
      care about tracking the buffer offsets and afaik there's no way to
      recover the actual pointer from the hashed value.
      Signed-off-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      Reviewed-by: default avatarDave Chinner <dchinner@redhat.com>
      bdec055b
    • Christophe JAILLET's avatar
      xfs: Fix error code in 'xfs_ioc_getbmap()' · 132bf672
      Christophe JAILLET authored
      In this function, once 'buf' has been allocated, we unconditionally
      return 0.
      However, 'error' is set to some error codes in several error handling
      paths.
      Before commit 232b5194 ("xfs: simplify the xfs_getbmap interface")
      this was not an issue because all error paths were returning directly,
      but now that some cleanup at the end may be needed, we must propagate the
      error code.
      
      Fixes: 232b5194 ("xfs: simplify the xfs_getbmap interface")
      Signed-off-by: default avatarChristophe JAILLET <christophe.jaillet@wanadoo.fr>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      132bf672
  4. 04 Nov, 2018 9 commits
    • Linus Torvalds's avatar
      Linux 4.20-rc1 · 65102238
      Linus Torvalds authored
      65102238
    • Linus Torvalds's avatar
      Merge tag 'tags/upstream-4.20-rc1' of git://git.infradead.org/linux-ubifs · 42bd06e9
      Linus Torvalds authored
      Pull UBIFS updates from Richard Weinberger:
      
       - Full filesystem authentication feature, UBIFS is now able to have the
         whole filesystem structure authenticated plus user data encrypted and
         authenticated.
      
       - Minor cleanups
      
      * tag 'tags/upstream-4.20-rc1' of git://git.infradead.org/linux-ubifs: (26 commits)
        ubifs: Remove unneeded semicolon
        Documentation: ubifs: Add authentication whitepaper
        ubifs: Enable authentication support
        ubifs: Do not update inode size in-place in authenticated mode
        ubifs: Add hashes and HMACs to default filesystem
        ubifs: authentication: Authenticate super block node
        ubifs: Create hash for default LPT
        ubfis: authentication: Authenticate master node
        ubifs: authentication: Authenticate LPT
        ubifs: Authenticate replayed journal
        ubifs: Add auth nodes to garbage collector journal head
        ubifs: Add authentication nodes to journal
        ubifs: authentication: Add hashes to index nodes
        ubifs: Add hashes to the tree node cache
        ubifs: Create functions to embed a HMAC in a node
        ubifs: Add helper functions for authentication support
        ubifs: Add separate functions to init/crc a node
        ubifs: Format changes for authentication support
        ubifs: Store read superblock node
        ubifs: Drop write_node
        ...
      42bd06e9
    • Linus Torvalds's avatar
      Merge tag 'nfs-for-4.20-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs · 4710e789
      Linus Torvalds authored
      Pull NFS client bugfixes from Trond Myklebust:
       "Highlights include:
      
        Bugfix:
         - Fix build issues on architectures that don't provide 64-bit cmpxchg
      
        Cleanups:
         - Fix a spelling mistake"
      
      * tag 'nfs-for-4.20-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
        NFS: fix spelling mistake, EACCESS -> EACCES
        SUNRPC: Use atomic(64)_t for seq_send(64)
      4710e789
    • Linus Torvalds's avatar
      Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 35e74524
      Linus Torvalds authored
      Pull more timer updates from Thomas Gleixner:
       "A set of commits for the new C-SKY architecture timers"
      
      * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        dt-bindings: timer: gx6605s SOC timer
        clocksource/drivers/c-sky: Add gx6605s SOC system timer
        dt-bindings: timer: C-SKY Multi-processor timer
        clocksource/drivers/c-sky: Add C-SKY SMP timer
      35e74524
    • Linus Torvalds's avatar
      Merge tag 'ntb-4.20' of git://github.com/jonmason/ntb · 04578e84
      Linus Torvalds authored
      Pull NTB updates from Jon Mason:
       "Fairly minor changes and bug fixes:
      
        NTB IDT thermal changes and hook into hwmon, ntb_netdev clean-up of
        private struct, and a few bug fixes"
      
      * tag 'ntb-4.20' of git://github.com/jonmason/ntb:
        ntb: idt: Alter the driver info comments
        ntb: idt: Discard temperature sensor IRQ handler
        ntb: idt: Add basic hwmon sysfs interface
        ntb: idt: Alter temperature read method
        ntb_netdev: Simplify remove with client device drvdata
        NTB: transport: Try harder to alloc an aligned MW buffer
        ntb: ntb_transport: Mark expected switch fall-throughs
        ntb: idt: Set PCIe bus address to BARLIMITx
        NTB: ntb_hw_idt: replace IS_ERR_OR_NULL with regular NULL checks
        ntb: intel: fix return value for ndev_vec_mask()
        ntb_netdev: fix sleep time mismatch
      04578e84
    • Linus Torvalds's avatar
      Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 71e56028
      Linus Torvalds authored
      Pull scheduler fixes from Ingo Molnar:
       "A memory (under-)allocation fix and a comment fix"
      
      * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        sched/topology: Fix off by one bug
        sched/rt: Update comment in pick_next_task_rt()
      71e56028
    • Linus Torvalds's avatar
      Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 601a8807
      Linus Torvalds authored
      Pull x86 fixes from Ingo Molnar:
       "A number of fixes and some late updates:
      
         - make in_compat_syscall() behavior on x86-32 similar to other
           platforms, this touches a number of generic files but is not
           intended to impact non-x86 platforms.
      
         - objtool fixes
      
         - PAT preemption fix
      
         - paravirt fixes/cleanups
      
         - cpufeatures updates for new instructions
      
         - earlyprintk quirk
      
         - make microcode version in sysfs world-readable (it is already
           world-readable in procfs)
      
         - minor cleanups and fixes"
      
      * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        compat: Cleanup in_compat_syscall() callers
        x86/compat: Adjust in_compat_syscall() to generic code under !COMPAT
        objtool: Support GCC 9 cold subfunction naming scheme
        x86/numa_emulation: Fix uniform-split numa emulation
        x86/paravirt: Remove unused _paravirt_ident_32
        x86/mm/pat: Disable preemption around __flush_tlb_all()
        x86/paravirt: Remove GPL from pv_ops export
        x86/traps: Use format string with panic() call
        x86: Clean up 'sizeof x' => 'sizeof(x)'
        x86/cpufeatures: Enumerate MOVDIR64B instruction
        x86/cpufeatures: Enumerate MOVDIRI instruction
        x86/earlyprintk: Add a force option for pciserial device
        objtool: Support per-function rodata sections
        x86/microcode: Make revision and processor flags world-readable
      601a8807
    • Linus Torvalds's avatar
      Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 01897f3e
      Linus Torvalds authored
      Pull perf updates and fixes from Ingo Molnar:
       "These are almost all tooling updates: 'perf top', 'perf trace' and
        'perf script' fixes and updates, an UAPI header sync with the merge
        window versions, license marker updates, much improved Sparc support
        from David Miller, and a number of fixes"
      
      * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (66 commits)
        perf intel-pt/bts: Calculate cpumode for synthesized samples
        perf intel-pt: Insert callchain context into synthesized callchains
        perf tools: Don't clone maps from parent when synthesizing forks
        perf top: Start display thread earlier
        tools headers uapi: Update linux/if_link.h header copy
        tools headers uapi: Update linux/netlink.h header copy
        tools headers: Sync the various kvm.h header copies
        tools include uapi: Update linux/mmap.h copy
        perf trace beauty: Use the mmap flags table generated from headers
        perf beauty: Wire up the mmap flags table generator to the Makefile
        perf beauty: Add a generator for MAP_ mmap's flag constants
        tools include uapi: Update asound.h copy
        tools arch uapi: Update asm-generic/unistd.h and arm64 unistd.h copies
        tools include uapi: Update linux/fs.h copy
        perf callchain: Honour the ordering of PERF_CONTEXT_{USER,KERNEL,etc}
        perf cs-etm: Correct CPU mode for samples
        perf unwind: Take pgoff into account when reporting elf to libdwfl
        perf top: Do not use overwrite mode by default
        perf top: Allow disabling the overwrite mode
        perf trace: Beautify mount's first pathname arg
        ...
      01897f3e
    • Linus Torvalds's avatar
      Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · e9ebc215
      Linus Torvalds authored
      Pull irq fixes from Ingo Molnar:
       "An irqchip driver fix and a memory (over-)allocation fix"
      
      * 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        irqchip/irq-mvebu-sei: Fix a NULL vs IS_ERR() bug in probe function
        irq/matrix: Fix memory overallocation
      e9ebc215
  5. 03 Nov, 2018 22 commits