1. 02 Jan, 2018 2 commits
  2. 21 Dec, 2017 7 commits
  3. 14 Dec, 2017 8 commits
  4. 09 Dec, 2017 3 commits
    • Darrick J. Wong's avatar
      xfs: make iomap_begin functions trim iomaps consistently · b7e0b6ff
      Darrick J. Wong authored
      Historically, the XFS iomap_begin function only returned mappings for
      exactly the range queried, i.e. it doesn't do XFS_BMAPI_ENTIRE lookups.
      The current vfs iomap consumers are only set up to deal with trimmed
      mappings.  xfs_xattr_iomap_begin does BMAPI_ENTIRE lookups, which is
      inconsistent with the current iomap usage.  Remove the flag so that both
      iomap_begin functions behave the same way.
      
      FWIW this also fixes a behavioral regression in xattr FIEMAP that was
      introduced in 4.8 wherein attr fork extents are no longer trimmed like
      they used to be.
      Signed-off-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      b7e0b6ff
    • Christoph Hellwig's avatar
      xfs: remove "no-allocation" reservations for file creations · f59cf5c2
      Christoph Hellwig authored
      If we create a new file we will need an inode, and usually some metadata
      in the parent direction.  Aiming for everything to go well despite the
      lack of a reservation leads to dirty transactions cancelled under a heavy
      create/delete load.  This patch removes those nospace transactions, which
      will lead to slightly earlier ENOSPC on some workloads, but instead
      prevent file system shutdowns due to cancelling dirty transactions for
      others.
      
      A customer could observe assertations failures and shutdowns due to
      cancelation of dirty transactions during heavy NFS workloads as shown
      below:
      
      2017-05-30 21:17:06 kernel: WARNING: [ 2670.728125] XFS: Assertion failed: error != -ENOSPC, file: fs/xfs/xfs_inode.c, line: 1262
      
      2017-05-30 21:17:06 kernel: WARNING: [ 2670.728222] Call Trace:
      2017-05-30 21:17:06 kernel: WARNING: [ 2670.728246]  [<ffffffff81795daf>] dump_stack+0x63/0x81
      2017-05-30 21:17:06 kernel: WARNING: [ 2670.728262]  [<ffffffff810a1a5a>] warn_slowpath_common+0x8a/0xc0
      2017-05-30 21:17:06 kernel: WARNING: [ 2670.728264]  [<ffffffff810a1b8a>] warn_slowpath_null+0x1a/0x20
      2017-05-30 21:17:06 kernel: WARNING: [ 2670.728285]  [<ffffffffa01bf403>] asswarn+0x33/0x40 [xfs]
      2017-05-30 21:17:06 kernel: WARNING: [ 2670.728308]  [<ffffffffa01bb07e>] xfs_create+0x7be/0x7d0 [xfs]
      2017-05-30 21:17:06 kernel: WARNING: [ 2670.728329]  [<ffffffffa01b6ffb>] xfs_generic_create+0x1fb/0x2e0 [xfs]
      2017-05-30 21:17:06 kernel: WARNING: [ 2670.728348]  [<ffffffffa01b7114>] xfs_vn_mknod+0x14/0x20 [xfs]
      2017-05-30 21:17:06 kernel: WARNING: [ 2670.728366]  [<ffffffffa01b7153>] xfs_vn_create+0x13/0x20 [xfs]
      2017-05-30 21:17:06 kernel: WARNING: [ 2670.728380]  [<ffffffff81231de5>] vfs_create+0xd5/0x140
      2017-05-30 21:17:06 kernel: WARNING: [ 2670.728390]  [<ffffffffa045ddb9>] do_nfsd_create+0x499/0x610 [nfsd]
      2017-05-30 21:17:06 kernel: WARNING: [ 2670.728396]  [<ffffffffa0465fa5>] nfsd3_proc_create+0x135/0x210 [nfsd]
      2017-05-30 21:17:06 kernel: WARNING: [ 2670.728401]  [<ffffffffa04561e3>] nfsd_dispatch+0xc3/0x210 [nfsd]
      2017-05-30 21:17:06 kernel: WARNING: [ 2670.728416]  [<ffffffffa03bfa43>] svc_process_common+0x453/0x6f0 [sunrpc]
      2017-05-30 21:17:06 kernel: WARNING: [ 2670.728423]  [<ffffffffa03bfdf3>] svc_process+0x113/0x1f0 [sunrpc]
      2017-05-30 21:17:06 kernel: WARNING: [ 2670.728427]  [<ffffffffa0455bcf>] nfsd+0x10f/0x180 [nfsd]
      2017-05-30 21:17:06 kernel: WARNING: [ 2670.728432]  [<ffffffffa0455ac0>] ? nfsd_destroy+0x80/0x80 [nfsd]
      2017-05-30 21:17:06 kernel: WARNING: [ 2670.728438]  [<ffffffff810c0d58>] kthread+0xd8/0xf0
      2017-05-30 21:17:06 kernel: WARNING: [ 2670.728441]  [<ffffffff810c0c80>] ? kthread_create_on_node+0x1b0/0x1b0
      2017-05-30 21:17:06 kernel: WARNING: [ 2670.728451]  [<ffffffff8179d962>] ret_from_fork+0x42/0x70
      2017-05-30 21:17:06 kernel: WARNING: [ 2670.728453]  [<ffffffff810c0c80>] ? kthread_create_on_node+0x1b0/0x1b0
      2017-05-30 21:17:06 kernel: WARNING: [ 2670.728454] ---[ end trace f9822c842fec81d4 ]---
      
      2017-05-30 21:17:06 kernel: ALERT: [ 2670.728477] XFS (sdb): Internal error xfs_trans_cancel at line 983 of file fs/xfs/xfs_trans.c.  Caller xfs_create+0x4ee/0x7d0 [xfs]
      
      2017-05-30 21:17:06 kernel: ALERT: [ 2670.728684] XFS (sdb): Corruption of in-memory data detected. Shutting down filesystem
      2017-05-30 21:17:06 kernel: ALERT: [ 2670.728685] XFS (sdb): Please umount the filesystem and rectify the problem(s)
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      f59cf5c2
    • Pravin Shedge's avatar
      fs: xfs: remove duplicate includes · eaf0ec30
      Pravin Shedge authored
      These duplicate includes have been found with scripts/checkincludes.pl but
      they have been removed manually to avoid removing false positives.
      Signed-off-by: default avatarPravin Shedge <pravin.shedge4linux@gmail.com>
      Reviewed-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      eaf0ec30
  5. 30 Nov, 2017 4 commits
  6. 28 Nov, 2017 4 commits
  7. 27 Nov, 2017 3 commits
    • Darrick J. Wong's avatar
      xfs: log recovery should replay deferred ops in order · 50995582
      Darrick J. Wong authored
      As part of testing log recovery with dm_log_writes, Amir Goldstein
      discovered an error in the deferred ops recovery that lead to corruption
      of the filesystem metadata if a reflink+rmap filesystem happened to shut
      down midway through a CoW remap:
      
      "This is what happens [after failed log recovery]:
      
      "Phase 1 - find and verify superblock...
      "Phase 2 - using internal log
      "        - zero log...
      "        - scan filesystem freespace and inode maps...
      "        - found root inode chunk
      "Phase 3 - for each AG...
      "        - scan (but don't clear) agi unlinked lists...
      "        - process known inodes and perform inode discovery...
      "        - agno = 0
      "data fork in regular inode 134 claims CoW block 376
      "correcting nextents for inode 134
      "bad data fork in inode 134
      "would have cleared inode 134"
      
      Hou Tao dissected the log contents of exactly such a crash:
      
      "According to the implementation of xfs_defer_finish(), these ops should
      be completed in the following sequence:
      
      "Have been done:
      "(1) CUI: Oper (160)
      "(2) BUI: Oper (161)
      "(3) CUD: Oper (194), for CUI Oper (160)
      "(4) RUI A: Oper (197), free rmap [0x155, 2, -9]
      
      "Should be done:
      "(5) BUD: for BUI Oper (161)
      "(6) RUI B: add rmap [0x155, 2, 137]
      "(7) RUD: for RUI A
      "(8) RUD: for RUI B
      
      "Actually be done by xlog_recover_process_intents()
      "(5) BUD: for BUI Oper (161)
      "(6) RUI B: add rmap [0x155, 2, 137]
      "(7) RUD: for RUI B
      "(8) RUD: for RUI A
      
      "So the rmap entry [0x155, 2, -9] for COW should be freed firstly,
      then a new rmap entry [0x155, 2, 137] will be added. However, as we can see
      from the log record in post_mount.log (generated after umount) and the trace
      print, the new rmap entry [0x155, 2, 137] are added firstly, then the rmap
      entry [0x155, 2, -9] are freed."
      
      When reconstructing the internal log state from the log items found on
      disk, it's required that deferred ops replay in exactly the same order
      that they would have had the filesystem not gone down.  However,
      replaying unfinished deferred ops can create /more/ deferred ops.  These
      new deferred ops are finished in the wrong order.  This causes fs
      corruption and replay crashes, so let's create a single defer_ops to
      handle the subsequent ops created during replay, then use one single
      transaction at the end of log recovery to ensure that everything is
      replayed in the same order as they're supposed to be.
      Reported-by: default avatarAmir Goldstein <amir73il@gmail.com>
      Analyzed-by: default avatarHou Tao <houtao1@huawei.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Tested-by: default avatarAmir Goldstein <amir73il@gmail.com>
      Signed-off-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      50995582
    • Darrick J. Wong's avatar
      xfs: always free inline data before resetting inode fork during ifree · 98c4f78d
      Darrick J. Wong authored
      In xfs_ifree, we reset the data/attr forks to extents format without
      bothering to free any inline data buffer that might still be around
      after all the blocks have been truncated off the file.  Prior to commit
      43518812 ("xfs: remove support for inlining data/extents into the
      inode fork") nobody noticed because the leftover inline data after
      truncation was small enough to fit inside the inline buffer inside the
      fork itself.
      
      However, now that we've removed the inline buffer, we /always/ have to
      free the inline data buffer or else we leak them like crazy.  This test
      was found by turning on kmemleak for generic/001 or generic/388.
      Signed-off-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      98c4f78d
    • Linus Torvalds's avatar
      Linux 4.15-rc1 · 4fbd8d19
      Linus Torvalds authored
      4fbd8d19
  8. 26 Nov, 2017 8 commits
    • Linus Torvalds's avatar
      Merge branch 'fixes' of git://git.armlinux.org.uk/~rmk/linux-arm · bbecb1cf
      Linus Torvalds authored
      Pull ARM fixes from Russell King:
      
       - LPAE fixes for kernel-readonly regions
      
       - Fix for get_user_pages_fast on LPAE systems
      
       - avoid tying decompressor to a particular platform if DEBUG_LL is
         enabled
      
       - BUG if we attempt to return to userspace but the to-be-restored PSR
         value keeps us in privileged mode (defeating an issue that ftracetest
         found)
      
      * 'fixes' of git://git.armlinux.org.uk/~rmk/linux-arm:
        ARM: BUG if jumping to usermode address in kernel mode
        ARM: 8722/1: mm: make STRICT_KERNEL_RWX effective for LPAE
        ARM: 8721/1: mm: dump: check hardware RO bit for LPAE
        ARM: make decompressor debug output user selectable
        ARM: fix get_user_pages_fast
      bbecb1cf
    • Linus Torvalds's avatar
      Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · dec0029a
      Linus Torvalds authored
      Pull irq fixes from Thomas Glexiner:
      
       - unbreak the irq trigger type check for legacy platforms
      
       - a handful fixes for ARM GIC v3/4 interrupt controllers
      
       - a few trivial fixes all over the place
      
      * 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        genirq/matrix: Make - vs ?: Precedence explicit
        irqchip/imgpdc: Use resource_size function on resource object
        irqchip/qcom: Fix u32 comparison with value less than zero
        irqchip/exiu: Fix return value check in exiu_init()
        irqchip/gic-v3-its: Remove artificial dependency on PCI
        irqchip/gic-v4: Add forward definition of struct irq_domain_ops
        irqchip/gic-v3: pr_err() strings should end with newlines
        irqchip/s3c24xx: pr_err() strings should end with newlines
        irqchip/gic-v3: Fix ppi-partitions lookup
        irqchip/gic-v4: Clear IRQ_DISABLE_UNLAZY again if mapping fails
        genirq: Track whether the trigger type has been set
      dec0029a
    • Linus Torvalds's avatar
      Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 02fc87b1
      Linus Torvalds authored
      Pull misc x86 fixes from Ingo Molnar:
       - topology enumeration fixes
       - KASAN fix
       - two entry fixes (not yet the big series related to KASLR)
       - remove obsolete code
       - instruction decoder fix
       - better /dev/mem sanity checks, hopefully working better this time
       - pkeys fixes
       - two ACPI fixes
       - 5-level paging related fixes
       - UMIP fixes that should make application visible faults more debuggable
       - boot fix for weird virtualization environment
      
      * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (24 commits)
        x86/decoder: Add new TEST instruction pattern
        x86/PCI: Remove unused HyperTransport interrupt support
        x86/umip: Fix insn_get_code_seg_params()'s return value
        x86/boot/KASLR: Remove unused variable
        x86/entry/64: Add missing irqflags tracing to native_load_gs_index()
        x86/mm/kasan: Don't use vmemmap_populate() to initialize shadow
        x86/entry/64: Fix entry_SYSCALL_64_after_hwframe() IRQ tracing
        x86/pkeys/selftests: Fix protection keys write() warning
        x86/pkeys/selftests: Rename 'si_pkey' to 'siginfo_pkey'
        x86/mpx/selftests: Fix up weird arrays
        x86/pkeys: Update documentation about availability
        x86/umip: Print a warning into the syslog if UMIP-protected instructions are used
        x86/smpboot: Fix __max_logical_packages estimate
        x86/topology: Avoid wasting 128k for package id array
        perf/x86/intel/uncore: Cache logical pkg id in uncore driver
        x86/acpi: Reduce code duplication in mp_override_legacy_irq()
        x86/acpi: Handle SCI interrupts above legacy space gracefully
        x86/boot: Fix boot failure when SMP MP-table is based at 0
        x86/mm: Limit mmap() of /dev/mem to valid physical addresses
        x86/selftests: Add test for mapping placement for 5-level paging
        ...
      02fc87b1
    • Linus Torvalds's avatar
      Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 6830c8db
      Linus Torvalds authored
      Pull scheduler fixes from Ingo Molnar:
       "Misc fixes: a documentation fix, a Sparse warning fix and a debugging
        fix"
      
      * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        sched/debug: Fix task state recording/printout
        sched/deadline: Don't use dubious signed bitfields
        sched/deadline: Fix the description of runtime accounting in the documentation
      6830c8db
    • Linus Torvalds's avatar
      Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 580e3d55
      Linus Torvalds authored
      Pull perf fixes from Ingo Molnar:
       "Misc fixes: two PMU driver fixes and a memory leak fix"
      
      * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        perf/core: Fix memory leak triggered by perf --namespace
        perf/x86/intel/uncore: Add event constraint for BDX PCU
        perf/x86/intel: Hide TSX events when RTM is not supported
      580e3d55
    • Linus Torvalds's avatar
      Merge branch 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · cd4b5d5d
      Linus Torvalds authored
      Pull static key fix from Ingo Molnar:
       "Fix a boot warning related to bad init ordering of the static keys
        self-test"
      
      * 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        jump_label: Invoke jump_label_test() via early_initcall()
      cd4b5d5d
    • Linus Torvalds's avatar
      Merge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · fcbc38b1
      Linus Torvalds authored
      Pull objtool fixes from Ingo Molnar:
       "A handful of objtool fixes, most of them related to making the UAPI
        header-syncing warnings easier to read and easier to act upon"
      
      * 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        tools/headers: Sync objtool UAPI header
        objtool: Fix cross-build
        objtool: Move kernel headers/code sync check to a script
        objtool: Move synced files to their original relative locations
        objtool: Make unreachable annotation inline asms explicitly volatile
        objtool: Add a comment for the unreachable annotation macros
      fcbc38b1
    • Russell King's avatar
      ARM: BUG if jumping to usermode address in kernel mode · 8bafae20
      Russell King authored
      Detect if we are returning to usermode via the normal kernel exit paths
      but the saved PSR value indicates that we are in kernel mode.  This
      could occur due to corrupted stack state, which has been observed with
      "ftracetest".
      
      This ensures that we catch the problem case before we get to user code.
      Signed-off-by: default avatarRussell King <rmk+kernel@armlinux.org.uk>
      8bafae20
  9. 25 Nov, 2017 1 commit
    • Linus Torvalds's avatar
      Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 844056fd
      Linus Torvalds authored
      Pull timer updates from Thomas Gleixner:
      
       - The final conversion of timer wheel timers to timer_setup().
      
         A few manual conversions and a large coccinelle assisted sweep and
         the removal of the old initialization mechanisms and the related
         code.
      
       - Remove the now unused VSYSCALL update code
      
       - Fix permissions of /proc/timer_list. I still need to get rid of that
         file completely
      
       - Rename a misnomed clocksource function and remove a stale declaration
      
      * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (27 commits)
        m68k/macboing: Fix missed timer callback assignment
        treewide: Remove TIMER_FUNC_TYPE and TIMER_DATA_TYPE casts
        timer: Remove redundant __setup_timer*() macros
        timer: Pass function down to initialization routines
        timer: Remove unused data arguments from macros
        timer: Switch callback prototype to take struct timer_list * argument
        timer: Pass timer_list pointer to callbacks unconditionally
        Coccinelle: Remove setup_timer.cocci
        timer: Remove setup_*timer() interface
        timer: Remove init_timer() interface
        treewide: setup_timer() -> timer_setup() (2 field)
        treewide: setup_timer() -> timer_setup()
        treewide: init_timer() -> setup_timer()
        treewide: Switch DEFINE_TIMER callbacks to struct timer_list *
        s390: cmm: Convert timers to use timer_setup()
        lightnvm: Convert timers to use timer_setup()
        drivers/net: cris: Convert timers to use timer_setup()
        drm/vc4: Convert timers to use timer_setup()
        block/laptop_mode: Convert timers to use timer_setup()
        net/atm/mpc: Avoid open-coded assignment of timer callback function
        ...
      844056fd