1. 08 May, 2020 7 commits
    • Bob Peterson's avatar
      gfs2: Change BUG_ON to an assert_withdraw in gfs2_quota_change · f9615fe3
      Bob Peterson authored
      Before this patch, gfs2_quota_change() would BUG_ON if the
      qa_ref counter was not a positive number. This patch changes it to
      be a withdraw instead. That way we can debug things more easily.
      Signed-off-by: default avatarBob Peterson <rpeterso@redhat.com>
      Signed-off-by: default avatarAndreas Gruenbacher <agruenba@redhat.com>
      f9615fe3
    • Bob Peterson's avatar
      gfs2: Fix problems regarding gfs2_qa_get and _put · 2297ab61
      Bob Peterson authored
      This patch fixes a couple of places in which gfs2_qa_get and gfs2_qa_put are
      not balanced: we now keep references around whenever a file is open for writing
      (see gfs2_open_common and gfs2_release), so we need to put all references we
      grab in function gfs2_create_inode.  This was broken in the successful case and
      on one error path.
      
      This also means that we don't have a reference to put in gfs2_evict_inode.
      
      In addition, gfs2_qa_put was called for the wrong inode in gfs2_link.
      Signed-off-by: default avatarBob Peterson <rpeterso@redhat.com>
      Signed-off-by: default avatarAndreas Gruenbacher <agruenba@redhat.com>
      2297ab61
    • Andreas Gruenbacher's avatar
      gfs2: More gfs2_find_jhead fixes · aa83da7f
      Andreas Gruenbacher authored
      It turns out that when extending an existing bio, gfs2_find_jhead fails to
      check if the block number is consecutive, which leads to incorrect reads for
      fragmented journals.
      
      In addition, limit the maximum bio size to an arbitrary value of 2 megabytes:
      since commit 07173c3e ("block: enable multipage bvecs"), if we just keep
      adding pages until bio_add_page fails, bios will grow much larger than useful,
      which pins more memory than necessary with barely any additional performance
      gains.
      
      Fixes: f4686c26 ("gfs2: read journal in large chunks")
      Cc: stable@vger.kernel.org # v5.2+
      Signed-off-by: default avatarAndreas Gruenbacher <agruenba@redhat.com>
      Signed-off-by: default avatarBob Peterson <rpeterso@redhat.com>
      aa83da7f
    • Andreas Gruenbacher's avatar
      gfs2: Another gfs2_walk_metadata fix · 566a2ab3
      Andreas Gruenbacher authored
      Make sure we don't walk past the end of the metadata in gfs2_walk_metadata: the
      inode holds fewer pointers than indirect blocks.
      
      Slightly clean up gfs2_iomap_get.
      
      Fixes: a27a0c9b ("gfs2: gfs2_walk_metadata fix")
      Cc: stable@vger.kernel.org # v5.3+
      Signed-off-by: default avatarAndreas Gruenbacher <agruenba@redhat.com>
      Signed-off-by: default avatarBob Peterson <rpeterso@redhat.com>
      566a2ab3
    • Bob Peterson's avatar
      gfs2: Fix use-after-free in gfs2_logd after withdraw · d22f69a0
      Bob Peterson authored
      When the gfs2_logd daemon withdrew, the withdraw sequence called
      into make_fs_ro() to make the file system read-only. That caused the
      journal descriptors to be freed. However, those journal descriptors
      were used by gfs2_logd's call to gfs2_ail_flush_reqd(). This caused
      a use-after free and NULL pointer dereference.
      
      This patch changes function gfs2_logd() so that it stops all logd
      work until the thread is told to stop. Once a withdraw is done,
      it only does an interruptible sleep.
      Signed-off-by: default avatarBob Peterson <rpeterso@redhat.com>
      Signed-off-by: default avatarAndreas Gruenbacher <agruenba@redhat.com>
      d22f69a0
    • Bob Peterson's avatar
      gfs2: Fix BUG during unmount after file system withdraw · 53af80ce
      Bob Peterson authored
      Before this patch, when the logd daemon was forced to withdraw, it
      would try to request its journal be recovered by another cluster node.
      However, in single-user cases with lock_nolock, there are no other
      nodes to recover the journal. Function signal_our_withdraw() was
      recognizing the lock_nolock situation, but not until after it had
      evicted its journal inode. Since the journal descriptor that points
      to the inode was never removed from the master list, when the unmount
      occurred, it did another iput on the evicted inode, which resulted in
      a BUG_ON(inode->i_state & I_CLEAR).
      
      This patch moves the check for this situation earlier in function
      signal_our_withdraw(), which avoids the extra iput, so the unmount
      may happen normally.
      Signed-off-by: default avatarBob Peterson <rpeterso@redhat.com>
      Signed-off-by: default avatarAndreas Gruenbacher <agruenba@redhat.com>
      53af80ce
    • Bob Peterson's avatar
      gfs2: Fix error exit in do_xmote · a8b7528b
      Bob Peterson authored
      Before this patch, if an error was detected from glock function go_sync
      by function do_xmote, it would return.  But the function had temporarily
      unlocked the gl_lockref spin_lock, and it never re-locked it.  When the
      caller of do_xmote tried to unlock it again, it was already unlocked,
      which resulted in a corrupted spin_lock value.
      
      This patch makes sure the gl_lockref spin_lock is re-locked after it is
      unlocked.
      
      Thanks to Wu Bo <wubo40@huawei.com> for reporting this problem.
      Signed-off-by: default avatarBob Peterson <rpeterso@redhat.com>
      Signed-off-by: default avatarAndreas Gruenbacher <agruenba@redhat.com>
      a8b7528b
  2. 06 May, 2020 1 commit
    • Bob Peterson's avatar
      gfs2: fix withdraw sequence deadlock · ac915584
      Bob Peterson authored
      After a gfs2 file system withdraw, any attempt to read metadata is
      automatically rejected by function gfs2_meta_read() except for reads
      of the journal inode. This turns out to be a problem because function
      signal_our_withdraw() repeatedly calls check_journal_clean() which reads
      the metadata (both its dinode and indirect blocks) to see if the entire
      journal is mapped. The dinode read works, but reading the indirect blocks
      returns -EIO which gets sent back up and causes a consistency error.
      This results in withdraw-from-withdraw, which becomes a deadlock.
      
      This patch changes the test in gfs2_meta_read() to allow all metadata
      reads for the journal. Instead of checking the journal block, it now
      checks for the journal inode glock which is the same for all blocks in
      the journal. This allows check_journal_clean() to properly check the
      journal without trying to withdraw recursively.
      Signed-off-by: default avatarBob Peterson <rpeterso@redhat.com>
      Signed-off-by: default avatarAndreas Gruenbacher <agruenba@redhat.com>
      ac915584
  3. 12 Apr, 2020 10 commits
    • Linus Torvalds's avatar
      Linux 5.7-rc1 · 8f3d9f35
      Linus Torvalds authored
      8f3d9f35
    • Linus Torvalds's avatar
      MAINTAINERS: sort field names for all entries · 3b50142d
      Linus Torvalds authored
      This sorts the actual field names too, potentially causing even more
      chaos and confusion at merge time if you have edited the MAINTAINERS
      file.  But the end result is a more consistent layout, and hopefully
      it's a one-time pain minimized by doing this just before the -rc1
      release.
      
      This was entirely scripted:
      
        ./scripts/parse-maintainers.pl --input=MAINTAINERS --output=MAINTAINERS --order
      Requested-by: default avatarJoe Perches <joe@perches.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      3b50142d
    • Linus Torvalds's avatar
      MAINTAINERS: sort entries by entry name · 4400b7d6
      Linus Torvalds authored
      They are all supposed to be sorted, but people who add new entries don't
      always know the alphabet.  Plus sometimes the entry names get edited,
      and people don't then re-order the entry.
      
      Let's see how painful this will be for merging purposes (the MAINTAINERS
      file is often edited in various different trees), but Joe claims there's
      relatively few patches in -next that touch this, and doing it just
      before -rc1 is likely the best time.  Fingers crossed.
      
      This was scripted with
      
        /scripts/parse-maintainers.pl --input=MAINTAINERS --output=MAINTAINERS
      
      but then I also ended up manually upper-casing a few entry names that
      stood out when looking at the end result.
      Requested-by: default avatarJoe Perches <joe@perches.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      4400b7d6
    • Linus Torvalds's avatar
      Merge tag 'x86-urgent-2020-04-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 4f8a3cc1
      Linus Torvalds authored
      Pull x86 fixes from Thomas Gleixner:
       "A set of three patches to fix the fallout of the newly added split
        lock detection feature.
      
        It addressed the case where a KVM guest triggers a split lock #AC and
        KVM reinjects it into the guest which is not prepared to handle it.
      
        Add proper sanity checks which prevent the unconditional injection
        into the guest and handles the #AC on the host side in the same way as
        user space detections are handled. Depending on the detection mode it
        either warns and disables detection for the task or kills the task if
        the mode is set to fatal"
      
      * tag 'x86-urgent-2020-04-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        KVM: VMX: Extend VMXs #AC interceptor to handle split lock #AC in guest
        KVM: x86: Emulate split-lock access as a write in emulator
        x86/split_lock: Provide handle_guest_split_lock()
      4f8a3cc1
    • Linus Torvalds's avatar
      Merge tag 'timers-urgent-2020-04-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 0785249f
      Linus Torvalds authored
      Pull time(keeping) updates from Thomas Gleixner:
      
       - Fix the time_for_children symlink in /proc/$PID/ so it properly
         reflects that it part of the 'time' namespace
      
       - Add the missing userns limit for the allowed number of time
         namespaces, which was half defined but the actual array member was
         not added. This went unnoticed as the array has an exessive empty
         member at the end but introduced a user visible regression as the
         output was corrupted.
      
       - Prevent further silent ucount corruption by adding a BUILD_BUG_ON()
         to catch half updated data.
      
      * tag 'timers-urgent-2020-04-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        ucount: Make sure ucounts in /proc/sys/user don't regress again
        time/namespace: Add max_time_namespaces ucount
        time/namespace: Fix time_for_children symlink
      0785249f
    • Linus Torvalds's avatar
      Merge tag 'sched-urgent-2020-04-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 590680d1
      Linus Torvalds authored
      Pull scheduler fixes/updates from Thomas Gleixner:
      
       - Deduplicate the average computations in the scheduler core and the
         fair class code.
      
       - Fix a raise between runtime distribution and assignement which can
         cause exceeding the quota by up to 70%.
      
       - Prevent negative results in the imbalanace calculation
      
       - Remove a stale warning in the workqueue code which can be triggered
         since the call site was moved out of preempt disabled code. It's a
         false positive.
      
       - Deduplicate the print macros for procfs
      
       - Add the ucmap values to the SCHED_DEBUG procfs output for completness
      
      * tag 'sched-urgent-2020-04-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        sched/debug: Add task uclamp values to SCHED_DEBUG procfs
        sched/debug: Factor out printing formats into common macros
        sched/debug: Remove redundant macro define
        sched/core: Remove unused rq::last_load_update_tick
        workqueue: Remove the warning in wq_worker_sleeping()
        sched/fair: Fix negative imbalance in imbalance calculation
        sched/fair: Fix race between runtime distribution and assignment
        sched/fair: Align rq->avg_idle and rq->avg_scan_cost
      590680d1
    • Linus Torvalds's avatar
      Merge tag 'perf-urgent-2020-04-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 20e2aa81
      Linus Torvalds authored
      Pull perf fixes from Thomas Gleixner:
       "Three fixes/updates for perf:
      
         - Fix the perf event cgroup tracking which tries to track the cgroup
           even for disabled events.
      
         - Add Ice Lake server support for uncore events
      
         - Disable pagefaults when retrieving the physical address in the
           sampling code"
      
      * tag 'perf-urgent-2020-04-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        perf/core: Disable page faults when getting phys address
        perf/x86/intel/uncore: Add Ice Lake server uncore support
        perf/cgroup: Correct indirection in perf_less_group_idx()
        perf/core: Fix event cgroup tracking
      20e2aa81
    • Linus Torvalds's avatar
      Merge tag 'locking-urgent-2020-04-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 652fa53c
      Linus Torvalds authored
      Pull locking fixes from Thomas Gleixner:
       "Three small fixes/updates for the locking core code:
      
         - Plug a task struct reference leak in the percpu rswem
           implementation.
      
         - Document the refcount interaction with PID_MAX_LIMIT
      
         - Improve the 'invalid wait context' data dump in lockdep so it
           contains all information which is required to decode the problem"
      
      * tag 'locking-urgent-2020-04-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        locking/lockdep: Improve 'invalid wait context' splat
        locking/refcount: Document interaction with PID_MAX_LIMIT
        locking/percpu-rwsem: Fix a task_struct refcount
      652fa53c
    • Linus Torvalds's avatar
      Merge tag '5.7-rc-smb3-fixes-part2' of git://git.samba.org/sfrench/cifs-2.6 · 4119bf9f
      Linus Torvalds authored
      Pull cifs fixes from Steve French:
       "Ten cifs/smb fixes:
      
         - five RDMA (smbdirect) related fixes
      
         - add experimental support for swap over SMB3 mounts
      
         - also a fix which improves performance of signed connections"
      
      * tag '5.7-rc-smb3-fixes-part2' of git://git.samba.org/sfrench/cifs-2.6:
        smb3: enable swap on SMB3 mounts
        smb3: change noisy error message to FYI
        smb3: smbdirect support can be configured by default
        cifs: smbd: Do not schedule work to send immediate packet on every receive
        cifs: smbd: Properly process errors on ib_post_send
        cifs: Allocate crypto structures on the fly for calculating signatures of incoming packets
        cifs: smbd: Update receive credits before sending and deal with credits roll back on failure before sending
        cifs: smbd: Check send queue size before posting a send
        cifs: smbd: Merge code to track pending packets
        cifs: ignore cached share root handle closing errors
      4119bf9f
    • Linus Torvalds's avatar
      Merge tag 'nfs-for-5.7-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs · 50bda5fa
      Linus Torvalds authored
      Pull NFS client bugfix from Trond Myklebust:
       "Fix an RCU read lock leakage in pnfs_alloc_ds_commits_list()"
      
      * tag 'nfs-for-5.7-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
        pNFS: Fix RCU lock leakage
      50bda5fa
  4. 11 Apr, 2020 14 commits
  5. 10 Apr, 2020 8 commits