1. 15 Aug, 2020 2 commits
    • Jens Axboe's avatar
      io_uring: short circuit -EAGAIN for blocking read attempt · f91daf56
      Jens Axboe authored
      One case was missed in the short IO retry handling, and that's hitting
      -EAGAIN on a blocking attempt read (eg from io-wq context). This is a
      problem on sockets that are marked as non-blocking when created, they
      don't carry any REQ_F_NOWAIT information to help us terminate them
      instead of perpetually retrying.
      
      Fixes: 227c0c96 ("io_uring: internally retry short reads")
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      f91daf56
    • Jens Axboe's avatar
      io_uring: sanitize double poll handling · d4e7cd36
      Jens Axboe authored
      There's a bit of confusion on the matching pairs of poll vs double poll,
      depending on if the request is a pure poll (IORING_OP_POLL_ADD) or
      poll driven retry.
      
      Add io_poll_get_double() that returns the double poll waitqueue, if any,
      and io_poll_get_single() that returns the original poll waitqueue. With
      that, remove the argument to io_poll_remove_double().
      
      Finally ensure that wait->private is cleared once the double poll handler
      has run, so that remove knows it's already been seen.
      
      Cc: stable@vger.kernel.org # v5.8
      Reported-by: syzbot+7f617d4a9369028b8a2c@syzkaller.appspotmail.com
      Fixes: 18bceab1 ("io_uring: allow POLL_ADD with double poll_wait() users")
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      d4e7cd36
  2. 13 Aug, 2020 3 commits
    • Jens Axboe's avatar
      io_uring: internally retry short reads · 227c0c96
      Jens Axboe authored
      We've had a few application cases of not handling short reads properly,
      and it is understandable as short reads aren't really expected if the
      application isn't doing non-blocking IO.
      
      Now that we retain the iov_iter over retries, we can implement internal
      retry pretty trivially. This ensures that we don't return a short read,
      even for buffered reads on page cache conflicts.
      
      Cleanup the deep nesting and hard to read nature of io_read() as well,
      it's much more straight forward now to read and understand. Added a
      few comments explaining the logic as well.
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      227c0c96
    • Jens Axboe's avatar
      io_uring: retain iov_iter state over io_read/io_write calls · ff6165b2
      Jens Axboe authored
      Instead of maintaining (and setting/remembering) iov_iter size and
      segment counts, just put the iov_iter in the async part of the IO
      structure.
      
      This is mostly a preparation patch for doing appropriate internal retries
      for short reads, but it also cleans up the state handling nicely and
      simplifies it quite a bit.
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      ff6165b2
    • Jens Axboe's avatar
      task_work: only grab task signal lock when needed · ebf0d100
      Jens Axboe authored
      If JOBCTL_TASK_WORK is already set on the targeted task, then we need
      not go through {lock,unlock}_task_sighand() to set it again and queue
      a signal wakeup. This is safe as we're checking it _after_ adding the
      new task_work with cmpxchg().
      
      The ordering is as follows:
      
      task_work_add()				get_signal()
      --------------------------------------------------------------
      STORE(task->task_works, new_work);	STORE(task->jobctl);
      mb();					mb();
      LOAD(task->jobctl);			LOAD(task->task_works);
      
      This speeds up TWA_SIGNAL handling quite a bit, which is important now
      that io_uring is relying on it for all task_work deliveries.
      
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Jann Horn <jannh@google.com>
      Acked-by: default avatarOleg Nesterov <oleg@redhat.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      ebf0d100
  3. 12 Aug, 2020 2 commits
    • Jens Axboe's avatar
      io_uring: enable lookup of links holding inflight files · f254ac04
      Jens Axboe authored
      When a process exits, we cancel whatever requests it has pending that
      are referencing the file table. However, if a link is holding a
      reference, then we cannot find it by simply looking at the inflight
      list.
      
      Enable checking of the poll and timeout list to find the link, and
      cancel it appropriately.
      
      Cc: stable@vger.kernel.org
      Reported-by: default avatarJosef <josef.grieb@gmail.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      f254ac04
    • Jens Axboe's avatar
      io_uring: fail poll arm on queue proc failure · a36da65c
      Jens Axboe authored
      Check the ipt.error value, it must have been either cleared to zero or
      set to another error than the default -EINVAL if we don't go through the
      waitqueue proc addition. Just give up on poll at that point and return
      failure, this will fallback to async work.
      
      io_poll_add() doesn't suffer from this failure case, as it returns the
      error value directly.
      
      Cc: stable@vger.kernel.org # v5.7+
      Reported-by: syzbot+a730016dc0bdce4f6ff5@syzkaller.appspotmail.com
      Reviewed-by: default avatarStefano Garzarella <sgarzare@redhat.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      a36da65c
  4. 11 Aug, 2020 2 commits
  5. 10 Aug, 2020 4 commits
    • Jens Axboe's avatar
      io_uring: defer file table grabbing request cleanup for locked requests · 51a4cc11
      Jens Axboe authored
      If we're in the error path failing links and we have a link that has
      grabbed a reference to the fs_struct, then we cannot safely drop our
      reference to the table if we already hold the completion lock. This
      adds a hardirq dependency to the fs_struct->lock, which it currently
      doesn't have.
      
      Defer the final cleanup and free of such requests to avoid adding this
      dependency.
      
      Reported-by: syzbot+ef4b654b49ed7ff049bf@syzkaller.appspotmail.com
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      51a4cc11
    • Jens Axboe's avatar
      io_uring: add missing REQ_F_COMP_LOCKED for nested requests · 9b7adba9
      Jens Axboe authored
      When we traverse into failing links or timeouts, we need to ensure we
      propagate the REQ_F_COMP_LOCKED flag to ensure that we correctly signal
      to the completion side that we already hold the completion lock.
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      9b7adba9
    • Jens Axboe's avatar
      io_uring: fix recursive completion locking on oveflow flush · 7271ef3a
      Jens Axboe authored
      syszbot reports a scenario where we recurse on the completion lock
      when flushing an overflow:
      
      1 lock held by syz-executor287/6816:
       #0: ffff888093cdb4d8 (&ctx->completion_lock){....}-{2:2}, at: io_cqring_overflow_flush+0xc6/0xab0 fs/io_uring.c:1333
      
      stack backtrace:
      CPU: 1 PID: 6816 Comm: syz-executor287 Not tainted 5.8.0-syzkaller #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       __dump_stack lib/dump_stack.c:77 [inline]
       dump_stack+0x1f0/0x31e lib/dump_stack.c:118
       print_deadlock_bug kernel/locking/lockdep.c:2391 [inline]
       check_deadlock kernel/locking/lockdep.c:2432 [inline]
       validate_chain+0x69a4/0x88a0 kernel/locking/lockdep.c:3202
       __lock_acquire+0x1161/0x2ab0 kernel/locking/lockdep.c:4426
       lock_acquire+0x160/0x730 kernel/locking/lockdep.c:5005
       __raw_spin_lock_irq include/linux/spinlock_api_smp.h:128 [inline]
       _raw_spin_lock_irq+0x67/0x80 kernel/locking/spinlock.c:167
       spin_lock_irq include/linux/spinlock.h:379 [inline]
       io_queue_linked_timeout fs/io_uring.c:5928 [inline]
       __io_queue_async_work fs/io_uring.c:1192 [inline]
       __io_queue_deferred+0x36a/0x790 fs/io_uring.c:1237
       io_cqring_overflow_flush+0x774/0xab0 fs/io_uring.c:1359
       io_ring_ctx_wait_and_kill+0x2a1/0x570 fs/io_uring.c:7808
       io_uring_release+0x59/0x70 fs/io_uring.c:7829
       __fput+0x34f/0x7b0 fs/file_table.c:281
       task_work_run+0x137/0x1c0 kernel/task_work.c:135
       exit_task_work include/linux/task_work.h:25 [inline]
       do_exit+0x5f3/0x1f20 kernel/exit.c:806
       do_group_exit+0x161/0x2d0 kernel/exit.c:903
       __do_sys_exit_group+0x13/0x20 kernel/exit.c:914
       __se_sys_exit_group+0x10/0x10 kernel/exit.c:912
       __x64_sys_exit_group+0x37/0x40 kernel/exit.c:912
       do_syscall_64+0x31/0x70 arch/x86/entry/common.c:46
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      Fix this by passing back the link from __io_queue_async_work(), and
      then let the caller handle the queueing of the link. Take care to also
      punt the submission reference put to the caller, as we're holding the
      completion lock for the __io_queue_defer() case. Hence we need to mark
      the io_kiocb appropriately for that case.
      
      Reported-by: syzbot+996f91b6ec3812c48042@syzkaller.appspotmail.com
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      7271ef3a
    • Jens Axboe's avatar
      io_uring: use TWA_SIGNAL for task_work uncondtionally · 0ba9c9ed
      Jens Axboe authored
      An earlier commit:
      
      b7db41c9 ("io_uring: fix regression with always ignoring signals in io_cqring_wait()")
      
      ensured that we didn't get stuck waiting for eventfd reads when it's
      registered with the io_uring ring for event notification, but we still
      have cases where the task can be waiting on other events in the kernel and
      need a bigger nudge to make forward progress. Or the task could be in the
      kernel and running, but on its way to blocking.
      
      This means that TWA_RESUME cannot reliably be used to ensure we make
      progress. Use TWA_SIGNAL unconditionally.
      
      Cc: stable@vger.kernel.org # v5.7+
      Reported-by: default avatarJosef <josef.grieb@gmail.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      0ba9c9ed
  6. 06 Aug, 2020 2 commits
  7. 05 Aug, 2020 1 commit
    • Guoyu Huang's avatar
      io_uring: Fix NULL pointer dereference in loop_rw_iter() · 2dd2111d
      Guoyu Huang authored
      loop_rw_iter() does not check whether the file has a read or
      write function. This can lead to NULL pointer dereference
      when the user passes in a file descriptor that does not have
      read or write function.
      
      The crash log looks like this:
      
      [   99.834071] BUG: kernel NULL pointer dereference, address: 0000000000000000
      [   99.835364] #PF: supervisor instruction fetch in kernel mode
      [   99.836522] #PF: error_code(0x0010) - not-present page
      [   99.837771] PGD 8000000079d62067 P4D 8000000079d62067 PUD 79d8c067 PMD 0
      [   99.839649] Oops: 0010 [#2] SMP PTI
      [   99.840591] CPU: 1 PID: 333 Comm: io_wqe_worker-0 Tainted: G      D           5.8.0 #2
      [   99.842622] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1 04/01/2014
      [   99.845140] RIP: 0010:0x0
      [   99.845840] Code: Bad RIP value.
      [   99.846672] RSP: 0018:ffffa1c7c01ebc08 EFLAGS: 00010202
      [   99.848018] RAX: 0000000000000000 RBX: ffff92363bd67300 RCX: ffff92363d461208
      [   99.849854] RDX: 0000000000000010 RSI: 00007ffdbf696bb0 RDI: ffff92363bd67300
      [   99.851743] RBP: ffffa1c7c01ebc40 R08: 0000000000000000 R09: 0000000000000000
      [   99.853394] R10: ffffffff9ec692a0 R11: 0000000000000000 R12: 0000000000000010
      [   99.855148] R13: 0000000000000000 R14: ffff92363d461208 R15: ffffa1c7c01ebc68
      [   99.856914] FS:  0000000000000000(0000) GS:ffff92363dd00000(0000) knlGS:0000000000000000
      [   99.858651] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [   99.860032] CR2: ffffffffffffffd6 CR3: 000000007ac66000 CR4: 00000000000006e0
      [   99.861979] Call Trace:
      [   99.862617]  loop_rw_iter.part.0+0xad/0x110
      [   99.863838]  io_write+0x2ae/0x380
      [   99.864644]  ? kvm_sched_clock_read+0x11/0x20
      [   99.865595]  ? sched_clock+0x9/0x10
      [   99.866453]  ? sched_clock_cpu+0x11/0xb0
      [   99.867326]  ? newidle_balance+0x1d4/0x3c0
      [   99.868283]  io_issue_sqe+0xd8f/0x1340
      [   99.869216]  ? __switch_to+0x7f/0x450
      [   99.870280]  ? __switch_to_asm+0x42/0x70
      [   99.871254]  ? __switch_to_asm+0x36/0x70
      [   99.872133]  ? lock_timer_base+0x72/0xa0
      [   99.873155]  ? switch_mm_irqs_off+0x1bf/0x420
      [   99.874152]  io_wq_submit_work+0x64/0x180
      [   99.875192]  ? kthread_use_mm+0x71/0x100
      [   99.876132]  io_worker_handle_work+0x267/0x440
      [   99.877233]  io_wqe_worker+0x297/0x350
      [   99.878145]  kthread+0x112/0x150
      [   99.878849]  ? __io_worker_unuse+0x100/0x100
      [   99.879935]  ? kthread_park+0x90/0x90
      [   99.880874]  ret_from_fork+0x22/0x30
      [   99.881679] Modules linked in:
      [   99.882493] CR2: 0000000000000000
      [   99.883324] ---[ end trace 4453745f4673190b ]---
      [   99.884289] RIP: 0010:0x0
      [   99.884837] Code: Bad RIP value.
      [   99.885492] RSP: 0018:ffffa1c7c01ebc08 EFLAGS: 00010202
      [   99.886851] RAX: 0000000000000000 RBX: ffff92363acd7f00 RCX: ffff92363d461608
      [   99.888561] RDX: 0000000000000010 RSI: 00007ffe040d9e10 RDI: ffff92363acd7f00
      [   99.890203] RBP: ffffa1c7c01ebc40 R08: 0000000000000000 R09: 0000000000000000
      [   99.891907] R10: ffffffff9ec692a0 R11: 0000000000000000 R12: 0000000000000010
      [   99.894106] R13: 0000000000000000 R14: ffff92363d461608 R15: ffffa1c7c01ebc68
      [   99.896079] FS:  0000000000000000(0000) GS:ffff92363dd00000(0000) knlGS:0000000000000000
      [   99.898017] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [   99.899197] CR2: ffffffffffffffd6 CR3: 000000007ac66000 CR4: 00000000000006e0
      
      Fixes: 32960613 ("io_uring: correctly handle non ->{read,write}_iter() file_operations")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarGuoyu Huang <hgy5945@gmail.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      2dd2111d
  8. 03 Aug, 2020 24 commits
    • Jens Axboe's avatar
      io_uring: add comments on how the async buffered read retry works · c1dd91d1
      Jens Axboe authored
      The retry based logic here isn't easy to follow unless you're already
      familiar with how io_uring does task_work based retries. Add some
      comments explaining the flow a little better.
      Suggested-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      c1dd91d1
    • Jens Axboe's avatar
      io_uring: io_async_buf_func() need not test page bit · cbd287c0
      Jens Axboe authored
      Since we don't do exclusive waits or wakeups, we know that the bit is
      always going to be set. Kill the test. Also see commit:
      
      2a9127fc ("mm: rewrite wait_on_page_bit_common() logic")
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      cbd287c0
    • Linus Torvalds's avatar
      Merge tag 'sched-core-2020-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · e4cbce4d
      Linus Torvalds authored
      Pull scheduler updates from Ingo Molnar:
      
       - Improve uclamp performance by using a static key for the fast path
      
       - Add the "sched_util_clamp_min_rt_default" sysctl, to optimize for
         better power efficiency of RT tasks on battery powered devices.
         (The default is to maximize performance & reduce RT latencies.)
      
       - Improve utime and stime tracking accuracy, which had a fixed boundary
         of error, which created larger and larger relative errors as the
         values become larger. This is now replaced with more precise
         arithmetics, using the new mul_u64_u64_div_u64() helper in math64.h.
      
       - Improve the deadline scheduler, such as making it capacity aware
      
       - Improve frequency-invariant scheduling
      
       - Misc cleanups in energy/power aware scheduling
      
       - Add sched_update_nr_running tracepoint to track changes to nr_running
      
       - Documentation additions and updates
      
       - Misc cleanups and smaller fixes
      
      * tag 'sched-core-2020-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (54 commits)
        sched/doc: Factorize bits between sched-energy.rst & sched-capacity.rst
        sched/doc: Document capacity aware scheduling
        sched: Document arch_scale_*_capacity()
        arm, arm64: Fix selection of CONFIG_SCHED_THERMAL_PRESSURE
        Documentation/sysctl: Document uclamp sysctl knobs
        sched/uclamp: Add a new sysctl to control RT default boost value
        sched/uclamp: Fix a deadlock when enabling uclamp static key
        sched: Remove duplicated tick_nohz_full_enabled() check
        sched: Fix a typo in a comment
        sched/uclamp: Remove unnecessary mutex_init()
        arm, arm64: Select CONFIG_SCHED_THERMAL_PRESSURE
        sched: Cleanup SCHED_THERMAL_PRESSURE kconfig entry
        arch_topology, sched/core: Cleanup thermal pressure definition
        trace/events/sched.h: fix duplicated word
        linux/sched/mm.h: drop duplicated words in comments
        smp: Fix a potential usage of stale nr_cpus
        sched/fair: update_pick_idlest() Select group with lowest group_util when idle_cpus are equal
        sched: nohz: stop passing around unused "ticks" parameter.
        sched: Better document ttwu()
        sched: Add a tracepoint to track rq->nr_running
        ...
      e4cbce4d
    • Linus Torvalds's avatar
      Merge tag 'perf-core-2020-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · b34133fe
      Linus Torvalds authored
      Pull perf event updates from Ingo Molnar:
       "HW support updates:
      
         - Add uncore support for Intel Comet Lake
      
         - Add RAPL support for Hygon Fam18h
      
         - Add Intel "IIO stack to PMON mapping" support on Skylake-SP CPUs,
           which enumerates per device performance counters via sysfs and
           enables the perf stat --iiostat functionality
      
         - Add support for Intel "Architectural LBRs", which generalized the
           model specific LBR hardware tracing feature into a
           model-independent, architected performance monitoring feature.
      
           Usage is mostly seamless to tooling, as the pre-existing LBR
           features are kept, but there's a couple of advantages under the
           hood, such as faster context-switching, faster LBR reads, cleaner
           exposure of LBR features to guest kernels, etc.
      
           ( Since architectural LBRs are supported via XSAVE, there's related
             changes to the x86 FPU code as well. )
      
        ftrace/perf updates:
      
         - Add support to add a text poke event to record changes to kernel
           text (i.e. self-modifying code) in order to support tracers like
           Intel PT decoding through jump labels, kprobes and ftrace
           trampolines.
      
        Misc cleanups, smaller fixes..."
      
      * tag 'perf-core-2020-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (47 commits)
        perf/x86/rapl: Add Hygon Fam18h RAPL support
        kprobes: Remove unnecessary module_mutex locking from kprobe_optimizer()
        x86/perf: Fix a typo
        perf: <linux/perf_event.h>: drop a duplicated word
        perf/x86/intel/lbr: Support XSAVES for arch LBR read
        perf/x86/intel/lbr: Support XSAVES/XRSTORS for LBR context switch
        x86/fpu/xstate: Add helpers for LBR dynamic supervisor feature
        x86/fpu/xstate: Support dynamic supervisor feature for LBR
        x86/fpu: Use proper mask to replace full instruction mask
        perf/x86: Remove task_ctx_size
        perf/x86/intel/lbr: Create kmem_cache for the LBR context data
        perf/core: Use kmem_cache to allocate the PMU specific data
        perf/core: Factor out functions to allocate/free the task_ctx_data
        perf/x86/intel/lbr: Support Architectural LBR
        perf/x86/intel/lbr: Factor out intel_pmu_store_lbr
        perf/x86/intel/lbr: Factor out rdlbr_all() and wrlbr_all()
        perf/x86/intel/lbr: Mark the {rd,wr}lbr_{to,from} wrappers __always_inline
        perf/x86/intel/lbr: Unify the stored format of LBR information
        perf/x86/intel/lbr: Support LBR_CTL
        perf/x86: Expose CPUID enumeration bits for arch LBR
        ...
      b34133fe
    • Linus Torvalds's avatar
      Merge tag 'objtool-core-2020-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 9dee8689
      Linus Torvalds authored
      Pull objtool updates from Ingo Molnar:
      
       - Add support for non-rela relocations, in preparation to merge
         'recordmcount' functionality into objtool
      
       - Fix assumption that broke under --ffunction-sections (LTO) builds
      
       - Misc cleanups
      
      * tag 'objtool-core-2020-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        objtool: Add support for relocations without addends
        objtool: Rename rela to reloc
        objtool: Use sh_info to find the base for .rela sections
        objtool: Do not assume order of parent/child functions
      9dee8689
    • Linus Torvalds's avatar
      Merge tag 'locking-core-2020-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 9ba19ccd
      Linus Torvalds authored
      Pull locking updates from Ingo Molnar:
      
       - LKMM updates: mostly documentation changes, but also some new litmus
         tests for atomic ops.
      
       - KCSAN updates: the most important change is that GCC 11 now has all
         fixes in place to support KCSAN, so GCC support can be enabled again.
         Also more annotations.
      
       - futex updates: minor cleanups and simplifications
      
       - seqlock updates: merge preparatory changes/cleanups for the
         'associated locks' facilities.
      
       - lockdep updates:
          - simplify IRQ trace event handling
          - add various new debug checks
          - simplify header dependencies, split out <linux/lockdep_types.h>,
            decouple lockdep from other low level headers some more
          - fix NMI handling
      
       - misc cleanups and smaller fixes
      
      * tag 'locking-core-2020-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (60 commits)
        kcsan: Improve IRQ state trace reporting
        lockdep: Refactor IRQ trace events fields into struct
        seqlock: lockdep assert non-preemptibility on seqcount_t write
        lockdep: Add preemption enabled/disabled assertion APIs
        seqlock: Implement raw_seqcount_begin() in terms of raw_read_seqcount()
        seqlock: Add kernel-doc for seqcount_t and seqlock_t APIs
        seqlock: Reorder seqcount_t and seqlock_t API definitions
        seqlock: seqcount_t latch: End read sections with read_seqcount_retry()
        seqlock: Properly format kernel-doc code samples
        Documentation: locking: Describe seqlock design and usage
        locking/qspinlock: Do not include atomic.h from qspinlock_types.h
        locking/atomic: Move ATOMIC_INIT into linux/types.h
        lockdep: Move list.h inclusion into lockdep.h
        locking/lockdep: Fix TRACE_IRQFLAGS vs. NMIs
        futex: Remove unused or redundant includes
        futex: Consistently use fshared as boolean
        futex: Remove needless goto's
        futex: Remove put_futex_key()
        rwsem: fix commas in initialisation
        docs: locking: Replace HTTP links with HTTPS ones
        ...
      9ba19ccd
    • Linus Torvalds's avatar
      Merge tag 'core-rcu-2020-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 8f0cb666
      Linus Torvalds authored
      Pull RCU updates from Ingo Molnar:
      
       - kfree_rcu updates
      
       - RCU tasks updates
      
       - Read-side scalability tests
      
       - SRCU updates
      
       - Torture-test updates
      
       - Documentation updates
      
       - Miscellaneous fixes
      
      * tag 'core-rcu-2020-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (109 commits)
        torture: Remove obsolete "cd $KVM"
        torture: Avoid duplicate specification of qemu command
        torture: Dump ftrace at shutdown only if requested
        torture: Add kvm-tranform.sh script for qemu-cmd files
        torture: Add more tracing crib notes to kvm.sh
        torture: Improve diagnostic for KCSAN-incapable compilers
        torture: Correctly summarize build-only runs
        torture: Pass --kmake-arg to all make invocations
        rcutorture: Check for unwatched readers
        torture: Abstract out console-log error detection
        torture: Add a stop-run capability
        torture: Create qemu-cmd in --buildonly runs
        rcu/rcutorture: Replace 0 with false
        torture: Add --allcpus argument to the kvm.sh script
        torture: Remove whitespace from identify_qemu_vcpus output
        rcutorture: NULL rcu_torture_current earlier in cleanup code
        rcutorture: Handle non-statistic bang-string error messages
        torture: Set configfile variable to current scenario
        rcutorture: Add races with task-exit processing
        locktorture: Use true and false to assign to bool variables
        ...
      8f0cb666
    • Linus Torvalds's avatar
      Merge tag 'core-headers-2020-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 5ece0817
      Linus Torvalds authored
      Pull header cleanup from Ingo Molnar:
       "Separate out the instrumentation_begin()/end() bits from compiler.h"
      
      * tag 'core-headers-2020-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        compiler.h: Move instrumentation_begin()/end() to new <linux/instrumentation.h> header
      5ece0817
    • Linus Torvalds's avatar
      Merge tag 'core-debugobjects-2020-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · c8e69391
      Linus Torvalds authored
      Pull debugobjects cleanup from Ingo Molnar:
       "A single commit which simplifies a debugfs attribute definition"
      
      * tag 'core-debugobjects-2020-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        debugobjects: Convert to DEFINE_SHOW_ATTRIBUTE
      c8e69391
    • Linus Torvalds's avatar
      Merge tag 'irq-urgent-2020-08-02' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 3b4b84b2
      Linus Torvalds authored
      Pull irq fixes from Ingo Molnar:
       "Fix a recent IRQ affinities regression, add in a missing debugfs
        printout that helps the debugging of IRQ affinity logic bugs, and fix
        a memory leak"
      
      * tag 'irq-urgent-2020-08-02' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        genirq/debugfs: Add missing irqchip flags
        genirq/affinity: Make affinity setting if activated opt-in
        irqdomain/treewide: Free firmware node after domain removal
      3b4b84b2
    • Linus Torvalds's avatar
      Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux · 145ff1ec
      Linus Torvalds authored
      Pull arm64 and cross-arch updates from Catalin Marinas:
       "Here's a slightly wider-spread set of updates for 5.9.
      
        Going outside the usual arch/arm64/ area is the removal of
        read_barrier_depends() series from Will and the MSI/IOMMU ID
        translation series from Lorenzo.
      
        The notable arm64 updates include ARMv8.4 TLBI range operations and
        translation level hint, time namespace support, and perf.
      
        Summary:
      
         - Removal of the tremendously unpopular read_barrier_depends()
           barrier, which is a NOP on all architectures apart from Alpha, in
           favour of allowing architectures to override READ_ONCE() and do
           whatever dance they need to do to ensure address dependencies
           provide LOAD -> LOAD/STORE ordering.
      
           This work also offers a potential solution if compilers are shown
           to convert LOAD -> LOAD address dependencies into control
           dependencies (e.g. under LTO), as weakly ordered architectures will
           effectively be able to upgrade READ_ONCE() to smp_load_acquire().
           The latter case is not used yet, but will be discussed further at
           LPC.
      
         - Make the MSI/IOMMU input/output ID translation PCI agnostic,
           augment the MSI/IOMMU ACPI/OF ID mapping APIs to accept an input ID
           bus-specific parameter and apply the resulting changes to the
           device ID space provided by the Freescale FSL bus.
      
         - arm64 support for TLBI range operations and translation table level
           hints (part of the ARMv8.4 architecture version).
      
         - Time namespace support for arm64.
      
         - Export the virtual and physical address sizes in vmcoreinfo for
           makedumpfile and crash utilities.
      
         - CPU feature handling cleanups and checks for programmer errors
           (overlapping bit-fields).
      
         - ACPI updates for arm64: disallow AML accesses to EFI code regions
           and kernel memory.
      
         - perf updates for arm64.
      
         - Miscellaneous fixes and cleanups, most notably PLT counting
           optimisation for module loading, recordmcount fix to ignore
           relocations other than R_AARCH64_CALL26, CMA areas reserved for
           gigantic pages on 16K and 64K configurations.
      
         - Trivial typos, duplicate words"
      
      Link: http://lkml.kernel.org/r/20200710165203.31284-1-will@kernel.org
      Link: http://lkml.kernel.org/r/20200619082013.13661-1-lorenzo.pieralisi@arm.com
      
      * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (82 commits)
        arm64: use IRQ_STACK_SIZE instead of THREAD_SIZE for irq stack
        arm64/mm: save memory access in check_and_switch_context() fast switch path
        arm64: sigcontext.h: delete duplicated word
        arm64: ptrace.h: delete duplicated word
        arm64: pgtable-hwdef.h: delete duplicated words
        bus: fsl-mc: Add ACPI support for fsl-mc
        bus/fsl-mc: Refactor the MSI domain creation in the DPRC driver
        of/irq: Make of_msi_map_rid() PCI bus agnostic
        of/irq: make of_msi_map_get_device_domain() bus agnostic
        dt-bindings: arm: fsl: Add msi-map device-tree binding for fsl-mc bus
        of/device: Add input id to of_dma_configure()
        of/iommu: Make of_map_rid() PCI agnostic
        ACPI/IORT: Add an input ID to acpi_dma_configure()
        ACPI/IORT: Remove useless PCI bus walk
        ACPI/IORT: Make iort_msi_map_rid() PCI agnostic
        ACPI/IORT: Make iort_get_device_domain IRQ domain agnostic
        ACPI/IORT: Make iort_match_node_callback walk the ACPI namespace for NC
        arm64: enable time namespace support
        arm64/vdso: Restrict splitting VVAR VMA
        arm64/vdso: Handle faults on timens page
        ...
      145ff1ec
    • Linus Torvalds's avatar
      Merge tag 'm68k-for-v5.9-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k · 8c4e1c02
      Linus Torvalds authored
      Pull m68k updates from Geert Uytterhoeven:
      
       - several Kbuild improvements
      
       - several Mac fixes
      
       - minor cleanups and fixes
      
       - defconfig updates
      
      * tag 'm68k-for-v5.9-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k:
        m68k: defconfig: Update defconfigs for v5.8-rc3
        m68k: Use CLEAN_FILES to clean up files
        m68k: mac: Improve IOP debug messages
        m68k: mac: Don't send uninitialized data in IOP message reply
        m68k: mac: Fix IOP status/control register writes
        m68k: mac: Don't send IOP message until channel is idle
        m68k: atari: Annotate dummy read in ROM port IO code as __maybe_unused
        m68k: Use sizeof_field() helper
        m68k: Pass -D options to KBUILD_CPPFLAGS instead of KBUILD_{A,C}FLAGS
        m68k: Optimize cc-option calls for cpuflags-y
        m68k: sun3: Descend to prom from arch/m68k/sun3
        m68k: Add arch/m68k/Kbuild
      8c4e1c02
    • Linus Torvalds's avatar
      Merge tag 'rm-unicore32' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/linux · 05119217
      Linus Torvalds authored
      Pull unicore32 removal from Mike Rapoport:
       "Remove unicore32 support.
      
        The unicore32 port do not seem maintained for a long time now, there
        is no upstream toolchain that can create unicore32 binaries and all
        the links to prebuilt toolchains for unicore32 are dead. Even
        compilers that were available are not supported by the kernel anymore.
      
        Guenter Roeck says:
          "I have stopped building unicore32 images since v4.19 since there is
           no available compiler that is still supported by the kernel. I am
           surprised that support for it has not been removed from the kernel"
      
        However, it's worth pointing out two things:
      
         - Guan Xuetao is still listed as maintainer and asked for the port to
           be kept around the last time Arnd suggested removing it two years
           ago. He promised that there would be compiler sources (presumably
           llvm), but has not made those available since.
      
         - https://github.com/gxt has patches to linux-4.9 and qemu-2.7, both
           released in 2016, with patches dated early 2019. These patches
           mainly restore a syscall ABI that was never part of mainline Linux
           but apparently used in production. qemu-2.8 removed support for
           that ABI and newer kernels (4.19+) can no longer be built with the
           old toolchain, so apparently there will not be any future updates
           to that git tree"
      
      * tag 'rm-unicore32' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/linux:
        MAINTAINERS: remove "PKUNITY SOC DRIVERS" entry
        rtc: remove fb-puv3  driver
        video: fbdev: remove fb-puv3  driver
        pwm: remove pwm-puv3  driver
        input: i8042: remove support for 8042-unicore32io
        i2c/buses: remove i2c-puv3  driver
        cpufreq: remove unicore32 driver
        arch: remove unicore32 port
      05119217
    • Linus Torvalds's avatar
      Merge tag 's390-5.9-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux · 45365a06
      Linus Torvalds authored
      Pull s390 updates from Heiko Carstens:
      
       - Add support for function error injection.
      
       - Add support for custom exception handlers, as required by
         BPF_PROBE_MEM.
      
       - Add support for BPF_PROBE_MEM.
      
       - Add trace events for idle enter / exit for the s390 specific idle
         implementation.
      
       - Remove unused zcore memmmap device.
      
       - Remove unused "raw view" from s390 debug feature.
      
       - AP bus + zcrypt device driver code refactoring.
      
       - Provide cex4 cca sysfs attributes for cex3 for zcrypt device driver.
      
       - Expose only minimal interface to walk physmem for mm/memblock. This
         is a common code change and it has been agreed on with Mike Rapoport
         and Andrew Morton that this can go upstream via the s390 tree.
      
       - Rework of the s390 vmem/vmmemap code to allow for future memory hot
         remove.
      
       - Get rid of FORCE_MAX_ZONEORDER to finally allow for order-10
         allocations again, instead of only order-8 allocations.
      
       - Various small improvements and fixes.
      
      * tag 's390-5.9-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (48 commits)
        s390/vmemmap: coding style updates
        s390/vmemmap: avoid memset(PAGE_UNUSED) when adding consecutive sections
        s390/vmemmap: remember unused sub-pmd ranges
        s390/vmemmap: fallback to PTEs if mapping large PMD fails
        s390/vmem: cleanup empty page tables
        s390/vmemmap: take the vmem_mutex when populating/freeing
        s390/vmemmap: cleanup when vmemmap_populate() fails
        s390/vmemmap: extend modify_pagetable() to handle vmemmap
        s390/vmem: consolidate vmem_add_range() and vmem_remove_range()
        s390/vmem: rename vmem_add_mem() to vmem_add_range()
        s390: enable HAVE_FUNCTION_ERROR_INJECTION
        s390/pci: clarify comment in s390_mmio_read/write
        s390/time: improve comparison for tod steering
        s390/time: select CLOCKSOURCE_VALIDATE_LAST_CYCLE
        s390/time: use CLOCKSOURCE_MASK
        s390/bpf: implement BPF_PROBE_MEM
        s390/kernel: expand exception table logic to allow new handling options
        s390/kernel: unify EX_TABLE* implementations
        s390/mm: allow order 10 allocations
        s390/mm: avoid trimming to MAX_ORDER
        ...
      45365a06
    • Linus Torvalds's avatar
      Merge tag 'for-5.9/io_uring-20200802' of git://git.kernel.dk/linux-block · cdc8fcb4
      Linus Torvalds authored
      Pull io_uring updates from Jens Axboe:
       "Lots of cleanups in here, hardening the code and/or making it easier
        to read and fixing bugs, but a core feature/change too adding support
        for real async buffered reads. With the latter in place, we just need
        buffered write async support and we're done relying on kthreads for
        the fast path. In detail:
      
         - Cleanup how memory accounting is done on ring setup/free (Bijan)
      
         - sq array offset calculation fixup (Dmitry)
      
         - Consistently handle blocking off O_DIRECT submission path (me)
      
         - Support proper async buffered reads, instead of relying on kthread
           offload for that. This uses the page waitqueue to drive retries
           from task_work, like we handle poll based retry. (me)
      
         - IO completion optimizations (me)
      
         - Fix race with accounting and ring fd install (me)
      
         - Support EPOLLEXCLUSIVE (Jiufei)
      
         - Get rid of the io_kiocb unionizing, made possible by shrinking
           other bits (Pavel)
      
         - Completion side cleanups (Pavel)
      
         - Cleanup REQ_F_ flags handling, and kill off many of them (Pavel)
      
         - Request environment grabbing cleanups (Pavel)
      
         - File and socket read/write cleanups (Pavel)
      
         - Improve kiocb_set_rw_flags() (Pavel)
      
         - Tons of fixes and cleanups (Pavel)
      
         - IORING_SQ_NEED_WAKEUP clear fix (Xiaoguang)"
      
      * tag 'for-5.9/io_uring-20200802' of git://git.kernel.dk/linux-block: (127 commits)
        io_uring: flip if handling after io_setup_async_rw
        fs: optimise kiocb_set_rw_flags()
        io_uring: don't touch 'ctx' after installing file descriptor
        io_uring: get rid of atomic FAA for cq_timeouts
        io_uring: consolidate *_check_overflow accounting
        io_uring: fix stalled deferred requests
        io_uring: fix racy overflow count reporting
        io_uring: deduplicate __io_complete_rw()
        io_uring: de-unionise io_kiocb
        io-wq: update hash bits
        io_uring: fix missing io_queue_linked_timeout()
        io_uring: mark ->work uninitialised after cleanup
        io_uring: deduplicate io_grab_files() calls
        io_uring: don't do opcode prep twice
        io_uring: clear IORING_SQ_NEED_WAKEUP after executing task works
        io_uring: batch put_task_struct()
        tasks: add put_task_struct_many()
        io_uring: return locked and pinned page accounting
        io_uring: don't miscount pinned memory
        io_uring: don't open-code recv kbuf managment
        ...
      cdc8fcb4
    • Linus Torvalds's avatar
      Merge tag 'for-5.9/block-20200802' of git://git.kernel.dk/linux-block · 382625d0
      Linus Torvalds authored
      Pull core block updates from Jens Axboe:
       "Good amount of cleanups and tech debt removals in here, and as a
        result, the diffstat shows a nice net reduction in code.
      
         - Softirq completion cleanups (Christoph)
      
         - Stop using ->queuedata (Christoph)
      
         - Cleanup bd claiming (Christoph)
      
         - Use check_events, moving away from the legacy media change
           (Christoph)
      
         - Use inode i_blkbits consistently (Christoph)
      
         - Remove old unused writeback congestion bits (Christoph)
      
         - Cleanup/unify submission path (Christoph)
      
         - Use bio_uninit consistently, instead of bio_disassociate_blkg
           (Christoph)
      
         - sbitmap cleared bits handling (John)
      
         - Request merging blktrace event addition (Jan)
      
         - sysfs add/remove race fixes (Luis)
      
         - blk-mq tag fixes/optimizations (Ming)
      
         - Duplicate words in comments (Randy)
      
         - Flush deferral cleanup (Yufen)
      
         - IO context locking/retry fixes (John)
      
         - struct_size() usage (Gustavo)
      
         - blk-iocost fixes (Chengming)
      
         - blk-cgroup IO stats fixes (Boris)
      
         - Various little fixes"
      
      * tag 'for-5.9/block-20200802' of git://git.kernel.dk/linux-block: (135 commits)
        block: blk-timeout: delete duplicated word
        block: blk-mq-sched: delete duplicated word
        block: blk-mq: delete duplicated word
        block: genhd: delete duplicated words
        block: elevator: delete duplicated word and fix typos
        block: bio: delete duplicated words
        block: bfq-iosched: fix duplicated word
        iocost_monitor: start from the oldest usage index
        iocost: Fix check condition of iocg abs_vdebt
        block: Remove callback typedefs for blk_mq_ops
        block: Use non _rcu version of list functions for tag_set_list
        blk-cgroup: show global disk stats in root cgroup io.stat
        blk-cgroup: make iostat functions visible to stat printing
        block: improve discard bio alignment in __blkdev_issue_discard()
        block: change REQ_OP_ZONE_RESET and REQ_OP_ZONE_RESET_ALL to be odd numbers
        block: defer flush request no matter whether we have elevator
        block: make blk_timeout_init() static
        block: remove retry loop in ioc_release_fn()
        block: remove unnecessary ioc nested locking
        block: integrate bd_start_claiming into __blkdev_get
        ...
      382625d0
    • Linus Torvalds's avatar
      Merge branch 'mtd/fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux · 99f6cf61
      Linus Torvalds authored
      Pull mtd fix from Richard Weinberger.
      
      * 'mtd/fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux:
        mtd: properly check all write ioctls for permissions
      99f6cf61
    • Linus Torvalds's avatar
      userfaultfd: simplify fault handling · f9bf3522
      Linus Torvalds authored
      Instead of waiting in a loop for the userfaultfd condition to become
      true, just wait once and return VM_FAULT_RETRY.
      
      We've already dropped the mmap lock, we know we can't really
      successfully handle the fault at this point and the caller will have to
      retry anyway.  So there's no point in making the wait any more
      complicated than it needs to be - just schedule away.
      
      And once you don't have that complexity with explicit looping, you can
      also just lose all the 'userfaultfd_signal_pending()' complexity,
      because once we've set the correct process sleeping state, and don't
      loop, the act of scheduling itself will be checking if there are any
      pending signals before going to sleep.
      
      We can also drop the VM_FAULT_MAJOR games, since we'll be treating all
      retried faults as major soon anyway (series to regularize and share more
      of fault handling across architectures in a separate series by Peter Xu,
      and in the meantime we won't worry about the possible minor - I'll be
      here all week, try the veal - accounting difference).
      
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Peter Xu <peterx@redhat.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      f9bf3522
    • Linus Torvalds's avatar
      Merge tag 'filelock-v5.9-1' of git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux · 3208167a
      Linus Torvalds authored
      Pull file locking fix from Jeff Layton:
       "Just a single, one-line patch to fix an inefficiency in the posix
        locking code that can lead to it doing more wakeups than necessary"
      
      * tag 'filelock-v5.9-1' of git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux:
        locks: add locks_move_blocks in posix_lock_inode
      3208167a
    • Linus Torvalds's avatar
      Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 · ab5c60b7
      Linus Torvalds authored
      Pull crypto updates from Herbert Xu:
       "API:
         - Add support for allocating transforms on a specific NUMA Node
         - Introduce the flag CRYPTO_ALG_ALLOCATES_MEMORY for storage users
      
        Algorithms:
         - Drop PMULL based ghash on arm64
         - Fixes for building with clang on x86
         - Add sha256 helper that does the digest in one go
         - Add SP800-56A rev 3 validation checks to dh
      
        Drivers:
         - Permit users to specify NUMA node in hisilicon/zip
         - Add support for i.MX6 in imx-rngc
         - Add sa2ul crypto driver
         - Add BA431 hwrng driver
         - Add Ingenic JZ4780 and X1000 hwrng driver
         - Spread IRQ affinity in inside-secure and marvell/cesa"
      
      * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (157 commits)
        crypto: sa2ul - Fix inconsistent IS_ERR and PTR_ERR
        hwrng: core - remove redundant initialization of variable ret
        crypto: x86/curve25519 - Remove unused carry variables
        crypto: ingenic - Add hardware RNG for Ingenic JZ4780 and X1000
        dt-bindings: RNG: Add Ingenic RNG bindings.
        crypto: caam/qi2 - add module alias
        crypto: caam - add more RNG hw error codes
        crypto: caam/jr - remove incorrect reference to caam_jr_register()
        crypto: caam - silence .setkey in case of bad key length
        crypto: caam/qi2 - create ahash shared descriptors only once
        crypto: caam/qi2 - fix error reporting for caam_hash_alloc
        crypto: caam - remove deadcode on 32-bit platforms
        crypto: ccp - use generic power management
        crypto: xts - Replace memcpy() invocation with simple assignment
        crypto: marvell/cesa - irq balance
        crypto: inside-secure - irq balance
        crypto: ecc - SP800-56A rev 3 local public key validation
        crypto: dh - SP800-56A rev 3 local public key validation
        crypto: dh - check validity of Z before export
        lib/mpi: Add mpi_sub_ui()
        ...
      ab5c60b7
    • Linus Torvalds's avatar
      Merge tag 'fsverity-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscrypt · 5577416c
      Linus Torvalds authored
      Pull fsverity update from Eric Biggers:
       "One fix for fs/verity/ to strengthen a memory barrier which might be
        too weak. This mirrors a similar fix in fs/crypto/"
      
      * tag 'fsverity-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscrypt:
        fs-verity: use smp_load_acquire() for ->i_verity_info
      5577416c
    • Linus Torvalds's avatar
      Merge tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscrypt · 690b2567
      Linus Torvalds authored
      Pull fscrypt updates from Eric Biggers:
       "This release, we add support for inline encryption via the blk-crypto
        framework which was added in 5.8.
      
        Now when an ext4 or f2fs filesystem is mounted with '-o inlinecrypt',
        the contents of encrypted files will be encrypted/decrypted via
        blk-crypto, instead of directly using the crypto API. This model
        allows taking advantage of the inline encryption hardware that is
        integrated into the UFS or eMMC host controllers on most mobile SoCs.
      
        Note that this is just an alternate implementation; the ciphertext
        written to disk stays the same.
      
        (This pull request does *not* include support for direct I/O on
        encrypted files, which blk-crypto makes possible, since that part is
        still being discussed.)
      
        Besides the above feature update, there are also a few fixes and
        cleanups, e.g. strengthening some memory barriers that may be too
        weak.
      
        All these patches have been in linux-next with no reported issues.
        I've also tested them with the fscrypt xfstests, as usual. It's also
        been tested that the inline encryption support works with the support
        for Qualcomm and Mediatek inline encryption hardware that will be in
        the scsi pull request for 5.9. Also, several SoC vendors are already
        using a previous, functionally equivalent version of these patches"
      
      * tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscrypt:
        fscrypt: don't load ->i_crypt_info before it's known to be valid
        fscrypt: document inline encryption support
        fscrypt: use smp_load_acquire() for ->i_crypt_info
        fscrypt: use smp_load_acquire() for ->s_master_keys
        fscrypt: use smp_load_acquire() for fscrypt_prepared_key
        fscrypt: switch fscrypt_do_sha256() to use the SHA-256 library
        fscrypt: restrict IV_INO_LBLK_* to AES-256-XTS
        fscrypt: rename FS_KEY_DERIVATION_NONCE_SIZE
        fscrypt: add comments that describe the HKDF info strings
        ext4: add inline encryption support
        f2fs: add inline encryption support
        fscrypt: add inline encryption support
        fs: introduce SB_INLINECRYPT
      690b2567
    • Linus Torvalds's avatar
      Merge tag 'for-5.9-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux · 6dec9f40
      Linus Torvalds authored
      Pull btrfs updates from David Sterba:
       "We don't have any big feature updates this time, there are lots of
        small enhacements or fixes. A highlight perhaps is the parallel fsync
        performance improvements, numbers below.
      
        Regarding the dio/iomap that was reverted last time, the required API
        changes are likely to land in the upcoming cycle, the btrfs part will
        be updated afterwards.
      
        User visible changes:
      
         - new mount option rescue= to group all recovery-related mount
           options so we don't have many specific options, currently
           introducing only aliases for existing options, future extensions
           are in development to allow read-only mount with partially damaged
           structures:
            - usebackuproot is an alias for rescue=usebackuproot
            - nologreplay is an alias for rescue=nologreplay
      
         - start deprecation of mount option inode_cache, removal scheduled to
           v5.11
      
         - removed deprecated mount options alloc_start and subvolrootid
      
         - device stats corruption counter gets incremented when a checksum
           mismatch is found
      
         - qgroup information exported in /sys/fs/btrfs/<UUID>/qgroups/<id>
           using sysfs
      
         - add link /sys/fs/btrfs/<UUID>/bdi pointing to the associated
           backing dev info
      
         - FS_INFO ioctl enhancements:
            - add flags to request/describe newly added items
            - new item: numeric checksum type and checksum size
            - new item: generation
            - new item: metadata_uuid
      
         - seed device: with one new read-write device added, print the new
           device information in /proc/mounts
      
         - balance: detect cancellation by Ctrl-C in existing cancellation
           points
      
        Performance improvements:
      
         - optimized versions of various helpers on little-endian
           architectures, where we don't have to do LE/BE conversion from
           on-disk format
      
         - tree-log/fsync optimizations leading to lower max latency reported
           by dbench, reduced by about 12%
      
         - all chunk tree leaves are prefetched at mount time, can improve
           mount time on large (terabyte-sized) filesystems
      
         - speed up parallel fsync of files with reflinked/deduped extents,
           with jobs 16 to 1024 the throughput gets improved roughly by 50% on
           average and runtime decreased roughly by 30% on average, notable
           outlier is 128 jobs with +121.2% on throughput and -54.6% runtime
      
         - another speed up of parallel fsync, reduce number of checksum tree
           lookups and contention, the improvements start to show up with 2
           tasks with +20% throughput and -16% runtime up to 64 with +200%
           throughput and -66% runtime
      
        Core:
      
         - umount-time qgroup leak checker
      
         - qgroups
            - add a way to unreserve partial range after failure, avoiding
              some EDQUOT errors
            - improved flushing logic when EDQUOT is hit
      
         - possible EINTR interruption caused by failed reservations after
           transaction start is better handled and documented
      
         - transaction abort errors are unified to EROFS in case it's not the
           original reason of abort or we don't have other way to determine
           the reason
      
        Fixes:
      
         - make truncate succeed on a NOCOW file even if data space is
           exhausted
      
         - fix cancelling balance on filesystem with exhausted metadata space
      
         - anon block device:
            - preallocate anon bdev when subvolume is created to report
              failure early
            - shorten time the anon bdev id is allocated
            - don't allocate anon bdev for internal roots
      
         - minor memory leak in ref-verify
      
         - refuse invalid combinations of compression and NOCOW file flags
      
         - lockdep fixes, updating the device locks
      
         - remove obsolete fallback logic for block group profile adjustments
           when switching from 1 to more devices, causing allocation of
           unwanted block groups
      
        Other cleanups, refactoring, simplifications:
      
         - conversions from struct inode to struct btrfs_inode in internal
           functions
      
         - removal of unused struct members"
      
      * tag 'for-5.9-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: (151 commits)
        btrfs: do not set the full sync flag on the inode during page release
        btrfs: release old extent maps during page release
        btrfs: fix race between page release and a fast fsync
        btrfs: open-code remount flag setting in btrfs_remount
        btrfs: if we're restriping, use the target restripe profile
        btrfs: don't adjust bg flags and use default allocation profiles
        btrfs: fix lockdep splat from btrfs_dump_space_info
        btrfs: move the chunk_mutex in btrfs_read_chunk_tree
        btrfs: open device without device_list_mutex
        btrfs: sysfs: use NOFS for device creation
        btrfs: return EROFS for BTRFS_FS_STATE_ERROR cases
        btrfs: document special case error codes for fs errors
        btrfs: don't WARN if we abort a transaction with EROFS
        btrfs: reduce contention on log trees when logging checksums
        btrfs: remove done label in writepage_delalloc
        btrfs: add comments for btrfs_reserve_flush_enum
        btrfs: relocation: review the call sites which can be interrupted by signal
        btrfs: avoid possible signal interruption of btrfs_drop_snapshot() on relocation tree
        btrfs: relocation: allow signal to cancel balance
        btrfs: raid56: remove out label in __raid56_parity_recover
        ...
      6dec9f40
    • Linus Torvalds's avatar
      Merge tag 'tpmdd-next-v5.9' of git://git.infradead.org/users/jjs/linux-tpmdd · 92b7e492
      Linus Torvalds authored
      Pull tpm updates from Jarkko Sakkinen:
       "An issue was fixed with the TPM space buffer size. The buffer is used
        to store in-TPM objects while swapped out of the TPM for a /dev/tpmrm0
        session. The code incorrectly used PAGE_SIZE, which obviously can
        vary. With these changes the buffer has a fixed size of 16 kB.
      
        In addition, this contains support for acquiring TPM even log from
        TPM2 ACPI table. This method is used by QEMU in particular"
      
      * tag 'tpmdd-next-v5.9' of git://git.infradead.org/users/jjs/linux-tpmdd:
        tpm: Add support for event log pointer found in TPM2 ACPI table
        acpi: Extend TPM2 ACPI table with missing log fields
        tpm: Unify the mismatching TPM space buffer sizes
        tpm: Require that all digests are present in TCG_PCR_EVENT2 structures
      92b7e492