1. 31 Jan, 2020 35 commits
  2. 30 Jan, 2020 5 commits
    • Linus Torvalds's avatar
      Merge tag 'for-linus-hmm' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma · 39bed42d
      Linus Torvalds authored
      Pull mmu_notifier updates from Jason Gunthorpe:
       "This small series revises the names in mmu_notifier to make the code
        clearer and more readable"
      
      * tag 'for-linus-hmm' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
        mm/mmu_notifiers: Use 'interval_sub' as the variable for mmu_interval_notifier
        mm/mmu_notifiers: Use 'subscription' as the variable name for mmu_notifier
        mm/mmu_notifier: Rename struct mmu_notifier_mm to mmu_notifier_subscriptions
      39bed42d
    • Linus Torvalds's avatar
      Merge tag 'threads-v5.6' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux · 83fa805b
      Linus Torvalds authored
      Pull thread management updates from Christian Brauner:
       "Sargun Dhillon over the last cycle has worked on the pidfd_getfd()
        syscall.
      
        This syscall allows for the retrieval of file descriptors of a process
        based on its pidfd. A task needs to have ptrace_may_access()
        permissions with PTRACE_MODE_ATTACH_REALCREDS (suggested by Oleg and
        Andy) on the target.
      
        One of the main use-cases is in combination with seccomp's user
        notification feature. As a reminder, seccomp's user notification
        feature was made available in v5.0. It allows a task to retrieve a
        file descriptor for its seccomp filter. The file descriptor is usually
        handed of to a more privileged supervising process. The supervisor can
        then listen for syscall events caught by the seccomp filter of the
        supervisee and perform actions in lieu of the supervisee, usually
        emulating syscalls. pidfd_getfd() is needed to expand its uses.
      
        There are currently two major users that wait on pidfd_getfd() and one
        future user:
      
         - Netflix, Sargun said, is working on a service mesh where users
           should be able to connect to a dns-based VIP. When a user connects
           to e.g. 1.2.3.4:80 that runs e.g. service "foo" they will be
           redirected to an envoy process. This service mesh uses seccomp user
           notifications and pidfd to intercept all connect calls and instead
           of connecting them to 1.2.3.4:80 connects them to e.g.
           127.0.0.1:8080.
      
         - LXD uses the seccomp notifier heavily to intercept and emulate
           mknod() and mount() syscalls for unprivileged containers/processes.
           With pidfd_getfd() more uses-cases e.g. bridging socket connections
           will be possible.
      
         - The patchset has also seen some interest from the browser corner.
           Right now, Firefox is using a SECCOMP_RET_TRAP sandbox managed by a
           broker process. In the future glibc will start blocking all signals
           during dlopen() rendering this type of sandbox impossible. Hence,
           in the future Firefox will switch to a seccomp-user-nofication
           based sandbox which also makes use of file descriptor retrieval.
           The thread for this can be found at
           https://sourceware.org/ml/libc-alpha/2019-12/msg00079.html
      
        With pidfd_getfd() it is e.g. possible to bridge socket connections
        for the supervisee (binding to a privileged port) and taking actions
        on file descriptors on behalf of the supervisee in general.
      
        Sargun's first version was using an ioctl on pidfds but various people
        pushed for it to be a proper syscall which he duely implemented as
        well over various review cycles. Selftests are of course included.
        I've also added instructions how to deal with merge conflicts below.
      
        There's also a small fix coming from the kernel mentee project to
        correctly annotate struct sighand_struct with __rcu to fix various
        sparse warnings. We've received a few more such fixes and even though
        they are mostly trivial I've decided to postpone them until after -rc1
        since they came in rather late and I don't want to risk introducing
        build warnings.
      
        Finally, there's a new prctl() command PR_{G,S}ET_IO_FLUSHER which is
        needed to avoid allocation recursions triggerable by storage drivers
        that have userspace parts that run in the IO path (e.g. dm-multipath,
        iscsi, etc). These allocation recursions deadlock the device.
      
        The new prctl() allows such privileged userspace components to avoid
        allocation recursions by setting the PF_MEMALLOC_NOIO and
        PF_LESS_THROTTLE flags. The patch carries the necessary acks from the
        relevant maintainers and is routed here as part of prctl()
        thread-management."
      
      * tag 'threads-v5.6' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux:
        prctl: PR_{G,S}ET_IO_FLUSHER to support controlling memory reclaim
        sched.h: Annotate sighand_struct with __rcu
        test: Add test for pidfd getfd
        arch: wire up pidfd_getfd syscall
        pid: Implement pidfd_getfd syscall
        vfs, fdtable: Add fget_task helper
      83fa805b
    • Linus Torvalds's avatar
      Merge tag 'for-5.6/io_uring-vfs-2020-01-29' of git://git.kernel.dk/linux-block · 896f8d23
      Linus Torvalds authored
      Pull io_uring updates from Jens Axboe:
      
       - Support for various new opcodes (fallocate, openat, close, statx,
         fadvise, madvise, openat2, non-vectored read/write, send/recv, and
         epoll_ctl)
      
       - Faster ring quiesce for fileset updates
      
       - Optimizations for overflow condition checking
      
       - Support for max-sized clamping
      
       - Support for probing what opcodes are supported
      
       - Support for io-wq backend sharing between "sibling" rings
      
       - Support for registering personalities
      
       - Lots of little fixes and improvements
      
      * tag 'for-5.6/io_uring-vfs-2020-01-29' of git://git.kernel.dk/linux-block: (64 commits)
        io_uring: add support for epoll_ctl(2)
        eventpoll: support non-blocking do_epoll_ctl() calls
        eventpoll: abstract out epoll_ctl() handler
        io_uring: fix linked command file table usage
        io_uring: support using a registered personality for commands
        io_uring: allow registering credentials
        io_uring: add io-wq workqueue sharing
        io-wq: allow grabbing existing io-wq
        io_uring/io-wq: don't use static creds/mm assignments
        io-wq: make the io_wq ref counted
        io_uring: fix refcounting with batched allocations at OOM
        io_uring: add comment for drain_next
        io_uring: don't attempt to copy iovec for READ/WRITE
        io_uring: honor IOSQE_ASYNC for linked reqs
        io_uring: prep req when do IOSQE_ASYNC
        io_uring: use labeled array init in io_op_defs
        io_uring: optimise sqe-to-req flags translation
        io_uring: remove REQ_F_IO_DRAINED
        io_uring: file switch work needs to get flushed on exit
        io_uring: hide uring_fd in ctx
        ...
      896f8d23
    • Linus Torvalds's avatar
      Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi · 33c84e89
      Linus Torvalds authored
      Pull SCSI updates from James Bottomley:
       "This series is slightly unusual because it includes Arnd's compat
        ioctl tree here:
      
          1c46a2cf Merge tag 'block-ioctl-cleanup-5.6' into 5.6/scsi-queue
      
        Excluding Arnd's changes, this is mostly an update of the usual
        drivers: megaraid_sas, mpt3sas, qla2xxx, ufs, lpfc, hisi_sas.
      
        There are a couple of core and base updates around error propagation
        and atomicity in the attribute container base we use for the SCSI
        transport classes.
      
        The rest is minor changes and updates"
      
      * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (149 commits)
        scsi: hisi_sas: Rename hisi_sas_cq.pci_irq_mask
        scsi: hisi_sas: Add prints for v3 hw interrupt converge and automatic affinity
        scsi: hisi_sas: Modify the file permissions of trigger_dump to write only
        scsi: hisi_sas: Replace magic number when handle channel interrupt
        scsi: hisi_sas: replace spin_lock_irqsave/spin_unlock_restore with spin_lock/spin_unlock
        scsi: hisi_sas: use threaded irq to process CQ interrupts
        scsi: ufs: Use UFS device indicated maximum LU number
        scsi: ufs: Add max_lu_supported in struct ufs_dev_info
        scsi: ufs: Delete is_init_prefetch from struct ufs_hba
        scsi: ufs: Inline two functions into their callers
        scsi: ufs: Move ufshcd_get_max_pwr_mode() to ufshcd_device_params_init()
        scsi: ufs: Split ufshcd_probe_hba() based on its called flow
        scsi: ufs: Delete struct ufs_dev_desc
        scsi: ufs: Fix ufshcd_probe_hba() reture value in case ufshcd_scsi_add_wlus() fails
        scsi: ufs-mediatek: enable low-power mode for hibern8 state
        scsi: ufs: export some functions for vendor usage
        scsi: ufs-mediatek: add dbg_register_dump implementation
        scsi: qla2xxx: Fix a NULL pointer dereference in an error path
        scsi: qla1280: Make checking for 64bit support consistent
        scsi: megaraid_sas: Update driver version to 07.713.01.00-rc1
        ...
      33c84e89
    • Linus Torvalds's avatar
      Merge tag 'for-5.6/dm-changes' of... · e9f8ca0a
      Linus Torvalds authored
      Merge tag 'for-5.6/dm-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm
      
      Pull device mapper updates from Mike Snitzer:
      
       - Fix DM core's potential for q->make_request_fn NULL pointer in the
         unlikely case that a DM device is created without a DM table and then
         accessed due to upper-layer userspace code or user error.
      
       - Fix DM thin-provisioning's metadata_pre_commit_callback to not use
         memory after it is free'd. Also refactor code to disallow changing
         the thin-pool's data device once in use -- doing so guarantees smae
         lifetime of pool's data device relative to the pool metadata.
      
       - Fix DM space maps used by DM thinp and DM cache to avoid reuse of a
         already used block. This race was identified with extremely heavy
         snapshot use in the context of DM thin provisioning.
      
       - Fix DM raid's table status relative to an active rebuild.
      
       - Fix DM crypt to use GFP_NOIO rather than GFP_NOFS in call to
         skcipher_request_alloc(). Also fix benbi IV constructor crash if used
         in authenticated mode.
      
       - Add DM crypt support for Elephant diffuser to allow for Bitlocker
         compatibility.
      
       - Fix DM verity target to not prefetch hash blocks for data that has
         already been verified.
      
       - Fix DM writecache's incorrect flush sequence during commit when in
         SSD mode.
      
       - Improve DM writecache's sequential write performance on SSDs.
      
       - Add DM zoned target support for zone sizes smaller than 128MiB.
      
       - Add DM multipath 'queue_if_no_path_timeout_secs' module param to
         allow timeout if path isn't reinstated. This allows users a kernel
         safety-net against IO hanging indefinitely, due to no active paths,
         that has historically only been provided by multipathd userspace.
      
       - Various DM code cleanups to use true/false rather than 1/0, a
         variable rename in dm-dust, and fix for a math error in comment for
         DM thin metadata's ondisk format.
      
      * tag 'for-5.6/dm-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: (21 commits)
        dm: fix potential for q->make_request_fn NULL pointer
        dm writecache: improve performance of large linear writes on SSDs
        dm mpath: Add timeout mechanism for queue_if_no_path
        dm thin: change data device's flush_bio to be member of struct pool
        dm thin: don't allow changing data device during thin-pool reload
        dm thin: fix use-after-free in metadata_pre_commit_callback
        dm thin metadata: use pool locking at end of dm_pool_metadata_close
        dm writecache: fix incorrect flush sequence when doing SSD mode commit
        dm crypt: fix benbi IV constructor crash if used in authenticated mode
        dm crypt: Implement Elephant diffuser for Bitlocker compatibility
        dm space map common: fix to ensure new block isn't already in use
        dm verity: don't prefetch hash blocks for already-verified data
        dm crypt: fix GFP flags passed to skcipher_request_alloc()
        dm thin metadata: Fix trivial math error in on-disk format documentation
        dm thin metadata: use true/false for bool variable
        dm snapshot: use true/false for bool variable
        dm bio prison v2: use true/false for bool variable
        dm mpath: use true/false for bool variable
        dm zoned: support zone sizes smaller than 128MiB
        dm raid: table line rebuild status fixes
        ...
      e9f8ca0a