1. 14 Dec, 2020 10 commits
    • Miklos Szeredi's avatar
      ovl: unprivieged mounts · 459c7c56
      Miklos Szeredi authored
      Enable unprivileged user namespace mounts of overlayfs.  Overlayfs's
      permission model (*) ensures that the mounter itself cannot gain additional
      privileges by the act of creating an overlayfs mount.
      
      This feature request is coming from the "rootless" container crowd.
      
      (*) Documentation/filesystems/overlayfs.txt#Permission model
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@redhat.com>
      459c7c56
    • Miklos Szeredi's avatar
      ovl: do not get metacopy for userxattr · 87b2c60c
      Miklos Szeredi authored
      When looking up an inode on the lower layer for which the mounter lacks
      read permisison the metacopy check will fail.  This causes the lookup to
      fail as well, even though the directory is readable.
      
      So ignore EACCES for the "userxattr" case and assume no metacopy for the
      unreadable file.
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@redhat.com>
      87b2c60c
    • Miklos Szeredi's avatar
      ovl: do not fail because of O_NOATIME · b6650dab
      Miklos Szeredi authored
      In case the file cannot be opened with O_NOATIME because of lack of
      capabilities, then clear O_NOATIME instead of failing.
      
      Remove WARN_ON(), since it would now trigger if O_NOATIME was cleared.
      Noticed by Amir Goldstein.
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@redhat.com>
      b6650dab
    • Miklos Szeredi's avatar
      ovl: do not fail when setting origin xattr · 6939f977
      Miklos Szeredi authored
      Comment above call already says this, but only EOPNOTSUPP is ignored, other
      failures are not.
      
      For example setting "user.*" will fail with EPERM on symlink/special.
      
      Ignore this error as well.
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@redhat.com>
      6939f977
    • Miklos Szeredi's avatar
      ovl: user xattr · 2d2f2d73
      Miklos Szeredi authored
      Optionally allow using "user.overlay." namespace instead of
      "trusted.overlay."
      
      This is necessary for overlayfs to be able to be mounted in an unprivileged
      namepsace.
      
      Make the option explicit, since it makes the filesystem format be
      incompatible.
      
      Disable redirect_dir and metacopy options, because these would allow
      privilege escalation through direct manipulation of the
      "user.overlay.redirect" or "user.overlay.metacopy" xattrs.
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@redhat.com>
      Reviewed-by: default avatarAmir Goldstein <amir73il@gmail.com>
      2d2f2d73
    • Miklos Szeredi's avatar
      ovl: simplify file splice · 82a763e6
      Miklos Szeredi authored
      generic_file_splice_read() and iter_file_splice_write() will call back into
      f_op->iter_read() and f_op->iter_write() respectively.  These already do
      the real file lookup and cred override.  So the code in ovl_splice_read()
      and ovl_splice_write() is redundant.
      
      In addition the ovl_file_accessed() call in ovl_splice_write() is
      incorrect, though probably harmless.
      
      Fix by calling generic_file_splice_read() and iter_file_splice_write()
      directly.
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@redhat.com>
      82a763e6
    • Miklos Szeredi's avatar
      ovl: make ioctl() safe · 89bdfaf9
      Miklos Szeredi authored
      ovl_ioctl_set_flags() does a capability check using flags, but then the
      real ioctl double-fetches flags and uses potentially different value.
      
      The "Check the capability before cred override" comment misleading: user
      can skip this check by presenting benign flags first and then overwriting
      them to non-benign flags.
      
      Just remove the cred override for now, hoping this doesn't cause a
      regression.
      
      The proper solution is to create a new setxflags i_op (patches are in the
      works).
      
      Xfstests don't show a regression.
      Reported-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@redhat.com>
      Reviewed-by: default avatarAmir Goldstein <amir73il@gmail.com>
      Fixes: dab5ca8f ("ovl: add lsattr/chattr support")
      Cc: <stable@vger.kernel.org> # v4.19
      89bdfaf9
    • Miklos Szeredi's avatar
      ovl: check privs before decoding file handle · c846af05
      Miklos Szeredi authored
      CAP_DAC_READ_SEARCH is required by open_by_handle_at(2) so check it in
      ovl_decode_real_fh() as well to prevent privilege escalation for
      unprivileged overlay mounts.
      
      [Amir] If the mounter is not capable in init ns, ovl_check_origin() and
      ovl_verify_index() will not function as expected and this will break index
      and nfs export features.  So check capability in ovl_can_decode_fh(), to
      auto disable those features.
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@redhat.com>
      c846af05
    • Miklos Szeredi's avatar
      vfs: verify source area in vfs_dedupe_file_range_one() · 3078d85c
      Miklos Szeredi authored
      Call remap_verify_area() on the source file as well as the destination.
      
      When called from vfs_dedupe_file_range() the check as already been
      performed, but not so if called from layered fs (overlayfs, etc...)
      
      Could ommit the redundant check in vfs_dedupe_file_range(), but leave for
      now to get error early (for fear of breaking backward compatibility).
      
      This call shouldn't be performance sensitive.
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@redhat.com>
      3078d85c
    • Miklos Szeredi's avatar
      vfs: move cap_convert_nscap() call into vfs_setxattr() · 7c03e2cd
      Miklos Szeredi authored
      cap_convert_nscap() does permission checking as well as conversion of the
      xattr value conditionally based on fs's user-ns.
      
      This is needed by overlayfs and probably other layered fs (ecryptfs) and is
      what vfs_foo() is supposed to do anyway.
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@redhat.com>
      Acked-by: default avatarJames Morris <jamorris@linux.microsoft.com>
      7c03e2cd
  2. 12 Nov, 2020 7 commits
  3. 25 Oct, 2020 17 commits
  4. 24 Oct, 2020 6 commits
    • Linus Torvalds's avatar
      Merge tag 'block-5.10-2020-10-24' of git://git.kernel.dk/linux-block · d7691390
      Linus Torvalds authored
      Pull block fixes from Jens Axboe:
      
       - NVMe pull request from Christoph
           - rdma error handling fixes (Chao Leng)
           - fc error handling and reconnect fixes (James Smart)
           - fix the qid displace when tracing ioctl command (Keith Busch)
           - don't use BLK_MQ_REQ_NOWAIT for passthru (Chaitanya Kulkarni)
           - fix MTDT for passthru (Logan Gunthorpe)
           - blacklist Write Same on more devices (Kai-Heng Feng)
           - fix an uninitialized work struct (zhenwei pi)"
      
       - lightnvm out-of-bounds fix (Colin)
      
       - SG allocation leak fix (Doug)
      
       - rnbd fixes (Gioh, Guoqing, Jack)
      
       - zone error translation fixes (Keith)
      
       - kerneldoc markup fix (Mauro)
      
       - zram lockdep fix (Peter)
      
       - Kill unused io_context members (Yufen)
      
       - NUMA memory allocation cleanup (Xianting)
      
       - NBD config wakeup fix (Xiubo)
      
      * tag 'block-5.10-2020-10-24' of git://git.kernel.dk/linux-block: (27 commits)
        block: blk-mq: fix a kernel-doc markup
        nvme-fc: shorten reconnect delay if possible for FC
        nvme-fc: wait for queues to freeze before calling update_hr_hw_queues
        nvme-fc: fix error loop in create_hw_io_queues
        nvme-fc: fix io timeout to abort I/O
        null_blk: use zone status for max active/open
        nvmet: don't use BLK_MQ_REQ_NOWAIT for passthru
        nvmet: cleanup nvmet_passthru_map_sg()
        nvmet: limit passthru MTDS by BIO_MAX_PAGES
        nvmet: fix uninitialized work for zero kato
        nvme-pci: disable Write Zeroes on Sandisk Skyhawk
        nvme: use queuedata for nvme_req_qid
        nvme-rdma: fix crash due to incorrect cqe
        nvme-rdma: fix crash when connect rejected
        block: remove unused members for io_context
        blk-mq: remove the calling of local_memory_node()
        zram: Fix __zram_bvec_{read,write}() locking order
        skd_main: remove unused including <linux/version.h>
        sgl_alloc_order: fix memory leak
        lightnvm: fix out-of-bounds write to array devices->info[]
        ...
      d7691390
    • Linus Torvalds's avatar
      Merge tag 'io_uring-5.10-2020-10-24' of git://git.kernel.dk/linux-block · af004187
      Linus Torvalds authored
      Pull io_uring fixes from Jens Axboe:
      
       - fsize was missed in previous unification of work flags
      
       - Few fixes cleaning up the flags unification creds cases (Pavel)
      
       - Fix NUMA affinities for completely unplugged/replugged node for io-wq
      
       - Two fallout fixes from the set_fs changes. One local to io_uring, one
         for the splice entry point that io_uring uses.
      
       - Linked timeout fixes (Pavel)
      
       - Removal of ->flush() ->files work-around that we don't need anymore
         with referenced files (Pavel)
      
       - Various cleanups (Pavel)
      
      * tag 'io_uring-5.10-2020-10-24' of git://git.kernel.dk/linux-block:
        splice: change exported internal do_splice() helper to take kernel offset
        io_uring: make loop_rw_iter() use original user supplied pointers
        io_uring: remove req cancel in ->flush()
        io-wq: re-set NUMA node affinities if CPUs come online
        io_uring: don't reuse linked_timeout
        io_uring: unify fsize with def->work_flags
        io_uring: fix racy REQ_F_LINK_TIMEOUT clearing
        io_uring: do poll's hash_node init in common code
        io_uring: inline io_poll_task_handler()
        io_uring: remove extra ->file check in poll prep
        io_uring: make cached_cq_overflow non atomic_t
        io_uring: inline io_fail_links()
        io_uring: kill ref get/drop in personality init
        io_uring: flags-based creds init in queue
      af004187
    • Linus Torvalds's avatar
      Merge tag 'libata-5.10-2020-10-24' of git://git.kernel.dk/linux-block · cb6b2897
      Linus Torvalds authored
      Pull libata fixes from Jens Axboe:
       "Two minor libata fixes:
      
         - Fix a DMA boundary mask regression for sata_rcar (Geert)
      
         - kerneldoc markup fix (Mauro)"
      
      * tag 'libata-5.10-2020-10-24' of git://git.kernel.dk/linux-block:
        ata: fix some kernel-doc markups
        ata: sata_rcar: Fix DMA boundary mask
      cb6b2897
    • Linus Torvalds's avatar
      Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · 0eac1102
      Linus Torvalds authored
      Pull misc vfs updates from Al Viro:
       "Assorted stuff all over the place (the largest group here is
        Christoph's stat cleanups)"
      
      * 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
        fs: remove KSTAT_QUERY_FLAGS
        fs: remove vfs_stat_set_lookup_flags
        fs: move vfs_fstatat out of line
        fs: implement vfs_stat and vfs_lstat in terms of vfs_fstatat
        fs: remove vfs_statx_fd
        fs: omfs: use kmemdup() rather than kmalloc+memcpy
        [PATCH] reduce boilerplate in fsid handling
        fs: Remove duplicated flag O_NDELAY occurring twice in VALID_OPEN_FLAGS
        selftests: mount: add nosymfollow tests
        Add a "nosymfollow" mount option.
      0eac1102
    • Linus Torvalds's avatar
      Merge tag 'dma-mapping-5.10-1' of git://git.infradead.org/users/hch/dma-mapping · 1b307ac8
      Linus Torvalds authored
      Pull dma-mapping fixes from Christoph Hellwig:
      
       - document the new dma_{alloc,free}_pages() API
      
       - two fixups for the dma-mapping.h split
      
      * tag 'dma-mapping-5.10-1' of git://git.infradead.org/users/hch/dma-mapping:
        dma-mapping: document dma_{alloc,free}_pages
        dma-mapping: move more functions to dma-map-ops.h
        ARM/sa1111: add a missing include of dma-map-ops.h
      1b307ac8
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm · 9bf8d8bc
      Linus Torvalds authored
      Pull KVM fixes from Paolo Bonzini:
       "Two fixes for this merge window, and an unrelated bugfix for a host
        hang"
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
        KVM: ioapic: break infinite recursion on lazy EOI
        KVM: vmx: rename pi_init to avoid conflict with paride
        KVM: x86/mmu: Avoid modulo operator on 64-bit value to fix i386 build
      9bf8d8bc