1. 14 May, 2020 9 commits
    • Miklos Szeredi's avatar
      statx: add mount_root · 80340fe3
      Miklos Szeredi authored
      Determining whether a path or file descriptor refers to a mountpoint (or
      more precisely a mount root) is not trivial using current tools.
      
      Add a flag to statx that indicates whether the path or fd refers to the
      root of a mount or not.
      
      Cc: linux-api@vger.kernel.org
      Cc: linux-man@vger.kernel.org
      Reported-by: default avatarLennart Poettering <mzxreary@0pointer.de>
      Reported-by: default avatarJ. Bruce Fields <bfields@fieldses.org>
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@redhat.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      80340fe3
    • Miklos Szeredi's avatar
      statx: add mount ID · fa2fcf4f
      Miklos Szeredi authored
      Systemd is hacking around to get it and it's trivial to add to statx, so...
      
      Cc: linux-api@vger.kernel.org
      Cc: linux-man@vger.kernel.org
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@redhat.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      fa2fcf4f
    • Miklos Szeredi's avatar
      statx: don't clear STATX_ATIME on SB_RDONLY · 761e28fa
      Miklos Szeredi authored
      IS_NOATIME(inode) is defined as __IS_FLG(inode, SB_RDONLY|SB_NOATIME), so
      generic_fillattr() will clear STATX_ATIME from the result_mask if the super
      block is marked read only.
      
      This was probably not the intention, so fix to only clear STATX_ATIME if
      the fs doesn't support atime at all.
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@redhat.com>
      Acked-by: default avatarDavid Howells <dhowells@redhat.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      761e28fa
    • Miklos Szeredi's avatar
      uapi: deprecate STATX_ALL · 581701b7
      Miklos Szeredi authored
      Constants of the *_ALL type can be actively harmful due to the fact that
      developers will usually fail to consider the possible effects of future
      changes to the definition.
      
      Deprecate STATX_ALL in the uapi, while no damage has been done yet.
      
      We could keep something like this around in the kernel, but there's
      actually no point, since all filesystems should be explicitly checking
      flags that they support and not rely on the VFS masking unknown ones out: a
      flag could be known to the VFS, yet not known to the filesystem.
      
      Cc: David Howells <dhowells@redhat.com>
      Cc: linux-api@vger.kernel.org
      Cc: linux-man@vger.kernel.org
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@redhat.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      581701b7
    • Miklos Szeredi's avatar
      utimensat: AT_EMPTY_PATH support · 44a3b874
      Miklos Szeredi authored
      This makes it possible to use utimensat on an O_PATH file (including
      symlinks).
      
      It supersedes the nonstandard utimensat(fd, NULL, ...) form.
      
      Cc: linux-api@vger.kernel.org
      Cc: linux-man@vger.kernel.org
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@redhat.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      44a3b874
    • Miklos Szeredi's avatar
      vfs: split out access_override_creds() · 94704515
      Miklos Szeredi authored
      Split out a helper that overrides the credentials in preparation for
      actually doing the access check.
      
      This prepares for the next patch that optionally disables the creds
      override.
      Suggested-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@redhat.com>
      94704515
    • Miklos Szeredi's avatar
      proc/mounts: add cursor · 9f6c61f9
      Miklos Szeredi authored
      If mounts are deleted after a read(2) call on /proc/self/mounts (or its
      kin), the subsequent read(2) could miss a mount that comes after the
      deleted one in the list.  This is because the file position is interpreted
      as the number mount entries from the start of the list.
      
      E.g. first read gets entries #0 to #9; the seq file index will be 10.  Then
      entry #5 is deleted, resulting in #10 becoming #9 and #11 becoming #10,
      etc...  The next read will continue from entry #10, and #9 is missed.
      
      Solve this by adding a cursor entry for each open instance.  Taking the
      global namespace_sem for write seems excessive, since we are only dealing
      with a per-namespace list.  Instead add a per-namespace spinlock and use
      that together with namespace_sem taken for read to protect against
      concurrent modification of the mount list.  This may reduce parallelism of
      is_local_mountpoint(), but it's hardly a big contention point.  We could
      also use RCU freeing of cursors to make traversal not need additional
      locks, if that turns out to be neceesary.
      
      Only move the cursor once for each read (cursor is not added on open) to
      minimize cacheline invalidation.  When EOF is reached, the cursor is taken
      off the list, in order to prevent an excessive number of cursors due to
      inactive open file descriptors.
      Reported-by: default avatarKarel Zak <kzak@redhat.com>
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@redhat.com>
      9f6c61f9
    • Miklos Szeredi's avatar
      aio: fix async fsync creds · 530f32fc
      Miklos Szeredi authored
      Avi Kivity reports that on fuse filesystems running in a user namespace
      asyncronous fsync fails with EOVERFLOW.
      
      The reason is that f_ops->fsync() is called with the creds of the kthread
      performing aio work instead of the creds of the process originally
      submitting IOCB_CMD_FSYNC.
      
      Fuse sends the creds of the caller in the request header and it needs to
      translate the uid and gid into the server's user namespace.  Since the
      kthread is running in init_user_ns, the translation will fail and the
      operation returns an error.
      
      It can be argued that fsync doesn't actually need any creds, but just
      zeroing out those fields in the header (as with requests that currently
      don't take creds) is a backward compatibility risk.
      
      Instead of working around this issue in fuse, solve the core of the problem
      by calling the filesystem with the proper creds.
      Reported-by: default avatarAvi Kivity <avi@scylladb.com>
      Tested-by: default avatarGiuseppe Scrivano <gscrivan@redhat.com>
      Fixes: c9582eb0 ("fuse: Fail all requests with invalid uids or gids")
      Cc: stable@vger.kernel.org  # 4.18+
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@redhat.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      530f32fc
    • Miklos Szeredi's avatar
      vfs: allow unprivileged whiteout creation · a3c751a5
      Miklos Szeredi authored
      Whiteouts, unlike real device node should not require privileges to create.
      
      The general concern with device nodes is that opening them can have side
      effects.  The kernel already avoids zero major (see
      Documentation/admin-guide/devices.txt).  To be on the safe side the patch
      explicitly forbids registering a char device with 0/0 number (see
      cdev_add()).
      
      This guarantees that a non-O_PATH open on a whiteout will fail with ENODEV;
      i.e. it won't have any side effect.
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@redhat.com>
      a3c751a5
  2. 03 May, 2020 4 commits
  3. 02 May, 2020 8 commits
    • Linus Torvalds's avatar
      Merge tag 'pm-5.7-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · 743f0573
      Linus Torvalds authored
      Pull power management fixes from Rafael Wysocki:
      
       - prevent the intel_pstate driver from printing excessive diagnostic
         messages in some cases (Chris Wilson)
      
       - make the hibernation restore kernel freeze kernel threads as well as
         user space tasks (Dexuan Cui)
      
       - fix the ACPI device PM disagnostic messages to include the correct
         power state name (Kai-Heng Feng).
      
      * tag 'pm-5.7-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
        PM: ACPI: Output correct message on target power state
        PM: hibernate: Freeze kernel threads in software_resume()
        cpufreq: intel_pstate: Only mention the BIOS disabling turbo mode once
      743f0573
    • Rafael J. Wysocki's avatar
      Merge branches 'pm-cpufreq' and 'pm-sleep' · a5383996
      Rafael J. Wysocki authored
      * pm-cpufreq:
        cpufreq: intel_pstate: Only mention the BIOS disabling turbo mode once
      
      * pm-sleep:
        PM: hibernate: Freeze kernel threads in software_resume()
      a5383996
    • Linus Torvalds's avatar
      Merge tag 'iomap-5.7-fixes-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux · f66ed1eb
      Linus Torvalds authored
      Pull iomap fix from Darrick Wong:
       "Hoist the check for an unrepresentable FIBMAP return value into
        ioctl_fibmap.
      
        The internal kernel function can handle 64-bit values (and is needed
        to fix a regression on ext4 + jbd2). It is only the userspace ioctl
        that is so old that it cannot deal"
      
      * tag 'iomap-5.7-fixes-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
        fibmap: Warn and return an error in case of block > INT_MAX
      f66ed1eb
    • Linus Torvalds's avatar
      Merge tag 'nfs-for-5.7-4' of git://git.linux-nfs.org/projects/trondmy/linux-nfs · 29a47f45
      Linus Torvalds authored
      Pull NFS client bugfixes from Trond Myklebust:
       "Highlights include:
      
        Stable fixes:
         - fix handling of backchannel binding in BIND_CONN_TO_SESSION
      
        Bugfixes:
         - Fix a credential use-after-free issue in pnfs_roc()
         - Fix potential posix_acl refcnt leak in nfs3_set_acl
         - defer slow parts of rpc_free_client() to a workqueue
         - Fix an Oopsable race in __nfs_list_for_each_server()
         - Fix trace point use-after-free race
         - Regression: the RDMA client no longer responds to server disconnect
           requests
         - Fix return values of xdr_stream_encode_item_{present, absent}
         - _pnfs_return_layout() must always wait for layoutreturn completion
      
        Cleanups:
         - Remove unreachable error conditions"
      
      * tag 'nfs-for-5.7-4' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
        NFS: Fix a race in __nfs_list_for_each_server()
        NFSv4.1: fix handling of backchannel binding in BIND_CONN_TO_SESSION
        SUNRPC: defer slow parts of rpc_free_client() to a workqueue.
        NFSv4: Remove unreachable error condition due to rpc_run_task()
        SUNRPC: Remove unreachable error condition
        xprtrdma: Fix use of xdr_stream_encode_item_{present, absent}
        xprtrdma: Fix trace point use-after-free race
        xprtrdma: Restore wake-up-all to rpcrdma_cm_event_handler()
        nfs: Fix potential posix_acl refcnt leak in nfs3_set_acl
        NFS/pnfs: Fix a credential use-after-free issue in pnfs_roc()
        NFS/pnfs: Ensure that _pnfs_return_layout() waits for layoutreturn completion
      29a47f45
    • Linus Torvalds's avatar
      Merge tag 'dmaengine-fix-5.7-rc4' of git://git.infradead.org/users/vkoul/slave-dma · ed6889db
      Linus Torvalds authored
      Pull dmaengine fixes from Vinod Koul:
       "Core:
         - Documentation typo fixes
         - fix the channel indexes
         - dmatest: fixes for process hang and iterations
      
        Drivers:
         - hisilicon: build error fix without PCI_MSI
         - ti-k3: deadlock fix
         - uniphier-xdmac: fix for reg region
         - pch: fix data race
         - tegra: fix clock state"
      
      * tag 'dmaengine-fix-5.7-rc4' of git://git.infradead.org/users/vkoul/slave-dma:
        dmaengine: dmatest: Fix process hang when reading 'wait' parameter
        dmaengine: dmatest: Fix iteration non-stop logic
        dmaengine: tegra-apb: Ensure that clock is enabled during of DMA synchronization
        dmaengine: fix channel index enumeration
        dmaengine: mmp_tdma: Reset channel error on release
        dmaengine: mmp_tdma: Do not ignore slave config validation errors
        dmaengine: pch_dma.c: Avoid data race between probe and irq handler
        dt-bindings: dma: uniphier-xdmac: switch to single reg region
        include/linux/dmaengine: Typos fixes in API documentation
        dmaengine: xilinx_dma: Add missing check for empty list
        dmaengine: ti: k3-psil: fix deadlock on error path
        dmaengine: hisilicon: Fix build error without PCI_MSI
      ed6889db
    • Linus Torvalds's avatar
      Merge tag 'vfio-v5.7-rc4' of git://github.com/awilliam/linux-vfio · 690e2aba
      Linus Torvalds authored
      Pull VFIO fixes from Alex Williamson:
      
       - copy_*_user validity check for new vfio_dma_rw interface (Yan Zhao)
      
       - Fix a potential math overflow (Yan Zhao)
      
       - Use follow_pfn() for calculating PFNMAPs (Sean Christopherson)
      
      * tag 'vfio-v5.7-rc4' of git://github.com/awilliam/linux-vfio:
        vfio/type1: Fix VA->PA translation for PFNMAP VMAs in vaddr_get_pfn()
        vfio: avoid possible overflow in vfio_iommu_type1_pin_pages
        vfio: checking of validity of user vaddr in vfio_dma_rw
      690e2aba
    • Linus Torvalds's avatar
      Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux · 42eb62d4
      Linus Torvalds authored
      Pull arm64 fix from Catalin Marinas:
       "Add -fasynchronous-unwind-tables to the vDSO CFLAGS"
      
      * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
        arm64: vdso: Add -fasynchronous-unwind-tables to cflags
      42eb62d4
    • Linus Torvalds's avatar
      Merge tag 'io_uring-5.7-2020-05-01' of git://git.kernel.dk/linux-block · cf018530
      Linus Torvalds authored
      Pull io_uring fixes from Jens Axboe:
      
       - Fix for statx not grabbing the file table, making AT_EMPTY_PATH fail
      
       - Cover a few cases where async poll can handle retry, eliminating the
         need for an async thread
      
       - fallback request busy/free fix (Bijan)
      
       - syzbot reported SQPOLL thread exit fix for non-preempt (Xiaoguang)
      
       - Fix extra put of req for sync_file_range (Pavel)
      
       - Always punt splice async. We'll improve this for 5.8, but wanted to
         eliminate the inode mutex lock from the non-blocking path for 5.7
         (Pavel)
      
      * tag 'io_uring-5.7-2020-05-01' of git://git.kernel.dk/linux-block:
        io_uring: punt splice async because of inode mutex
        io_uring: check non-sync defer_list carefully
        io_uring: fix extra put in sync_file_range()
        io_uring: use cond_resched() in io_ring_ctx_wait_and_kill()
        io_uring: use proper references for fallback_req locking
        io_uring: only force async punt if poll based retry can't handle it
        io_uring: enable poll retry for any file with ->read_iter / ->write_iter
        io_uring: statx must grab the file table for valid fd
      cf018530
  4. 01 May, 2020 19 commits