• Eric W. Biederman's avatar
    vfs: Lazily remove mounts on unlinked files and directories. · 8ed936b5
    Eric W. Biederman authored
    With the introduction of mount namespaces and bind mounts it became
    possible to access files and directories that on some paths are mount
    points but are not mount points on other paths.  It is very confusing
    when rm -rf somedir returns -EBUSY simply because somedir is mounted
    somewhere else.  With the addition of user namespaces allowing
    unprivileged mounts this condition has gone from annoying to allowing
    a DOS attack on other users in the system.
    
    The possibility for mischief is removed by updating the vfs to support
    rename, unlink and rmdir on a dentry that is a mountpoint and by
    lazily unmounting mountpoints on deleted dentries.
    
    In particular this change allows rename, unlink and rmdir system calls
    on a dentry without a mountpoint in the current mount namespace to
    succeed, and it allows rename, unlink, and rmdir performed on a
    distributed filesystem to update the vfs cache even if when there is a
    mount in some namespace on the original dentry.
    
    There are two common patterns of maintaining mounts: Mounts on trusted
    paths with the parent directory of the mount point and all ancestory
    directories up to / owned by root and modifiable only by root
    (i.e. /media/xxx, /dev, /dev/pts, /proc, /sys, /sys/fs/cgroup/{cpu,
    cpuacct, ...}, /usr, /usr/local).  Mounts on unprivileged directories
    maintained by fusermount.
    
    In the case of mounts in trusted directories owned by root and
    modifiable only by root the current parent directory permissions are
    sufficient to ensure a mount point on a trusted path is not removed
    or renamed by anyone other than root, even if there is a context
    where the there are no mount points to prevent this.
    
    In the case of mounts in directories owned by less privileged users
    races with users modifying the path of a mount point are already a
    danger.  fusermount already uses a combination of chdir,
    /proc/<pid>/fd/NNN, and UMOUNT_NOFOLLOW to prevent these races.  The
    removable of global rename, unlink, and rmdir protection really adds
    nothing new to consider only a widening of the attack window, and
    fusermount is already safe against unprivileged users modifying the
    directory simultaneously.
    
    In principle for perfect userspace programs returning -EBUSY for
    unlink, rmdir, and rename of dentires that have mounts in the local
    namespace is actually unnecessary.  Unfortunately not all userspace
    programs are perfect so retaining -EBUSY for unlink, rmdir and rename
    of dentries that have mounts in the current mount namespace plays an
    important role of maintaining consistency with historical behavior and
    making imperfect userspace applications hard to exploit.
    
    v2: Remove spurious old_dentry.
    v3: Optimized shrink_submounts_and_drop
        Removed unsued afs label
    v4: Simplified the changes to check_submounts_and_drop
        Do not rename check_submounts_and_drop shrink_submounts_and_drop
        Document what why we need atomicity in check_submounts_and_drop
        Rely on the parent inode mutex to make d_revalidate and d_invalidate
        an atomic unit.
    v5: Refcount the mountpoint to detach in case of simultaneous
        renames.
    Reviewed-by: default avatarMiklos Szeredi <miklos@szeredi.hu>
    Signed-off-by: default avatar"Eric W. Biederman" <ebiederm@xmission.com>
    Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
    8ed936b5
dcache.c 88.4 KB