• Eric W. Biederman's avatar
    mnt: Tuck mounts under others instead of creating shadow/side mounts. · 1064f874
    Eric W. Biederman authored
    Ever since mount propagation was introduced in cases where a mount in
    propagated to parent mount mountpoint pair that is already in use the
    code has placed the new mount behind the old mount in the mount hash
    table.
    
    This implementation detail is problematic as it allows creating
    arbitrary length mount hash chains.
    
    Furthermore it invalidates the constraint maintained elsewhere in the
    mount code that a parent mount and a mountpoint pair will have exactly
    one mount upon them.  Making it hard to deal with and to talk about
    this special case in the mount code.
    
    Modify mount propagation to notice when there is already a mount at
    the parent mount and mountpoint where a new mount is propagating to
    and place that preexisting mount on top of the new mount.
    
    Modify unmount propagation to notice when a mount that is being
    unmounted has another mount on top of it (and no other children), and
    to replace the unmounted mount with the mount on top of it.
    
    Move the MNT_UMUONT test from __lookup_mnt_last into
    __propagate_umount as that is the only call of __lookup_mnt_last where
    MNT_UMOUNT may be set on any mount visible in the mount hash table.
    
    These modifications allow:
     - __lookup_mnt_last to be removed.
     - attach_shadows to be renamed __attach_mnt and its shadow
       handling to be removed.
     - commit_tree to be simplified
     - copy_tree to be simplified
    
    The result is an easier to understand tree of mounts that does not
    allow creation of arbitrary length hash chains in the mount hash table.
    
    The result is also a very slight userspace visible difference in semantics.
    The following two cases now behave identically, where before order
    mattered:
    
    case 1: (explicit user action)
    	B is a slave of A
    	mount something on A/a , it will propagate to B/a
    	and than mount something on B/a
    
    case 2: (tucked mount)
    	B is a slave of A
    	mount something on B/a
    	and than mount something on A/a
    
    Histroically umount A/a would fail in case 1 and succeed in case 2.
    Now umount A/a succeeds in both configurations.
    
    This very small change in semantics appears if anything to be a bug
    fix to me and my survey of userspace leads me to believe that no programs
    will notice or care of this subtle semantic change.
    
    v2: Updated to mnt_change_mountpoint to not call dput or mntput
    and instead to decrement the counts directly.  It is guaranteed
    that there will be other references when mnt_change_mountpoint is
    called so this is safe.
    
    v3: Moved put_mountpoint under mount_lock in attach_recursive_mnt
        As the locking in fs/namespace.c changed between v2 and v3.
    
    v4: Reworked the logic in propagate_mount_busy and __propagate_umount
        that detects when a mount completely covers another mount.
    
    v5: Removed unnecessary tests whose result is alwasy true in
        find_topper and attach_recursive_mnt.
    
    v6: Document the user space visible semantic difference.
    
    Cc: stable@vger.kernel.org
    Fixes: b90fa9ae ("[PATCH] shared mount handling: bind and rbind")
    Tested-by: default avatarAndrei Vagin <avagin@virtuozzo.com>
    Signed-off-by: default avatar"Eric W. Biederman" <ebiederm@xmission.com>
    1064f874
pnode.h 1.95 KB