Commits · 1bb27cacf4992b77556ed4487f99c76c4af3b43d · Kirill Smelkov / linux

09 Oct, 2014 40 commits

f_fs: saner API for ffs_sb_create_file() · 1bb27cac

Al Viro authored Sep 03, 2014

make it return dentry instead of inode
Acked-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

1bb27cac

jfs: don't hash direct inode · 9bb8730e

Al Viro authored Sep 02, 2014

hlist_add_fake(inode->i_hash), same as for the rest of special ones...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

9bb8730e

[s390] remove pointless assignment of ->f_op in vmlogrdr ->open() · 6b933de6

Al Viro authored Sep 02, 2014

The only way we can get to that function is from misc_open(), after
the latter has set file->f_op to exactly the same value we are
(re)assigning there.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

6b933de6

ecryptfs: ->f_op is never NULL · c2e3f5d5
Al Viro authored Sep 02, 2014
```
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
```
c2e3f5d5
android: ->f_op is never NULL · 765d3682
Al Viro authored Sep 02, 2014
```
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
```
765d3682
nouveau: __iomem misannotations · 3cfb2fac
Al Viro authored Aug 31, 2014
```
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
```
3cfb2fac
missing annotation in fs/file.c · e983094d
Al Viro authored Aug 31, 2014
```
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
```
e983094d

fs: namespace: suppress 'may be used uninitialized' warnings · b8850d1f

Tim Gardner authored Aug 28, 2014

The gcc version 4.9.1 compiler complains Even though it isn't possible for
these variables to not get initialized before they are used.

fs/namespace.c: In function ‘SyS_mount’:
fs/namespace.c:2720:8: warning: ‘kernel_dev’ may be used uninitialized in this function [-Wmaybe-uninitialized]
  ret = do_mount(kernel_dev, kernel_dir->name, kernel_type, flags,
        ^
fs/namespace.c:2699:8: note: ‘kernel_dev’ was declared here
  char *kernel_dev;
        ^
fs/namespace.c:2720:8: warning: ‘kernel_type’ may be used uninitialized in this function [-Wmaybe-uninitialized]
  ret = do_mount(kernel_dev, kernel_dir->name, kernel_type, flags,
        ^
fs/namespace.c:2697:8: note: ‘kernel_type’ was declared here
  char *kernel_type;
        ^

Fix the warnings by simplifying copy_mount_string() as suggested by Al Viro.

Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

b8850d1f

saner perf_atoll() · 8ba7f6c2

Al Viro authored Aug 29, 2014

That loop in there is both anti-idiomatic *and* completely pointless.
strtoll() is there for purpose; use it and compare what's left with
acceptable suffices.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

8ba7f6c2

switch /dev/kmsg to ->write_iter() · 849f3127
Al Viro authored Aug 23, 2014
```
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
```
849f3127
switch logger to ->write_iter() · cd678fce
Al Viro authored Aug 23, 2014
```
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
```
cd678fce
switch hci_vhci to ->write_iter() · 512b2268
Al Viro authored Aug 23, 2014
```
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
```
512b2268
switch /dev/zero and /dev/full to ->read_iter() · 13ba33e8
Al Viro authored Aug 18, 2014
```
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
```
13ba33e8

dma-buf: don't open-code atomic_long_read() · a1f6dbac

Al Viro authored Aug 20, 2014

... not to mention that even atomic_long_read() is too low-level here -
there's file_count().
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

a1f6dbac

rsxx debugfs inanity · 8e3fb059

Al Viro authored Aug 19, 2014

check with the author of that horror...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

8e3fb059

carma-fpga: switch to simple_read_from_buffer() · d88c2426
Al Viro authored Aug 19, 2014
```
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
```
d88c2426
carma-fpga: switch to fixed_size_llseek() · 1a37f5ec
Al Viro authored Aug 19, 2014
```
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
```
1a37f5ec
cachefiles_write_page(): switch to __kernel_write() · 2ec3a12a
Al Viro authored Aug 19, 2014
```
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
```
2ec3a12a
vme: don't open-code fixed_size_llseek() · 59482291
Al Viro authored Aug 19, 2014
```
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
```
59482291
ashmem: use vfs_llseek() · 91360b02
Al Viro authored Aug 19, 2014
```
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
```
91360b02
9p: switch to %p[dD] · 4b8e9923
Al Viro authored Aug 19, 2014
```
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
```
4b8e9923
cifs: switch to use of %p[dD] · 35c265e0
Al Viro authored Aug 19, 2014
```
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
```
35c265e0

fs: make cont_expand_zero interruptible · c2ca0fcd

Mikulas Patocka authored Jul 27, 2014

This patch makes it possible to kill a process looping in
cont_expand_zero. A process may spend a lot of time in this function, so
it is desirable to be able to kill it.

It happened to me that I wanted to copy a piece data from the disk to a
file. By mistake, I used the "seek" parameter to dd instead of "skip". Due
to the "seek" parameter, dd attempted to extend the file and became stuck
doing so - the only possibility was to reset the machine or wait many
hours until the filesystem runs out of space and cont_expand_zero fails.
We need this patch to be able to terminate the process.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

c2ca0fcd

Add copy_to_iter(), copy_from_iter() and iov_iter_zero() · c35e0248

Matthew Wilcox authored Aug 01, 2014

For DAX, we want to be able to copy between iovecs and kernel addresses
that don't necessarily have a struct page.  This is a fairly simple
rearrangement for bvec iters to kmap the pages outside and pass them in,
but for user iovecs it gets more complicated because we might try various
different ways to kmap the memory.  Duplicating the existing logic works
out best in this case.

We need to be able to write zeroes to an iovec for reads from unwritten
ranges in a file.  This is performed by the new iov_iter_zero() function,
again patterned after the existing code that handles iovec iterators.

[AV: and export the buggers...]
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

c35e0248

fs: Fix theoretical division by 0 in super_cache_scan(). · 475d0db7

Tetsuo Handa authored May 17, 2014

total_objects could be 0 and is used as a denom.

While total_objects is a "long", total_objects == 0 unlikely happens for
3.12 and later kernels because 32-bit architectures would not be able to
hold (1 << 32) objects. However, total_objects == 0 may happen for kernels
between 3.1 and 3.11 because total_objects in prune_super() was an "int"
and (e.g.) x86_64 architecture might be able to hold (1 << 32) objects.
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: stable <stable@kernel.org> # 3.1+
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

475d0db7

dcache: Fix no spaces at the start of a line in dcache.c · b8314f93

Daeseok Youn authored Aug 11, 2014

Fixed coding style in dcache.c
Signed-off-by: Daeseok Youn <daeseok.youn@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

b8314f93

[jffs2] kill wbuf_queued/wbuf_dwork_lock · 99358a1c

Al Viro authored Aug 01, 2014

schedule_delayed_work() happening when the work is already pending is
a cheap no-op.  Don't bother with ->wbuf_queued logics - it's both
broken (cancelling ->wbuf_dwork leaves it set, as spotted by Jeff Harris)
and pointless.  It's cheaper to let schedule_delayed_work() handle that
case.
Reported-by: Jeff Harris <jefftharris@gmail.com>
Tested-by: Jeff Harris <jefftharris@gmail.com>
Cc: stable@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

99358a1c

vfs: fix typo in s_op->alloc_inode() documentation · 4e07ad64

Kirill Smelkov authored Aug 14, 2014

The function which calls s_op->alloc_inode() is not inode_alloc(), but
instead alloc_inode() which lives in fs/inode.c .

The typo was there from the beginning from 5ea626aa (VFS: update
documentation, 2005) - there was no standalone inode_alloc() for the
whole kernel history.

Cc: Pekka Enberg <penberg@kernel.org>
Signed-off-by: Kirill Smelkov <kirr@nexedi.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

4e07ad64

constify file_inode() · 1fa97e8b
Al Viro authored May 07, 2014
```
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
```
1fa97e8b

handle suicide on late failure exits in execve() in search_binary_handler() · 19d860a1

Al Viro authored May 04, 2014

... rather than doing that in the guts of ->load_binary().
[updated to fix the bug spotted by Shentino - for SIGSEGV we really need
something stronger than send_sig_info(); again, better do that in one place]
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

19d860a1

dcache.c: call ->d_prune() regardless of d_unhashed() · 29266201

Al Viro authored May 30, 2014

the only in-tree instance checks d_unhashed() anyway,
out-of-tree code can preserve the current behaviour by
adding such check if they want it and we get an ability
to use it in cases where we *want* to be notified of
killing being inevitable before ->d_lock is dropped,
whether it's unhashed or not.  In particular, autofs
would benefit from that.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

29266201

d_prune_alias(): just lock the parent and call __dentry_kill() · 29355c39

Al Viro authored May 30, 2014

The only reason for games with ->d_prune() was __d_drop(), which
was needed only to force dput() into killing the sucker off.

Note that lock_parent() can be called under ->i_lock and won't
drop it, so dentry is safe from somebody managing to kill it
under us - it won't happen while we are holding ->i_lock.

__dentry_kill() is called only with ->d_lockref.count being 0
(here and when picked from shrink list) or 1 (dput() and dropping
the ancestors in shrink_dentry_list()), so it will never be called
twice - the first thing it's doing is making ->d_lockref.count
negative and once that happens, nothing will increment it.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

29355c39

proc: Update proc_flush_task_mnt to use d_invalidate · bbd51924

Eric W. Biederman authored Feb 13, 2014

Now that d_invalidate always succeeds and flushes mount points use
it in stead of a combination of shrink_dcache_parent and d_drop
in proc_flush_task_mnt.  This removes the danger of a mount point
under /proc/<pid>/... becoming unreachable after the d_drop.
Reviewed-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

bbd51924

vfs: Remove d_drop calls from d_revalidate implementations · c143c233

Eric W. Biederman authored Feb 13, 2014

Now that d_invalidate always succeeds it is not longer necessary or
desirable to hard code d_drop calls into filesystem specific
d_revalidate implementations.

Remove the unnecessary d_drop calls and rely on d_invalidate
to drop the dentries.  Using d_invalidate ensures that paths
to mount points will not be dropped.
Reviewed-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

c143c233

vfs: Make d_invalidate return void · 5542aa2f

Eric W. Biederman authored Feb 13, 2014

Now that d_invalidate can no longer fail, stop returning a useless
return code.  For the few callers that checked the return code update
remove the handling of d_invalidate failure.
Reviewed-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

5542aa2f

vfs: Merge check_submounts_and_drop and d_invalidate · 1ffe46d1

Eric W. Biederman authored Feb 13, 2014

Now that d_invalidate is the only caller of check_submounts_and_drop,
expand check_submounts_and_drop inline in d_invalidate.
Reviewed-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

1ffe46d1

vfs: Remove unnecessary calls of check_submounts_and_drop · 9b053f32

Eric W. Biederman authored Feb 13, 2014

Now that check_submounts_and_drop can not fail and is called from
d_invalidate there is no longer a need to call check_submounts_and_drom
from filesystem d_revalidate methods so remove it.
Reviewed-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

9b053f32

vfs: Lazily remove mounts on unlinked files and directories. · 8ed936b5

Eric W. Biederman authored Oct 01, 2013

With the introduction of mount namespaces and bind mounts it became
possible to access files and directories that on some paths are mount
points but are not mount points on other paths.  It is very confusing
when rm -rf somedir returns -EBUSY simply because somedir is mounted
somewhere else.  With the addition of user namespaces allowing
unprivileged mounts this condition has gone from annoying to allowing
a DOS attack on other users in the system.

The possibility for mischief is removed by updating the vfs to support
rename, unlink and rmdir on a dentry that is a mountpoint and by
lazily unmounting mountpoints on deleted dentries.

In particular this change allows rename, unlink and rmdir system calls
on a dentry without a mountpoint in the current mount namespace to
succeed, and it allows rename, unlink, and rmdir performed on a
distributed filesystem to update the vfs cache even if when there is a
mount in some namespace on the original dentry.

There are two common patterns of maintaining mounts: Mounts on trusted
paths with the parent directory of the mount point and all ancestory
directories up to / owned by root and modifiable only by root
(i.e. /media/xxx, /dev, /dev/pts, /proc, /sys, /sys/fs/cgroup/{cpu,
cpuacct, ...}, /usr, /usr/local).  Mounts on unprivileged directories
maintained by fusermount.

In the case of mounts in trusted directories owned by root and
modifiable only by root the current parent directory permissions are
sufficient to ensure a mount point on a trusted path is not removed
or renamed by anyone other than root, even if there is a context
where the there are no mount points to prevent this.

In the case of mounts in directories owned by less privileged users
races with users modifying the path of a mount point are already a
danger.  fusermount already uses a combination of chdir,
/proc/<pid>/fd/NNN, and UMOUNT_NOFOLLOW to prevent these races.  The
removable of global rename, unlink, and rmdir protection really adds
nothing new to consider only a widening of the attack window, and
fusermount is already safe against unprivileged users modifying the
directory simultaneously.

In principle for perfect userspace programs returning -EBUSY for
unlink, rmdir, and rename of dentires that have mounts in the local
namespace is actually unnecessary.  Unfortunately not all userspace
programs are perfect so retaining -EBUSY for unlink, rmdir and rename
of dentries that have mounts in the current mount namespace plays an
important role of maintaining consistency with historical behavior and
making imperfect userspace applications hard to exploit.

v2: Remove spurious old_dentry.
v3: Optimized shrink_submounts_and_drop
    Removed unsued afs label
v4: Simplified the changes to check_submounts_and_drop
    Do not rename check_submounts_and_drop shrink_submounts_and_drop
    Document what why we need atomicity in check_submounts_and_drop
    Rely on the parent inode mutex to make d_revalidate and d_invalidate
    an atomic unit.
v5: Refcount the mountpoint to detach in case of simultaneous
    renames.
Reviewed-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

8ed936b5

vfs: Add a function to lazily unmount all mounts from any dentry. · 80b5dce8

Eric W. Biederman authored Oct 03, 2013

The new function detach_mounts comes in two pieces.  The first piece
is a static inline test of d_mounpoint that returns immediately
without taking any locks if d_mounpoint is not set.  In the common
case when mountpoints are absent this allows the vfs to continue
running with it's same cacheline foot print.

The second piece of detach_mounts __detach_mounts actually does the
work and it assumes that a mountpoint is present so it is slow and
takes namespace_sem for write, and then locks the mount hash (aka
mount_lock) after a struct mountpoint has been found.

With those two locks held each entry on the list of mounts on a
mountpoint is selected and lazily unmounted until all of the mount
have been lazily unmounted.

v7: Wrote a proper change description and removed the changelog
    documenting deleted wrong turns.
Signed-off-by: Eric W. Biederman <ebiederman@twitter.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

80b5dce8

vfs: factor out lookup_mountpoint from new_mountpoint · e2dfa935

Eric W. Biederman authored Feb 24, 2014

I am shortly going to add a new user of struct mountpoint that
needs to look up existing entries but does not want to create
a struct mountpoint if one does not exist.  Therefore to keep
the code simple and easy to read split out lookup_mountpoint
from new_mountpoint.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

e2dfa935