Commits · 21c9f5ccb103868c730aec6f8548e144ec397fed · nexedi / linux

12 Apr, 2015 40 commits

p9_client_attach(): set fid->uid correctly · 21c9f5cc

Al Viro authored Apr 02, 2015

it's almost always equal to current_fsuid(), but there's an exception -
if the first writeback fid is opened by non-root *and* that happens before
root has done any lookups in /, we end up doing attach for root. The
current code leaves the resulting FID owned by root from the server POV
and by non-root from the client one. Unfortunately, it means that e.g.
massive dcache eviction will leave that user buggered - they'll end
up redoing walks from / *and* picking that FID every time. As soon as
they try to create something, the things will get nasty.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

21c9f5cc

9p: we are leaking glock.client_id in v9fs_file_getlock() · ce85dd58
Al Viro authored Apr 02, 2015
```
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
```
ce85dd58
9p: switch to ->read_iter/->write_iter · e494b6b5
Al Viro authored Apr 01, 2015
```
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
```
e494b6b5
9p: get rid of v9fs_direct_file_read() · 42b1ab97
Al Viro authored Apr 01, 2015
```
do it in ->direct_IO()...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
```
42b1ab97
9p: switch p9_client_read() to passing struct iov_iter * · e1200fe6
Al Viro authored Apr 01, 2015
```
... and make it loop
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
```
e1200fe6
9p: get rid of v9fs_direct_file_write() · 9565a544
Al Viro authored Apr 01, 2015
```
just handle it in ->direct_IO()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
```
9565a544
9p: fold v9fs_file_write_internal() into the caller · c711a6b1
Al Viro authored Apr 01, 2015
```
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
```
c711a6b1
9p: switch ->writepage() to direct use of p9_client_write() · 371098c6
Al Viro authored Apr 01, 2015
```
Don't mess with kmap() - just use ITER_BVEC.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
```
371098c6
9p: switch p9_client_write() to passing it struct iov_iter * · 070b3656
Al Viro authored Apr 01, 2015
```
... and make it loop until it's done
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
```
070b3656

net/9p: switch the guts of p9_client_{read,write}() to iov_iter · 4f3b35c1

Al Viro authored Apr 01, 2015

... and have get_user_pages_fast() mapping fewer pages than requested
to generate a short read/write.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

4f3b35c1

nommu: use __vfs_read() · 6e242a1c

Al Viro authored Mar 31, 2015

... instead of open-coding the call of ->read()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

6e242a1c

acct: check FMODE_CAN_WRITE · d0f88f8d

Al Viro authored Mar 31, 2015

it's not calling ->write() directly anymore.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

d0f88f8d

aio_run_iocb(): kill dead check · 47e39362

Al Viro authored Mar 31, 2015

We check if ->ki_pos is positive.  However, by that point we have
already done rw_verify_area(), which would have rejected such
unless the file had been one of /dev/mem, /dev/kmem and /proc/kcore.
All of which do not have vectored rw methods, so we would've bailed
out even earlier.

This check had been introduced before rw_verify_area() had been added there
- in fact, it was a subset of checks done on sync paths by rw_verify_area()
(back then the /dev/mem exception didn't exist at all).  The rest of checks
(mandatory locking, etc.) hadn't been added until later.  Unfortunately,
by the time the call of rw_verify_area() got added, the /dev/mem exception
had already appeared, so it wasn't obvious that the older explicit check
downstream had become dead code.  It *is* a dead code, though, since the few
files for which the exception applies do not have ->aio_{read,write}() or
->{read,write}_iter() and for them we won't reach that check anyway.

What's more, even if we ever introduce vectored methods for /dev/mem
and friends, they'll have to cope with negative positions anyway, since
readv(2) and writev(2) are using the same checks as read(2) and write(2) -
i.e. rw_verify_area().

Let's bury it.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

47e39362

ioctx_alloc(): remove pointless check · 08397acd

Al Viro authored Mar 31, 2015

Way, way back kiocb used to be picked from arrays, so ioctx_alloc()
checked for multiplication overflow when calculating the size of
such array.  By the time fs/aio.c went into the tree (in 2002) they
were already allocated one-by-one by kmem_cache_alloc(), so that
check had already become pointless.  Let's bury it...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

08397acd

lustre: kill unused members of struct vvp_thread_info · 23602adf
Al Viro authored Mar 30, 2015
```
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
```
23602adf

expand __fuse_direct_write() in both callers · 812408fb

Al Viro authored Mar 30, 2015

it's actually shorter that way *and* later we'll want iocb in scope
of generic_write_check() caller.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

812408fb

fuse: switch fuse_direct_io_file_operations to ->{read,write}_iter() · 15316263
Al Viro authored Mar 30, 2015
```
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
```
15316263
cuse: switch to iov_iter · cfa86a74
Al Viro authored Mar 21, 2015
```
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
```
cfa86a74
Merge branch 'for-davem' into for-next · 39c853eb
Al Viro authored Apr 11, 2015

39c853eb
sg_start_req(): use import_iovec() · fdc81f45
Al Viro authored Mar 21, 2015
```
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
```
fdc81f45

sg_start_req(): make sure that there's not too many elements in iovec · 451a2886

Al Viro authored Mar 21, 2015

unfortunately, allowing an arbitrary 16bit value means a possibility of
overflow in the calculation of total number of pages in bio_map_user_iov() -
we rely on there being no more than PAGE_SIZE members of sum in the
first loop there.  If that sum wraps around, we end up allocating
too small array of pointers to pages and it's easy to overflow it in
the second loop.

X-Coverup: TINC (and there's no lumber cartel either)
Cc: stable@vger.kernel.org # way, way back
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

451a2886

blk_rq_map_user(): use import_single_range() · 8f7e885a
Al Viro authored Mar 21, 2015
```
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
```
8f7e885a

sg_io(): use import_iovec() · e272b89f

Al Viro authored Mar 21, 2015

... and don't skip access_ok() validation.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

e272b89f

process_vm_access: switch to {compat_,}import_iovec() · 17d17e72
Al Viro authored Mar 21, 2015
```
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
```
17d17e72
switch keyctl_instantiate_key_common() to iov_iter · b353a1f7
Al Viro authored Mar 17, 2015
```
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
```
b353a1f7
switch {compat_,}do_readv_writev() to {compat_,}import_iovec() · 0504c074
Al Viro authored Mar 21, 2015
```
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
```
0504c074
aio_setup_vectored_rw(): switch to {compat_,}import_iovec() · 32a56afa
Al Viro authored Mar 21, 2015
```
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
```
32a56afa
vmsplice_to_user(): switch to import_iovec() · 345995fa
Al Viro authored Mar 21, 2015
```
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
```
345995fa

kill aio_setup_single_vector() · d4fb392f

Al Viro authored Mar 21, 2015

identical to import_single_range()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

d4fb392f

Merge branch 'iov_iter' into for-next · 36e9f653
Al Viro authored Apr 11, 2015

36e9f653

aio: simplify arguments of aio_setup_..._rw() · a96114fa

Al Viro authored Mar 20, 2015

We don't need req in either of those. We don't need nr_segs in caller.
We don't really need len in caller either - iov_iter_count(&iter) will do.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

a96114fa

aio: lift iov_iter_init() into aio_setup_..._rw() · 4c185ce0

Al Viro authored Mar 20, 2015

the only non-trivial detail is that we do it before rw_verify_area(),
so we'd better cap the length ourselves in aio_setup_single_rw()
case (for vectored case rw_copy_check_uvector() will do that for us).
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

4c185ce0

lift iov_iter into {compat_,}do_readv_writev() · ac15ac06

Al Viro authored Mar 20, 2015

get it closer to matching {compat_,}rw_copy_check_uvector().
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

ac15ac06

Merge branch 'iocb' into for-next · c0fec3a9
Al Viro authored Apr 11, 2015

c0fec3a9

NFS: fix BUG() crash in notify_change() with patch to chown_common() · c1b8940b

Andrew Elble authored Feb 23, 2015

We have observed a BUG() crash in fs/attr.c:notify_change(). The crash
occurs during an rsync into a filesystem that is exported via NFS.

1.) fs/attr.c:notify_change() modifies the caller's version of attr.
2.) 6de0ec00 ("VFS: make notify_change pass ATTR_KILL_S*ID to
    setattr operations") introduced a BUG() restriction such that "no
    function will ever call notify_change() with both ATTR_MODE and
    ATTR_KILL_S*ID set". Under some circumstances though, it will have
    assisted in setting the caller's version of attr to this very
    combination.
3.) 27ac0ffe ("locks: break delegations on any attribute
    modification") introduced code to handle breaking
    delegations. This can result in notify_change() being re-called. attr
    _must_ be explicitly reset to avoid triggering the BUG() established
    in #2.
4.) The path that that triggers this is via fs/open.c:chmod_common().
    The combination of attr flags set here and in the first call to
    notify_change() along with a later failed break_deleg_wait()
    results in notify_change() being called again via retry_deleg
    without resetting attr.

Solution is to move retry_deleg in chmod_common() a bit further up to
ensure attr is completely reset.

There are other places where this seemingly could occur, such as
fs/utimes.c:utimes_common(), but the attr flags are not initially
set in such a way to trigger this.

Fixes: 27ac0ffe ("locks: break delegations on any attribute modification")
Reported-by: Eric Meddaugh <etmsys@rit.edu>
Tested-by: Eric Meddaugh <etmsys@rit.edu>
Signed-off-by: Andrew Elble <aweits@rit.edu>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

c1b8940b

dcache: return -ESTALE not -EBUSY on distributed fs race · 3d330dc1

J. Bruce Fields authored Feb 10, 2015

On a distributed filesystem it's possible for lookup to discover that a
directory it just found is already cached elsewhere in the directory
heirarchy.  The dcache won't let us keep the directory in both places,
so we have to move the dentry to the new location from the place we
previously had it cached.

If the parent has changed, then this requires all the same locks as we'd
need to do a cross-directory rename.  But we're already in lookup
holding one parent's i_mutex, so it's too late to acquire those locks in
the right order.

The (unreliable) solution in __d_unalias is to trylock() the required
locks and return -EBUSY if it fails.

I see no particular reason for returning -EBUSY, and -ESTALE is already
the result of some other lookup races on NFS.  I think -ESTALE is the
more helpful error return.  It also allows us to take advantage of the
logic Jeff Layton added in c6a94284 "vfs: fix renameat to retry on
ESTALE errors" and ancestors, which hopefully resolves some of these
errors before they're returned to userspace.

I can reproduce these cases using NFS with:

	ssh root@$client '
		mount -olookupcache=pos '$server':'$export' /mnt/
		mkdir /mnt/TO
		mkdir /mnt/DIR
		touch /mnt/DIR/test.txt
		while true; do
			strace -e open cat /mnt/DIR/test.txt 2>&1 | grep EBUSY
		done
	'
	ssh root@$server '
		while true; do
			mv $export/DIR $export/TO/DIR
			mv $export/TO/DIR $export/DIR
		done
	'

It also helps to add some other concurrent use of the directory on the
client (e.g., "ls /mnt/TO").  And you can replace the server-side mv's
by client-side mv's that are repeatedly killed.  (If the client is
interrupted while waiting for the RENAME response then it's left with a
dentry that has to go under one parent or the other, but it doesn't yet
know which.)
Acked-by: Jeff Layton <jlayton@primarydata.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

3d330dc1

NTFS: Version 2.1.32 - Update file write from aio_write to write_iter. · a632f559
Anton Altaparmakov authored Mar 11, 2015
```
Signed-off-by: Anton Altaparmakov <anton@tuxera.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
```
a632f559

VFS: Add iov_iter_fault_in_multipages_readable() · 171a0203

Anton Altaparmakov authored Mar 11, 2015

simillar to iov_iter_fault_in_readable() but differs in that it is
not limited to faulting in the first iovec and instead faults in
"bytes" bytes iterating over the iovecs as necessary.

Also, instead of only faulting in the first and last page of the
range, all pages are faulted in.

This function is needed by NTFS when it does multi page file
writes.
Signed-off-by: Anton Altaparmakov <anton@tuxera.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

171a0203

drop bogus check in file_open_root() · e5b811e3

Al Viro authored Mar 08, 2015

For one thing, LOOKUP_DIRECTORY will be dealt with in do_last().
For another, name can be an empty string, but not NULL - no callers
pass that and it would oops immediately if they would.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

e5b811e3

switch security_inode_getattr() to struct path * · 3f7036a0
Al Viro authored Mar 08, 2015
```
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
```
3f7036a0