Commits · b28fc82267aa07c34e019a72c42292d156654ee8 · Kirill Smelkov / linux

26 May, 2018 29 commits

crypto: af_alg: convert to ->poll_mask · b28fc822
Christoph Hellwig authored Jan 11, 2018
```
Signed-off-by: Christoph Hellwig <hch@lst.de>
```
b28fc822
net/rxrpc: convert to ->poll_mask · 5001c2dc
Christoph Hellwig authored Dec 31, 2017
```
Signed-off-by: Christoph Hellwig <hch@lst.de>
```
5001c2dc
net/iucv: convert to ->poll_mask · f87be894
Christoph Hellwig authored Dec 31, 2017
```
Signed-off-by: Christoph Hellwig <hch@lst.de>
```
f87be894
net/phonet: convert to ->poll_mask · e7a98d47
Christoph Hellwig authored Dec 31, 2017
```
Signed-off-by: Christoph Hellwig <hch@lst.de>
```
e7a98d47
net/nfc: convert to ->poll_mask · 4bac2bcd
Christoph Hellwig authored Dec 31, 2017
```
Signed-off-by: Christoph Hellwig <hch@lst.de>
```
4bac2bcd
net/caif: convert to ->poll_mask · 9490e40a
Christoph Hellwig authored Dec 31, 2017
```
Signed-off-by: Christoph Hellwig <hch@lst.de>
```
9490e40a
net/bluetooth: convert to ->poll_mask · 17112d80
Christoph Hellwig authored Dec 31, 2017
```
Signed-off-by: Christoph Hellwig <hch@lst.de>
```
17112d80
net/sctp: convert to ->poll_mask · 568ea88e
Christoph Hellwig authored Dec 31, 2017
```
Signed-off-by: Christoph Hellwig <hch@lst.de>
```
568ea88e
net/tipc: convert to ->poll_mask · 4df7338f
Christoph Hellwig authored Dec 31, 2017
```
Signed-off-by: Christoph Hellwig <hch@lst.de>
```
4df7338f
net/vmw_vsock: convert to ->poll_mask · 31f50b55
Christoph Hellwig authored Dec 31, 2017
```
Signed-off-by: Christoph Hellwig <hch@lst.de>
```
31f50b55
net/atm: convert to ->poll_mask · 9f728af3
Christoph Hellwig authored Dec 31, 2017
```
Signed-off-by: Christoph Hellwig <hch@lst.de>
```
9f728af3
net/dccp: convert to ->poll_mask · f4335f52
Christoph Hellwig authored Dec 31, 2017
```
Signed-off-by: Christoph Hellwig <hch@lst.de>
```
f4335f52

net: convert datagram_poll users tp ->poll_mask · db5051ea

Christoph Hellwig authored Apr 09, 2018

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

db5051ea

net/unix: convert to ->poll_mask · e76cd24d
Christoph Hellwig authored Apr 09, 2018
```
Signed-off-by: Christoph Hellwig <hch@lst.de>
```
e76cd24d
net/tcp: convert to ->poll_mask · 2c7d3dac
Christoph Hellwig authored Apr 09, 2018
```
Signed-off-by: Christoph Hellwig <hch@lst.de>
```
2c7d3dac

net: remove sock_no_poll · 984652dd

Christoph Hellwig authored Apr 09, 2018

Now that sock_poll handles a NULL ->poll or ->poll_mask there is no need
for a stub.
Signed-off-by: Christoph Hellwig <hch@lst.de>

984652dd

net: add support for ->poll_mask in proto_ops · 15252423

Christoph Hellwig authored Apr 09, 2018

The socket file operations still implement ->poll until all protocols are
switched over.
Signed-off-by: Christoph Hellwig <hch@lst.de>

15252423

net: refactor socket_poll · 3cafb376

Christoph Hellwig authored Jan 09, 2018

Factor out two busy poll related helpers for late reuse, and remove
a command that isn't very helpful, especially with the __poll_t
annotations in place.
Signed-off-by: Christoph Hellwig <hch@lst.de>

3cafb376

aio: try to complete poll iocbs without context switch · 1962da0d

Christoph Hellwig authored May 20, 2018

If we can acquire ctx_lock without spinning we can just remove our
iocb from the active_reqs list, and thus complete the iocbs from the
wakeup context.
Signed-off-by: Christoph Hellwig <hch@lst.de>

1962da0d

aio: implement IOCB_CMD_POLL · 2c14fa83

Christoph Hellwig authored Mar 20, 2018

Simple one-shot poll through the io_submit() interface.  To poll for
a file descriptor the application should submit an iocb of type
IOCB_CMD_POLL.  It will poll the fd for the events specified in the
the first 32 bits of the aio_buf field of the iocb.

Unlike poll or epoll without EPOLLONESHOT this interface always works
in one shot mode, that is once the iocb is completed, it will have to be
resubmitted.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>

2c14fa83

aio: simplify cancellation · 888933f8

Christoph Hellwig authored May 23, 2018

With the current aio code there is no need for the magic KIOCB_CANCELLED
value, as a cancelation just kicks the driver to queue the completion
ASAP, with all actual completion handling done in another thread. Given
that both the completion path and cancelation take the context lock there
is no need for magic cmpxchg loops either.  If we remove iocbs from the
active list after calling ->ki_cancel (but with ctx_lock still held), we
can also rely on the invariant thay anything found on the list has a
->ki_cancel callback and can be cancelled, further simplifing the code.
Signed-off-by: Christoph Hellwig <hch@lst.de>

888933f8

aio: simplify KIOCB_KEY handling · f3a2752a

Christoph Hellwig authored Mar 30, 2018

No need to pass the key field to lookup_iocb to compare it with KIOCB_KEY,
as we can do that right after retrieving it from userspace.  Also move the
KIOCB_KEY definition to aio.c as it is an internal value not used by any
other place in the kernel.
Signed-off-by: Christoph Hellwig <hch@lst.de>

f3a2752a

fs: introduce new ->get_poll_head and ->poll_mask methods · 3deb642f

Christoph Hellwig authored Jan 09, 2018

->get_poll_head returns the waitqueue that the poll operation is going
to sleep on.  Note that this means we can only use a single waitqueue
for the poll, unlike some current drivers that use two waitqueues for
different events.  But now that we have keyed wakeups and heavily use
those for poll there aren't that many good reason left to keep the
multiple waitqueues, and if there are any ->poll is still around, the
driver just won't support aio poll.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>

3deb642f

fs: add new vfs_poll and file_can_poll helpers · 9965ed17

Christoph Hellwig authored Mar 05, 2018

These abstract out calls to the poll method in preparation for changes
in how we poll.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>

9965ed17

fs: update documentation to mention __poll_t and match the code · 6e8b704d

Christoph Hellwig authored Jan 02, 2018

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

6e8b704d

fs: cleanup do_pollfd · a0f8dcfc

Christoph Hellwig authored Mar 05, 2018

Use straightline code with failure handling gotos instead of a lot
of nested conditionals.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>

a0f8dcfc

fs: unexport poll_schedule_timeout · 8f546ae1

Christoph Hellwig authored Jan 11, 2018

No users outside of select.c.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>

8f546ae1

uapi: turn __poll_t sparse checks on by default · ee219b94
Christoph Hellwig authored May 23, 2018
```
Signed-off-by: Christoph Hellwig <hch@lst.de>
```
ee219b94
Merge branch 'fixes' of https://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs into aio-base · ed0d523a
Christoph Hellwig authored May 26, 2018

ed0d523a

24 May, 2018 1 commit

fix io_destroy()/aio_complete() race · 4faa9996

Al Viro authored May 23, 2018

If io_destroy() gets to cancelling everything that can be cancelled and
gets to kiocb_cancel() calling the function driver has left in ->ki_cancel,
it becomes vulnerable to a race with IO completion. At that point req
is already taken off the list and aio_complete() does *NOT* spin until
we (in free_ioctx_users()) releases ->ctx_lock. As the result, it proceeds
to kiocb_free(), freing req just it gets passed to ->ki_cancel().

Fix is simple - remove from the list after the call of kiocb_cancel(). All
instances of ->ki_cancel() already have to cope with the being called with
iocb still on list - that's what happens in io_cancel(2).

Cc: stable@kernel.org
Fixes: 0460fef2 "aio: use cancellation list lazily"
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

4faa9996

21 May, 2018 10 commits

aio: fix io_destroy(2) vs. lookup_ioctx() race · baf10564

Al Viro authored May 20, 2018

kill_ioctx() used to have an explicit RCU delay between removing the
reference from ->ioctx_table and percpu_ref_kill() dropping the refcount.
At some point that delay had been removed, on the theory that
percpu_ref_kill() itself contained an RCU delay. Unfortunately, that was
the wrong kind of RCU delay and it didn't care about rcu_read_lock() used
by lookup_ioctx(). As the result, we could get ctx freed right under
lookup_ioctx(). Tejun has fixed that in a6d7cff4 ("fs/aio: Add explicit
RCU grace period when freeing kioctx"); however, that fix is not enough.

Suppose io_destroy() from one thread races with e.g. io_setup() from another;
CPU1 removes the reference from current->mm->ioctx_table[...] just as CPU2
has picked it (under rcu_read_lock()). Then CPU1 proceeds to drop the
refcount, getting it to 0 and triggering a call of free_ioctx_users(),
which proceeds to drop the secondary refcount and once that reaches zero
calls free_ioctx_reqs(). That does
INIT_RCU_WORK(&ctx->free_rwork, free_ioctx);
queue_rcu_work(system_wq, &ctx->free_rwork);
and schedules freeing the whole thing after RCU delay.

In the meanwhile CPU2 has gotten around to percpu_ref_get(), bumping the
refcount from 0 to 1 and returned the reference to io_setup().

Tejun's fix (that queue_rcu_work() in there) guarantees that ctx won't get
freed until after percpu_ref_get(). Sure, we'd increment the counter before
ctx can be freed. Now we are out of rcu_read_lock() and there's nothing to
stop freeing of the whole thing. Unfortunately, CPU2 assumes that since it
has grabbed the reference, ctx is *NOT* going away until it gets around to
dropping that reference.

The fix is obvious - use percpu_ref_tryget_live() and treat failure as miss.
It's not costlier than what we currently do in normal case, it's safe to
call since freeing *is* delayed and it closes the race window - either
lookup_ioctx() comes before percpu_ref_kill() (in which case ctx->users
won't reach 0 until the caller of lookup_ioctx() drops it) or lookup_ioctx()
fails, ctx->users is unaffected and caller of lookup_ioctx() doesn't see
the object in question at all.

Cc: stable@kernel.org
Fixes: a6d7cff4 "fs/aio: Add explicit RCU grace period when freeing kioctx"
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

baf10564

ext2: fix a block leak · 5aa1437d

Al Viro authored May 17, 2018

open file, unlink it, then use ioctl(2) to make it immutable or
append only.  Now close it and watch the blocks *not* freed...

Immutable/append-only checks belong in ->setattr().
Note: the bug is old and backport to anything prior to 737f2e93
("ext2: convert to use the new truncate convention") will need
these checks lifted into ext2_setattr().

Cc: stable@kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

5aa1437d

nfsd: vfs_mkdir() might succeed leaving dentry negative unhashed · 3819bb0d

Al Viro authored May 11, 2018

That can (and does, on some filesystems) happen - ->mkdir() (and thus
vfs_mkdir()) can legitimately leave its argument negative and just
unhash it, counting upon the lookup to pick the object we'd created
next time we try to look at that name.

Some vfs_mkdir() callers forget about that possibility...
Acked-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

3819bb0d

cachefiles: vfs_mkdir() might succeed leaving dentry negative unhashed · 9c3e9025

Al Viro authored May 10, 2018

That can (and does, on some filesystems) happen - ->mkdir() (and thus
vfs_mkdir()) can legitimately leave its argument negative and just
unhash it, counting upon the lookup to pick the object we'd created
next time we try to look at that name.

Some vfs_mkdir() callers forget about that possibility...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

9c3e9025

unfuck sysfs_mount() · 7b745a4e

Al Viro authored May 14, 2018

new_sb is left uninitialized in case of early failures in kernfs_mount_ns(),
and while IS_ERR(root) is true in all such cases, using IS_ERR(root) || !new_sb
is not a solution - IS_ERR(root) is true in some cases when new_sb is true.

Make sure new_sb is initialized (and matches the reality) in all cases and
fix the condition for dropping kobj reference - we want it done precisely
in those situations where the reference has not been transferred into a new
super_block instance.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

7b745a4e

kernfs: deal with kernfs_fill_super() failures · 82382ace

Al Viro authored Apr 03, 2018

make sure that info->node is initialized early, so that kernfs_kill_sb()
can list_del() it safely.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

82382ace

cramfs: Fix IS_ENABLED typo · 08a8f308

Joe Perches authored May 13, 2018

There's an extra C here...

Fixes: 99c18ce5 ("cramfs: direct memory access support")
Acked-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

08a8f308

befs_lookup(): use d_splice_alias() · f4e4d434

Al Viro authored Apr 30, 2018

RTFS(Documentation/filesystems/nfs/Exporting) if you try to make
something exportable.

Fixes: ac632f5b "befs: add NFS export support"
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

f4e4d434

affs_lookup: switch to d_splice_alias() · 87fbd639

Al Viro authored May 06, 2018

Making something exportable takes more than providing ->s_export_ops.
In particular, ->lookup() *MUST* use d_splice_alias() instead of
d_add().

Reading Documentation/filesystems/nfs/Exporting would've been a good idea;
as it is, exporting AFFS is badly (and exploitably) broken.

Partially-Fixes: ed4433d7 "fs/affs: make affs exportable"
Acked-by: David Sterba <dsterba@suse.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

87fbd639

affs_lookup(): close a race with affs_remove_link() · 30da870c

Al Viro authored May 06, 2018

we unlock the directory hash too early - if we are looking at secondary
link and primary (in another directory) gets removed just as we unlock,
we could have the old primary moved in place of the secondary, leaving
us to look into freed entry (and leaving our dentry with ->d_fsdata
pointing to a freed entry).

Cc: stable@vger.kernel.org # 2.4.4+
Acked-by: David Sterba <dsterba@suse.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

30da870c