Commits · 137d6acaa64afa4cf3d977417424e731ea04705a · nexedi / linux

11 Jul, 2007 40 commits

NFSv4: Make sure unlock is really an unlock when cancelling a lock · 137d6aca

Frank Filz authored Jul 09, 2007

I ran into a curious issue when a lock is being canceled. The
cancellation results in a lock request to the vfs layer instead of an
unlock request. This is particularly insidious when the process that
owns the lock is exiting. In that case, sometimes the erroneous lock is
applied AFTER the process has entered zombie state, preventing the lock
from ever being released. Eventually other processes block on the lock
causing a slow degredation of the system. In the 2.6.16 kernel this was
investigated on, the problem is compounded by the fact that the cl_sem
is held while blocking on the vfs lock, which results in most processes
accessing the nfs file system in question hanging.

In more detail, here is how the situation occurs:

first _nfs4_do_setlk():

static int _nfs4_do_setlk(struct nfs4_state *state, int cmd, struct file_lock *fl, int reclaim)
...
        ret = nfs4_wait_for_completion_rpc_task(task);
        if (ret == 0) {
...
        } else
                data->cancelled = 1;

then nfs4_lock_release():

static void nfs4_lock_release(void *calldata)
...
        if (data->cancelled != 0) {
                struct rpc_task *task;
                task = nfs4_do_unlck(&data->fl, data->ctx, data->lsp,
                                data->arg.lock_seqid);

The problem is the same file_lock that was passed in to _nfs4_do_setlk()
gets passed to nfs4_do_unlck() from nfs4_lock_release(). So the type is
still F_RDLCK or FWRLCK, not F_UNLCK. At some point, when cancelling the
lock, the type needs to be changed to F_UNLCK. It seemed easiest to do
that in nfs4_do_unlck(), but it could be done in nfs4_lock_release().
The concern I had with doing it there was if something still needed the
original file_lock, though it turns out the original file_lock still
needs to be modified by nfs4_do_unlck() because nfs4_do_unlck() uses the
original file_lock to pass to the vfs layer, and a copy of the original
file_lock for the RPC request.

It seems like the simplest solution is to force all situations where
nfs4_do_unlck() is being used to result in an unlock, so with that in
mind, I made the following change:
Signed-off-by: Frank Filz <ffilzlnx@us.ibm.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

137d6aca

NLM: fix source address of callback to client · c98451bd

Frank van Maarseveen authored Jul 09, 2007

Use the destination address of the original NLM request as the
source address in callbacks to the client.
Signed-off-by: Frank van Maarseveen <frankvm@frankvm.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

c98451bd

SUNRPC client: add interface for binding to a local address · d3bc9a1d

Frank van Maarseveen authored Jul 09, 2007

In addition to binding to a local privileged port the NFS client should
allow binding to a specific local address. This is used by the server
for callbacks. The patch adds the necessary interface.
Signed-off-by: Frank van Maarseveen <frankvm@frankvm.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

d3bc9a1d

SUNRPC server: record the destination address of a request · a9747692

Frank van Maarseveen authored Jul 09, 2007

Save the destination address of an incoming request over TCP like is
done already for UDP. It is necessary later for callbacks by the server.
Signed-off-by: Frank van Maarseveen <frankvm@frankvm.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

a9747692

SUNRPC: cleanup transport creation argument passing · 96802a09

Frank van Maarseveen authored Jul 08, 2007

Cleanup argument passing to functions for creating an RPC transport.
Signed-off-by: Frank van Maarseveen <frankvm@frankvm.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

96802a09

NFSv4: Make the NFS state model work with the nosharedcache mount option · 6f2e64d3

Trond Myklebust authored Jul 06, 2007

Consider the case where the user has mounted the remote filesystem
server:/foo on the two local directories /bar and /baz using the
nosharedcache mount option. The files /bar/file and /baz/file are
represented by different inodes in the local namespace, but refer to the
same file /foo/file on the server.
Consider the case where a process opens both /bar/file and /baz/file, then
closes /bar/file: because the nfs4_state is not shared between /bar/file
and /baz/file, the kernel will see that the nfs4_state for /bar/file is no
longer referenced, so it will send off a CLOSE rpc call. Unless the
open_owners differ, then that CLOSE call will invalidate the open state on
/baz/file too.

Conclusion: we cannot share open state owners between two different
non-shared mount instances of the same filesystem.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

6f2e64d3

NFS: Error when mounting the same filesystem with different options · 275a5d24

Trond Myklebust authored May 16, 2007

Unless the user sets the NFS_MOUNT_NOSHAREDCACHE mount flag, we should
return EBUSY if the filesystem is already mounted on a superblock that
has set conflicting mount options.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

275a5d24

NFS: Add the mount option "nosharecache" · 75180df2

Trond Myklebust authored May 16, 2007

Prior to David Howell's mount changes in 2.6.18, users who mounted
different directories which happened to be from the same filesystem on the
server would get different super blocks, and hence could choose different
mount options. As long as there were no hard linked files that crossed from
one subtree to another, this was quite safe.
Post the changes, if the two directories are on the same filesystem (have
the same 'fsid'), they will share the same super block, and hence the same
mount options.

Add a flag to allow users to elect not to share the NFS super block with
another mount point, even if the fsids are the same. This will allow
users to set different mount options for the two different super blocks, as
was previously possible. It is still up to the user to ensure that there
are no cache coherency issues when doing this, however the default
behaviour will be to share super blocks whenever two paths result in
the same fsid.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

75180df2

NFS: Add support for mounting NFSv4 file systems with string options · 80071225
Chuck Lever authored Jul 01, 2007
```
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
```
80071225

NFS: Add final pieces to support in-kernel mount option parsing · 136d558c

Chuck Lever authored Jul 01, 2007

Hook in final components required for supporting in-kernel mount option
parsing for NFSv2 and NFSv3 mounts.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

136d558c

NFS: Introduce generic mount client API · 0076d7b7

Chuck Lever authored Jul 01, 2007

For NFSv2 and v3 mounts, the first step is to contact the server's MOUNTD
and request the file handle for the root of the mounted share. Add a
function to the NFS client that handles this operation.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

0076d7b7

NFS: Add enums and match tables for mount option parsing · bf0fd768

Chuck Lever authored Jul 01, 2007

This generic infrastructure works for both NFS and NFSv4 mounts.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

bf0fd768

NFS: Improve debugging output in NFS in-kernel mount client · 013a8c1a

Chuck Lever authored Jul 01, 2007

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

013a8c1a

NFS: Clean up in-kernel NFS mount · 19207231

Chuck Lever authored Jul 01, 2007

Clean up white space and coding conventions.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

19207231

NFS: Remake nfsroot_mount as a permanent part of NFS client · 3ea97309

Chuck Lever authored Jul 01, 2007

In preparation for supporting NFSv2 and NFSv3 mount option handling in the
kernel NFS client, convert mount_clnt.c to be a permanent part of the NFS
client, instead of built only when CONFIG_ROOT_NFS is enabled.

In addition, we also replace the "struct sockaddr_in *" argument with
something more generic, to help support IPv6 at some later point.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

3ea97309

SUNRPC: Add a convenient default for the hostname when calling rpc_create() · 43780b87

Chuck Lever authored Jul 01, 2007

A couple of callers just use a stringified IP address for the rpc client's
hostname.  Move the logic for constructing this into rpc_create(), so it can
be shared.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

43780b87

SUNRPC: Rename rpcb_getport to be consistent with new rpcb_getport_sync name · 45160d62

Chuck Lever authored Jul 01, 2007

Clean up, for consistency.  Rename rpcb_getport as rpcb_getport_async, to
match the naming scheme of rpcb_getport_sync.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

45160d62

SUNRPC: Rename rpcb_getport_external routine · cce63cd6

Chuck Lever authored Jul 01, 2007

In preparation for handling NFS mount option parsing in the kernel,
rename rpcb_getport_external as rpcb_get_port_sync, and make it available
always (instead of only when CONFIG_ROOT_NFS is enabled).
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

cce63cd6

SUNRPC: Allow rpcbind requests to be interrupted by a signal. · f7fb558e

Chuck Lever authored Jul 01, 2007

This allows NFS mount requests and RPC re-binding to be interruptible if the
server isn't responding.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

f7fb558e

NFS: Introduce nfs4_validate_mount_options · f0768ebd

Chuck Lever authored Jul 01, 2007

Refactor NFSv4 mount processing to break out mount data validation
in the same way it's broken out in the NFSv2/v3 mount path.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

f0768ebd

NFS: Clean up nfs_validate_mount_data · 5df36e78

Chuck Lever authored Jul 01, 2007

Move error handling code out of the main code path. The switch statement
was also improperly indented, according to Documentation/CodingStyle. This
prepares nfs_validate_mount_data for the addition of option string parsing.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

5df36e78

NFS: Add a new NFS debugging flag just for mount processing · f1828993

Chuck Lever authored Jul 01, 2007

Note to self: fix up /usr/sbin/rpcdebug too
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

f1828993

NFS: Clean-up: Refactor IP address sanity checks in NFS client · fc50d58f

Chuck Lever authored Jul 01, 2007

NFS and NFSv4 mounts can now share server address sanity checking.  And, it
provides an easy mechanism for adding IPv6 address checking at some later
point.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Cc: Aurelien Charbon <aurelien.charbon@ext.bull.net>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

fc50d58f

NFS: Clean-up: fix a compiler warning in fs/nfs/super.c · 4d81cd16

Chuck Lever authored Jul 01, 2007

/home/cel/linux/fs/nfs/super.c: In function 'nfs_pseudoflavour_to_name':
/home/cel/linux/fs/nfs/super.c:270: warning: comparison between signed and unsigned
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

4d81cd16

NFS: Clean up error handling in nfs_get_sb · 0655960f

Chuck Lever authored Jul 01, 2007

The error return logic in nfs_get_sb now matches nfs4_get_sb, and is more maintainable.
A subsequent patch will take advantage of this simplification.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

0655960f

NFS: Clean-up: Replace nfs_copy_user_string with strndup_user · 29eb981a

Chuck Lever authored Jul 01, 2007

The new string utility function strndup_user can be used instead of
nfs_copy_user_string, eliminating an unnecessary duplication of function.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

29eb981a

NFS: Clean-up: Define macros for maximum host and export path name lengths · 5680d48b
Chuck Lever authored Jul 01, 2007
```
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
```
5680d48b

NFS: Clean-up: use correct type when converting NFS blocks to local blocks · 9eaa67c6

Chuck Lever authored Jul 01, 2007

inode->i_blocks is a blkcnt_t these days, which can be a u64 or unsigned
long, depending on the setting of CONFIG_LSF.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

9eaa67c6

NFS: Clean up nfs_size_to_loff_t() · 433c9237

Chuck Lever authored Jul 01, 2007

Use the same file size limit that lockd uses.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

433c9237

NFSv4: Fix up stateid locking... · 8bda4e4c

Trond Myklebust authored Jul 09, 2007

We really don't need to grab both the state->so_owner and the
inode->i_lock.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

8bda4e4c

NFSv4: Clean up the callers of nfs4_open_recover_helper() · 1ac7e2fd

Trond Myklebust authored Jul 08, 2007

Rely on nfs4_try_open_cached() when appropriate.

Also fix an RCU violation in _nfs4_do_open_reclaim()
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

1ac7e2fd

NFSv4: Don't call OPEN if we already have an open stateid for a file · 6ee41268

Trond Myklebust authored Jul 08, 2007

If we already have a stateid with the correct open mode for a given file,
then we can reuse that stateid instead of re-issuing an OPEN call without
violating the close-to-open caching semantics.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

6ee41268

NFSv4: Check for the existence of a delegation in nfs4_open_prepare() · aac00a8d

Trond Myklebust authored Jul 05, 2007

We should not be calling open() on an inode that has a delegation unless
we're doing a reclaim.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

aac00a8d

NFSv4: Clean up _nfs4_proc_open() · 3e309914

Trond Myklebust authored Jul 07, 2007

Use a flag instead of the 'data->rpc_status = -ENOMEM hack.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

3e309914

NFSv4: Allow nfs4_opendata_to_nfs4_state to return errors. · 1b370bc2
Trond Myklebust authored Jul 07, 2007
```
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
```
1b370bc2
NFSv4: Improve the debugging of bad sequence id errors... · 6f43ddcc
Trond Myklebust authored Jul 08, 2007
```
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
```
6f43ddcc
NFSv4: Always use the delegation if we have one · 003707c7
Trond Myklebust authored Jul 05, 2007
```
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
```
003707c7
NFSv4: Clean up confirmation of sequence ids... · 0f9f95e0
Trond Myklebust authored Jul 08, 2007
```
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
```
0f9f95e0

NFSv4: Defer inode revalidation when setting up a delegation · 412c77ce

Trond Myklebust authored Jul 03, 2007

Currently we force a synchronous call to __nfs_revalidate_inode() in
nfs_inode_set_delegation(). This not only ensures that we cannot call
nfs_inode_set_delegation from an asynchronous context, but it also slows
down any call to open().
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

412c77ce

NFSv4: Use RCU to protect delegations · 8383e460
Trond Myklebust authored Jul 06, 2007
```
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
```
8383e460