- 08 Jul, 2021 19 commits
-
-
Trond Myklebust authored
Cache the layout in the arguments so we don't have to keep looking it up from the inode. Signed-off-by:
Trond Myklebust <trond.myklebust@hammerspace.com>
-
Trond Myklebust authored
If the layout gets invalidated, we should wait for any outstanding layoutget requests for that layout to complete, and we should resend them only after re-establishing the layout stateid. Fixes: d29b468d ("pNFS/NFSv4: Improve rejection of out-of-order layouts") Signed-off-by:
Trond Myklebust <trond.myklebust@hammerspace.com>
-
Trond Myklebust authored
If we have multiple outstanding layoutget requests, the current code to update the layout barrier assumes that the outstanding layout stateids are updated in order. That's not necessarily the case. Instead of using the value of lo->plh_outstanding as a guesstimate for the window of values we need to accept, just wait to update the window until we're processing the last one. The intention here is just to ensure that we don't process 2^31 seqid updates without also updating the barrier. Fixes: 1bcf34fd ("pNFS/NFSv4: Update the layout barrier when we schedule a layoutreturn") Signed-off-by:
Trond Myklebust <trond.myklebust@hammerspace.com>
-
Dave Wysochanski authored
Earlier commits refactored some NFS read code and removed nfs_readpage_async(), but neglected to properly fixup nfs_readpage_from_fscache_complete(). The code path is only hit when something unusual occurs with the cachefiles backing filesystem, such as an IO error or while a cookie is being invalidated. Mark page with PG_checked if fscache IO completes in error, unlock the page, and let the VM decide to re-issue based on PG_uptodate. When the VM reissues the readpage, PG_checked allows us to skip over fscache and read from the server. Link: https://marc.info/?l=linux-nfs&m=162498209518739 Fixes: 1e83b173 ("NFS: Add nfs_pageio_complete_read() and remove nfs_readpage_async()") Signed-off-by:
Dave Wysochanski <dwysocha@redhat.com> Signed-off-by:
Trond Myklebust <trond.myklebust@hammerspace.com>
-
Dave Wysochanski authored
A previous refactoring of nfs_readpage() might end up calling wait_on_page_locked_killable() even if readpage_async_filler() failed with an internal error and pg_error was non-zero (for example, if nfs_create_request() failed). In the case of an internal error, skip over wait_on_page_locked_killable() as this is only needed when the read is sent and an error occurs during completion handling. Signed-off-by:
Dave Wysochanski <dwysocha@redhat.com> Signed-off-by:
Trond Myklebust <trond.myklebust@hammerspace.com>
-
Trond Myklebust authored
Signed-off-by:
Trond Myklebust <trond.myklebust@hammerspace.com>
-
Olga Kornievskaia authored
In preparation of being able to change the xprt's state, add a way to show currect state of the transport. Signed-off-by:
Olga Kornievskaia <kolga@netapp.com> Signed-off-by:
Trond Myklebust <trond.myklebust@hammerspace.com>
-
Olga Kornievskaia authored
Allow to query xrpt_switch attributes. Currently showing the following fields of the rpc_xprt_switch structure: xps_nxprts, xps_nactive, xps_queuelen. Signed-off-by:
Olga Kornievskaia <kolga@netapp.com> Signed-off-by:
Trond Myklebust <trond.myklebust@hammerspace.com>
-
Olga Kornievskaia authored
Allow to query transport's attributes. Currently showing following fields of the rpc_xprt structure: state, last_used, cong, cwnd, max_reqs, min_reqs, num_reqs, sizes of queues binding, sending, pending, backlog. Signed-off-by:
Olga Kornievskaia <kolga@netapp.com> Signed-off-by:
Trond Myklebust <trond.myklebust@hammerspace.com>
-
Olga Kornievskaia authored
Allow to query and set the destination's address of a transport. Setting of the destination address is allowed only for TCP or RDMA based connections. Signed-off-by:
Olga Kornievskaia <kolga@netapp.com> Signed-off-by:
Trond Myklebust <trond.myklebust@hammerspace.com>
-
Olga Kornievskaia authored
Add individual transport directories under each transport switch group. For instance, for each nconnect=X connections there will be a transport directory. Naming conventions also identifies transport type -- xprt-<id>-<type> where type is udp, tcp, rdma, local, bc. Signed-off-by:
Olga Kornievskaia <kolga@netapp.com> Signed-off-by:
Trond Myklebust <trond.myklebust@hammerspace.com>
-
Olga Kornievskaia authored
An rpc client uses a transport switch and one ore more transports associated with that switch. Since transports are shared among rpc clients, create a symlink into the xprt_switch directory instead of duplicating entries under each rpc client. Signed-off-by:
Olga Kornievskaia <kolga@netapp.com> Signed-off-by:
Trond Myklebust <trond.myklebust@hammerspace.com>
-
Olga Kornievskaia authored
Add xprt_switch directory to the sysfs and create individual xprt_swith subdirectories for multipath transport group. Signed-off-by:
Olga Kornievskaia <kolga@netapp.com> Signed-off-by:
Trond Myklebust <trond.myklebust@hammerspace.com>
-
Olga Kornievskaia authored
We need to keep track of the type for a given transport. Signed-off-by:
Olga Kornievskaia <kolga@netapp.com> Signed-off-by:
Trond Myklebust <trond.myklebust@hammerspace.com>
-
Olga Kornievskaia authored
This is used to uniquely identify sunrpc multipath objects in /sys. Signed-off-by:
Dan Aloni <dan@kernelim.com> Signed-off-by:
Olga Kornievskaia <kolga@netapp.com> Signed-off-by:
Trond Myklebust <trond.myklebust@hammerspace.com>
-
Olga Kornievskaia authored
This adds a unique identifier for a sunrpc transport in sysfs, which is similarly managed to the unique IDs of clients. Signed-off-by:
Dan Aloni <dan@kernelim.com> Signed-off-by:
Olga Kornievskaia <kolga@netapp.com> Signed-off-by:
Trond Myklebust <trond.myklebust@hammerspace.com>
-
Olga Kornievskaia authored
These will eventually have files placed under them for sysfs operations. Signed-off-by:
Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by:
Olga Kornievskaia <kolga@netapp.com> Signed-off-by:
Trond Myklebust <trond.myklebust@hammerspace.com>
-
Olga Kornievskaia authored
For network namespace separation. Signed-off-by:
Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by:
Olga Kornievskaia <kolga@netapp.com> Signed-off-by:
Trond Myklebust <trond.myklebust@hammerspace.com>
-
Olga Kornievskaia authored
This is where we'll put per-rpc_client related files Signed-off-by:
Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by:
Olga Kornievskaia <kolga@netapp.com> Signed-off-by:
Trond Myklebust <trond.myklebust@hammerspace.com>
-
- 29 Jun, 2021 3 commits
-
-
Trond Myklebust authored
Signed-off-by:
Trond Myklebust <trond.myklebust@hammerspace.com>
-
Trond Myklebust authored
Instead of returning ENOLCK when we can't hand out a lease, we should be returning EAGAIN. Signed-off-by:
Trond Myklebust <trond.myklebust@hammerspace.com>
-
Trond Myklebust authored
If a file has already been closed, then it should not be selected to support further I/O. Signed-off-by:
Trond Myklebust <trond.myklebust@hammerspace.com> [Trond: Fix an invalid pointer deref reported by Colin Ian King]
-
- 28 Jun, 2021 4 commits
-
-
Zhang Xiaoxu authored
When find a task from wait queue to wake up, a non-privileged task may be found out, rather than the privileged. This maybe lead a deadlock same as commit dfe1fe75 ("NFSv4: Fix deadlock between nfs4_evict_inode() and nfs4_opendata_get_inode()"): Privileged delegreturn task is queued to privileged list because all the slots are assigned. If there has no enough slot to wake up the non-privileged batch tasks(session less than 8 slot), then the privileged delegreturn task maybe lost waked up because the found out task can't get slot since the session is on draining. So we should treate the privileged task as the emergency task, and execute it as for as we can. Reported-by:
Hulk Robot <hulkci@huawei.com> Fixes: 5fcdfacc ("NFSv4: Return delegations synchronously in evict_inode") Cc: stable@vger.kernel.org Signed-off-by:
Zhang Xiaoxu <zhangxiaoxu5@huawei.com> Signed-off-by:
Trond Myklebust <trond.myklebust@hammerspace.com>
-
Zhang Xiaoxu authored
The 'queue->nr' will wraparound from 0 to 255 when only current priority queue has tasks. This maybe lead a deadlock same as commit dfe1fe75 ("NFSv4: Fix deadlock between nfs4_evict_inode() and nfs4_opendata_get_inode()"): Privileged delegreturn task is queued to privileged list because all the slots are assigned. When non-privileged task complete and release the slot, a non-privileged maybe picked out. It maybe allocate slot failed when the session on draining. If the 'queue->nr' has wraparound to 255, and no enough slot to service it, then the privileged delegreturn will lost to wake up. So we should avoid the wraparound on 'queue->nr'. Reported-by:
Hulk Robot <hulkci@huawei.com> Fixes: 5fcdfacc ("NFSv4: Return delegations synchronously in evict_inode") Fixes: 1da177e4 ("Linux-2.6.12-rc2") Cc: stable@vger.kernel.org Signed-off-by:
Zhang Xiaoxu <zhangxiaoxu5@huawei.com> Signed-off-by:
Trond Myklebust <trond.myklebust@hammerspace.com>
-
Dave Wysochanski authored
Simplify nfs_pageio_complete_read() by using the inode pointer saved inside nfs_pageio_descriptor. Signed-off-by:
Dave Wysochanski <dwysocha@redhat.com> Signed-off-by:
Trond Myklebust <trond.myklebust@hammerspace.com>
-
Scott Mayhew authored
After calling security_sb_clone_mnt_opts() in nfs_get_root(), it's necessary to copy the value of has_sec_mnt_opts from the cloned super_block's nfs_server. Otherwise, calls to nfs_compare_super() using this super_block may not return the correct result, leading to mount failures. For example, mounting an nfs server with the following in /etc/exports: /export *(rw,insecure,crossmnt,no_root_squash,security_label) and having /export/scratch on a separate block device. mount -o v4.2,context=system_u:object_r:root_t:s0 server:/export/test /mnt/test mount -o v4.2,context=system_u:object_r:swapfile_t:s0 server:/export/scratch /mnt/scratch The second mount would fail with "mount.nfs: /mnt/scratch is busy or already mounted or sharecache fail" and "SELinux: mount invalid. Same superblock, different security settings for..." would appear in the syslog. Also while we're in there, replace several instances of "NFS_SB(s)" with "server", which was already declared at the top of the nfs_get_root(). Fixes: ec1ade6a ("nfs: account for selinux security context when deciding to share superblock") Signed-off-by:
Scott Mayhew <smayhew@redhat.com> Signed-off-by:
Trond Myklebust <trond.myklebust@hammerspace.com>
-
- 26 Jun, 2021 3 commits
-
-
Trond Myklebust authored
We know that the attributes changed on the server if and only if the change attribute is different. Otherwise, we're just refreshing our cache with values that were already known to be stale. Signed-off-by:
Trond Myklebust <trond.myklebust@hammerspace.com>
-
Trond Myklebust authored
If the change attribute update is declared to be non-atomic by the server, or our cached value does not match the server's value before the operation was performed, then we should declare the inode cache invalid. On the other hand, if the change to the directory raced with a lookup or getattr which already updated the change attribute, then optimise away the revalidation. Signed-off-by:
Trond Myklebust <trond.myklebust@hammerspace.com>
-
Trond Myklebust authored
The inode is considered revalidated when we've checked the value of the change attribute against our cached value since that suffices to establish whether or not the other cached values are valid. Signed-off-by:
Trond Myklebust <trond.myklebust@hammerspace.com>
-
- 21 Jun, 2021 3 commits
-
-
Gao Xiang authored
When looking into another nfs xfstests report, I found acl and default_acl in nfs3_proc_create() and nfs3_proc_mknod() error paths are possibly leaked. Fix them in advance. Fixes: 013cdf10 ("nfs: use generic posix ACL infrastructure for v3 Posix ACLs") Cc: Trond Myklebust <trond.myklebust@hammerspace.com> Cc: Anna Schumaker <anna.schumaker@netapp.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Joseph Qi <joseph.qi@linux.alibaba.com> Signed-off-by:
Gao Xiang <hsiangkao@linux.alibaba.com> Signed-off-by:
Trond Myklebust <trond.myklebust@hammerspace.com>
-
NeilBrown authored
If an RPC client is created without RPC_CLNT_CREATE_REUSEPORT, it should not reuse the source port when a TCP connection is re-established. This is currently implemented by preventing the source port being recorded after a successful connection (the call to xs_set_srcport()). However the source port is also recorded after a successful bind in xs_bind(). This may not be needed at all and certainly is not wanted when RPC_CLNT_CREATE_REUSEPORT wasn't requested. So avoid that assignment when xprt.reuseport is not set. With this change, NFSv4.1 and later mounts use a different port number on each connection. This is helpful with some firewalls which don't cope well with port reuse. Signed-off-by:
NeilBrown <neilb@suse.de> Fixes: e6237b6f ("NFSv4.1: Don't rebind to the same source port when reconnecting to the server") Signed-off-by:
Trond Myklebust <trond.myklebust@hammerspace.com>
-
Colin Ian King authored
The variable status is being initialized with a value that is never read, the assignment is redundant and can be removed. Addresses-Coverity: ("Unused value") Signed-off-by:
Colin Ian King <colin.king@canonical.com> Signed-off-by:
Trond Myklebust <trond.myklebust@hammerspace.com>
-
- 13 Jun, 2021 8 commits
-
-
Anna Schumaker authored
This seems to happen fairly easily during READ_PLUS testing on NFS v4.2. I found that we could end up accessing xdr->buf->pages[pgnr] with a pgnr greater than the number of pages in the array. So let's just return early if we're setting base to a point at the end of the page data and let xdr_set_tail_base() handle setting up the buffer pointers instead. Signed-off-by:
Anna Schumaker <Anna.Schumaker@Netapp.com> Fixes: 8d86e373 ("SUNRPC: Clean up helpers xdr_set_iov() and xdr_set_page_base()") Signed-off-by:
Trond Myklebust <trond.myklebust@hammerspace.com>
-
Trond Myklebust authored
Fix an Oopsable condition in pnfs_mark_request_commit() when we're putting a set of writes on the commit list to reschedule them after a failed pNFS attempt. Fixes: 9c455a8c ("NFS/pNFS: Clean up pNFS commit operations") Signed-off-by:
Trond Myklebust <trond.myklebust@hammerspace.com>
-
Trond Myklebust authored
Set up the connection to the NFSv4 server in nfs4_alloc_client(), before we've added the struct nfs_client to the net-namespace's nfs_client_list so that a downed server won't cause other mounts to hang in the trunking detection code. Reported-by:
Michael Wakabayashi <mwakabayashi@vmware.com> Fixes: 5c6e5b60 ("NFS: Fix an Oops in the pNFS files and flexfiles connection setup to the DS") Signed-off-by:
Trond Myklebust <trond.myklebust@hammerspace.com>
-
Trond Myklebust authored
If the NFSv4 client already holds a delegation for a file, then we can support application leases (i.e. fcntl(fd, F_SETLEASE,...)) because the underlying delegation guarantees that the file is not being modified on the server by another client in a way that might conflict with the lease guarantees. Signed-off-by:
Trond Myklebust <trond.myklebust@hammerspace.com>
-
Trond Myklebust authored
When we add support for application level leases and knfsd delegations to the NFS client, we we want to have them safely underpinned by a "real" delegation to provide the caching guarantees. If that real delegation is recalled, then we need to ensure that the application leases/delegations are recalled too. Signed-off-by:
Trond Myklebust <trond.myklebust@hammerspace.com>
-
Trond Myklebust authored
If we're unable to immediately recover all locks because the server is unable to immediately service our reclaim calls, then we want to retry after we've finished servicing all the other asynchronous delegation returns on our queue. Signed-off-by:
Trond Myklebust <trond.myklebust@hammerspace.com>
-
Linus Torvalds authored
-
Linus Torvalds authored
Merge tag 'perf-tools-fixes-for-v5.13-2021-06-13' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux Pull perf tools fixes from Arnaldo Carvalho de Melo: - Correct buffer copying when peeking events - Sync cpufeatures/disabled-features.h header with the kernel sources * tag 'perf-tools-fixes-for-v5.13-2021-06-13' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: tools headers cpufeatures: Sync with the kernel sources perf session: Correct buffer copying when peeking events
-