- 28 Feb, 2013 1 commit
-
-
J. Bruce Fields authored
It doesn't appear that anyone actually needs to connect asynchronously. Also, using a workqueue for the connect means we lose the namespace information from the original process. This is a problem since there's no way to explicitly pass in a filesystem namespace for resolution of an AF_LOCAL address. Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
-
- 17 Feb, 2013 3 commits
-
-
Jeff Layton authored
kbuild test robot says: tree: git://linux-nfs.org/~bfields/linux.git for-3.9 head: deb4534f commit: 01a7decf [32/44] nfsd: keep a checksum of the first 256 bytes of request config: i386-randconfig-x088 (attached as .config) All warnings: fs/nfsd/nfscache.c: In function 'nfsd_cache_csum': >> fs/nfsd/nfscache.c:266:9: warning: comparison of distinct pointer types lacks a cast [enabled by default] vim +266 fs/nfsd/nfscache.c 250 __wsum csum; 251 struct xdr_buf *buf = &rqstp->rq_arg; 252 const unsigned char *p = buf->head[0].iov_base; 253 size_t csum_len = min_t(size_t, buf->head[0].iov_len + buf->page_len, 254 RC_CSUMLEN); 255 size_t len = min(buf->head[0].iov_len, csum_len); 256 257 /* rq_arg.head first */ 258 csum = csum_partial(p, len, 0); 259 csum_len -= len; 260 261 /* Continue into page array */ 262 idx = buf->page_base / PAGE_SIZE; 263 base = buf->page_base & ~PAGE_MASK; 264 while (csum_len) { 265 p = page_address(buf->pages[idx]) + base; > 266 len = min(PAGE_SIZE - base, csum_len); 267 csum = csum_partial(p, len, csum); 268 csum_len -= len; 269 base = 0; 270 ++idx; 271 } 272 return csum; 273 } 274 Signed-off-by: Jeff Layton <jlayton@redhat.com> Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
-
J. Bruce Fields authored
Rewrite server shutdown to remove the assumption that there are no longer any threads running (no longer true, for example, when shutting down the service in one network namespace while it's still running in others). Do that by doing what we'd do in normal circumstances: just CLOSE each socket, then enqueue it. Since there may not be threads to handle the resulting queued xprts, also run a simplified version of the svc_recv() loop run by a server to clean up any closed xprts afterwards. Cc: stable@kernel.org Tested-by: Jason Tibbitts <tibbs@math.uh.edu> Tested-by: Paweł Sikora <pawel.sikora@agmk.net> Acked-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
-
J. Bruce Fields authored
svc_age_temp_xprts expires xprts in a two-step process: first it takes the sv_lock and moves the xprts to expire off their server-wide list (sv_tempsocks or sv_permsocks) to a local list. Then it drops the sv_lock and enqueues and puts each one. I see no reason for this: svc_xprt_enqueue() will take sp_lock, but the sv_lock and sp_lock are not otherwise nested anywhere (and documentation at the top of this file claims it's correct to nest these with sp_lock inside.) Cc: stable@kernel.org Tested-by: Jason Tibbitts <tibbs@math.uh.edu> Tested-by: Paweł Sikora <pawel.sikora@agmk.net> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
-
- 15 Feb, 2013 12 commits
-
-
Tim Gardner authored
Even though nlmclnt_reclaim() is only one call into the stack frame, 928 bytes on the stack seems like a lot. Recode to dynamically allocate the request structure once from within the reclaimer task, then pass this pointer into nlmclnt_reclaim() for reuse on subsequent calls. smatch analysis: fs/lockd/clntproc.c:620 nlmclnt_reclaim() warn: 'reqst' puts 928 bytes on stack Also remove redundant assignment of 0 after memset. Cc: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Reviewed-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
-
Stanislav Kinsbursky authored
Currently, NFSd is ready to operate in network namespace based containers. So let's drop check for "init_net" and make it able to fly. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
-
Stanislav Kinsbursky authored
This tracker uses khelper kthread to execute binaries. Execution itself is done from kthread context - i.e. global root is used. This is not suitable for containers with own root. So, disable this tracker for a while. Note: one of possible solutions can be pass "init" callback to khelper, which will swap root to desired one. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
-
Stanislav Kinsbursky authored
Functuon "exports_open" is used for both "/proc/fs/nfs/exports" and "/proc/fs/nfsd/exports" files. Now NFSd filesystem is containerised, so proper net can be taken from superblock for "/proc/fs/nfsd/exports" reader. But for "/proc/fs/nfsd/exports" only current->nsproxy->net_ns can be used. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
-
Stanislav Kinsbursky authored
This patch makes NFSD file system superblock to be created per net. This makes possible to get proper network namespace from superblock instead of using hard-coded "init_net". Note: NFSd fs super-block holds network namespace. This garantees, that network namespace won't disappear from underneath of it. This, obviously, means, that in case of kill of a container's "init" (which is not a mount namespace, but network namespace creator) netowrk namespace won't be destroyed. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
-
Jeff Layton authored
Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
-
Stanislav Kinsbursky authored
The reason to move cache_request() callback call from sunrpc_cache_pipe_upcall() to cache_read() is that this garantees, that cache access will be done userspace process context (only userspace process have proper root context). This is required for NFSd support in container: svc_export_request() (which is cache_request callback) calls d_path(), which, in turn, traverse dentry up to current->fs->root. Kernel threads always have global root, while container have be in "root jail" - i.e. have it's own nested root. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
-
Stanislav Kinsbursky authored
Passing this pointer is redundant since it's stored on cache_detail structure, which is also passed to sunrpc_cache_pipe_upcall () function. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
-
Stanislav Kinsbursky authored
For most of SUNRPC caches (except NFS DNS cache) cache_detail->cache_upcall is redundant since all that it's implementations are doing is calling sunrpc_cache_pipe_upcall() with proper function address argument. Cache request function address is now stored on cache_detail structure and thus all the code can be simplified. Now, for those cache details, which doesn't have cache_upcall callback (the only one, which still has is nfs_dns_resolve_template) sunrpc_cache_pipe_upcall will be called instead. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
-
Stanislav Kinsbursky authored
This callback will allow to simplify upcalls in further patches in this series. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
-
Stanislav Kinsbursky authored
This is a cleanup patch. Such helpers like nfs_cache_init() and nfs_cache_destroy() are redundant, because they are just a wrappers around sunrpc_init_cache_detail() and sunrpc_destroy_cache_detail() respectively. So let's remove them completely and move corresponding logic to nfs_cache_register_net() and nfs_cache_unregister_net() respectively (since they are called together anyway). Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
-
Stanislav Kinsbursky authored
This cache was the first containerized and doesn't use net-aware cache creation and destruction helpers. This is a cleanup patch which just makes code looks clearer and reduce amount of lines of code. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
-
- 11 Feb, 2013 1 commit
-
-
Fengguang Wu authored
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
-
- 08 Feb, 2013 2 commits
-
-
Jeff Layton authored
Now that we're allowing more DRC entries, it becomes a lot easier to hit problems with XID collisions. In order to mitigate those, calculate a checksum of up to the first 256 bytes of each request coming in and store that in the cache entry, along with the total length of the request. This initially used crc32, but Chuck Lever and Jim Rees pointed out that crc32 is probably more heavyweight than we really need for generating these checksums, and recommended looking at using the same routines that are used to generate checksums for IP packets. On an x86_64 KVM guest measurements with ftrace showed ~800ns to use csum_partial vs ~1750ns for crc32. The difference probably isn't terribly significant, but for now we may as well use csum_partial. Signed-off-by: Jeff Layton <jlayton@redhat.com> Stones-thrown-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
-
Jeff Layton authored
When GSSAPI integrity signatures are in use, or when we're using GSSAPI privacy with the v2 token format, there is a trailing checksum on the xdr_buf that is returned. It's checked during the authentication stage, and afterward nothing cares about it. Ordinarily, it's not a problem since the XDR code generally ignores it, but it will be when we try to compute a checksum over the buffer to help prevent XID collisions in the duplicate reply cache. Fix the code to trim off the checksums after verifying them. Note that in unwrap_integ_data, we must avoid trying to reverify the checksum if the request was deferred since it will no longer be present when it's revisited. Signed-off-by: Jeff Layton <jlayton@redhat.com>
-
- 05 Feb, 2013 5 commits
-
-
Jeff Layton authored
...these pages aren't necessarily contiguous. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
-
Jeff Layton authored
These routines are used by server and client code, so having them in a separate header would be best. Signed-off-by: Jeff Layton <jlayton@redhat.com> Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
-
Jeff Layton authored
When copying an address, we should also copy the scopeid in the event that this is a link-local address and the scope matters. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
-
J. Bruce Fields authored
We don't really need to preallocate at all; just allocate and initialize everything at once, but leave the sc_type field initially 0 to prevent finding the stateid till it's fully initialized. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
-
majianpeng authored
When free nfs-client, it must free the ->cl_stateids. Cc: stable@kernel.org Signed-off-by: Jianpeng Ma <majianpeng@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
-
- 04 Feb, 2013 16 commits
-
-
Jeff Layton authored
Since we dynamically allocate them now, allow the system to call us up to release them if it gets low on memory. Since these entries aren't replaceable, only free ones that are expired or that are over the cap. The the seeks value is set to '1' however to indicate that freeing the these entries is low-cost. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
-
Jeff Layton authored
It's not sufficient to only clean the cache when requests come in. What if we have a flurry of activity and then the server goes idle? Add a workqueue job that will clean the cache every RC_EXPIRE period. Care is taken to only run this when we expect to have entries expiring. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
-
Jeff Layton authored
There's no need to keep entries around that we're declaring RC_NOCACHE. Ditto if there's a problem with the entry. With this change too, there's no need to test for RC_UNUSED in the search function. If the entry's in the hash table then it's either INPROG or DONE. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
-
Jeff Layton authored
With the change to dynamically allocate entries, the cache is never disabled on the fly. Remove this flag. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
-
Jeff Layton authored
The existing code keeps a fixed-size cache of 1024 entries. This is much too small for a busy server, and wastes memory on an idle one. This patch changes the code to dynamically allocate and free these cache entries. A cap on the number of entries is retained, but it's much larger than the existing value and now scales with the amount of low memory in the machine. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
-
Jeff Layton authored
Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
-
Jeff Layton authored
...otherwise, we end up with the list ordering wrong. Currently, it's not a problem since we skip RC_INPROG entries, but keeping the ordering strict will be necessary for a later patch that adds a cache cleaner. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
-
Jeff Layton authored
commit 885c91f7 in Bruce's tree was causing oopses for me: general protection fault: 0000 [#1] SMP Modules linked in: nfsd(OF) nfs_acl(OF) auth_rpcgss(OF) lockd(OF) sunrpc(OF) kvm_amd kvm microcode i2c_piix4 virtio_net virtio_balloon cirrus drm_kms_helper ttm drm virtio_blk i2c_core CPU 0 Pid: 564, comm: exportfs Tainted: GF O 3.8.0-0.rc5.git2.1.fc19.x86_64 #1 Bochs Bochs RIP: 0010:[<ffffffff811b1509>] [<ffffffff811b1509>] kfree+0x49/0x280 RSP: 0018:ffff88007a3d7c50 EFLAGS: 00010203 RAX: 01adaf8dadadad80 RBX: 6b6b6b6b6b6b6b6b RCX: 0000000000000001 RDX: ffffffff7fffffff RSI: 0000000000000000 RDI: 6b6b6b6b6b6b6b6b RBP: ffff88007a3d7c80 R08: 6b6b6b6b6b6b6b6b R09: 0000000000000000 R10: 0000000000000018 R11: 0000000000000000 R12: ffff88006a117b50 R13: ffffffffa01a589c R14: ffff8800631b0f50 R15: 01ad998dadadad80 FS: 00007fcaa3616740(0000) GS:ffff88007fc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 00007f5d84b6fdd8 CR3: 0000000064db4000 CR4: 00000000000006f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process exportfs (pid: 564, threadinfo ffff88007a3d6000, task ffff88006af28000) Stack: ffff88007a3d7c80 ffff88006a117b68 ffff88006a117b50 0000000000000000 ffff8800631b0f50 ffff88006a117b50 ffff88007a3d7ca0 ffffffffa01a589c ffff880036be1148 ffff88007a3d7cf8 ffff88007a3d7e28 ffffffffa01a6a98 Call Trace: [<ffffffffa01a589c>] svc_export_put+0x5c/0x70 [nfsd] [<ffffffffa01a6a98>] svc_export_parse+0x328/0x7e0 [nfsd] [<ffffffffa016f1c7>] cache_do_downcall+0x57/0x70 [sunrpc] [<ffffffffa016f25e>] cache_downcall+0x7e/0x100 [sunrpc] [<ffffffffa016f338>] cache_write_procfs+0x58/0x90 [sunrpc] [<ffffffffa016f2e0>] ? cache_downcall+0x100/0x100 [sunrpc] [<ffffffff8123b0e5>] proc_reg_write+0x75/0xb0 [<ffffffff811ccecf>] vfs_write+0x9f/0x170 [<ffffffff811cd089>] sys_write+0x49/0xa0 [<ffffffff816e0919>] system_call_fastpath+0x16/0x1b Code: 66 66 66 90 48 83 fb 10 0f 86 c3 00 00 00 48 89 df 49 bf 00 00 00 00 00 ea ff ff e8 f2 12 ea ff 48 c1 e8 0c 48 c1 e0 06 49 01 c7 <49> 8b 07 f6 c4 80 0f 85 1d 02 00 00 49 8b 07 a8 80 0f 84 ee 01 RIP [<ffffffff811b1509>] kfree+0x49/0x280 RSP <ffff88007a3d7c50> I think Majianpeng's patch is correct, but incomplete. In order for it to be safe to free the ex_uuid unconditionally in svc_export_put, we need to make sure it's initialized to NULL in the init routine. Cc: majianpeng <majianpeng@gmail.com> Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
-
Jeff Layton authored
Later, we'll need more than one call site for this, so break it out into a new function. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
-
Jeff Layton authored
Add a preprocessor constant for the expiry time of cache entries, and move the test for an expired entry into a function. Note that the current code does not test for RC_INPROG. It just assumes that it won't take more than 2 minutes to fill out an in-progress entry. I'm not sure how valid that assumption is though, so let's just ensure that we never consider an RC_INPROG entry to be expired. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
-
Jeff Layton authored
Entries can only get a c_type of RC_REPLBUFF iff they are RC_DONE. Therefore the test for RC_DONE isn't necessary here. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
-
Jeff Layton authored
Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
-
Jeff Layton authored
Currently we use kmalloc() which wastes a little bit of memory on each allocation since it's a power of 2 allocator. Since we're allocating a 1024 of these now, and may need even more later, let's create a new slabcache for them. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
-
Jeff Layton authored
The reply cache code never returns this status. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
-
Jeff Layton authored
The locking rules for cache entries say that locking the cache_lock isn't needed if you're just touching the current entry. Earlier in this function we set rp->c_state to RC_UNUSED without any locking, so I believe it's ok to do the same here. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
-
Jeff Layton authored
Currently, it only stores the first 16 bytes of any address. struct sockaddr_in6 is 28 bytes however, so we're currently ignoring the last 12 bytes of the address. Expand the c_addr field to a sockaddr_in6, and cast it to a sockaddr_in as necessary. Also fix the comparitor to use the existing RPC helpers for this. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
-