Commit d4550bbe authored by Chuck Lever's avatar Chuck Lever Committed by Anna Schumaker

xprtrdma: Check inline size before providing a Write chunk

In very rare cases, an NFS READ operation might predict that the
non-payload part of the RPC Call is large. For instance, an
NFSv4 COMPOUND with a large GETATTR result, in combination with a
large Kerberos credential, could push the non-payload part to be
several kilobytes.

If the non-payload part is larger than the connection's inline
threshold, the client is required to provision a Reply chunk. The
current Linux client does not check for this case. There are two
obvious ways to handle it:

a. Provision a Write chunk for the payload and a Reply chunk for
   the non-payload part

b. Provision a Reply chunk for the whole RPC Reply

Some testing at a recent NFS bake-a-thon showed that servers can
mostly handle a. but there are some corner cases that do not work
yet. b. already works (it has to, to handle krb5i/p), but could be
somewhat less efficient. However, I expect this scenario to be very
rare -- no-one has reported a problem yet.

So I'm going to implement b. Sometime later I will provide some
patches to help make b. a little more efficient by more carefully
choosing the Reply chunk's segment sizes to ensure the payload is
optimally aligned.
Signed-off-by: default avatarChuck Lever <chuck.lever@oracle.com>
Signed-off-by: default avatarAnna Schumaker <Anna.Schumaker@Netapp.com>
parent ec482cc1
...@@ -164,6 +164,21 @@ static bool rpcrdma_results_inline(struct rpcrdma_xprt *r_xprt, ...@@ -164,6 +164,21 @@ static bool rpcrdma_results_inline(struct rpcrdma_xprt *r_xprt,
return rqst->rq_rcv_buf.buflen <= ia->ri_max_inline_read; return rqst->rq_rcv_buf.buflen <= ia->ri_max_inline_read;
} }
/* The client is required to provide a Reply chunk if the maximum
* size of the non-payload part of the RPC Reply is larger than
* the inline threshold.
*/
static bool
rpcrdma_nonpayload_inline(const struct rpcrdma_xprt *r_xprt,
const struct rpc_rqst *rqst)
{
const struct xdr_buf *buf = &rqst->rq_rcv_buf;
const struct rpcrdma_ia *ia = &r_xprt->rx_ia;
return buf->head[0].iov_len + buf->tail[0].iov_len <
ia->ri_max_inline_read;
}
/* Split @vec on page boundaries into SGEs. FMR registers pages, not /* Split @vec on page boundaries into SGEs. FMR registers pages, not
* a byte range. Other modes coalesce these SGEs into a single MR * a byte range. Other modes coalesce these SGEs into a single MR
* when they can. * when they can.
...@@ -762,7 +777,8 @@ rpcrdma_marshal_req(struct rpcrdma_xprt *r_xprt, struct rpc_rqst *rqst) ...@@ -762,7 +777,8 @@ rpcrdma_marshal_req(struct rpcrdma_xprt *r_xprt, struct rpc_rqst *rqst)
*/ */
if (rpcrdma_results_inline(r_xprt, rqst)) if (rpcrdma_results_inline(r_xprt, rqst))
wtype = rpcrdma_noch; wtype = rpcrdma_noch;
else if (ddp_allowed && rqst->rq_rcv_buf.flags & XDRBUF_READ) else if ((ddp_allowed && rqst->rq_rcv_buf.flags & XDRBUF_READ) &&
rpcrdma_nonpayload_inline(r_xprt, rqst))
wtype = rpcrdma_writech; wtype = rpcrdma_writech;
else else
wtype = rpcrdma_replych; wtype = rpcrdma_replych;
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment