1. 20 May, 2024 8 commits
    • Dan Aloni's avatar
      sunrpc: fix NFSACL RPC retry on soft mount · 0dc9f430
      Dan Aloni authored
      It used to be quite awhile ago since 1b63a751 ('SUNRPC: Refactor
      rpc_clone_client()'), in 2012, that `cl_timeout` was copied in so that
      all mount parameters propagate to NFSACL clients. However since that
      change, if mount options as follows are given:
      
          soft,timeo=50,retrans=16,vers=3
      
      The resultant NFSACL client receives:
      
          cl_softrtry: 1
          cl_timeout: to_initval=60000, to_maxval=60000, to_increment=0, to_retries=2, to_exponential=0
      
      These values lead to NFSACL operations not being retried under the
      condition of transient network outages with soft mount. Instead, getacl
      call fails after 60 seconds with EIO.
      
      The simple fix is to pass the existing client's `cl_timeout` as the new
      client timeout.
      
      Cc: Chuck Lever <chuck.lever@oracle.com>
      Cc: Benjamin Coddington <bcodding@redhat.com>
      Link: https://lore.kernel.org/all/20231105154857.ryakhmgaptq3hb6b@gmail.com/T/
      Fixes: 1b63a751 ('SUNRPC: Refactor rpc_clone_client()')
      Signed-off-by: default avatarDan Aloni <dan.aloni@vastdata.com>
      Reviewed-by: default avatarBenjamin Coddington <bcodding@redhat.com>
      Signed-off-by: default avatarTrond Myklebust <trond.myklebust@hammerspace.com>
      0dc9f430
    • Olga Kornievskaia's avatar
      SUNRPC: fix handling expired GSS context · 9b62ef6d
      Olga Kornievskaia authored
      In the case where we have received a successful reply to an RPC request,
      but while processing the reply the client in rpc_decode_header() finds
      an expired context, the code ends up propagating the error to the caller
      instead of getting a new context and retrying the request.
      
      To give more details, in rpc_decode_header() we call rpcauth_checkverf()
      will call into the gss and internally will at some point call
      gss_validate() which has a check if the current’s context lifetime
      expired, and it would fail. The reason for the failure gets ‘scrubbed’
      and translated to EACCES so when we get back to rpc_decode_header() we
      just go to “out_verifier” which for that error would get converted to
      “out_garbage” (ie it’s treated as garballed reply) and the next
      action is call_encode. Which (1) doesn’t reencode or re-send (not to
      mention no upcall happens because context expires as that reason just
      not known) and it again fails in the same decoding process. After
      re-trying it 3 times the error is propagated back to the caller
      (ie nfs4_write_done_cb() in the case a failing write).
      
      To fix this, instead we need to look to the case where the server
      decides that context has expired and replies with an RPC auth error.
      In that case, the rpc_decode_header() goes to "out_msg_denied" in that
      we return EKEYREJECTED which in call_decode() is sent to “call_reserve”
      which triggers an upcalls and a re-try of the operation.
      
      The proposed fix is in case of a failed rpc_decode_header() to check
      if credentials were set to be invalid and use that as a proxy for
      deciding that context has expired and then treat is same way as
      receiving an auth error.
      Signed-off-by: default avatarOlga Kornievskaia <kolga@netapp.com>
      Reviewed-by: default avatarBenjamin Coddington <bcodding@redhat.com>
      Signed-off-by: default avatarTrond Myklebust <trond.myklebust@hammerspace.com>
      9b62ef6d
    • Martin Kaiser's avatar
      nfs: keep server info for remounts · b322bf9e
      Martin Kaiser authored
      With newer kernels that use fs_context for nfs mounts, remounts fail with
      -EINVAL.
      
      $ mount -t nfs -o nolock 10.0.0.1:/tmp/test /mnt/test/
      $ mount -t nfs -o remount /mnt/test/
      mount: mounting 10.0.0.1:/tmp/test on /mnt/test failed: Invalid argument
      
      For remounts, the nfs server address and port are populated by
      nfs_init_fs_context and later overwritten with 0x00 bytes by
      nfs23_parse_monolithic. The remount then fails as the server address is
      invalid.
      
      Fix this by not overwriting nfs server info in nfs23_parse_monolithic if
      we're doing a remount.
      
      Fixes: f2aedb71 ("NFS: Add fs_context support.")
      Signed-off-by: default avatarMartin Kaiser <martin@kaiser.cx>
      Signed-off-by: default avatarTrond Myklebust <trond.myklebust@hammerspace.com>
      b322bf9e
    • Benjamin Coddington's avatar
      NFSv4: Fixup smatch warning for ambiguous return · 37ffe065
      Benjamin Coddington authored
      Dan Carpenter reports smatch warning for nfs4_try_migration() when a memory
      allocation failure results in a zero return value.  In this case, a
      transient allocation failure error will likely be retried the next time the
      server responds with NFS4ERR_MOVED.
      
      We can fixup the smatch warning with a small refactor: attempt all three
      allocations before testing and returning on a failure.
      Reported-by: default avatarDan Carpenter <dan.carpenter@linaro.org>
      Fixes: c3ed2227 ("NFSv4: Fix free of uninitialized nfs4_label on referral lookup.")
      Signed-off-by: default avatarBenjamin Coddington <bcodding@redhat.com>
      Reviewed-by: default avatarDan Carpenter <dan.carpenter@linaro.org>
      Reviewed-by: default avatarChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: default avatarTrond Myklebust <trond.myklebust@hammerspace.com>
      37ffe065
    • Chen Hanxiao's avatar
      NFS: make sure lock/nolock overriding local_lock mount option · bf95f82e
      Chen Hanxiao authored
      Currently, mount option lock/nolock and local_lock option
      may override NFS_MOUNT_LOCAL_FLOCK NFS_MOUNT_LOCAL_FCNTL flags
      when passing in different order:
      
      mount -o vers=3,local_lock=all,lock:
      	local_lock=none
      
      mount -o vers=3,lock,local_lock=all:
      	local_lock=all
      
      This patch will let lock/nolock override local_lock option
      as nfs(5) suggested.
      Signed-off-by: default avatarChen Hanxiao <chenhx.fnst@fujitsu.com>
      Signed-off-by: default avatarTrond Myklebust <trond.myklebust@hammerspace.com>
      bf95f82e
    • NeilBrown's avatar
      NFS: add atomic_open for NFSv3 to handle O_TRUNC correctly. · 7c6c5249
      NeilBrown authored
      With two clients, each with NFSv3 mounts of the same directory, the sequence:
      
         client1            client2
        ls -l afile
                            echo hello there > afile
        echo HELLO > afile
        cat afile
      
      will show
         HELLO
         there
      
      because the O_TRUNC requested in the final 'echo' doesn't take effect.
      This is because the "Negative dentry, just create a file" section in
      lookup_open() assumes that the file *does* get created since the dentry
      was negative, so it sets FMODE_CREATED, and this causes do_open() to
      clear O_TRUNC and so the file doesn't get truncated.
      
      Even mounting with -o lookupcache=none does not help as
      nfs_neg_need_reval() always returns false if LOOKUP_CREATE is set.
      
      This patch fixes the problem by providing an atomic_open inode operation
      for NFSv3 (and v2).  The code is largely the code from the branch in
      lookup_open() when atomic_open is not provided.  The significant change
      is that the O_TRUNC flag is passed a new nfs_do_create() which add
      'trunc' handling to nfs_create().
      
      With this change we also optimise away an unnecessary LOOKUP before the
      file is created.
      Signed-off-by: default avatarNeilBrown <neilb@suse.de>
      Signed-off-by: default avatarTrond Myklebust <trond.myklebust@hammerspace.com>
      7c6c5249
    • Anna Schumaker's avatar
      pNFS/filelayout: Specify the layout segment range in LAYOUTGET · 464b424f
      Anna Schumaker authored
      Move from only requesting full file layout segments to requesting layout
      segments that match our I/O size. This means the server is still free to
      return a full file layout if it wants, but partial layouts will no
      longer cause an error.
      Signed-off-by: default avatarAnna Schumaker <Anna.Schumaker@Netapp.com>
      Signed-off-by: default avatarTrond Myklebust <trond.myklebust@hammerspace.com>
      464b424f
    • Anna Schumaker's avatar
      pNFS/filelayout: Remove the whole file layout requirement · 9c75576e
      Anna Schumaker authored
      Layout segments have been supported in pNFS for years, so remove the
      requirement that the server always sends whole file layouts.
      Signed-off-by: default avatarAnna Schumaker <Anna.Schumaker@Netapp.com>
      Signed-off-by: default avatarTrond Myklebust <trond.myklebust@hammerspace.com>
      9c75576e
  2. 12 May, 2024 5 commits
  3. 11 May, 2024 10 commits
  4. 10 May, 2024 17 commits