1. 07 Feb, 2020 15 commits
  2. 04 Feb, 2020 4 commits
    • Robert Milkowski's avatar
      NFSv4.0: nfs4_do_fsinfo() should not do implicit lease renewals · 7dc2993a
      Robert Milkowski authored
      Currently, each time nfs4_do_fsinfo() is called it will do an implicit
      NFS4 lease renewal, which is not compliant with the NFS4 specification.
      This can result in a lease being expired by an NFS server.
      
      Commit 83ca7f5a ("NFS: Avoid PUTROOTFH when managing leases")
      introduced implicit client lease renewal in nfs4_do_fsinfo(),
      which can result in the NFSv4.0 lease to expire on a server side,
      and servers returning NFS4ERR_EXPIRED or NFS4ERR_STALE_CLIENTID.
      
      This can easily be reproduced by frequently unmounting a sub-mount,
      then stat'ing it to get it mounted again, which will delay or even
      completely prevent client from sending RENEW operations if no other
      NFS operations are issued. Eventually nfs server will expire client's
      lease and return an error on file access or next RENEW.
      
      This can also happen when a sub-mount is automatically unmounted
      due to inactivity (after nfs_mountpoint_expiry_timeout), then it is
      mounted again via stat(). This can result in a short window during
      which client's lease will expire on a server but not on a client.
      This specific case was observed on production systems.
      
      This patch removes the implicit lease renewal from nfs4_do_fsinfo().
      
      Fixes: 83ca7f5a ("NFS: Avoid PUTROOTFH when managing leases")
      Signed-off-by: default avatarRobert Milkowski <rmilkowski@gmail.com>
      Signed-off-by: default avatarAnna Schumaker <Anna.Schumaker@Netapp.com>
      7dc2993a
    • Robert Milkowski's avatar
      NFSv4: try lease recovery on NFS4ERR_EXPIRED · 924491f2
      Robert Milkowski authored
      Currently, if an nfs server returns NFS4ERR_EXPIRED to open(),
      we return EIO to applications without even trying to recover.
      
      Fixes: 272289a3 ("NFSv4: nfs4_do_handle_exception() handle revoke/expiry of a single stateid")
      Signed-off-by: default avatarRobert Milkowski <rmilkowski@gmail.com>
      Reviewed-by: default avatarTrond Myklebust <trond.myklebust@hammerspace.com>
      Signed-off-by: default avatarAnna Schumaker <Anna.Schumaker@Netapp.com>
      924491f2
    • Wenwen Wang's avatar
      NFS: Fix memory leaks · 123c23c6
      Wenwen Wang authored
      In _nfs42_proc_copy(), 'res->commit_res.verf' is allocated through
      kzalloc() if 'args->sync' is true. In the following code, if
      'res->synchronous' is false, handle_async_copy() will be invoked. If an
      error occurs during the invocation, the following code will not be executed
      and the error will be returned . However, the allocated
      'res->commit_res.verf' is not deallocated, leading to a memory leak. This
      is also true if the invocation of process_copy_commit() returns an error.
      
      To fix the above leaks, redirect the execution to the 'out' label if an
      error is encountered.
      Signed-off-by: default avatarWenwen Wang <wenwen@cs.uga.edu>
      Signed-off-by: default avatarAnna Schumaker <Anna.Schumaker@Netapp.com>
      123c23c6
    • Dai Ngo's avatar
      nfs: optimise readdir cache page invalidation · 227823d2
      Dai Ngo authored
      When the directory is large and it's being modified by one client
      while another client is doing the 'ls -l' on the same directory then
      the cache page invalidation from nfs_force_use_readdirplus causes
      the reading client to keep restarting READDIRPLUS from cookie 0
      which causes the 'ls -l' to take a very long time to complete,
      possibly never completing.
      
      Currently when nfs_force_use_readdirplus is called to switch from
      READDIR to READDIRPLUS, it invalidates all the cached pages of the
      directory. This cache page invalidation causes the next nfs_readdir
      to re-read the directory content from cookie 0.
      
      This patch is to optimise the cache invalidation in
      nfs_force_use_readdirplus by only truncating the cached pages from
      last page index accessed to the end the file. It also marks the
      inode to delay invalidating all the cached page of the directory
      until the next initial nfs_readdir of the next 'ls' instance.
      Signed-off-by: default avatarDai Ngo <dai.ngo@oracle.com>
      Reviewed-by: default avatarTrond Myklebust <trond.myklebust@hammerspace.com>
      [Anna - Fix conflicts with Trond's readdir patches]
      [Anna - Remove redundant call to nfs_zap_mapping()]
      [Anna - Replace d_inode(file_dentry(desc->file)) with file_inode(desc->file)]
      Signed-off-by: default avatarAnna Schumaker <Anna.Schumaker@Netapp.com>
      227823d2
  3. 03 Feb, 2020 15 commits
  4. 24 Jan, 2020 3 commits
  5. 15 Jan, 2020 3 commits
    • Su Yanjun's avatar
      NFSv3: FIx bug when using chacl and chmod to change acl · fe1e8dbe
      Su Yanjun authored
      We find a bug when running test under nfsv3  as below.
      1)
      chacl u::r--,g::rwx,o:rw- file1
      2)
      chmod u+w file1
      3)
      chacl -l file1
      
      We expect u::rw-, but it shows u::r--, more likely it returns the
      cached acl in inode.
      
      We dig the code find that the code path is different.
      
      chacl->..->__nfs3_proc_setacls->nfs_zap_acl_cache
      Then nfs_zap_acl_cache clears the NFS_INO_INVALID_ACL in
      NFS_I(inode)->cache_validity.
      
      chmod->..->nfs3_proc_setattr
      Because NFS_INO_INVALID_ACL has been cleared by chacl path,
      nfs_zap_acl_cache wont be called.
      
      nfs_setattr_update_inode will set NFS_INO_INVALID_ACL so let it
      before nfs_zap_acl_cache call.
      Signed-off-by: default avatarSu Yanjun <suyanjun218@gmail.com>
      Signed-off-by: default avatarAnna Schumaker <Anna.Schumaker@Netapp.com>
      fe1e8dbe
    • Olga Kornievskaia's avatar
      NFSv4.x recover from pre-mature loss of openstateid · d826e5b8
      Olga Kornievskaia authored
      Ever since the commit 0e0cb35b, it's possible to lose an open stateid
      while retrying a CLOSE due to ERR_OLD_STATEID. Once that happens,
      operations that require openstateid fail with EAGAIN which is propagated
      to the application then tests like generic/446 and generic/168 fail with
      "Resource temporarily unavailable".
      
      Instead of returning this error, initiate state recovery when possible to
      recover the open stateid and then try calling nfs4_select_rw_stateid()
      again.
      
      Fixes: 0e0cb35b ("NFSv4: Handle NFS4ERR_OLD_STATEID in CLOSE/OPEN_DOWNGRADE")
      Signed-off-by: default avatarOlga Kornievskaia <kolga@netapp.com>
      Signed-off-by: default avatarAnna Schumaker <Anna.Schumaker@Netapp.com>
      d826e5b8
    • Olga Kornievskaia's avatar
      NFSv4 fix acl retrieval over krb5i/krb5p mounts · 62a1573f
      Olga Kornievskaia authored
      For the krb5i and krb5p mount, it was problematic to truncate the
      received ACL to the provided buffer because an integrity check
      could not be preformed.
      
      Instead, provide enough pages to accommodate the largest buffer
      bounded by the largest RPC receive buffer size.
      
      Note: I don't think it's possible for the ACL to be truncated now.
      Thus NFS4_ACL_TRUNC flag and related code could be possibly
      removed but since I'm unsure, I'm leaving it.
      
      v2: needs +1 page.
      Signed-off-by: default avatarOlga Kornievskaia <kolga@netapp.com>
      Signed-off-by: default avatarAnna Schumaker <Anna.Schumaker@Netapp.com>
      62a1573f