1. 20 Sep, 2024 22 commits
  2. 01 Sep, 2024 13 commits
    • NeilBrown's avatar
      nfsd: move nfsd_pool_stats_open into nfsctl.c · c9f10f81
      NeilBrown authored
      nfsd_pool_stats_open() is used in nfsctl.c, so move it there.
      Signed-off-by: default avatarNeilBrown <neilb@suse.de>
      Signed-off-by: default avatarChuck Lever <chuck.lever@oracle.com>
      c9f10f81
    • NeilBrown's avatar
      SUNRPC: make various functions static, or not exported. · f2b27e1d
      NeilBrown authored
      Various functions are only used within the sunrpc module, and several
      are only use in the one file.  So clean up:
      
      These are marked static, and any EXPORT is removed.
        svc_rcpb_setup()
        svc_rqst_alloc()
        svc_rqst_free()  - also moved before first use
        svc_rpcbind_set_version()
        svc_drop() - also moved to svc.c
      
      These are now not EXPORTed, but are not static.
        svc_authenticate()
        svc_sock_update_bufs()
      Signed-off-by: default avatarNeilBrown <neilb@suse.de>
      Signed-off-by: default avatarChuck Lever <chuck.lever@oracle.com>
      f2b27e1d
    • NeilBrown's avatar
      lockd: discard nlmsvc_timeout · 4ed9ef32
      NeilBrown authored
      nlmsvc_timeout always has the same value as (nlm_timeout * HZ), so use
      that in the one place that nlmsvc_timeout is used.
      
      In truth it *might* not always be the same as nlmsvc_timeout is only set
      when lockd is started while nlm_timeout can be set at anytime via
      sysctl.  I think this difference it not helpful so removing it is good.
      
      Also remove the test for nlm_timout being 0.  This is not possible -
      unless a module parameter is used to set the minimum timeout to 0, and
      if that happens then it probably should be honoured.
      Signed-off-by: default avatarNeilBrown <neilb@suse.de>
      Signed-off-by: default avatarChuck Lever <chuck.lever@oracle.com>
      4ed9ef32
    • NeilBrown's avatar
      nfsd: don't EXPORT_SYMBOL nfsd4_ssc_init_umount_work() · 8203ab8a
      NeilBrown authored
      nfsd4_ssc_init_umount_work() is only used in the nfsd module, so there
      is no need to EXPORT it.
      Signed-off-by: default avatarNeilBrown <neilb@suse.de>
      Signed-off-by: default avatarChuck Lever <chuck.lever@oracle.com>
      8203ab8a
    • Chen Hanxiao's avatar
      NFS: trace: show TIMEDOUT instead of 0x6e · cef48236
      Chen Hanxiao authored
      __nfs_revalidate_inode may return ETIMEDOUT.
      
      print symbol of ETIMEDOUT in nfs trace:
      
      before:
      cat-5191 [005] 119.331127: nfs_revalidate_inode_exit: error=-110 (0x6e)
      
      after:
      cat-1738 [004] 44.365509: nfs_revalidate_inode_exit: error=-110 (TIMEDOUT)
      Signed-off-by: default avatarChen Hanxiao <chenhx.fnst@fujitsu.com>
      Reviewed-by: default avatarJeff Layton <jlayton@kernel.org>
      Signed-off-by: default avatarChuck Lever <chuck.lever@oracle.com>
      cef48236
    • Youzhong Yang's avatar
      nfsd: use system_unbound_wq for nfsd_file_gc_worker() · 4b84551a
      Youzhong Yang authored
      After many rounds of changes in filecache.c, the fix by commit
      ce7df055(NFSD: Make the file_delayed_close workqueue UNBOUND)
      is gone, now we are getting syslog messages like these:
      
      [ 1618.186688] workqueue: nfsd_file_gc_worker [nfsd] hogged CPU for >13333us 4 times, consider switching to WQ_UNBOUND
      [ 1638.661616] workqueue: nfsd_file_gc_worker [nfsd] hogged CPU for >13333us 8 times, consider switching to WQ_UNBOUND
      [ 1665.284542] workqueue: nfsd_file_gc_worker [nfsd] hogged CPU for >13333us 16 times, consider switching to WQ_UNBOUND
      [ 1759.491342] workqueue: nfsd_file_gc_worker [nfsd] hogged CPU for >13333us 32 times, consider switching to WQ_UNBOUND
      [ 3013.012308] workqueue: nfsd_file_gc_worker [nfsd] hogged CPU for >13333us 64 times, consider switching to WQ_UNBOUND
      [ 3154.172827] workqueue: nfsd_file_gc_worker [nfsd] hogged CPU for >13333us 128 times, consider switching to WQ_UNBOUND
      [ 3422.461924] workqueue: nfsd_file_gc_worker [nfsd] hogged CPU for >13333us 256 times, consider switching to WQ_UNBOUND
      [ 3963.152054] workqueue: nfsd_file_gc_worker [nfsd] hogged CPU for >13333us 512 times, consider switching to WQ_UNBOUND
      
      Consider use system_unbound_wq instead of system_wq for
      nfsd_file_gc_worker().
      Signed-off-by: default avatarYouzhong Yang <youzhong@gmail.com>
      Signed-off-by: default avatarChuck Lever <chuck.lever@oracle.com>
      4b84551a
    • Jeff Layton's avatar
      nfsd: count nfsd_file allocations · 700bb4ff
      Jeff Layton authored
      We already count the frees (via nfsd_file_releases). Count the
      allocations as well. Also switch the direct call to nfsd_file_slab_free
      in nfsd_file_do_acquire to nfsd_file_free, so that the allocs and
      releases match up.
      Signed-off-by: default avatarJeff Layton <jlayton@kernel.org>
      Signed-off-by: default avatarChuck Lever <chuck.lever@oracle.com>
      700bb4ff
    • Jeff Layton's avatar
      nfsd: fix refcount leak when file is unhashed after being found · 8a792617
      Jeff Layton authored
      If we wait_for_construction and find that the file is no longer hashed,
      and we're going to retry the open, the old nfsd_file reference is
      currently leaked. Put the reference before retrying.
      
      Fixes: c6593366 ("nfsd: don't kill nfsd_files because of lease break error")
      Signed-off-by: default avatarJeff Layton <jlayton@kernel.org>
      Tested-by: default avatarYouzhong Yang <youzhong@gmail.com>
      Signed-off-by: default avatarChuck Lever <chuck.lever@oracle.com>
      8a792617
    • Jeff Layton's avatar
      nfsd: remove unneeded EEXIST error check in nfsd_do_file_acquire · 81a95c2b
      Jeff Layton authored
      Given that we do the search and insertion while holding the i_lock, I
      don't think it's possible for us to get EEXIST here. Remove this case.
      
      Fixes: c6593366 ("nfsd: don't kill nfsd_files because of lease break error")
      Signed-off-by: default avatarJeff Layton <jlayton@kernel.org>
      Tested-by: default avatarYouzhong Yang <youzhong@gmail.com>
      Signed-off-by: default avatarChuck Lever <chuck.lever@oracle.com>
      81a95c2b
    • Youzhong Yang's avatar
      nfsd: add list_head nf_gc to struct nfsd_file · 8e6e2ffa
      Youzhong Yang authored
      nfsd_file_put() in one thread can race with another thread doing
      garbage collection (running nfsd_file_gc() -> list_lru_walk() ->
      nfsd_file_lru_cb()):
      
        * In nfsd_file_put(), nf->nf_ref is 1, so it tries to do nfsd_file_lru_add().
        * nfsd_file_lru_add() returns true (with NFSD_FILE_REFERENCED bit set)
        * garbage collector kicks in, nfsd_file_lru_cb() clears REFERENCED bit and
          returns LRU_ROTATE.
        * garbage collector kicks in again, nfsd_file_lru_cb() now decrements nf->nf_ref
          to 0, runs nfsd_file_unhash(), removes it from the LRU and adds to the dispose
          list [list_lru_isolate_move(lru, &nf->nf_lru, head)]
        * nfsd_file_put() detects NFSD_FILE_HASHED bit is cleared, so it tries to remove
          the 'nf' from the LRU [if (!nfsd_file_lru_remove(nf))]. The 'nf' has been added
          to the 'dispose' list by nfsd_file_lru_cb(), so nfsd_file_lru_remove(nf) simply
          treats it as part of the LRU and removes it, which leads to its removal from
          the 'dispose' list.
        * At this moment, 'nf' is unhashed with its nf_ref being 0, and not on the LRU.
          nfsd_file_put() continues its execution [if (refcount_dec_and_test(&nf->nf_ref))],
          as nf->nf_ref is already 0, nf->nf_ref is set to REFCOUNT_SATURATED, and the 'nf'
          gets no chance of being freed.
      
      nfsd_file_put() can also race with nfsd_file_cond_queue():
        * In nfsd_file_put(), nf->nf_ref is 1, so it tries to do nfsd_file_lru_add().
        * nfsd_file_lru_add() sets REFERENCED bit and returns true.
        * Some userland application runs 'exportfs -f' or something like that, which triggers
          __nfsd_file_cache_purge() -> nfsd_file_cond_queue().
        * In nfsd_file_cond_queue(), it runs [if (!nfsd_file_unhash(nf))], unhash is done
          successfully.
        * nfsd_file_cond_queue() runs [if (!nfsd_file_get(nf))], now nf->nf_ref goes to 2.
        * nfsd_file_cond_queue() runs [if (nfsd_file_lru_remove(nf))], it succeeds.
        * nfsd_file_cond_queue() runs [if (refcount_sub_and_test(decrement, &nf->nf_ref))]
          (with "decrement" being 2), so the nf->nf_ref goes to 0, the 'nf' is added to the
          dispose list [list_add(&nf->nf_lru, dispose)]
        * nfsd_file_put() detects NFSD_FILE_HASHED bit is cleared, so it tries to remove
          the 'nf' from the LRU [if (!nfsd_file_lru_remove(nf))], although the 'nf' is not
          in the LRU, but it is linked in the 'dispose' list, nfsd_file_lru_remove() simply
          treats it as part of the LRU and removes it. This leads to its removal from
          the 'dispose' list!
        * Now nf->ref is 0, unhashed. nfsd_file_put() continues its execution and set
          nf->nf_ref to REFCOUNT_SATURATED.
      
      As shown in the above analysis, using nf_lru for both the LRU list and dispose list
      can cause the leaks. This patch adds a new list_head nf_gc in struct nfsd_file, and uses
      it for the dispose list. This does not fix the nfsd_file leaking issue completely.
      Signed-off-by: default avatarYouzhong Yang <youzhong@gmail.com>
      Reviewed-by: default avatarJeff Layton <jlayton@kernel.org>
      Signed-off-by: default avatarChuck Lever <chuck.lever@oracle.com>
      8e6e2ffa
    • Linus Torvalds's avatar
      Linux 6.11-rc6 · 431c1646
      Linus Torvalds authored
      431c1646
    • Linus Torvalds's avatar
      Merge tag 'v6.11-rc5-smb-client-fixes' of git://git.samba.org/sfrench/cifs-2.6 · 6b9ffc45
      Linus Torvalds authored
      Pull smb client fixes from Steve French:
      
       - copy_file_range fix
      
       - two read fixes including read past end of file rc fix and read retry
         crediting fix
      
       - falloc zero range fix
      
      * tag 'v6.11-rc5-smb-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
        cifs: Fix FALLOC_FL_ZERO_RANGE to preflush buffered part of target region
        cifs: Fix copy offload to flush destination region
        netfs, cifs: Fix handling of short DIO read
        cifs: Fix lack of credit renegotiation on read retry
      6b9ffc45
    • Linus Torvalds's avatar
      Merge tag 'bcachefs-2024-08-21' of https://github.com/koverstreet/bcachefs · a4c76312
      Linus Torvalds authored
      Push bcachefs fixes from Kent Overstreet:
       "The data corruption in the buffered write path is troubling; inode
        lock should not have been able to cause that...
      
         - Fix a rare data corruption in the rebalance path, caught as a nonce
           inconsistency on encrypted filesystems
      
         - Revert lockless buffered write path
      
         - Mark more errors as autofix"
      
      * tag 'bcachefs-2024-08-21' of https://github.com/koverstreet/bcachefs:
        bcachefs: Mark more errors as autofix
        bcachefs: Revert lockless buffered IO path
        bcachefs: Fix bch2_extents_match() false positive
        bcachefs: Fix failure to return error in data_update_index_update()
      a4c76312
  3. 31 Aug, 2024 5 commits