1. 16 Apr, 2013 1 commit
    • J. Bruce Fields's avatar
      nfsd4: implement SEQ4_STATUS_RECALLABLE_STATE_REVOKED · 3bd64a5b
      J. Bruce Fields authored
      A 4.1 server must notify a client that has had any state revoked using
      the SEQ4_STATUS_RECALLABLE_STATE_REVOKED flag.  The client can figure
      out exactly which state is the problem using CHECK_STATEID and then free
      it using FREE_STATEID.  The status flag will be unset once all such
      revoked stateids are freed.
      
      Our server's only recallable state is delegations.  So we keep with each
      4.1 client a list of delegations that have timed out and been recalled,
      but haven't yet been freed by FREE_STATEID.
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      3bd64a5b
  2. 09 Apr, 2013 10 commits
  3. 08 Apr, 2013 1 commit
    • J. Bruce Fields's avatar
      nfsd4: cleanup handling of nfsv4.0 closed stateid's · 9411b1d4
      J. Bruce Fields authored
      Closed stateid's are kept around a little while to handle close replays
      in the 4.0 case.  So we stash them in the last-used stateid in the
      oo_last_closed_stateid field of the open owner.  We can free that in
      encode_seqid_op_tail once the seqid on the open owner is next
      incremented.  But we don't want to do that on the close itself; so we
      set NFS4_OO_PURGE_CLOSE flag set on the open owner, skip freeing it the
      first time through encode_seqid_op_tail, then when we see that flag set
      next time we free it.
      
      This is unnecessarily baroque.
      
      Instead, just move the logic that increments the seqid out of the xdr
      code and into the operation code itself.
      
      The justification given for the current placement is that we need to
      wait till the last minute to be sure we know whether the status is a
      sequence-id-mutating error or not, but examination of the code shows
      that can't actually happen.
      Reported-by: default avatarYanchuan Nian <ycnian@gmail.com>
      Tested-by: default avatarYanchuan Nian <ycnian@gmail.com>
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      9411b1d4
  4. 04 Apr, 2013 4 commits
  5. 03 Apr, 2013 24 commits
    • Alexey Khoroshilov's avatar
      SUNRPC/cache: add module_put() on error path in cache_open() · a7823c79
      Alexey Khoroshilov authored
      If kmalloc() fails in cache_open(), module cd->owner left locked.
      The patch adds module_put(cd->owner) on this path.
      
      Found by Linux Driver Verification project (linuxtesting.org).
      Signed-off-by: default avatarAlexey Khoroshilov <khoroshilov@ispras.ru>
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      a7823c79
    • fanchaoting's avatar
      nfsd: remove /proc/fs/nfs when create /proc/fs/nfs/exports error · ff7c4b36
      fanchaoting authored
      when create /proc/fs/nfs/exports error, we should remove /proc/fs/nfs,
      if don't do it, it maybe cause Memory leak.
      Signed-off-by: default avatarfanchaoting <fanchaoting@cn.fujitsu.com>
      Reviewed-by: default avatarchendt.fnst <chendt.fnst@cn.fujitsu.com>
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      ff7c4b36
    • fanchaoting's avatar
      nfsd: don't run get_file if nfs4_preprocess_stateid_op return error · b022032e
      fanchaoting authored
      we should return error status directly when nfs4_preprocess_stateid_op
      return error.
      Signed-off-by: default avatarfanchaoting <fanchaoting@cn.fujitsu.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      b022032e
    • Jeff Layton's avatar
      nfsd: convert the file_hashtbl to a hlist · 89876f8c
      Jeff Layton authored
      We only ever traverse the hash chains in the forward direction, so a
      double pointer list head isn't really necessary.
      Signed-off-by: default avatarJeff Layton <jlayton@redhat.com>
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      89876f8c
    • J. Bruce Fields's avatar
      nfsd4: don't destroy in-use session · 66b2b9b2
      J. Bruce Fields authored
      This changes session destruction to be similar to client destruction in
      that attempts to destroy a session while in use (which should be rare
      corner cases) result in DELAY.  This simplifies things somewhat and
      helps meet a coming 4.2 requirement.
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      66b2b9b2
    • J. Bruce Fields's avatar
      nfsd4: don't destroy in-use clients · 221a6876
      J. Bruce Fields authored
      When a setclientid_confirm or create_session confirms a client after a
      client reboot, it also destroys any previous state held by that client.
      
      The shutdown of that previous state must be careful not to free the
      client out from under threads processing other requests that refer to
      the client.
      
      This is a particular problem in the NFSv4.1 case when we hold a
      reference to a session (hence a client) throughout compound processing.
      
      The server attempts to handle this by unhashing the client at the time
      it's destroyed, then delaying the final free to the end.  But this still
      leaves some races in the current code.
      
      I believe it's simpler just to fail the attempt to destroy the client by
      returning NFS4ERR_DELAY.  This is a case that should never happen
      anyway.
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      221a6876
    • J. Bruce Fields's avatar
      nfsd4: simplify bind_conn_to_session locking · 4f6e6c17
      J. Bruce Fields authored
      The locking here is very fiddly, and there's no reason for us to be
      setting cstate->session, since this is the only op in the compound.
      Let's just take the state lock and drop the reference counting.
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      4f6e6c17
    • J. Bruce Fields's avatar
      nfsd4: fix destroy_session race · abcdff09
      J. Bruce Fields authored
      destroy_session uses the session and client without continuously holding
      any reference or locks.
      
      Put the whole thing under the state lock for now.
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      abcdff09
    • J. Bruce Fields's avatar
      nfsd4: clientid lookup cleanup · bfa85e83
      J. Bruce Fields authored
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      bfa85e83
    • J. Bruce Fields's avatar
      nfsd4: destroy_clientid simplification · c0293b01
      J. Bruce Fields authored
      I'm not sure what the check for clientid expiry was meant to do here.
      
      The check for a matching session is redundant given the previous check
      for state: a client without state is, in particular, a client without
      sessions.
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      c0293b01
    • J. Bruce Fields's avatar
      nfsd4: remove some dprintk's · 1ca50792
      J. Bruce Fields authored
      E.g. printk's that just report the return value from an op are
      uninteresting as we already do that in the main proc_compound loop.
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      1ca50792
    • J. Bruce Fields's avatar
      nfsd4: STALE_STATEID cleanup · 0eb6f20a
      J. Bruce Fields authored
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      0eb6f20a
    • J. Bruce Fields's avatar
      nfsd4: warn on odd create_session state · 78389046
      J. Bruce Fields authored
      This should never happen.
      
      (Note: the comparable case in setclientid_confirm *can* happen, since
      updating a client record can result in both confirmed and unconfirmed
      records with the same clientid.)
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      78389046
    • ycnian@gmail.com's avatar
      nfsd: fix bug on nfs4 stateid deallocation · 491402a7
      ycnian@gmail.com authored
      NFS4_OO_PURGE_CLOSE is not handled properly. To avoid memory leak, nfs4
      stateid which is pointed by oo_last_closed_stid is freed in nfsd4_close(),
      but NFS4_OO_PURGE_CLOSE isn't cleared meanwhile. So the stateid released in
      THIS close procedure may be freed immediately in the coming encoding function.
      Sorry that Signed-off-by was forgotten in last version.
      Signed-off-by: default avatarYanchuan Nian <ycnian@gmail.com>
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      491402a7
    • Yanchuan Nian's avatar
      nfsd: remove unused macro in nfsv4 · 9c6bdbb8
      Yanchuan Nian authored
      lk_rflags is never used anywhere, and rflags is not defined in struct
      nfsd4_lock.
      Signed-off-by: default avatarYanchuan Nian <ycnian@gmail.com>
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      9c6bdbb8
    • J. Bruce Fields's avatar
      nfsd4: fix use-after-free of 4.1 client on connection loss · 2e4b7239
      J. Bruce Fields authored
      Once we drop the lock here there's nothing keeping the client around:
      the only lock still held is the xpt_lock on this socket, but this socket
      no longer has any connection with the client so there's no way for other
      code to know we're still using the client.
      
      The solution is simple: all nfsd4_probe_callback does is set a few
      variables and queue some work, so there's no reason we can't just keep
      it under the lock.
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      2e4b7239
    • J. Bruce Fields's avatar
      nfsd4: fix race on client shutdown · b0a9d3ab
      J. Bruce Fields authored
      Dropping the session's reference count after the client's means we leave
      a window where the session's se_client pointer is NULL.  An xpt_user
      callback that encounters such a session may then crash:
      
      [  303.956011] BUG: unable to handle kernel NULL pointer dereference at 0000000000000318
      [  303.959061] IP: [<ffffffff81481a8e>] _raw_spin_lock+0x1e/0x40
      [  303.959061] PGD 37811067 PUD 3d498067 PMD 0
      [  303.959061] Oops: 0002 [#8] PREEMPT SMP
      [  303.959061] Modules linked in: md5 nfsd auth_rpcgss nfs_acl snd_hda_intel snd_hda_codec snd_hwdep snd_pcm snd_page_alloc microcode psmouse snd_timer serio_raw pcspkr evdev snd soundcore i2c_piix4 i2c_core intel_agp intel_gtt processor button nfs lockd sunrpc fscache ata_generic pata_acpi ata_piix uhci_hcd libata btrfs usbcore usb_common crc32c scsi_mod libcrc32c zlib_deflate floppy virtio_balloon virtio_net virtio_pci virtio_blk virtio_ring virtio
      [  303.959061] CPU 0
      [  303.959061] Pid: 264, comm: nfsd Tainted: G      D      3.8.0-ARCH+ #156 Bochs Bochs
      [  303.959061] RIP: 0010:[<ffffffff81481a8e>]  [<ffffffff81481a8e>] _raw_spin_lock+0x1e/0x40
      [  303.959061] RSP: 0018:ffff880037877dd8  EFLAGS: 00010202
      [  303.959061] RAX: 0000000000000100 RBX: ffff880037a2b698 RCX: ffff88003d879278
      [  303.959061] RDX: ffff88003d879278 RSI: dead000000100100 RDI: 0000000000000318
      [  303.959061] RBP: ffff880037877dd8 R08: ffff88003c5a0f00 R09: 0000000000000002
      [  303.959061] R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000000
      [  303.959061] R13: 0000000000000318 R14: ffff880037a2b680 R15: ffff88003c1cbe00
      [  303.959061] FS:  0000000000000000(0000) GS:ffff88003fc00000(0000) knlGS:0000000000000000
      [  303.959061] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      [  303.959061] CR2: 0000000000000318 CR3: 000000003d49c000 CR4: 00000000000006f0
      [  303.959061] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [  303.959061] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      [  303.959061] Process nfsd (pid: 264, threadinfo ffff880037876000, task ffff88003c1fd0a0)
      [  303.959061] Stack:
      [  303.959061]  ffff880037877e08 ffffffffa03772ec ffff88003d879000 ffff88003d879278
      [  303.959061]  ffff88003d879080 0000000000000000 ffff880037877e38 ffffffffa0222a1f
      [  303.959061]  0000000000107ac0 ffff88003c22e000 ffff88003d879000 ffff88003c1cbe00
      [  303.959061] Call Trace:
      [  303.959061]  [<ffffffffa03772ec>] nfsd4_conn_lost+0x3c/0xa0 [nfsd]
      [  303.959061]  [<ffffffffa0222a1f>] svc_delete_xprt+0x10f/0x180 [sunrpc]
      [  303.959061]  [<ffffffffa0223d96>] svc_recv+0xe6/0x580 [sunrpc]
      [  303.959061]  [<ffffffffa03587c5>] nfsd+0xb5/0x140 [nfsd]
      [  303.959061]  [<ffffffffa0358710>] ? nfsd_destroy+0x90/0x90 [nfsd]
      [  303.959061]  [<ffffffff8107ae00>] kthread+0xc0/0xd0
      [  303.959061]  [<ffffffff81010000>] ? perf_trace_xen_mmu_set_pte_at+0x50/0x100
      [  303.959061]  [<ffffffff8107ad40>] ? kthread_freezable_should_stop+0x70/0x70
      [  303.959061]  [<ffffffff814898ec>] ret_from_fork+0x7c/0xb0
      [  303.959061]  [<ffffffff8107ad40>] ? kthread_freezable_should_stop+0x70/0x70
      [  303.959061] Code: ff ff 5d c3 90 90 90 90 90 90 90 90 90 0f 1f 44 00 00 55 65 48 8b 04 25 f0 c6 00 00 48 89 e5 83 80 44 e0 ff ff 01 b8 00 01 00 00 <3e> 66 0f c1 07 0f b6 d4 38 c2 74 0f 66 0f 1f 44 00 00 f3 90 0f
      [  303.959061] RIP  [<ffffffff81481a8e>] _raw_spin_lock+0x1e/0x40
      [  303.959061]  RSP <ffff880037877dd8>
      [  303.959061] CR2: 0000000000000318
      [  304.001218] ---[ end trace 2d809cd4a7931f5a ]---
      [  304.001903] note: nfsd[264] exited with preempt_count 2
      Reported-by: default avatarBryan Schumaker <bjschuma@netapp.com>
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      b0a9d3ab
    • J. Bruce Fields's avatar
      nfsd4: handle seqid-mutating open errors from xdr decoding · 9d313b17
      J. Bruce Fields authored
      If a client sets an owner (or group_owner or acl) attribute on open for
      create, and the mapping of that owner to an id fails, then we return
      BAD_OWNER.  But BAD_OWNER is a seqid-mutating error, so we can't
      shortcut the open processing that case: we have to at least look up the
      owner so we can find the seqid to bump.
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      9d313b17
    • J. Bruce Fields's avatar
      nfsd4: remove BUG_ON · b600de7a
      J. Bruce Fields authored
      This BUG_ON just crashes the thread a little earlier than it would
      otherwise--it doesn't seem useful.
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      b600de7a
    • Jeff Layton's avatar
      nfsd: scale up the number of DRC hash buckets with cache size · 0733c7ba
      Jeff Layton authored
      We've now increased the size of the duplicate reply cache by quite a
      bit, but the number of hash buckets has not changed. So, we've gone from
      an average hash chain length of 16 in the old code to 4096 when the
      cache is its largest. Change the code to scale out the number of buckets
      with the max size of the cache.
      
      At the same time, we also need to fix the hash function since the
      existing one isn't really suitable when there are more than 256 buckets.
      Move instead to use the stock hash_32 function for this. Testing on a
      machine that had 2048 buckets showed that this gave a smaller
      longest:average ratio than the existing hash function:
      
      The formula here is longest hash bucket searched divided by average
      number of entries per bucket at the time that we saw that longest
      bucket:
      
          old hash: 68/(39258/2048) == 3.547404
          hash_32:  45/(33773/2048) == 2.728807
      Signed-off-by: default avatarJeff Layton <jlayton@redhat.com>
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      0733c7ba
    • Jeff Layton's avatar
      nfsd: keep stats on worst hash balancing seen so far · 98d821bd
      Jeff Layton authored
      The typical case with the DRC is a cache miss, so if we keep track of
      the max number of entries that we've ever walked over in a search, then
      we should have a reasonable estimate of the longest hash chain that
      we've ever seen.
      
      With that, we'll also keep track of the total size of the cache when we
      see the longest chain. In the case of a tie, we prefer to track the
      smallest total cache size in order to properly gauge the worst-case
      ratio of max vs. avg chain length.
      Signed-off-by: default avatarJeff Layton <jlayton@redhat.com>
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      98d821bd
    • Jeff Layton's avatar
      nfsd: add new reply_cache_stats file in nfsdfs · a2f999a3
      Jeff Layton authored
      For presenting statistics relating to duplicate reply cache.
      Signed-off-by: default avatarJeff Layton <jlayton@redhat.com>
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      a2f999a3
    • Jeff Layton's avatar
      6c6910cd
    • Jeff Layton's avatar
      nfsd: break out comparator into separate function · 9dc56143
      Jeff Layton authored
      Break out the function that compares the rqstp and checksum against a
      reply cache entry. While we're at it, track the efficacy of the checksum
      over the NFS data by tracking the cases where we would have incorrectly
      matched a DRC entry if we had not tracked it or the length.
      Signed-off-by: default avatarJeff Layton <jlayton@redhat.com>
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      9dc56143