1. 09 Apr, 2013 1 commit
  2. 08 Apr, 2013 1 commit
    • J. Bruce Fields's avatar
      nfsd4: cleanup handling of nfsv4.0 closed stateid's · 9411b1d4
      J. Bruce Fields authored
      Closed stateid's are kept around a little while to handle close replays
      in the 4.0 case.  So we stash them in the last-used stateid in the
      oo_last_closed_stateid field of the open owner.  We can free that in
      encode_seqid_op_tail once the seqid on the open owner is next
      incremented.  But we don't want to do that on the close itself; so we
      set NFS4_OO_PURGE_CLOSE flag set on the open owner, skip freeing it the
      first time through encode_seqid_op_tail, then when we see that flag set
      next time we free it.
      
      This is unnecessarily baroque.
      
      Instead, just move the logic that increments the seqid out of the xdr
      code and into the operation code itself.
      
      The justification given for the current placement is that we need to
      wait till the last minute to be sure we know whether the status is a
      sequence-id-mutating error or not, but examination of the code shows
      that can't actually happen.
      Reported-by: default avatarYanchuan Nian <ycnian@gmail.com>
      Tested-by: default avatarYanchuan Nian <ycnian@gmail.com>
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      9411b1d4
  3. 04 Apr, 2013 4 commits
  4. 03 Apr, 2013 25 commits
    • Alexey Khoroshilov's avatar
      SUNRPC/cache: add module_put() on error path in cache_open() · a7823c79
      Alexey Khoroshilov authored
      If kmalloc() fails in cache_open(), module cd->owner left locked.
      The patch adds module_put(cd->owner) on this path.
      
      Found by Linux Driver Verification project (linuxtesting.org).
      Signed-off-by: default avatarAlexey Khoroshilov <khoroshilov@ispras.ru>
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      a7823c79
    • fanchaoting's avatar
      nfsd: remove /proc/fs/nfs when create /proc/fs/nfs/exports error · ff7c4b36
      fanchaoting authored
      when create /proc/fs/nfs/exports error, we should remove /proc/fs/nfs,
      if don't do it, it maybe cause Memory leak.
      Signed-off-by: default avatarfanchaoting <fanchaoting@cn.fujitsu.com>
      Reviewed-by: default avatarchendt.fnst <chendt.fnst@cn.fujitsu.com>
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      ff7c4b36
    • fanchaoting's avatar
      nfsd: don't run get_file if nfs4_preprocess_stateid_op return error · b022032e
      fanchaoting authored
      we should return error status directly when nfs4_preprocess_stateid_op
      return error.
      Signed-off-by: default avatarfanchaoting <fanchaoting@cn.fujitsu.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      b022032e
    • Jeff Layton's avatar
      nfsd: convert the file_hashtbl to a hlist · 89876f8c
      Jeff Layton authored
      We only ever traverse the hash chains in the forward direction, so a
      double pointer list head isn't really necessary.
      Signed-off-by: default avatarJeff Layton <jlayton@redhat.com>
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      89876f8c
    • J. Bruce Fields's avatar
      nfsd4: don't destroy in-use session · 66b2b9b2
      J. Bruce Fields authored
      This changes session destruction to be similar to client destruction in
      that attempts to destroy a session while in use (which should be rare
      corner cases) result in DELAY.  This simplifies things somewhat and
      helps meet a coming 4.2 requirement.
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      66b2b9b2
    • J. Bruce Fields's avatar
      nfsd4: don't destroy in-use clients · 221a6876
      J. Bruce Fields authored
      When a setclientid_confirm or create_session confirms a client after a
      client reboot, it also destroys any previous state held by that client.
      
      The shutdown of that previous state must be careful not to free the
      client out from under threads processing other requests that refer to
      the client.
      
      This is a particular problem in the NFSv4.1 case when we hold a
      reference to a session (hence a client) throughout compound processing.
      
      The server attempts to handle this by unhashing the client at the time
      it's destroyed, then delaying the final free to the end.  But this still
      leaves some races in the current code.
      
      I believe it's simpler just to fail the attempt to destroy the client by
      returning NFS4ERR_DELAY.  This is a case that should never happen
      anyway.
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      221a6876
    • J. Bruce Fields's avatar
      nfsd4: simplify bind_conn_to_session locking · 4f6e6c17
      J. Bruce Fields authored
      The locking here is very fiddly, and there's no reason for us to be
      setting cstate->session, since this is the only op in the compound.
      Let's just take the state lock and drop the reference counting.
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      4f6e6c17
    • J. Bruce Fields's avatar
      nfsd4: fix destroy_session race · abcdff09
      J. Bruce Fields authored
      destroy_session uses the session and client without continuously holding
      any reference or locks.
      
      Put the whole thing under the state lock for now.
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      abcdff09
    • J. Bruce Fields's avatar
      nfsd4: clientid lookup cleanup · bfa85e83
      J. Bruce Fields authored
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      bfa85e83
    • J. Bruce Fields's avatar
      nfsd4: destroy_clientid simplification · c0293b01
      J. Bruce Fields authored
      I'm not sure what the check for clientid expiry was meant to do here.
      
      The check for a matching session is redundant given the previous check
      for state: a client without state is, in particular, a client without
      sessions.
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      c0293b01
    • J. Bruce Fields's avatar
      nfsd4: remove some dprintk's · 1ca50792
      J. Bruce Fields authored
      E.g. printk's that just report the return value from an op are
      uninteresting as we already do that in the main proc_compound loop.
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      1ca50792
    • J. Bruce Fields's avatar
      nfsd4: STALE_STATEID cleanup · 0eb6f20a
      J. Bruce Fields authored
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      0eb6f20a
    • J. Bruce Fields's avatar
      nfsd4: warn on odd create_session state · 78389046
      J. Bruce Fields authored
      This should never happen.
      
      (Note: the comparable case in setclientid_confirm *can* happen, since
      updating a client record can result in both confirmed and unconfirmed
      records with the same clientid.)
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      78389046
    • ycnian@gmail.com's avatar
      nfsd: fix bug on nfs4 stateid deallocation · 491402a7
      ycnian@gmail.com authored
      NFS4_OO_PURGE_CLOSE is not handled properly. To avoid memory leak, nfs4
      stateid which is pointed by oo_last_closed_stid is freed in nfsd4_close(),
      but NFS4_OO_PURGE_CLOSE isn't cleared meanwhile. So the stateid released in
      THIS close procedure may be freed immediately in the coming encoding function.
      Sorry that Signed-off-by was forgotten in last version.
      Signed-off-by: default avatarYanchuan Nian <ycnian@gmail.com>
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      491402a7
    • Yanchuan Nian's avatar
      nfsd: remove unused macro in nfsv4 · 9c6bdbb8
      Yanchuan Nian authored
      lk_rflags is never used anywhere, and rflags is not defined in struct
      nfsd4_lock.
      Signed-off-by: default avatarYanchuan Nian <ycnian@gmail.com>
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      9c6bdbb8
    • J. Bruce Fields's avatar
      nfsd4: fix use-after-free of 4.1 client on connection loss · 2e4b7239
      J. Bruce Fields authored
      Once we drop the lock here there's nothing keeping the client around:
      the only lock still held is the xpt_lock on this socket, but this socket
      no longer has any connection with the client so there's no way for other
      code to know we're still using the client.
      
      The solution is simple: all nfsd4_probe_callback does is set a few
      variables and queue some work, so there's no reason we can't just keep
      it under the lock.
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      2e4b7239
    • J. Bruce Fields's avatar
      nfsd4: fix race on client shutdown · b0a9d3ab
      J. Bruce Fields authored
      Dropping the session's reference count after the client's means we leave
      a window where the session's se_client pointer is NULL.  An xpt_user
      callback that encounters such a session may then crash:
      
      [  303.956011] BUG: unable to handle kernel NULL pointer dereference at 0000000000000318
      [  303.959061] IP: [<ffffffff81481a8e>] _raw_spin_lock+0x1e/0x40
      [  303.959061] PGD 37811067 PUD 3d498067 PMD 0
      [  303.959061] Oops: 0002 [#8] PREEMPT SMP
      [  303.959061] Modules linked in: md5 nfsd auth_rpcgss nfs_acl snd_hda_intel snd_hda_codec snd_hwdep snd_pcm snd_page_alloc microcode psmouse snd_timer serio_raw pcspkr evdev snd soundcore i2c_piix4 i2c_core intel_agp intel_gtt processor button nfs lockd sunrpc fscache ata_generic pata_acpi ata_piix uhci_hcd libata btrfs usbcore usb_common crc32c scsi_mod libcrc32c zlib_deflate floppy virtio_balloon virtio_net virtio_pci virtio_blk virtio_ring virtio
      [  303.959061] CPU 0
      [  303.959061] Pid: 264, comm: nfsd Tainted: G      D      3.8.0-ARCH+ #156 Bochs Bochs
      [  303.959061] RIP: 0010:[<ffffffff81481a8e>]  [<ffffffff81481a8e>] _raw_spin_lock+0x1e/0x40
      [  303.959061] RSP: 0018:ffff880037877dd8  EFLAGS: 00010202
      [  303.959061] RAX: 0000000000000100 RBX: ffff880037a2b698 RCX: ffff88003d879278
      [  303.959061] RDX: ffff88003d879278 RSI: dead000000100100 RDI: 0000000000000318
      [  303.959061] RBP: ffff880037877dd8 R08: ffff88003c5a0f00 R09: 0000000000000002
      [  303.959061] R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000000
      [  303.959061] R13: 0000000000000318 R14: ffff880037a2b680 R15: ffff88003c1cbe00
      [  303.959061] FS:  0000000000000000(0000) GS:ffff88003fc00000(0000) knlGS:0000000000000000
      [  303.959061] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      [  303.959061] CR2: 0000000000000318 CR3: 000000003d49c000 CR4: 00000000000006f0
      [  303.959061] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [  303.959061] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      [  303.959061] Process nfsd (pid: 264, threadinfo ffff880037876000, task ffff88003c1fd0a0)
      [  303.959061] Stack:
      [  303.959061]  ffff880037877e08 ffffffffa03772ec ffff88003d879000 ffff88003d879278
      [  303.959061]  ffff88003d879080 0000000000000000 ffff880037877e38 ffffffffa0222a1f
      [  303.959061]  0000000000107ac0 ffff88003c22e000 ffff88003d879000 ffff88003c1cbe00
      [  303.959061] Call Trace:
      [  303.959061]  [<ffffffffa03772ec>] nfsd4_conn_lost+0x3c/0xa0 [nfsd]
      [  303.959061]  [<ffffffffa0222a1f>] svc_delete_xprt+0x10f/0x180 [sunrpc]
      [  303.959061]  [<ffffffffa0223d96>] svc_recv+0xe6/0x580 [sunrpc]
      [  303.959061]  [<ffffffffa03587c5>] nfsd+0xb5/0x140 [nfsd]
      [  303.959061]  [<ffffffffa0358710>] ? nfsd_destroy+0x90/0x90 [nfsd]
      [  303.959061]  [<ffffffff8107ae00>] kthread+0xc0/0xd0
      [  303.959061]  [<ffffffff81010000>] ? perf_trace_xen_mmu_set_pte_at+0x50/0x100
      [  303.959061]  [<ffffffff8107ad40>] ? kthread_freezable_should_stop+0x70/0x70
      [  303.959061]  [<ffffffff814898ec>] ret_from_fork+0x7c/0xb0
      [  303.959061]  [<ffffffff8107ad40>] ? kthread_freezable_should_stop+0x70/0x70
      [  303.959061] Code: ff ff 5d c3 90 90 90 90 90 90 90 90 90 0f 1f 44 00 00 55 65 48 8b 04 25 f0 c6 00 00 48 89 e5 83 80 44 e0 ff ff 01 b8 00 01 00 00 <3e> 66 0f c1 07 0f b6 d4 38 c2 74 0f 66 0f 1f 44 00 00 f3 90 0f
      [  303.959061] RIP  [<ffffffff81481a8e>] _raw_spin_lock+0x1e/0x40
      [  303.959061]  RSP <ffff880037877dd8>
      [  303.959061] CR2: 0000000000000318
      [  304.001218] ---[ end trace 2d809cd4a7931f5a ]---
      [  304.001903] note: nfsd[264] exited with preempt_count 2
      Reported-by: default avatarBryan Schumaker <bjschuma@netapp.com>
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      b0a9d3ab
    • J. Bruce Fields's avatar
      nfsd4: handle seqid-mutating open errors from xdr decoding · 9d313b17
      J. Bruce Fields authored
      If a client sets an owner (or group_owner or acl) attribute on open for
      create, and the mapping of that owner to an id fails, then we return
      BAD_OWNER.  But BAD_OWNER is a seqid-mutating error, so we can't
      shortcut the open processing that case: we have to at least look up the
      owner so we can find the seqid to bump.
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      9d313b17
    • J. Bruce Fields's avatar
      nfsd4: remove BUG_ON · b600de7a
      J. Bruce Fields authored
      This BUG_ON just crashes the thread a little earlier than it would
      otherwise--it doesn't seem useful.
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      b600de7a
    • Jeff Layton's avatar
      nfsd: scale up the number of DRC hash buckets with cache size · 0733c7ba
      Jeff Layton authored
      We've now increased the size of the duplicate reply cache by quite a
      bit, but the number of hash buckets has not changed. So, we've gone from
      an average hash chain length of 16 in the old code to 4096 when the
      cache is its largest. Change the code to scale out the number of buckets
      with the max size of the cache.
      
      At the same time, we also need to fix the hash function since the
      existing one isn't really suitable when there are more than 256 buckets.
      Move instead to use the stock hash_32 function for this. Testing on a
      machine that had 2048 buckets showed that this gave a smaller
      longest:average ratio than the existing hash function:
      
      The formula here is longest hash bucket searched divided by average
      number of entries per bucket at the time that we saw that longest
      bucket:
      
          old hash: 68/(39258/2048) == 3.547404
          hash_32:  45/(33773/2048) == 2.728807
      Signed-off-by: default avatarJeff Layton <jlayton@redhat.com>
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      0733c7ba
    • Jeff Layton's avatar
      nfsd: keep stats on worst hash balancing seen so far · 98d821bd
      Jeff Layton authored
      The typical case with the DRC is a cache miss, so if we keep track of
      the max number of entries that we've ever walked over in a search, then
      we should have a reasonable estimate of the longest hash chain that
      we've ever seen.
      
      With that, we'll also keep track of the total size of the cache when we
      see the longest chain. In the case of a tie, we prefer to track the
      smallest total cache size in order to properly gauge the worst-case
      ratio of max vs. avg chain length.
      Signed-off-by: default avatarJeff Layton <jlayton@redhat.com>
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      98d821bd
    • Jeff Layton's avatar
      nfsd: add new reply_cache_stats file in nfsdfs · a2f999a3
      Jeff Layton authored
      For presenting statistics relating to duplicate reply cache.
      Signed-off-by: default avatarJeff Layton <jlayton@redhat.com>
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      a2f999a3
    • Jeff Layton's avatar
      6c6910cd
    • Jeff Layton's avatar
      nfsd: break out comparator into separate function · 9dc56143
      Jeff Layton authored
      Break out the function that compares the rqstp and checksum against a
      reply cache entry. While we're at it, track the efficacy of the checksum
      over the NFS data by tracking the cases where we would have incorrectly
      matched a DRC entry if we had not tracked it or the length.
      Signed-off-by: default avatarJeff Layton <jlayton@redhat.com>
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      9dc56143
    • Jeff Layton's avatar
      nfsd: eliminate one of the DRC cache searches · 0b9ea37f
      Jeff Layton authored
      The most common case is to do a search of the cache, followed by an
      insert. In the case where we have to allocate an entry off the slab,
      then we end up having to redo the search, which is wasteful.
      
      Better optimize the code for the common case by eliminating the initial
      search of the cache and always preallocating an entry. In the case of a
      cache hit, we'll end up just freeing that entry but that's preferable to
      an extra search.
      Signed-off-by: default avatarJeff Layton <jlayton@redhat.com>
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      0b9ea37f
  5. 26 Mar, 2013 1 commit
  6. 22 Mar, 2013 1 commit
  7. 18 Mar, 2013 2 commits
  8. 17 Mar, 2013 4 commits
    • Linus Torvalds's avatar
      Linux 3.9-rc3 · a937536b
      Linus Torvalds authored
      a937536b
    • David Rientjes's avatar
      perf,x86: fix link failure for non-Intel configs · 6c4d3bc9
      David Rientjes authored
      Commit 1d9d8639 ("perf,x86: fix kernel crash with PEBS/BTS after
      suspend/resume") introduces a link failure since
      perf_restore_debug_store() is only defined for CONFIG_CPU_SUP_INTEL:
      
      	arch/x86/power/built-in.o: In function `restore_processor_state':
      	(.text+0x45c): undefined reference to `perf_restore_debug_store'
      
      Fix it by defining the dummy function appropriately.
      Signed-off-by: default avatarDavid Rientjes <rientjes@google.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      6c4d3bc9
    • Linus Torvalds's avatar
      perf,x86: fix wrmsr_on_cpu() warning on suspend/resume · 2a6e06b2
      Linus Torvalds authored
      Commit 1d9d8639 ("perf,x86: fix kernel crash with PEBS/BTS after
      suspend/resume") fixed a crash when doing PEBS performance profiling
      after resuming, but in using init_debug_store_on_cpu() to restore the
      DS_AREA mtrr it also resulted in a new WARN_ON() triggering.
      
      init_debug_store_on_cpu() uses "wrmsr_on_cpu()", which in turn uses CPU
      cross-calls to do the MSR update.  Which is not really valid at the
      early resume stage, and the warning is quite reasonable.  Now, it all
      happens to _work_, for the simple reason that smp_call_function_single()
      ends up just doing the call directly on the CPU when the CPU number
      matches, but we really should just do the wrmsr() directly instead.
      
      This duplicates the wrmsr() logic, but hopefully we can just remove the
      wrmsr_on_cpu() version eventually.
      Reported-and-tested-by: default avatarParag Warudkar <parag.lkml@gmail.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      2a6e06b2
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs · 08637024
      Linus Torvalds authored
      Pull btrfs fixes from Chris Mason:
       "Eric's rcu barrier patch fixes a long standing problem with our
        unmount code hanging on to devices in workqueue helpers.  Liu Bo
        nailed down a difficult assertion for in-memory extent mappings."
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
        Btrfs: fix warning of free_extent_map
        Btrfs: fix warning when creating snapshots
        Btrfs: return as soon as possible when edquot happens
        Btrfs: return EIO if we have extent tree corruption
        btrfs: use rcu_barrier() to wait for bdev puts at unmount
        Btrfs: remove btrfs_try_spin_lock
        Btrfs: get better concurrency for snapshot-aware defrag work
      08637024
  9. 16 Mar, 2013 1 commit
    • Liu Bo's avatar
      Btrfs: fix warning of free_extent_map · 3b277594
      Liu Bo authored
      Users report that an extent map's list is still linked when it's actually
      going to be freed from cache.
      
      The story is that
      
      a) when we're going to drop an extent map and may split this large one into
      smaller ems, and if this large one is flagged as EXTENT_FLAG_LOGGING which means
      that it's on the list to be logged, then the smaller ems split from it will also
      be flagged as EXTENT_FLAG_LOGGING, and this is _not_ expected.
      
      b) we'll keep ems from unlinking the list and freeing when they are flagged with
      EXTENT_FLAG_LOGGING, because the log code holds one reference.
      
      The end result is the warning, but the truth is that we set the flag
      EXTENT_FLAG_LOGGING only during fsync.
      
      So clear flag EXTENT_FLAG_LOGGING for extent maps split from a large one.
      Reported-by: default avatarJohannes Hirte <johannes.hirte@fem.tu-ilmenau.de>
      Reported-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: default avatarLiu Bo <bo.li.liu@oracle.com>
      Signed-off-by: default avatarChris Mason <chris.mason@fusionio.com>
      3b277594