1. 29 Sep, 2015 21 commits
  2. 28 Sep, 2015 19 commits
    • Grant Likely's avatar
      drivercore: Fix unregistration path of platform devices · 9965eafe
      Grant Likely authored
      commit 7f5dcaf1 upstream.
      
      The unregister path of platform_device is broken. On registration, it
      will register all resources with either a parent already set, or
      type==IORESOURCE_{IO,MEM}. However, on unregister it will release
      everything with type==IORESOURCE_{IO,MEM}, but ignore the others. There
      are also cases where resources don't get registered in the first place,
      like with devices created by of_platform_populate()*.
      
      Fix the unregister path to be symmetrical with the register path by
      checking the parent pointer instead of the type field to decide which
      resources to unregister. This is safe because the upshot of the
      registration path algorithm is that registered resources have a parent
      pointer, and non-registered resources do not.
      
      * It can be argued that of_platform_populate() should be registering
        it's resources, and they argument has some merit. However, there are
        quite a few platforms that end up broken if we try to do that due to
        overlapping resources in the device tree. Until that is fixed, we need
        to solve the immediate problem.
      
      Cc: Pantelis Antoniou <pantelis.antoniou@konsulko.com>
      Cc: Wolfram Sang <wsa@the-dreams.de>
      Cc: Rob Herring <robh@kernel.org>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Ricardo Ribalda Delgado <ricardo.ribalda@gmail.com>
      Signed-off-by: default avatarGrant Likely <grant.likely@linaro.org>
      Tested-by: default avatarRicardo Ribalda Delgado <ricardo.ribalda@gmail.com>
      Tested-by: default avatarWolfram Sang <wsa+renesas@sang-engineering.com>
      Signed-off-by: default avatarRob Herring <robh@kernel.org>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      9965eafe
    • David Daney's avatar
      of/address: Don't loop forever in of_find_matching_node_by_address(). · ea95c941
      David Daney authored
      commit 3a496b00 upstream.
      
      If the internal call to of_address_to_resource() fails, we end up
      looping forever in of_find_matching_node_by_address().  This can be
      caused by a defective device tree, or calling with an incorrect
      matches argument.
      
      Fix by calling of_find_matching_node() unconditionally at the end of
      the loop.
      Signed-off-by: default avatarDavid Daney <david.daney@cavium.com>
      Signed-off-by: default avatarRob Herring <robh@kernel.org>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      ea95c941
    • Adrien Schildknecht's avatar
      rtlwifi: rtl8192cu: Add new device ID · 334cf11f
      Adrien Schildknecht authored
      commit 1642d09f upstream.
      
      The v2 of NetGear WNA1000M uses a different idProduct: USB ID 0846:9043
      Signed-off-by: default avatarAdrien Schildknecht <adrien+dev@schischi.me>
      Acked-by: default avatarLarry Finger <Larry.Finger@lwfinger.net>
      Signed-off-by: default avatarKalle Valo <kvalo@codeaurora.org>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      334cf11f
    • Jan H. Schönherr's avatar
      sched: Fix cpu_active_mask/cpu_online_mask race · 9ecde004
      Jan H. Schönherr authored
      commit dd9d3843 upstream.
      
      There is a race condition in SMP bootup code, which may result
      in
      
          WARNING: CPU: 0 PID: 1 at kernel/workqueue.c:4418
          workqueue_cpu_up_callback()
      or
          kernel BUG at kernel/smpboot.c:135!
      
      It can be triggered with a bit of luck in Linux guests running
      on busy hosts.
      
      	CPU0                        CPUn
      	====                        ====
      
      	_cpu_up()
      	  __cpu_up()
      				    start_secondary()
      				      set_cpu_online()
      					cpumask_set_cpu(cpu,
      						   to_cpumask(cpu_online_bits));
      	  cpu_notify(CPU_ONLINE)
      	    <do stuff, see below>
      					cpumask_set_cpu(cpu,
      						   to_cpumask(cpu_active_bits));
      
      During the various CPU_ONLINE callbacks CPUn is online but not
      active. Several things can go wrong at that point, depending on
      the scheduling of tasks on CPU0.
      
      Variant 1:
      
        cpu_notify(CPU_ONLINE)
          workqueue_cpu_up_callback()
            rebind_workers()
              set_cpus_allowed_ptr()
      
        This call fails because it requires an active CPU; rebind_workers()
        ends with a warning:
      
          WARNING: CPU: 0 PID: 1 at kernel/workqueue.c:4418
          workqueue_cpu_up_callback()
      
      Variant 2:
      
        cpu_notify(CPU_ONLINE)
          smpboot_thread_call()
            smpboot_unpark_threads()
             ..
              __kthread_unpark()
                __kthread_bind()
                wake_up_state()
                 ..
                  select_task_rq()
                    select_fallback_rq()
      
        The ->wake_cpu of the unparked thread is not allowed, making a call
        to select_fallback_rq() necessary. Then, select_fallback_rq() cannot
        find an allowed, active CPU and promptly resets the allowed CPUs, so
        that the task in question ends up on CPU0.
      
        When those unparked tasks are eventually executed, they run
        immediately into a BUG:
      
          kernel BUG at kernel/smpboot.c:135!
      
      Just changing the order in which the online/active bits are set
      (and adding some memory barriers), would solve the two issues
      above. However, it would change the order of operations back to
      the one before commit 6acbfb96 ("sched: Fix hotplug vs.
      set_cpus_allowed_ptr()"), thus, reintroducing that particular
      problem.
      
      Going further back into history, we have at least the following
      commits touching this topic:
      - commit 2baab4e9 ("sched: Fix select_fallback_rq() vs cpu_active/cpu_online")
      - commit 5fbd036b ("sched: Cleanup cpu_active madness")
      
      Together, these give us the following non-working solutions:
      
        - secondary CPU sets active before online, because active is assumed to
          be a subset of online;
      
        - secondary CPU sets online before active, because the primary CPU
          assumes that an online CPU is also active;
      
        - secondary CPU sets online and waits for primary CPU to set active,
          because it might deadlock.
      
      Commit 875ebe94 ("powerpc/smp: Wait until secondaries are
      active & online") introduces an arch-specific solution to this
      arch-independent problem.
      
      Now, go for a more general solution without explicit waiting and
      simply set active twice: once on the secondary CPU after online
      was set and once on the primary CPU after online was seen.
      
      set_cpus_allowed_ptr()")
      Signed-off-by: default avatarJan H. Schönherr <jschoenh@amazon.de>
      Acked-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Anton Blanchard <anton@samba.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Joerg Roedel <jroedel@suse.de>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Matt Wilson <msw@amazon.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Fixes: 6acbfb96 ("sched: Fix hotplug vs. set_cpus_allowed_ptr()")
      Link: http://lkml.kernel.org/r/1439408156-18840-1-git-send-email-jschoenh@amazon.deSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      9ecde004
    • Jan Kara's avatar
      xfs: Fix file type directory corruption for btree directories · f3897ff3
      Jan Kara authored
      commit 03754234 upstream.
      
      Users have occasionally reported that file type for some directory
      entries is wrong. This mostly happened after updating libraries some
      libraries. After some debugging the problem was traced down to
      xfs_dir2_node_replace(). The function uses args->filetype as a file type
      to store in the replaced directory entry however it also calls
      xfs_da3_node_lookup_int() which will store file type of the current
      directory entry in args->filetype. Thus we fail to change file type of a
      directory entry to a proper type.
      
      Fix the problem by storing new file type in a local variable before
      calling xfs_da3_node_lookup_int().
      Reported-by: default avatarGiacomo Comes <comes@naic.edu>
      Signed-off-by: default avatarJan Kara <jack@suse.com>
      Reviewed-by: default avatarDave Chinner <dchinner@redhat.com>
      Signed-off-by: default avatarDave Chinner <david@fromorbit.com>
      [ luis: backported to 3.16:
        - file rename: fs/xfs/libxfs/xfs_dir2_node.c -> fs/xfs/xfs_dir2_node.c ]
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      f3897ff3
    • Stephen Chandler Paul's avatar
      DRM - radeon: Don't link train DisplayPort on HPD until we get the dpcd · 15b134ca
      Stephen Chandler Paul authored
      commit 924f92bf upstream.
      
      Most of the time this isn't an issue since hotplugging an adaptor will
      trigger a crtc mode change which in turn, causes the driver to probe
      every DisplayPort for a dpcd. However, in cases where hotplugging
      doesn't cause a mode change (specifically when one unplugs a monitor
      from a DisplayPort connector, then plugs that same monitor back in
      seconds later on the same port without any other monitors connected), we
      never probe for the dpcd before starting the initial link training. What
      happens from there looks like this:
      
      	- GPU has only one monitor connected. It's connected via
      	  DisplayPort, and does not go through an adaptor of any sort.
      
      	- User unplugs DisplayPort connector from GPU.
      
      	- Change in HPD is detected by the driver, we probe every
      	  DisplayPort for a possible connection.
      
      	- Probe the port the user originally had the monitor connected
      	  on for it's dpcd. This fails, and we clear the first (and only
      	  the first) byte of the dpcd to indicate we no longer have a
      	  dpcd for this port.
      
      	- User plugs the previously disconnected monitor back into the
      	  same DisplayPort.
      
      	- radeon_connector_hotplug() is called before everyone else,
      	  and tries to handle the link training. Since only the first
      	  byte of the dpcd is zeroed, the driver is able to complete
      	  link training but does so against the wrong dpcd, causing it
      	  to initialize the link with the wrong settings.
      
      	- Display stays blank (usually), dpcd is probed after the
      	  initial link training, and the driver prints no obvious
      	  messages to the log.
      
      In theory, since only one byte of the dpcd is chopped off (specifically,
      the byte that contains the revision information for DisplayPort), it's
      not entirely impossible that this bug may not show on certain monitors.
      For instance, the only reason this bug was visible on my ASUS PB238
      monitor was due to the fact that this monitor using the enhanced framing
      symbol sequence, the flag for which is ignored if the radeon driver
      thinks that the DisplayPort version is below 1.1.
      Signed-off-by: default avatarStephen Chandler Paul <cpaul@redhat.com>
      Reviewed-by: default avatarJerome Glisse <jglisse@redhat.com>
      Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      15b134ca
    • Benjamin Cama's avatar
      ARM: orion5x: fix legacy orion5x IRQ numbers · b5614b19
      Benjamin Cama authored
      commit 5be9fc23 upstream.
      
      Since v3.18, attempts to deliver IRQ0 are rejected, breaking orion5x.
      Fix this by increasing all interrupts by one, as did 5d6bed2a for
      dove. Also, force MULTI_IRQ_HANDLER for all orion platforms (including
      dove) as the specific handler is needed to shift back IRQ numbers by
      one.
      
      [gregory.clement@free-electrons.com]: moved the select
      MULTI_IRQ_HANDLER from PLAT_ORION_LEGACY to ARCH_ORION5X as it broke
      the build for dove.
      
      Fixes: a71b092a ("ARM: Convert handle_IRQ to use __handle_domain_irq")
      Signed-off-by: default avatarBenjamin Cama <benoar@dolka.fr>
      Signed-off-by: default avatarGregory CLEMENT <gregory.clement@free-electrons.com>
      Tested-by: default avatarDetlef Vollmann <dv@vollmann.ch>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      b5614b19
    • Filipe Manana's avatar
      Btrfs: check if previous transaction aborted to avoid fs corruption · 09f33db0
      Filipe Manana authored
      commit 1f9b8c8f upstream.
      
      While we are committing a transaction, it's possible the previous one is
      still finishing its commit and therefore we wait for it to finish first.
      However we were not checking if that previous transaction ended up getting
      aborted after we waited for it to commit, so we ended up committing the
      current transaction which can lead to fs corruption because the new
      superblock can point to trees that have had one or more nodes/leafs that
      were never durably persisted.
      The following sequence diagram exemplifies how this is possible:
      
                CPU 0                                                        CPU 1
      
        transaction N starts
      
        (...)
      
        btrfs_commit_transaction(N)
      
          cur_trans->state = TRANS_STATE_COMMIT_START;
          (...)
          cur_trans->state = TRANS_STATE_COMMIT_DOING;
          (...)
      
          cur_trans->state = TRANS_STATE_UNBLOCKED;
          root->fs_info->running_transaction = NULL;
      
                                                                    btrfs_start_transaction()
                                                                       --> starts transaction N + 1
      
          btrfs_write_and_wait_transaction(trans, root);
            --> starts writing all new or COWed ebs created
                at transaction N
      
                                                                    creates some new ebs, COWs some
                                                                    existing ebs but doesn't COW or
                                                                    deletes eb X
      
                                                                    btrfs_commit_transaction(N + 1)
                                                                      (...)
                                                                      cur_trans->state = TRANS_STATE_COMMIT_START;
                                                                      (...)
                                                                      wait_for_commit(root, prev_trans);
                                                                        --> prev_trans == transaction N
      
          btrfs_write_and_wait_transaction() continues
          writing ebs
             --> fails writing eb X, we abort transaction N
                 and set bit BTRFS_FS_STATE_ERROR on
                 fs_info->fs_state, so no new transactions
                 can start after setting that bit
      
             cleanup_transaction()
               btrfs_cleanup_one_transaction()
                 wakes up task at CPU 1
      
                                                                      continues, doesn't abort because
                                                                      cur_trans->aborted (transaction N + 1)
                                                                      is zero, and no checks for bit
                                                                      BTRFS_FS_STATE_ERROR in fs_info->fs_state
                                                                      are made
      
                                                                      btrfs_write_and_wait_transaction(trans, root);
                                                                        --> succeeds, no errors during writeback
      
                                                                      write_ctree_super(trans, root, 0);
                                                                        --> succeeds
                                                                        --> we have now a superblock that points us
                                                                            to some root that uses eb X, which was
                                                                            never written to disk
      
      In this scenario future attempts to read eb X from disk results in an
      error message like "parent transid verify failed on X wanted Y found Z".
      
      So fix this by aborting the current transaction if after waiting for the
      previous transaction we verify that it was aborted.
      Signed-off-by: default avatarFilipe Manana <fdmanana@suse.com>
      Reviewed-by: default avatarJosef Bacik <jbacik@fb.com>
      Reviewed-by: default avatarLiu Bo <bo.li.liu@oracle.com>
      Signed-off-by: default avatarChris Mason <clm@fb.com>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      09f33db0
    • Jeff Vander Stoep's avatar
      arm64: kconfig: Move LIST_POISON to a safe value · 49865041
      Jeff Vander Stoep authored
      commit bf0c4e04 upstream.
      
      Move the poison pointer offset to 0xdead000000000000, a
      recognized value that is not mappable by user-space exploits.
      Acked-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: default avatarThierry Strudel <tstrudel@google.com>
      Signed-off-by: default avatarJeff Vander Stoep <jeffv@google.com>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      49865041
    • Jan Kara's avatar
      xfs: Fix xfs_attr_leafblock definition · b3af1914
      Jan Kara authored
      commit ffeecc52 upstream.
      
      struct xfs_attr_leafblock contains 'entries' array which is declared
      with size 1 altough it can in fact contain much more entries. Since this
      array is followed by further struct members, gcc (at least in version
      4.8.3) thinks that the array has the fixed size of 1 element and thus
      may optimize away all accesses beyond the end of array resulting in
      non-working code. This problem was only observed with userspace code in
      xfsprogs, however it's better to be safe in kernel as well and have
      matching kernel and xfsprogs definitions.
      Signed-off-by: default avatarJan Kara <jack@suse.com>
      Reviewed-by: default avatarDave Chinner <dchinner@redhat.com>
      Signed-off-by: default avatarDave Chinner <david@fromorbit.com>
      [ luis: backported to 3.16:
        - file rename: fs/xfs/libxfs/xfs_da_format.h -> fs/xfs/xfs_da_format.h ]
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      b3af1914
    • Darrick J. Wong's avatar
      libxfs: readahead of dir3 data blocks should use the read verifier · f7aae115
      Darrick J. Wong authored
      commit 2f123bce upstream.
      
      In the dir3 data block readahead function, use the regular read
      verifier to check the block's CRC and spot-check the block contents
      instead of directly calling only the spot-checking routine.  This
      prevents corrupted directory data blocks from being read into the
      kernel, which can lead to garbage ls output and directory loops (if
      say one of the entries contains slashes and other junk).
      Signed-off-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      Reviewed-by: default avatarDave Chinner <dchinner@redhat.com>
      Signed-off-by: default avatarDave Chinner <david@fromorbit.com>
      [ luis: backported to 3.16:
        - file rename: fs/xfs/libxfs/xfs_dir2_data.c -> fs/xfs/xfs_dir2_data.c
        - adjusted context ]
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      f7aae115
    • Tyler Hicks's avatar
      eCryptfs: Invalidate dcache entries when lower i_nlink is zero · 707c89b2
      Tyler Hicks authored
      commit 5556e7e6 upstream.
      
      Consider eCryptfs dcache entries to be stale when the corresponding
      lower inode's i_nlink count is zero. This solves a problem caused by the
      lower inode being directly modified, without going through the eCryptfs
      mount, leaving stale eCryptfs dentries cached and the eCryptfs inode's
      i_nlink count not being cleared.
      Signed-off-by: default avatarTyler Hicks <tyhicks@canonical.com>
      Reported-by: default avatarRichard Weinberger <richard@nod.at>
      [ luis: backported to 3.16:
        - use dentry->d_inode instead of d_inode(dentry)
        - adjusted context ]
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      707c89b2
    • Don Zickus's avatar
      HID: usbhid: Fix the check for HID_RESET_PENDING in hid_io_error · 11fb2166
      Don Zickus authored
      commit 3af4e5a9 upstream.
      
      It was reported that after 10-20 reboots, a usb keyboard plugged
      into a docking station would not work unless it was replugged in.
      
      Using usbmon, it turns out the interrupt URBs were streaming with
      callback errors of -71 for some reason.  The hid-core.c::hid_io_error was
      supposed to retry and then reset, but the reset wasn't really happening.
      
      The check for HID_NO_BANDWIDTH was inverted.  Fix was simple.
      
      Tested by reporter and locally by me by unplugging a keyboard halfway until I
      could recreate a stream of errors but no disconnect.
      Signed-off-by: default avatarDon Zickus <dzickus@redhat.com>
      Signed-off-by: default avatarJiri Kosina <jkosina@suse.cz>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      11fb2166
    • Shota Suzuki's avatar
      igb: Fix oops caused by missing queue pairing · f6d39fea
      Shota Suzuki authored
      commit 72ddef05 upstream.
      
      When initializing igb driver (e.g. 82576, I350), IGB_FLAG_QUEUE_PAIRS is
      set if adapter->rss_queues exceeds half of max_rss_queues in
      igb_init_queue_configuration().
      On the other hand, IGB_FLAG_QUEUE_PAIRS is not set even if the number of
      queues exceeds half of max_combined in igb_set_channels() when changing
      the number of queues by "ethtool -L".
      In this case, if numvecs is larger than MAX_MSIX_ENTRIES (10), the size
      of adapter->msix_entries[], an overflow can occur in
      igb_set_interrupt_capability(), which in turn leads to an oops.
      
      Fix this problem as follows:
       - When changing the number of queues by "ethtool -L", set
         IGB_FLAG_QUEUE_PAIRS in the same way as initializing igb driver.
       - When increasing the size of q_vector, reallocate it appropriately.
         (With IGB_FLAG_QUEUE_PAIRS set, the size of q_vector gets larger.)
      
      Another possible way to fix this problem is to cap the queues at its
      initial number, which is the number of the initial online cpus. But this
      is not the optimal way because we cannot increase queues when another
      cpu becomes online.
      
      Note that before commit cd14ef54 ("igb: Change to use statically
      allocated array for MSIx entries"), this problem did not cause oops
      but just made the number of queues become 1 because of entering msi_only
      mode in igb_set_interrupt_capability().
      
      Fixes: 907b7835 ("igb: Add ethtool support to configure number of channels")
      Signed-off-by: default avatarShota Suzuki <suzuki_shota_t3@lab.ntt.co.jp>
      Tested-by: default avatarAaron Brown <aaron.f.brown@intel.com>
      Signed-off-by: default avatarJeff Kirsher <jeffrey.t.kirsher@intel.com>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      f6d39fea
    • David Ward's avatar
      a6b0138e
    • Matthijs Kooijman's avatar
      USB: ftdi_sio: Added custom PID for CustomWare products · f989c3c5
      Matthijs Kooijman authored
      commit 1fb8dc36 upstream.
      
      CustomWare uses the FTDI VID with custom PIDs for their ShipModul MiniPlex
      products.
      Signed-off-by: default avatarMatthijs Kooijman <matthijs@stdin.nl>
      Signed-off-by: default avatarJohan Hovold <johan@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      f989c3c5
    • Philipp Hachtmann's avatar
      USB: symbolserial: Use usb_get_serial_port_data · 5ced1dd4
      Philipp Hachtmann authored
      commit 951d3793 upstream.
      
      The driver used usb_get_serial_data(port->serial) which compiled but resulted
      in a NULL pointer being returned (and subsequently used). I did not go deeper
      into this but I guess this is a regression.
      Signed-off-by: default avatarPhilipp Hachtmann <hachti@hachti.de>
      Fixes: a85796ee ("USB: symbolserial: move private-data allocation to
      port_probe")
      Acked-by: default avatarJohan Hovold <johan@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      5ced1dd4
    • Peter Chen's avatar
      usb: host: ehci-sys: delete useless bus_to_hcd conversion · 416b9ba1
      Peter Chen authored
      commit 0521cfd0 upstream.
      
      The ehci platform device's drvdata is the pointer of struct usb_hcd
      already, so we doesn't need to call bus_to_hcd conversion again.
      Signed-off-by: default avatarPeter Chen <peter.chen@freescale.com>
      Acked-by: default avatarAlan Stern <stern@rowland.harvard.edu>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      416b9ba1
    • Kinglong Mee's avatar
      NFS: Fix a NULL pointer dereference of migration recovery ops for v4.2 client · 6a64d8c4
      Kinglong Mee authored
      commit 18e3b739 upstream.
      
      ---Steps to Reproduce--
      <nfs-server>
      # cat /etc/exports
      /nfs/referal  *(rw,insecure,no_subtree_check,no_root_squash,crossmnt)
      /nfs/old      *(ro,insecure,subtree_check,root_squash,crossmnt)
      
      <nfs-client>
      # mount -t nfs nfs-server:/nfs/ /mnt/
      # ll /mnt/*/
      
      <nfs-server>
      # cat /etc/exports
      /nfs/referal   *(rw,insecure,no_subtree_check,no_root_squash,crossmnt,refer=/nfs/old/@nfs-server)
      /nfs/old       *(ro,insecure,subtree_check,root_squash,crossmnt)
      # service nfs restart
      
      <nfs-client>
      # ll /mnt/*/    --->>>>> oops here
      
      [ 5123.102925] BUG: unable to handle kernel NULL pointer dereference at           (null)
      [ 5123.103363] IP: [<ffffffffa03ed38b>] nfs4_proc_get_locations+0x9b/0x120 [nfsv4]
      [ 5123.103752] PGD 587b9067 PUD 3cbf5067 PMD 0
      [ 5123.104131] Oops: 0000 [#1]
      [ 5123.104529] Modules linked in: nfsv4(OE) nfs(OE) fscache(E) nfsd(OE) xfs libcrc32c iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi coretemp crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel ppdev vmw_balloon parport_pc parport i2c_piix4 shpchp auth_rpcgss nfs_acl vmw_vmci lockd grace sunrpc vmwgfx drm_kms_helper ttm drm mptspi serio_raw scsi_transport_spi e1000 mptscsih mptbase ata_generic pata_acpi [last unloaded: nfsd]
      [ 5123.105887] CPU: 0 PID: 15853 Comm: ::1-manager Tainted: G           OE   4.2.0-rc6+ #214
      [ 5123.106358] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 05/20/2014
      [ 5123.106860] task: ffff88007620f300 ti: ffff88005877c000 task.ti: ffff88005877c000
      [ 5123.107363] RIP: 0010:[<ffffffffa03ed38b>]  [<ffffffffa03ed38b>] nfs4_proc_get_locations+0x9b/0x120 [nfsv4]
      [ 5123.107909] RSP: 0018:ffff88005877fdb8  EFLAGS: 00010246
      [ 5123.108435] RAX: ffff880053f3bc00 RBX: ffff88006ce6c908 RCX: ffff880053a0d240
      [ 5123.108968] RDX: ffffea0000e6d940 RSI: ffff8800399a0000 RDI: ffff88006ce6c908
      [ 5123.109503] RBP: ffff88005877fe28 R08: ffffffff81c708a0 R09: 0000000000000000
      [ 5123.110045] R10: 00000000000001a2 R11: ffff88003ba7f5c8 R12: ffff880054c55800
      [ 5123.110618] R13: 0000000000000000 R14: ffff880053a0d240 R15: ffff880053a0d240
      [ 5123.111169] FS:  0000000000000000(0000) GS:ffffffff81c27000(0000) knlGS:0000000000000000
      [ 5123.111726] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [ 5123.112286] CR2: 0000000000000000 CR3: 0000000054cac000 CR4: 00000000001406f0
      [ 5123.112888] Stack:
      [ 5123.113458]  ffffea0000e6d940 ffff8800399a0000 00000000000167d0 0000000000000000
      [ 5123.114049]  0000000000000000 0000000000000000 0000000000000000 00000000a7ec82c6
      [ 5123.114662]  ffff88005877fe18 ffffea0000e6d940 ffff8800399a0000 ffff880054c55800
      [ 5123.115264] Call Trace:
      [ 5123.115868]  [<ffffffffa03fb44b>] nfs4_try_migration+0xbb/0x220 [nfsv4]
      [ 5123.116487]  [<ffffffffa03fcb3b>] nfs4_run_state_manager+0x4ab/0x7b0 [nfsv4]
      [ 5123.117104]  [<ffffffffa03fc690>] ? nfs4_do_reclaim+0x510/0x510 [nfsv4]
      [ 5123.117813]  [<ffffffff810a4527>] kthread+0xd7/0xf0
      [ 5123.118456]  [<ffffffff810a4450>] ? kthread_worker_fn+0x160/0x160
      [ 5123.119108]  [<ffffffff816d9cdf>] ret_from_fork+0x3f/0x70
      [ 5123.119723]  [<ffffffff810a4450>] ? kthread_worker_fn+0x160/0x160
      [ 5123.120329] Code: 4c 8b 6a 58 74 17 eb 52 48 8d 55 a8 89 c6 4c 89 e7 e8 4a b5 ff ff 8b 45 b0 85 c0 74 1c 4c 89 f9 48 8b 55 90 48 8b 75 98 48 89 df <41> ff 55 00 3d e8 d8 ff ff 41 89 c6 74 cf 48 8b 4d c8 65 48 33
      [ 5123.121643] RIP  [<ffffffffa03ed38b>] nfs4_proc_get_locations+0x9b/0x120 [nfsv4]
      [ 5123.122308]  RSP <ffff88005877fdb8>
      [ 5123.122942] CR2: 0000000000000000
      
      Fixes: ec011fe8 ("NFS: Introduce a vector of migration recovery ops")
      Signed-off-by: default avatarKinglong Mee <kinglongmee@gmail.com>
      Signed-off-by: default avatarTrond Myklebust <trond.myklebust@primarydata.com>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      6a64d8c4