1. 29 Nov, 2013 40 commits
    • Sebastian Andrzej Siewior's avatar
      usb: musb: call musb_start() only once in OTG mode · 43ac9e19
      Sebastian Andrzej Siewior authored
      commit ae44df2e upstream.
      
      In commit 001dd84a ("usb: musb: start musb on the udc side, too") it was
      ensured that the state engine is started also in OTG mode after a
      removal / insertion of the gadget.
      Unfortunately this change also introduced a bug: If the device is
      configured as OTG and it connected with a remote host _without_ loading
      a gadget then we bug() later (because musb->otg->gadget is not
      initialized).
      Initially I assumed it might be nice to have the host part of musb in
      OTG mode working without having a gadget loaded. This bug and fact that
      it wasn't working like this before the host/gadget split made me realize
      that this was a silly idea.
      This patch now introduces back the old behavior where in OTG mode the
      host mode is only working after the gadget has been loaded.
      
      Cc: Daniel Mack <zonque@gmail.com>
      Signed-off-by: default avatarSebastian Andrzej Siewior <bigeasy@linutronix.de>
      Signed-off-by: default avatarFelipe Balbi <balbi@ti.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      43ac9e19
    • Sebastian Andrzej Siewior's avatar
      usb: musb: cancel work on removal · bbc21afd
      Sebastian Andrzej Siewior authored
      commit c5340bd1 upstream.
      
      So I captured this:
      
      |WARNING: CPU: 0 PID: 2078 at /home/bigeasy/work/new/TI/linux/lib/debugobjects.c:260 debug_print_object+0x94/0xc4()
      |ODEBUG: free active (active state 0) object type: work_struct hint: musb_irq_work+0x0/0x38 [musb_hdrc]
      |CPU: 0 PID: 2078 Comm: rmmod Not tainted 3.12.0-rc4+ #338
      |[<c0014d38>] (unwind_backtrace+0x0/0xf4) from [<c001249c>] (show_stack+0x14/0x1c)
      |[<c001249c>] (show_stack+0x14/0x1c) from [<c0037720>] (warn_slowpath_common+0x64/0x84)
      |[<c0037720>] (warn_slowpath_common+0x64/0x84) from [<c00377d4>] (warn_slowpath_fmt+0x30/0x40)
      |[<c00377d4>] (warn_slowpath_fmt+0x30/0x40) from [<c022ae90>] (debug_print_object+0x94/0xc4)
      |[<c022ae90>] (debug_print_object+0x94/0xc4) from [<c022b7e0>] (debug_check_no_obj_freed+0x1c0/0x228)
      |[<c022b7e0>] (debug_check_no_obj_freed+0x1c0/0x228) from [<c00f1f38>] (kfree+0xf8/0x228)
      |[<c00f1f38>] (kfree+0xf8/0x228) from [<c02921c4>] (release_nodes+0x1a8/0x248)
      |[<c02921c4>] (release_nodes+0x1a8/0x248) from [<c028f70c>] (__device_release_driver+0x98/0xf0)
      |[<c028f70c>] (__device_release_driver+0x98/0xf0) from [<c028f840>] (device_release_driver+0x24/0x34)
      |[<c028f840>] (device_release_driver+0x24/0x34) from [<c028ebe8>] (bus_remove_device+0x148/0x15c)
      |[<c028ebe8>] (bus_remove_device+0x148/0x15c) from [<c028d120>] (device_del+0x104/0x1c0)
      |[<c028d120>] (device_del+0x104/0x1c0) from [<c02911e4>] (platform_device_del+0x18/0xac)
      |[<c02911e4>] (platform_device_del+0x18/0xac) from [<c029179c>] (platform_device_unregister+0xc/0x18)
      |[<c029179c>] (platform_device_unregister+0xc/0x18) from [<bf1902fc>] (dsps_remove+0x20/0x4c [musb_dsps])
      |[<bf1902fc>] (dsps_remove+0x20/0x4c [musb_dsps]) from [<c0290d7c>] (platform_drv_remove+0x1c/0x24)
      |[<c0290d7c>] (platform_drv_remove+0x1c/0x24) from [<c028f704>] (__device_release_driver+0x90/0xf0)
      |[<c028f704>] (__device_release_driver+0x90/0xf0) from [<c028f818>] (driver_detach+0xb4/0xb8)
      |[<c028f818>] (driver_detach+0xb4/0xb8) from [<c028e6e8>] (bus_remove_driver+0x98/0xec)
      |[<c028e6e8>] (bus_remove_driver+0x98/0xec) from [<c008fc70>] (SyS_delete_module+0x1e0/0x24c)
      |[<c008fc70>] (SyS_delete_module+0x1e0/0x24c) from [<c000e680>] (ret_fast_syscall+0x0/0x48)
      |---[ end trace d79045419a3e51ec ]---
      
      The workqueue is only scheduled from the ep0 and never canceled in case
      the musb is removed before the work has a chance to run.
      Signed-off-by: default avatarSebastian Andrzej Siewior <bigeasy@linutronix.de>
      Signed-off-by: default avatarFelipe Balbi <balbi@ti.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      bbc21afd
    • Stanislaw Gruszka's avatar
      rt2800usb: slow down TX status polling · 3cc3e73b
      Stanislaw Gruszka authored
      commit 36165fd5 upstream.
      
      Polling TX statuses too frequently has two negative effects. First is
      randomly peek CPU usage, causing overall system functioning delays.
      Second bad effect is that device is not able to fill TX statuses in
      H/W register on some workloads and we get lot of timeouts like below:
      
      ieee80211 phy4: rt2800usb_entry_txstatus_timeout: Warning - TX status timeout for entry 7 in queue 2
      ieee80211 phy4: rt2800usb_entry_txstatus_timeout: Warning - TX status timeout for entry 7 in queue 2
      ieee80211 phy4: rt2800usb_txdone: Warning - Got TX status for an empty queue 2, dropping
      
      This not only cause flood of messages in dmesg, but also bad throughput,
      since rate scaling algorithm can not work optimally.
      
      In the future, we should probably make polling interval be adjusted
      automatically, but for now just increase values, this make mentioned
      problems gone.
      
      Resolve:
      https://bugzilla.kernel.org/show_bug.cgi?id=62781Signed-off-by: default avatarStanislaw Gruszka <sgruszka@redhat.com>
      Signed-off-by: default avatarJohn W. Linville <linville@tuxdriver.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3cc3e73b
    • Thomas Pugliese's avatar
      usb: wusbcore: set the RPIPE wMaxPacketSize value correctly · 76a8bf9e
      Thomas Pugliese authored
      commit 7b6bc07a upstream.
      
      For isochronous endpoints, set the RPIPE wMaxPacketSize value using
      wOverTheAirPacketSize from the endpoint companion descriptor instead of
      wMaxPacketSize from the normal endpoint descriptor.
      Signed-off-by: default avatarThomas Pugliese <thomas.pugliese@gmail.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      76a8bf9e
    • Julius Werner's avatar
      usb: hub: Clear Port Reset Change during init/resume · 38fee62c
      Julius Werner authored
      commit e92aee33 upstream.
      
      This patch adds the Port Reset Change flag to the set of bits that are
      preemptively cleared on init/resume of a hub. In theory this bit should
      never be set unexpectedly... in practice it can still happen if BIOS,
      SMM or ACPI code plays around with USB devices without cleaning up
      correctly. This is especially dangerous for XHCI root hubs, which don't
      generate any more Port Status Change Events until all change bits are
      cleared, so this is a good precaution to have (similar to how it's
      already done for the Warm Port Reset Change flag).
      Signed-off-by: default avatarJulius Werner <jwerner@chromium.org>
      Acked-by: default avatarAlan Stern <stern@rowland.harvard.edu>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      38fee62c
    • Sarah Sharp's avatar
      usb: Disable USB 2.0 Link PM before device reset. · 692a66b4
      Sarah Sharp authored
      commit dcc01c08 upstream.
      
      Before the USB core resets a device, we need to disable the L1 timeout
      for the roothub, if USB 2.0 Link PM is enabled.  Otherwise the port may
      transition into L1 in between descriptor fetches, before we know if the
      USB device descriptors changed.  LPM will be re-enabled after the
      full device descriptors are fetched, and we can confirm the device still
      supports USB 2.0 LPM after the reset.
      
      We don't need to wait for the USB device to exit L1 before resetting the
      device, since the xHCI roothub port diagrams show a transition to the
      Reset state from any of the Ux states (see Figure 34 in the 2012-08-14
      xHCI specification update).
      
      This patch should be backported to kernels as old as 3.2, that contain
      the commit 65580b43 "xHCI: set USB2
      hardware LPM".  That was the first commit to enable USB 2.0
      hardware-driven Link Power Management.
      Signed-off-by: default avatarSarah Sharp <sarah.a.sharp@linux.intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      692a66b4
    • Sarah Sharp's avatar
      xhci: Set L1 device slot on USB2 LPM enable/disable. · 6f181102
      Sarah Sharp authored
      commit 58e21f73 upstream.
      
      To enable USB 2.0 Link Power Management (LPM), the xHCI host controller
      needs the device slot ID to generate the device address used in L1 entry
      tokens.  That information is set in the L1 device slot ID field of the
      USB 2.0 LPM registers.
      
      Currently, the L1 device slot ID is overwritten when the xHCI driver
      initiates the software test of USB 2.0 Link PM in
      xhci_usb2_software_lpm_test.  It is never cleared when USB 2.0 Link PM
      is disabled for the device.  That should be harmless, because the
      Hardware LPM Enable (HLE) bit is cleared when USB 2.0 Link PM is
      disabled, so the host should not pay attention to the slot ID.
      
      This patch should have no effect on host behavior, but since
      xhci_usb2_software_lpm_test is going away in an upcoming bug fix patch,
      we need to move that code to the function that enables and disables USB
      2.0 Link PM.
      
      This patch should be backported to kernels as old as 3.11, that contain
      the commit a558ccdc "usb: xhci: add USB2
      Link power management BESL support".  The upcoming bug fix patch is also
      marked for that stable kernel.
      Signed-off-by: default avatarSarah Sharp <sarah.a.sharp@linux.intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6f181102
    • Mathias Nyman's avatar
      xhci: Enable LPM support only for hardwired or BESL devices · 2fbe9566
      Mathias Nyman authored
      commit 890dae88 upstream.
      
      Some usb3 devices falsely claim they support usb2 hardware Link PM
      when connected to a usb2 port. We only trust hardwired devices
      or devices with the later BESL LPM support to be LPM enabled as default.
      
      [Note: Sarah re-worked the original patch to move the code into the USB
      core, and updated it to check whether the USB device supports BESL,
      instead of checking if the xHCI port it's connected to supports BESL
      encoding.]
      
      This patch should be backported to kernels as old as 3.11, that
      contain the commit a558ccdc "usb: xhci:
      add USB2 Link power management BESL support".  Without this fix, some
      USB 3.0 devices will not enumerate or work properly under USB 2.0 ports
      on Haswell-ULT systems.
      Signed-off-by: default avatarMathias Nyman <mathias.nyman@linux.intel.com>
      Signed-off-by: default avatarSarah Sharp <sarah.a.sharp@linux.intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2fbe9566
    • Sarah Sharp's avatar
      usb: Don't enable USB 2.0 Link PM by default. · c1e847c7
      Sarah Sharp authored
      commit de68bab4 upstream.
      
      How it's supposed to work:
      --------------------------
      
      USB 2.0 Link PM is a lower power state that some newer USB 2.0 devices
      support.  USB 3.0 devices certified by the USB-IF are required to
      support it if they are plugged into a USB 2.0 only port, or a USB 2.0
      cable is used.  USB 2.0 Link PM requires both a USB device and a host
      controller that supports USB 2.0 hardware-enabled LPM.
      
      USB 2.0 Link PM is designed to be enabled once by software, and the host
      hardware handles transitions to the L1 state automatically.  The premise
      of USB 2.0 Link PM is to be able to put the device into a lower power
      link state when the bus is idle or the device NAKs USB IN transfers for
      a specified amount of time.
      
      ...but hardware is broken:
      --------------------------
      
      It turns out many USB 3.0 devices claim to support USB 2.0 Link PM (by
      setting the LPM bit in their USB 2.0 BOS descriptor), but they don't
      actually implement it correctly.  This manifests as the USB device
      refusing to respond to transfers when it is plugged into a USB 2.0 only
      port under the Haswell-ULT/Lynx Point LP xHCI host.
      
      These devices pass the xHCI driver's simple test to enable USB 2.0 Link
      PM, wait for the port to enter L1, and then bring it back into L0.  They
      only start to break when L1 entry is interleaved with transfers.
      
      Some devices then fail to respond to the next control transfer (usually
      a Set Configuration).  This results in devices never enumerating.
      
      Other mass storage devices (such as a later model Western Digital My
      Passport USB 3.0 hard drive) respond fine to going into L1 between
      control transfers.  They ACK the entry, come out of L1 when the host
      needs to send a control transfer, and respond properly to those control
      transfers.  However, when the first READ10 SCSI command is sent, the
      device NAKs the data phase while it's reading from the spinning disk.
      Eventually, the host requests to put the link into L1, and the device
      ACKs that request.  Then it never responds to the data phase of the
      READ10 command.  This results in not being able to read from the drive.
      
      Some mass storage devices (like the Corsair Survivor USB 3.0 flash
      drive) are well behaved.  They ACK the entry into L1 during control
      transfers, and when SCSI commands start coming in, they NAK the requests
      to go into L1, because they need to be at full power.
      
      Not all USB 3.0 devices advertise USB 2.0 link PM support.  My Point
      Grey USB 3.0 webcam advertises itself as a USB 2.1 device, but doesn't
      have a USB 2.0 BOS descriptor, so we don't enable USB 2.0 Link PM.  I
      suspect that means the device isn't certified.
      
      What do we do about it?
      -----------------------
      
      There's really no good way for the kernel to test these devices.
      Therefore, the kernel needs to disable USB 2.0 Link PM by default, and
      distros will have to enable it by writing 1 to the sysfs file
      /sys/bus/usb/devices/../power/usb2_hardware_lpm.  Rip out the xHCI Link
      PM test, since it's not sufficient to detect these buggy devices, and
      don't automatically enable LPM after the device is addressed.
      
      This patch should be backported to kernels as old as 3.11, that
      contain the commit a558ccdc "usb: xhci:
      add USB2 Link power management BESL support".  Without this fix, some
      USB 3.0 devices will not enumerate or work properly under USB 2.0 ports
      on Haswell-ULT systems.
      Signed-off-by: default avatarSarah Sharp <sarah.a.sharp@linux.intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c1e847c7
    • Tomas Winkler's avatar
      mei: nfc: fix memory leak in error path · a58c56c0
      Tomas Winkler authored
      commit 4bff7208 upstream.
      
      The flow may reach the err label without freeing cl and cl_info
      
      cl and cl_info weren't assigned to ndev->cl and cl_info
      so they weren't freed in mei_nfc_free called on error path
      
      Cc: Samuel Ortiz <sameo@linux.intel.com>
      Signed-off-by: default avatarTomas Winkler <tomas.winkler@intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a58c56c0
    • Trond Myklebust's avatar
      SUNRPC: Avoid deep recursion in rpc_release_client · 9dae4dbe
      Trond Myklebust authored
      commit d07ba842 upstream.
      
      In cases where an rpc client has a parent hierarchy, then
      rpc_free_client may end up calling rpc_release_client() on the
      parent, thus recursing back into rpc_free_client. If the hierarchy
      is deep enough, then we can get into situations where the stack
      simply overflows.
      
      The fix is to have rpc_release_client() loop so that it can take
      care of the parent rpc client hierarchy without needing to
      recurse.
      Reported-by: default avatarJeff Layton <jlayton@redhat.com>
      Reported-by: default avatarWeston Andros Adamson <dros@netapp.com>
      Reported-by: default avatarBruce Fields <bfields@fieldses.org>
      Link: http://lkml.kernel.org/r/2C73011F-0939-434C-9E4D-13A1EB1403D7@netapp.comSigned-off-by: default avatarTrond Myklebust <Trond.Myklebust@netapp.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9dae4dbe
    • Trond Myklebust's avatar
      SUNRPC: Fix a data corruption issue when retransmitting RPC calls · 2181c6aa
      Trond Myklebust authored
      commit a6b31d18 upstream.
      
      The following scenario can cause silent data corruption when doing
      NFS writes. It has mainly been observed when doing database writes
      using O_DIRECT.
      
      1) The RPC client uses sendpage() to do zero-copy of the page data.
      2) Due to networking issues, the reply from the server is delayed,
         and so the RPC client times out.
      
      3) The client issues a second sendpage of the page data as part of
         an RPC call retransmission.
      
      4) The reply to the first transmission arrives from the server
         _before_ the client hardware has emptied the TCP socket send
         buffer.
      5) After processing the reply, the RPC state machine rules that
         the call to be done, and triggers the completion callbacks.
      6) The application notices the RPC call is done, and reuses the
         pages to store something else (e.g. a new write).
      
      7) The client NIC drains the TCP socket send buffer. Since the
         page data has now changed, it reads a corrupted version of the
         initial RPC call, and puts it on the wire.
      
      This patch fixes the problem in the following manner:
      
      The ordering guarantees of TCP ensure that when the server sends a
      reply, then we know that the _first_ transmission has completed. Using
      zero-copy in that situation is therefore safe.
      If a time out occurs, we then send the retransmission using sendmsg()
      (i.e. no zero-copy), We then know that the socket contains a full copy of
      the data, and so it will retransmit a faithful reproduction even if the
      RPC call completes, and the application reuses the O_DIRECT buffer in
      the meantime.
      Signed-off-by: default avatarTrond Myklebust <Trond.Myklebust@netapp.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2181c6aa
    • Trond Myklebust's avatar
      SUNRPC: gss_alloc_msg - choose _either_ a v0 message or a v1 message · 1b1207b1
      Trond Myklebust authored
      commit 5fccc5b5 upstream.
      
      Add the missing 'break' to ensure that we don't corrupt a legacy 'v0' type
      message by appending the 'v1'.
      
      Cc: Bruce Fields <bfields@fieldses.org>
      Signed-off-by: default avatarTrond Myklebust <Trond.Myklebust@netapp.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1b1207b1
    • Christoph Lameter's avatar
      slub: Handle NULL parameter in kmem_cache_flags · 54fc381e
      Christoph Lameter authored
      commit c6f58d9b upstream.
      
      Andreas Herrmann writes:
      
        When I've used slub_debug kernel option (e.g.
        "slub_debug=,skbuff_fclone_cache" or similar) on a debug session I've
        seen a panic like:
      
          Highbank #setenv bootargs console=ttyAMA0 root=/dev/sda2 kgdboc.kgdboc=ttyAMA0,115200 slub_debug=,kmalloc-4096 earlyprintk=ttyAMA0
          ...
          Unable to handle kernel NULL pointer dereference at virtual address 00000000
          pgd = c0004000
          [00000000] *pgd=00000000
          Internal error: Oops: 5 [#1] SMP ARM
          Modules linked in:
          CPU: 0 PID: 0 Comm: swapper Tainted: G        W    3.12.0-00048-gbe408cd3 #314
          task: c0898360 ti: c088a000 task.ti: c088a000
          PC is at strncmp+0x1c/0x84
          LR is at kmem_cache_flags.isra.46.part.47+0x44/0x60
          pc : [<c02c6da0>]    lr : [<c0110a3c>]    psr: 200001d3
          sp : c088bea8  ip : c088beb8  fp : c088beb4
          r10: 00000000  r9 : 413fc090  r8 : 00000001
          r7 : 00000000  r6 : c2984a08  r5 : c0966e78  r4 : 00000000
          r3 : 0000006b  r2 : 0000000c  r1 : 00000000  r0 : c2984a08
          Flags: nzCv  IRQs off  FIQs off  Mode SVC_32  ISA ARM  Segment kernel
          Control: 10c5387d  Table: 0000404a  DAC: 00000015
          Process swapper (pid: 0, stack limit = 0xc088a248)
          Stack: (0xc088bea8 to 0xc088c000)
          bea0:                   c088bed4 c088beb8 c0110a3c c02c6d90 c0966e78 00000040
          bec0: ef001f00 00000040 c088bf14 c088bed8 c0112070 c0110a04 00000005 c010fac8
          bee0: c088bf5c c088bef0 c010fac8 ef001f00 00000040 00000000 00000040 00000001
          bf00: 413fc090 00000000 c088bf34 c088bf18 c0839190 c0112040 00000000 ef001f00
          bf20: 00000000 00000000 c088bf54 c088bf38 c0839200 c083914c 00000006 c0961c4c
          bf40: c0961c28 00000000 c088bf7c c088bf58 c08392ac c08391c0 c08a2ed8 c0966e78
          bf60: c086b874 c08a3f50 c0961c28 00000001 c088bfb4 c088bf80 c083b258 c0839248
          bf80: 2f800000 0f000000 c08935b4 ffffffff c08cd400 ffffffff c08cd400 c0868408
          bfa0: c29849c0 00000000 c088bff4 c088bfb8 c0824974 c083b1e4 ffffffff ffffffff
          bfc0: c08245c0 00000000 00000000 c0868408 00000000 10c5387d c0892bcc c0868404
          bfe0: c0899440 0000406a 00000000 c088bff8 00008074 c0824824 00000000 00000000
          [<c02c6da0>] (strncmp+0x1c/0x84) from [<c0110a3c>] (kmem_cache_flags.isra.46.part.47+0x44/0x60)
          [<c0110a3c>] (kmem_cache_flags.isra.46.part.47+0x44/0x60) from [<c0112070>] (__kmem_cache_create+0x3c/0x410)
          [<c0112070>] (__kmem_cache_create+0x3c/0x410) from [<c0839190>] (create_boot_cache+0x50/0x74)
          [<c0839190>] (create_boot_cache+0x50/0x74) from [<c0839200>] (create_kmalloc_cache+0x4c/0x88)
          [<c0839200>] (create_kmalloc_cache+0x4c/0x88) from [<c08392ac>] (create_kmalloc_caches+0x70/0x114)
          [<c08392ac>] (create_kmalloc_caches+0x70/0x114) from [<c083b258>] (kmem_cache_init+0x80/0xe0)
          [<c083b258>] (kmem_cache_init+0x80/0xe0) from [<c0824974>] (start_kernel+0x15c/0x318)
          [<c0824974>] (start_kernel+0x15c/0x318) from [<00008074>] (0x8074)
          Code: e3520000 01a00002 089da800 e5d03000 (e5d1c000)
          ---[ end trace 1b75b31a2719ed1d ]---
          Kernel panic - not syncing: Fatal exception
      
        Problem is that slub_debug option is not parsed before
        create_boot_cache is called. Solve this by changing slub_debug to
        early_param.
      
        Kernels 3.11, 3.10 are also affected.  I am not sure about older
        kernels.
      
      Christoph Lameter explains:
      
        kmem_cache_flags may be called with NULL parameter during early boot.
        Skip the test in that case.
      Reported-by: default avatarAndreas Herrmann <andreas.herrmann@calxeda.com>
      Signed-off-by: default avatarChristoph Lameter <cl@linux.com>
      Signed-off-by: default avatarPekka Enberg <penberg@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      54fc381e
    • Anton Blanchard's avatar
      powerpc/pseries: Duplicate dtl entries sometimes sent to userspace · 057a7c69
      Anton Blanchard authored
      commit 84b07386 upstream.
      
      When reading from the dispatch trace log (dtl) userspace interface, I
      sometimes see duplicate entries. One example:
      
      # hexdump -C dtl.out
      
      00000000  07 04 00 0c 00 00 48 44  00 00 00 00 00 00 00 00
      00000010  00 0c a0 b4 16 83 6d 68  00 00 00 00 00 00 00 00
      00000020  00 00 00 00 10 00 13 50  80 00 00 00 00 00 d0 32
      
      00000030  07 04 00 0c 00 00 48 44  00 00 00 00 00 00 00 00
      00000040  00 0c a0 b4 16 83 6d 68  00 00 00 00 00 00 00 00
      00000050  00 00 00 00 10 00 13 50  80 00 00 00 00 00 d0 32
      
      The problem is in scan_dispatch_log() where we call dtl_consumer()
      but bail out before incrementing the index.
      
      To fix this I moved dtl_consumer() after the timebase comparison.
      Signed-off-by: default avatarAnton Blanchard <anton@samba.org>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      057a7c69
    • Gavin Shan's avatar
      powerpc/eeh: Enable PCI_COMMAND_MASTER for PCI bridges · 80e6b610
      Gavin Shan authored
      commit bf898ec5 upstream.
      
      On PHB3, we will fail to fetch IODA tables without PCI_COMMAND_MASTER
      on PCI bridges. According to one experiment I had, the MSIx interrupts
      didn't raise from the adapter without the bit applied to all upstream
      PCI bridges including root port of the adapter. The patch forces to
      have that bit enabled accordingly.
      Signed-off-by: default avatarGavin Shan <shangw@linux.vnet.ibm.com>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      80e6b610
    • Michael Neuling's avatar
      powerpc/signals: Mark VSX not saved with small contexts · 3f0387f7
      Michael Neuling authored
      commit c13f20ac upstream.
      
      The VSX MSR bit in the user context indicates if the context contains VSX
      state.  Currently we set this when the process has touched VSX at any stage.
      
      Unfortunately, if the user has not provided enough space to save the VSX state,
      we can't save it but we currently still set the MSR VSX bit.
      
      This patch changes this to clear the MSR VSX bit when the user doesn't provide
      enough space.  This indicates that there is no valid VSX state in the user
      context.
      
      This is needed to support get/set/make/swapcontext for applications that use
      VSX but only provide a small context.  For example, getcontext in glibc
      provides a smaller context since the VSX registers don't need to be saved over
      the glibc function call.  But since the program calling getcontext may have
      used VSX, the kernel currently says the VSX state is valid when it's not.  If
      the returned context is then used in setcontext (ie. a small context without
      VSX but with MSR VSX set), the kernel will refuse the context.  This situation
      has been reported by the glibc community.
      
      Based on patch from Carlos O'Donell.
      Tested-by: default avatarHaren Myneni <haren@linux.vnet.ibm.com>
      Signed-off-by: default avatarMichael Neuling <mikey@neuling.org>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3f0387f7
    • Heiko Carstens's avatar
      powerpc: Fix __get_user_pages_fast() irq handling · 9e5139b7
      Heiko Carstens authored
      commit 95f715b0 upstream.
      
      __get_user_pages_fast() may be called with interrupts disabled (see e.g.
      get_futex_key() in kernel/futex.c) and therefore should use local_irq_save()
      and local_irq_restore() instead of local_irq_disable()/enable().
      Signed-off-by: default avatarHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9e5139b7
    • Anton Blanchard's avatar
      powerpc: ppc64 address space capped at 32TB, mmap randomisation disabled · 3aad6072
      Anton Blanchard authored
      commit 5a049f14 upstream.
      
      Commit fba2369e (mm: use vm_unmapped_area() on powerpc architecture)
      has a bug in slice_scan_available() where we compare an unsigned long
      (high_slices) against a shifted int. As a result, comparisons against
      the top 32 bits of high_slices (representing the top 32TB) always
      returns 0 and the top of our mmap region is clamped at 32TB
      
      This also breaks mmap randomisation since the randomised address is
      always up near the top of the address space and it gets clamped down
      to 32TB.
      Signed-off-by: default avatarAnton Blanchard <anton@samba.org>
      Acked-by: default avatarMichel Lespinasse <walken@google.com>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3aad6072
    • Gavin Shan's avatar
      powerpc/powernv: Add PE to its own PELTV · 9a439965
      Gavin Shan authored
      commit 631ad691 upstream.
      
      We need add PE to its own PELTV. Otherwise, the errors originated
      from the PE might contribute to other PEs. In the result, we can't
      clear up the error successfully even we're checking and clearing
      errors during access to PCI config space.
      
      Reported-by: kalshett@in.ibm.com
      Signed-off-by: default avatarGavin Shan <shangw@linux.vnet.ibm.com>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9a439965
    • Prarit Bhargava's avatar
      powerpc/vio: use strcpy in modalias_show · f457589d
      Prarit Bhargava authored
      commit 411cabf7 upstream.
      
      Commit e82b89a6 used strcat instead of
      strcpy which can result in an overflow of newlines on the buffer.
      
      Signed-off-by: Prarit Bhargava
      Cc: benh@kernel.crashing.org
      Cc: ben@decadent.org.uk
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f457589d
    • Gerhard Sittig's avatar
      powerpc/mpc512x: silence build warning upon disabled DIU · 12fbe485
      Gerhard Sittig authored
      commit 45d20e83 upstream.
      
      a disabled Kconfig option results in a reference to a not implemented
      routine when the IS_ENABLED() macro is used for both conditional
      implementation of the routine as well as a C language source code test
      at the call site -- the "if (0) func();" construct only gets eliminated
      later by the optimizer, while the compiler already has emitted its
      warning about "func()" being undeclared
      
      provide an empty implementation for the mpc512x_setup_diu() and
      mpc512x_init_diu() routines in case of the disabled option, to avoid the
      compiler warning which is considered fatal and breaks compilation
      
      the bug appeared with commit 2abbbb63
      "powerpc/mpc512x: move common code to shared.c file", how to reproduce:
      
        make mpc512x_defconfig
        echo CONFIG_FB_FSL_DIU=n >> .config && make olddefconfig
        make
      
          CC      arch/powerpc/platforms/512x/mpc512x_shared.o
        .../arch/powerpc/platforms/512x/mpc512x_shared.c: In function 'mpc512x_init_early':
        .../arch/powerpc/platforms/512x/mpc512x_shared.c:456:3: error: implicit declaration of function 'mpc512x_init_diu' [-Werror=implicit-function-declaration]
        .../arch/powerpc/platforms/512x/mpc512x_shared.c: In function 'mpc512x_setup_arch':
        .../arch/powerpc/platforms/512x/mpc512x_shared.c:469:3: error: implicit declaration of function 'mpc512x_setup_diu' [-Werror=implicit-function-declaration]
        cc1: all warnings being treated as errors
        make[4]: *** [arch/powerpc/platforms/512x/mpc512x_shared.o] Error 1
      Signed-off-by: default avatarGerhard Sittig <gsi@denx.de>
      Signed-off-by: default avatarAnatolij Gustschin <agust@denx.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      12fbe485
    • Anatolij Gustschin's avatar
      powerpc/52xx: fix build breakage for MPC5200 LPBFIFO module · 5e2bc1c5
      Anatolij Gustschin authored
      commit 2bf75084 upstream.
      
      The MPC5200 LPBFIFO driver requires the bestcomm module to be
      enabled, otherwise building will fail. Fix it.
      Reported-by: default avatarWolfgang Denk <wd@denx.de>
      Signed-off-by: default avatarAnatolij Gustschin <agust@denx.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5e2bc1c5
    • Mike Snitzer's avatar
      block: properly stack underlying max_segment_size to DM device · fa9d73ef
      Mike Snitzer authored
      commit d82ae52e upstream.
      
      Without this patch all DM devices will default to BLK_MAX_SEGMENT_SIZE
      (65536) even if the underlying device(s) have a larger value -- this is
      due to blk_stack_limits() using min_not_zero() when stacking the
      max_segment_size limit.
      
      1073741824
      
      before patch:
      65536
      
      after patch:
      1073741824
      Reported-by: default avatarLukasz Flis <l.flis@cyfronet.pl>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      fa9d73ef
    • Mikulas Patocka's avatar
      block: fix a probe argument to blk_register_region · 4ceb8127
      Mikulas Patocka authored
      commit a207f593 upstream.
      
      The probe function is supposed to return NULL on failure (as we can see in
      kobj_lookup: kobj = probe(dev, index, data); ... if (kobj) return kobj;
      
      However, in loop and brd, it returns negative error from ERR_PTR.
      
      This causes a crash if we simulate disk allocation failure and run
      less -f /dev/loop0 because the negative number is interpreted as a pointer:
      
      BUG: unable to handle kernel NULL pointer dereference at 00000000000002b4
      IP: [<ffffffff8118b188>] __blkdev_get+0x28/0x450
      PGD 23c677067 PUD 23d6d1067 PMD 0
      Oops: 0000 [#1] PREEMPT SMP
      Modules linked in: loop hpfs nvidia(PO) ip6table_filter ip6_tables uvesafb cfbcopyarea cfbimgblt cfbfillrect fbcon font bitblit fbcon_rotate fbcon_cw fbcon_ud fbcon_ccw softcursor fb fbdev msr ipt_MASQUERADE iptable_nat nf_nat_ipv4 nf_conntrack_ipv4 nf_defrag_ipv4 xt_state ipt_REJECT xt_tcpudp iptable_filter ip_tables x_tables bridge stp llc tun ipv6 cpufreq_stats cpufreq_ondemand cpufreq_userspace cpufreq_powersave cpufreq_conservative hid_generic spadfs usbhid hid fuse raid0 snd_usb_audio snd_pcm_oss snd_mixer_oss md_mod snd_pcm snd_timer snd_page_alloc snd_hwdep snd_usbmidi_lib dmi_sysfs snd_rawmidi nf_nat_ftp nf_nat nf_conntrack_ftp nf_conntrack snd soundcore lm85 hwmon_vid ohci_hcd ehci_pci ehci_hcd serverworks sata_svw libata acpi_cpufreq freq_table mperf ide_core usbcore kvm_amd kvm tg3 i2c_piix4 libphy microcode e100 usb_common ptp skge i2c_core pcspkr k10temp evdev floppy hwmon pps_core mii rtc_cmos button processor unix [last unloaded: nvidia]
      CPU: 1 PID: 6831 Comm: less Tainted: P        W  O 3.10.15-devel #18
      Hardware name: empty empty/S3992-E, BIOS 'V1.06   ' 06/09/2009
      task: ffff880203cc6bc0 ti: ffff88023e47c000 task.ti: ffff88023e47c000
      RIP: 0010:[<ffffffff8118b188>]  [<ffffffff8118b188>] __blkdev_get+0x28/0x450
      RSP: 0018:ffff88023e47dbd8  EFLAGS: 00010286
      RAX: ffffffffffffff74 RBX: ffffffffffffff74 RCX: 0000000000000000
      RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000001
      RBP: ffff88023e47dc18 R08: 0000000000000002 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000000 R12: ffff88023f519658
      R13: ffffffff8118c300 R14: 0000000000000000 R15: ffff88023f519640
      FS:  00007f2070bf7700(0000) GS:ffff880247400000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00000000000002b4 CR3: 000000023da1d000 CR4: 00000000000007e0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      Stack:
       0000000000000002 0000001d00000000 000000003e47dc50 ffff88023f519640
       ffff88043d5bb668 ffffffff8118c300 ffff88023d683550 ffff88023e47de60
       ffff88023e47dc98 ffffffff8118c10d 0000001d81605698 0000000000000292
      Call Trace:
       [<ffffffff8118c300>] ? blkdev_get_by_dev+0x60/0x60
       [<ffffffff8118c10d>] blkdev_get+0x1dd/0x370
       [<ffffffff8118c300>] ? blkdev_get_by_dev+0x60/0x60
       [<ffffffff813cea6c>] ? _raw_spin_unlock+0x2c/0x50
       [<ffffffff8118c300>] ? blkdev_get_by_dev+0x60/0x60
       [<ffffffff8118c365>] blkdev_open+0x65/0x80
       [<ffffffff8114d12e>] do_dentry_open.isra.18+0x23e/0x2f0
       [<ffffffff8114d214>] finish_open+0x34/0x50
       [<ffffffff8115e122>] do_last.isra.62+0x2d2/0xc50
       [<ffffffff8115eb58>] path_openat.isra.63+0xb8/0x4d0
       [<ffffffff81115a8e>] ? might_fault+0x4e/0xa0
       [<ffffffff8115f4f0>] do_filp_open+0x40/0x90
       [<ffffffff813cea6c>] ? _raw_spin_unlock+0x2c/0x50
       [<ffffffff8116db85>] ? __alloc_fd+0xa5/0x1f0
       [<ffffffff8114e45f>] do_sys_open+0xef/0x1d0
       [<ffffffff8114e559>] SyS_open+0x19/0x20
       [<ffffffff813cff16>] system_call_fastpath+0x1a/0x1f
      Code: 44 00 00 55 48 89 e5 41 57 49 89 ff 41 56 41 89 d6 41 55 41 54 4c 8d 67 18 53 48 83 ec 18 89 75 cc e9 f2 00 00 00 0f 1f 44 00 00 <48> 8b 80 40 03 00 00 48 89 df 4c 8b 68 58 e8 d5
      a4 07 00 44 89
      RIP  [<ffffffff8118b188>] __blkdev_get+0x28/0x450
       RSP <ffff88023e47dbd8>
      CR2: 00000000000002b4
      ---[ end trace bb7f32dbf02398dc ]---
      
      The brd change should be backported to stable kernels starting with 2.6.25.
      The loop change should be backported to stable kernels starting with 2.6.22.
      Signed-off-by: default avatarMikulas Patocka <mpatocka@redhat.com>
      Acked-by: default avatarTejun Heo <tj@kernel.org>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4ceb8127
    • Jeff Moyer's avatar
      block: fix race between request completion and timeout handling · 6c8a390a
      Jeff Moyer authored
      commit 4912aa6c upstream.
      
      crocode i2c_i801 i2c_core iTCO_wdt iTCO_vendor_support shpchp ioatdma dca be2net sg ses enclosure ext4 mbcache jbd2 sd_mod crc_t10dif ahci megaraid_sas(U) dm_mirror dm_region_hash dm_log dm_mod [last unloaded: scsi_wait_scan]
      
      Pid: 491, comm: scsi_eh_0 Tainted: G        W  ----------------   2.6.32-220.13.1.el6.x86_64 #1 IBM  -[8722PAX]-/00D1461
      RIP: 0010:[<ffffffff8124e424>]  [<ffffffff8124e424>] blk_requeue_request+0x94/0xa0
      RSP: 0018:ffff881057eefd60  EFLAGS: 00010012
      RAX: ffff881d99e3e8a8 RBX: ffff881d99e3e780 RCX: ffff881d99e3e8a8
      RDX: ffff881d99e3e8a8 RSI: ffff881d99e3e780 RDI: ffff881d99e3e780
      RBP: ffff881057eefd80 R08: ffff881057eefe90 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000000 R12: ffff881057f92338
      R13: 0000000000000000 R14: ffff881057f92338 R15: ffff883058188000
      FS:  0000000000000000(0000) GS:ffff880040200000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
      CR2: 00000000006d3ec0 CR3: 000000302cd7d000 CR4: 00000000000406b0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      Process scsi_eh_0 (pid: 491, threadinfo ffff881057eee000, task ffff881057e29540)
      Stack:
       0000000000001057 0000000000000286 ffff8810275efdc0 ffff881057f16000
      <0> ffff881057eefdd0 ffffffff81362323 ffff881057eefe20 ffffffff8135f393
      <0> ffff881057e29af8 ffff8810275efdc0 ffff881057eefe78 ffff881057eefe90
      Call Trace:
       [<ffffffff81362323>] __scsi_queue_insert+0xa3/0x150
       [<ffffffff8135f393>] ? scsi_eh_ready_devs+0x5e3/0x850
       [<ffffffff81362a23>] scsi_queue_insert+0x13/0x20
       [<ffffffff8135e4d4>] scsi_eh_flush_done_q+0x104/0x160
       [<ffffffff8135fb6b>] scsi_error_handler+0x35b/0x660
       [<ffffffff8135f810>] ? scsi_error_handler+0x0/0x660
       [<ffffffff810908c6>] kthread+0x96/0xa0
       [<ffffffff8100c14a>] child_rip+0xa/0x20
       [<ffffffff81090830>] ? kthread+0x0/0xa0
       [<ffffffff8100c140>] ? child_rip+0x0/0x20
      Code: 00 00 eb d1 4c 8b 2d 3c 8f 97 00 4d 85 ed 74 bf 49 8b 45 00 49 83 c5 08 48 89 de 4c 89 e7 ff d0 49 8b 45 00 48 85 c0 75 eb eb a4 <0f> 0b eb fe 0f 1f 84 00 00 00 00 00 55 48 89 e5 0f 1f 44 00 00
      RIP  [<ffffffff8124e424>] blk_requeue_request+0x94/0xa0
       RSP <ffff881057eefd60>
      
      The RIP is this line:
              BUG_ON(blk_queued_rq(rq));
      
      After digging through the code, I think there may be a race between the
      request completion and the timer handler running.
      
      A timer is started for each request put on the device's queue (see
      blk_start_request->blk_add_timer).  If the request does not complete
      before the timer expires, the timer handler (blk_rq_timed_out_timer)
      will mark the request complete atomically:
      
      static inline int blk_mark_rq_complete(struct request *rq)
      {
              return test_and_set_bit(REQ_ATOM_COMPLETE, &rq->atomic_flags);
      }
      
      and then call blk_rq_timed_out.  The latter function will call
      scsi_times_out, which will return one of BLK_EH_HANDLED,
      BLK_EH_RESET_TIMER or BLK_EH_NOT_HANDLED.  If BLK_EH_RESET_TIMER is
      returned, blk_clear_rq_complete is called, and blk_add_timer is again
      called to simply wait longer for the request to complete.
      
      Now, if the request happens to complete while this is going on, what
      happens?  Given that we know the completion handler will bail if it
      finds the REQ_ATOM_COMPLETE bit set, we need to focus on the completion
      handler running after that bit is cleared.  So, from the above
      paragraph, after the call to blk_clear_rq_complete.  If the completion
      sets REQ_ATOM_COMPLETE before the BUG_ON in blk_add_timer, we go boom
      there (I haven't seen this in the cores).  Next, if we get the
      completion before the call to list_add_tail, then the timer will
      eventually fire for an old req, which may either be freed or reallocated
      (there is evidence that this might be the case).  Finally, if the
      completion comes in *after* the addition to the timeout list, I think
      it's harmless.  The request will be removed from the timeout list,
      req_atom_complete will be set, and all will be well.
      
      This will only actually explain the coredumps *IF* the request
      structure was freed, reallocated *and* queued before the error handler
      thread had a chance to process it.  That is possible, but it may make
      sense to keep digging for another race.  I think that if this is what
      was happening, we would see other instances of this problem showing up
      as null pointer or garbage pointer dereferences, for example when the
      request structure was not re-used.  It looks like we actually do run
      into that situation in other reports.
      
      This patch moves the BUG_ON(test_bit(REQ_ATOM_COMPLETE,
      &req->atomic_flags)); from blk_add_timer to the only caller that could
      trip over it (blk_start_request).  It then inverts the calls to
      blk_clear_rq_complete and blk_add_timer in blk_rq_timed_out to address
      the race.  I've boot tested this patch, but nothing more.
      Signed-off-by: default avatarJeff Moyer <jmoyer@redhat.com>
      Acked-by: default avatarHannes Reinecke <hare@suse.de>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6c8a390a
    • Roger Tseng's avatar
      drivers/memstick/core/ms_block.c: fix unreachable state in h_msb_read_page() · 74dfb60d
      Roger Tseng authored
      commit a0e5a12f upstream.
      
      In h_msb_read_page() in ms_block.c, flow never reaches case
      MSB_RP_RECIVE_STATUS_REG.  This causes error when MEMSTICK_INT_ERR is
      encountered and status error bits are going to be examined, but the status
      will never be copied back.
      
      Fix it by transitioning to MSB_RP_RECIVE_STATUS_REG right after
      MSB_RP_SEND_READ_STATUS_REG.
      Signed-off-by: default avatarRoger Tseng <rogerable@realtek.com>
      Acked-by: default avatarMaxim Levitsky <maximlevitsky@gmail.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      74dfb60d
    • Guenter Roeck's avatar
      hwmon: (lm90) Fix max6696 alarm handling · 425a8197
      Guenter Roeck authored
      commit e41fae2b upstream.
      
      Bit 2 of status register 2 on MAX6696 (external diode 2 open)
      sets ALERT; the bit thus has to be listed in alert_alarms.
      Also display a message in the alert handler if the condition
      is encountered.
      
      Even though not all overtemperature conditions cause ALERT
      to be set, we should not ignore them in the alert handler.
      Display messages for all out-of-range conditions.
      Reported-by: default avatarJean Delvare <khali@linux-fr.org>
      Signed-off-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Signed-off-by: default avatarJean Delvare <khali@linux-fr.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      425a8197
    • Christoffer Dall's avatar
      arm/arm64: KVM: Fix hyp mappings of vmalloc regions · 696e67db
      Christoffer Dall authored
      commit 40c2729b upstream.
      
      Using virt_to_phys on percpu mappings is horribly wrong as it may be
      backed by vmalloc.  Introduce kvm_kaddr_to_phys which translates both
      types of valid kernel addresses to the corresponding physical address.
      
      At the same time resolves a typing issue where we were storing the
      physical address as a 32 bit unsigned long (on arm), truncating the
      physical address for addresses above the 4GB limit.  This caused
      breakage on Keystone.
      Reported-by: default avatarSantosh Shilimkar <santosh.shilimkar@ti.com>
      Tested-by: default avatarSantosh Shilimkar <santosh.shilimkar@ti.com>
      Signed-off-by: default avatarMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: default avatarChristoffer Dall <christoffer.dall@linaro.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      696e67db
    • Greg Edwards's avatar
      KVM: IOMMU: hva align mapping page size · 6492d85c
      Greg Edwards authored
      commit 27ef63c7 upstream.
      
      When determining the page size we could use to map with the IOMMU, the
      page size should also be aligned with the hva, not just the gfn.  The
      gfn may not reflect the real alignment within the hugetlbfs file.
      
      Most of the time, this works fine.  However, if the hugetlbfs file is
      backed by non-contiguous huge pages, a multi-huge page memslot starts at
      an unaligned offset within the hugetlbfs file, and the gfn is aligned
      with respect to the huge page size, kvm_host_page_size() will return the
      huge page size and we will use that to map with the IOMMU.
      
      When we later unpin that same memslot, the IOMMU returns the unmap size
      as the huge page size, and we happily unpin that many pfns in
      monotonically increasing order, not realizing we are spanning
      non-contiguous huge pages and partially unpin the wrong huge page.
      
      Ensure the IOMMU mapping page size is aligned with the hva corresponding
      to the gfn, which does reflect the alignment within the hugetlbfs file.
      Reviewed-by: default avatarMarcelo Tosatti <mtosatti@redhat.com>
      Signed-off-by: default avatarGreg Edwards <gedwards@ddn.com>
      Signed-off-by: default avatarGleb Natapov <gleb@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6492d85c
    • Kevin Hao's avatar
      ftrace/x86: skip over the breakpoint for ftrace caller · 26146207
      Kevin Hao authored
      commit ab4ead02 upstream.
      
      In commit 8a4d0a68 "ftrace: Use breakpoint method to update ftrace
      caller", we choose to use breakpoint method to update the ftrace
      caller. But we also need to skip over the breakpoint in function
      ftrace_int3_handler() for them. Otherwise weird things would happen.
      Signed-off-by: default avatarKevin Hao <haokexin@gmail.com>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      26146207
    • Paolo Bonzini's avatar
      KVM: x86: fix emulation of "movzbl %bpl, %eax" · 94152e4a
      Paolo Bonzini authored
      commit daf72722 upstream.
      
      When I was looking at RHEL5.9's failure to start with
      unrestricted_guest=0/emulate_invalid_guest_state=1, I got it working with a
      slightly older tree than kvm.git.  I now debugged the remaining failure,
      which was introduced by commit 660696d1 (KVM: X86 emulator: fix
      source operand decoding for 8bit mov[zs]x instructions, 2013-04-24)
      introduced a similar mis-emulation to the one in commit 8acb4207 (KVM:
      fix sil/dil/bpl/spl in the mod/rm fields, 2013-05-30).  The incorrect
      decoding occurs in 8-bit movzx/movsx instructions whose 8-bit operand
      is sil/dil/bpl/spl.
      
      Needless to say, "movzbl %bpl, %eax" does occur in RHEL5.9's decompression
      prolog, just a handful of instructions before finally giving control to
      the decompressed vmlinux and getting out of the invalid guest state.
      
      Because OpMem8 bypasses decode_modrm, the same handling of the REX prefix
      must be applied to OpMem8.
      Reported-by: default avatarMichele Baldessari <michele@redhat.com>
      Cc: Gleb Natapov <gleb@redhat.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarGleb Natapov <gleb@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      94152e4a
    • Thomas Renninger's avatar
      x86/microcode/amd: Tone down printk(), don't treat a missing firmware file as an error · 80b0114b
      Thomas Renninger authored
      commit 11f918d3 upstream.
      
      Do it the same way as done in microcode_intel.c: use pr_debug()
      for missing firmware files.
      
      There seem to be CPUs out there for which no microcode update
      has been submitted to kernel-firmware repo yet resulting in
      scary sounding error messages in dmesg:
      
        microcode: failed to load file amd-ucode/microcode_amd_fam16h.bin
      Signed-off-by: default avatarThomas Renninger <trenn@suse.de>
      Acked-by: default avatarBorislav Petkov <bp@suse.de>
      Link: http://lkml.kernel.org/r/1384274383-43510-1-git-send-email-trenn@suse.deSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      80b0114b
    • Fenghua Yu's avatar
      x86/apic: Disable I/O APIC before shutdown of the local APIC · 4c6df961
      Fenghua Yu authored
      commit 522e6646 upstream.
      
      In reboot and crash path, when we shut down the local APIC, the I/O APIC is
      still active. This may cause issues because external interrupts
      can still come in and disturb the local APIC during shutdown process.
      
      To quiet external interrupts, disable I/O APIC before shutdown local APIC.
      Signed-off-by: default avatarFenghua Yu <fenghua.yu@intel.com>
      Link: http://lkml.kernel.org/r/1382578212-4677-1-git-send-email-fenghua.yu@intel.com
      [ I suppose the 'issue' is a hang during shutdown. It's a fine change nevertheless. ]
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4c6df961
    • J. Bruce Fields's avatar
      nfsd4: fix xdr decoding of large non-write compounds · 5ec2a155
      J. Bruce Fields authored
      commit 365da4ad upstream.
      
      This fixes a regression from 24750082
      "nfsd4: fix decoding of compounds across page boundaries".  The previous
      code was correct: argp->pagelist is initialized in
      nfs4svc_deocde_compoundargs to rqstp->rq_arg.pages, and is therefore a
      pointer to the page *after* the page we are currently decoding.
      
      The reason that patch nevertheless fixed a problem with decoding
      compounds containing write was a bug in the write decoding introduced by
      5a80a54d "nfsd4: reorganize write
      decoding", after which write decoding no longer adhered to the rule that
      argp->pagelist point to the next page.
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5ec2a155
    • Christoph Hellwig's avatar
      nfsd: make sure to balance get/put_write_access · 479484f7
      Christoph Hellwig authored
      commit 987da479 upstream.
      
      Use a straight goto error label style in nfsd_setattr to make sure
      we always do the put_write_access call after we got it earlier.
      
      Note that the we have been failing to do that in the case
      nfsd_break_lease() returns an error, a bug introduced into 2.6.38 with
      6a76bebe "nfsd4: break lease on nfsd
      setattr".
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      479484f7
    • Christoph Hellwig's avatar
      nfsd: split up nfsd_setattr · fc2d834a
      Christoph Hellwig authored
      commit 818e5a22 upstream.
      
      Split out two helpers to make the code more readable and easier to verify
      for correctness.
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      fc2d834a
    • Jeff Layton's avatar
      nfs: don't retry detect_trunking with RPC_AUTH_UNIX more than once · 64eed789
      Jeff Layton authored
      commit 6d769f1e upstream.
      
      Currently, when we try to mount and get back NFS4ERR_CLID_IN_USE or
      NFS4ERR_WRONGSEC, we create a new rpc_clnt and then try the call again.
      There is no guarantee that doing so will work however, so we can end up
      retrying the call in an infinite loop.
      
      Worse yet, we create the new client using rpc_clone_client_set_auth,
      which creates the new client as a child of the old one. Thus, we can end
      up with a *very* long lineage of rpc_clnts. When we go to put all of the
      references to them, we can end up with a long call chain that can smash
      the stack as each rpc_free_client() call can recurse back into itself.
      
      This patch fixes this by simply ensuring that the SETCLIENTID call will
      only be retried in this situation if the last attempt did not use
      RPC_AUTH_UNIX.
      
      Note too that with this change, we don't need the (i > 2) check in the
      -EACCES case since we now have a more reliable test as to whether we
      should reattempt.
      
      Cc: Chuck Lever <chuck.lever@oracle.com>
      Tested-by/Acked-by: Weston Andros Adamson <dros@netapp.com>
      Signed-off-by: default avatarJeff Layton <jlayton@redhat.com>
      Signed-off-by: default avatarTrond Myklebust <Trond.Myklebust@netapp.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      64eed789
    • J. Bruce Fields's avatar
      nfsd4: fix discarded security labels on setattr · e976eb62
      J. Bruce Fields authored
      commit 3378b7f4 upstream.
      
      Security labels in setattr calls are currently ignored because we forget
      to set label->len.
      Reported-by: default avatarJeff Layton <jlayton@redhat.com>
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e976eb62
    • J. Bruce Fields's avatar
      nfsd: return better errors to exportfs · 35775137
      J. Bruce Fields authored
      commit 427d6c66 upstream.
      
      Someone noticed exportfs happily accepted exports that would later be
      rejected when mountd tried to give them to the kernel.  Fix this.
      
      This is a regression from 4c1e1b34
      "nfsd: Store ex_anon_uid and ex_anon_gid as kuids and kgids".
      
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Reported-by: default avatarYin.JianHong <jiyin@redhat.com>
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      35775137