1. 25 May, 2017 40 commits
    • Johan Hovold's avatar
      USB: chaoskey: fix Alea quirk on big-endian hosts · d0897554
      Johan Hovold authored
      commit 63afd5cc upstream.
      
      Add missing endianness conversion when applying the Alea timeout quirk.
      
      Found using sparse:
      
      	warning: restricted __le16 degrades to integer
      
      Fixes: e4a886e8 ("hwrng: chaoskey - Fix URB warning due to timeout on Alea")
      Cc: Bob Ham <bob.ham@collabora.com>
      Cc: Herbert Xu <herbert@gondor.apana.org.au>
      Cc: Keith Packard <keithp@keithp.com>
      Signed-off-by: default avatarJohan Hovold <johan@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d0897554
    • Andrey Korolyov's avatar
      USB: serial: ftdi_sio: add Olimex ARM-USB-TINY(H) PIDs · 7ed35d6c
      Andrey Korolyov authored
      commit 5f63424a upstream.
      
      This patch adds support for recognition of ARM-USB-TINY(H) devices which
      are almost identical to ARM-USB-OCD(H) but lacking separate barrel jack
      and serial console.
      
      By suggestion from Johan Hovold it is possible to replace
      ftdi_jtag_quirk with a bit more generic construction. Since all
      Olimex-ARM debuggers has exactly two ports, we could safely always use
      only second port within the debugger family.
      Signed-off-by: default avatarAndrey Korolyov <andrey@xdel.ru>
      Signed-off-by: default avatarJohan Hovold <johan@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7ed35d6c
    • Anthony Mallet's avatar
      USB: serial: ftdi_sio: fix setting latency for unprivileged users · 14bd189b
      Anthony Mallet authored
      commit bb246681 upstream.
      
      Commit 557aaa7f ("ft232: support the ASYNC_LOW_LATENCY
      flag") enables unprivileged users to set the FTDI latency timer,
      but there was a logic flaw that skipped sending the corresponding
      USB control message to the device.
      
      Specifically, the device latency timer would not be updated until next
      open, something which was later also inadvertently broken by commit
      c19db4c9 ("USB: ftdi_sio: set device latency timeout at port
      probe").
      
      A recent commit c6dce262 ("USB: serial: ftdi_sio: fix extreme
      low-latency setting") disabled the low-latency mode by default so we now
      need this fix to allow unprivileged users to again enable it.
      Signed-off-by: default avatarAnthony Mallet <anthony.mallet@laas.fr>
      [johan: amend commit message]
      Fixes: 557aaa7f ("ft232: support the ASYNC_LOW_LATENCY flag")
      Fixes: c19db4c9 ("USB: ftdi_sio: set device latency timeout at port probe").
      Signed-off-by: default avatarJohan Hovold <johan@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      14bd189b
    • Kirill Tkhai's avatar
      pid_ns: Fix race between setns'ed fork() and zap_pid_ns_processes() · 15e4ed2a
      Kirill Tkhai authored
      commit 3fd37226 upstream.
      
      Imagine we have a pid namespace and a task from its parent's pid_ns,
      which made setns() to the pid namespace. The task is doing fork(),
      while the pid namespace's child reaper is dying. We have the race
      between them:
      
      Task from parent pid_ns             Child reaper
      copy_process()                      ..
        alloc_pid()                       ..
        ..                                zap_pid_ns_processes()
        ..                                  disable_pid_allocation()
        ..                                  read_lock(&tasklist_lock)
        ..                                  iterate over pids in pid_ns
        ..                                    kill tasks linked to pids
        ..                                  read_unlock(&tasklist_lock)
        write_lock_irq(&tasklist_lock);   ..
        attach_pid(p, PIDTYPE_PID);       ..
        ..                                ..
      
      So, just created task p won't receive SIGKILL signal,
      and the pid namespace will be in contradictory state.
      Only manual kill will help there, but does the userspace
      care about this? I suppose, the most users just inject
      a task into a pid namespace and wait a SIGCHLD from it.
      
      The patch fixes the problem. It simply checks for
      (pid_ns->nr_hashed & PIDNS_HASH_ADDING) in copy_process().
      We do it under the tasklist_lock, and can't skip
      PIDNS_HASH_ADDING as noted by Oleg:
      
      "zap_pid_ns_processes() does disable_pid_allocation()
      and then takes tasklist_lock to kill the whole namespace.
      Given that copy_process() checks PIDNS_HASH_ADDING
      under write_lock(tasklist) they can't race;
      if copy_process() takes this lock first, the new child will
      be killed, otherwise copy_process() can't miss
      the change in ->nr_hashed."
      
      If allocation is disabled, we just return -ENOMEM
      like it's made for such cases in alloc_pid().
      
      v2: Do not move disable_pid_allocation(), do not
      introduce a new variable in copy_process() and simplify
      the patch as suggested by Oleg Nesterov.
      Account the problem with double irq enabling
      found by Eric W. Biederman.
      
      Fixes: c876ad76 ("pidns: Stop pid allocation when init dies")
      Signed-off-by: default avatarKirill Tkhai <ktkhai@virtuozzo.com>
      CC: Andrew Morton <akpm@linux-foundation.org>
      CC: Ingo Molnar <mingo@kernel.org>
      CC: Peter Zijlstra <peterz@infradead.org>
      CC: Oleg Nesterov <oleg@redhat.com>
      CC: Mike Rapoport <rppt@linux.vnet.ibm.com>
      CC: Michal Hocko <mhocko@suse.com>
      CC: Andy Lutomirski <luto@kernel.org>
      CC: "Eric W. Biederman" <ebiederm@xmission.com>
      CC: Andrei Vagin <avagin@openvz.org>
      CC: Cyrill Gorcunov <gorcunov@openvz.org>
      CC: Serge Hallyn <serge@hallyn.com>
      Acked-by: default avatarOleg Nesterov <oleg@redhat.com>
      Signed-off-by: default avatarEric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      15e4ed2a
    • Eric W. Biederman's avatar
      pid_ns: Sleep in TASK_INTERRUPTIBLE in zap_pid_ns_processes · 9fbdfb7a
      Eric W. Biederman authored
      commit b9a985db upstream.
      
      The code can potentially sleep for an indefinite amount of time in
      zap_pid_ns_processes triggering the hung task timeout, and increasing
      the system average.  This is undesirable.  Sleep with a task state of
      TASK_INTERRUPTIBLE instead of TASK_UNINTERRUPTIBLE to remove these
      undesirable side effects.
      
      Apparently under heavy load this has been allowing Chrome to trigger
      the hung time task timeout error and cause ChromeOS to reboot.
      Reported-by: default avatarVovo Yang <vovoy@google.com>
      Reported-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Tested-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Fixes: 6347e900 ("pidns: guarantee that the pidns init will be the last pidns process reaped")
      Signed-off-by: default avatar"Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9fbdfb7a
    • Michael J. Ruhl's avatar
      IB/hfi1: Fix a subcontext memory leak · 4cbd5018
      Michael J. Ruhl authored
      commit 224d71f9 upstream.
      
      The only context that frees user_exp_rcv data structures is the last
      context closed (from a sub-context set).  This leaks the allocations
      from the other sub-contexts.  Separate the common frees from the
      specific frees and call them at the appropriate time.
      
      Using KEDR to check for memory leaks we get:
      
      Before test:
      
      [leak_check] Possible leaks: 25
      
      After test:
      
      [leak_check] Possible leaks: 31  (6 leaked data structures)
      
      After patch applied (before and after test have the same value)
      
      [leak_check] Possible leaks: 25
      
      Each leak is 192 + 13440 + 6720 = 20352 bytes per sub-context.
      Reviewed-by: default avatarMike Marciniszyn <mike.marciniszyn@intel.com>
      Signed-off-by: default avatarMichael J. Ruhl <michael.j.ruhl@intel.com>
      Signed-off-by: default avatarDennis Dalessandro <dennis.dalessandro@intel.com>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4cbd5018
    • Michael J. Ruhl's avatar
      IB/hfi1: Return an error on memory allocation failure · d971ab21
      Michael J. Ruhl authored
      commit 94679061 upstream.
      
      If the eager buffer allocation fails, it is necessary to return
      an error code.
      Reviewed-by: default avatarMike Marciniszyn <mike.marciniszyn@intel.com>
      Signed-off-by: default avatarMichael J. Ruhl <michael.j.ruhl@intel.com>
      Signed-off-by: default avatarDennis Dalessandro <dennis.dalessandro@intel.com>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d971ab21
    • Fabrice Gasnier's avatar
      iio: stm32 trigger: fix sampling_frequency read · c0792750
      Fabrice Gasnier authored
      commit 77a9febf upstream.
      
      When prescaler (PSC) is 0, it means div factor is 1: counter clock
      frequency is equal to input clk / (PSC + 1).
      When reload value is 8 for example, counter counts 9 cycles, from 0 to 8.
      This is handled in frequency write routine, by writing respectively:
      - prescaler - 1 to PSC
      - reload value - 1 to ARR
      This fix does the opposite when reading the frequency from PSC and ARR:
      - prescaler is PSC + 1
      - reload value is ARR + 1
      
      Thus, PSC may be 0, depending on requested sampling frequency (div 1).
      In this case, reading freq wrongly reports 0, instead of computing and
      reporting correct value.
      Remove test on !psc and !arr.
      
      Small test on stm32f4 (example on tim1_trgo), before this fix:
      $ cd /sys/bus/iio/devices/triggerX
      $ echo 10000 > sampling_frequency
      $ cat sampling_frequency
      0
      
      After this fix:
      $ echo 10000 > sampling_frequency
      $ cat sampling_frequency
      10000
      Signed-off-by: default avatarFabrice Gasnier <fabrice.gasnier@st.com>
      Signed-off-by: default avatarJonathan Cameron <jic23@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c0792750
    • Andreas Klinger's avatar
      IIO: bmp280-core.c: fix error in humidity calculation · 705fa403
      Andreas Klinger authored
      commit ed3730c4 upstream.
      
      While calculating the compensation of the humidity there are negative values
      interpreted as unsigned because of unsigned variables used.  These values as
      well as the constants need to be casted to signed as indicated by the
      documentation of the sensor.
      Signed-off-by: default avatarAndreas Klinger <ak@it-klinger.de>
      Acked-by: default avatarLinus Walleij <linus.walleij@linaro.org>
      Reviewed-by: default avatarMatt Ranostay <matt.ranostay@konsulko.com>
      Signed-off-by: default avatarJonathan Cameron <jic23@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      705fa403
    • Pavel Roskin's avatar
      iio: dac: ad7303: fix channel description · 327f5da0
      Pavel Roskin authored
      commit ce420fd4 upstream.
      
      realbits, storagebits and shift should be numbers, not ASCII characters.
      Signed-off-by: default avatarPavel Roskin <plroskin@gmail.com>
      Reviewed-by: default avatarLars-Peter Clausen <lars@metafoo.de>
      Signed-off-by: default avatarJonathan Cameron <jic23@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      327f5da0
    • James Smart's avatar
      scsi: lpfc: Fix panic on BFS configuration · 6567967a
      James Smart authored
      commit 4492b739 upstream.
      
      To select the appropriate shost template, the driver is issuing a
      mailbox command to retrieve the wwn. Turns out the sending of the
      command precedes the reset of the function.  On SLI-4 adapters, this is
      inconsequential as the mailbox command location is specified by dma via
      the BMBX register. However, on SLI-3 adapters, the location of the
      mailbox command submission area changes. When the function is first
      powered on or reset, the cmd is submitted via PCI bar memory. Later the
      driver changes the function config to use host memory and DMA. The
      request to start a mailbox command is the same, a simple doorbell write,
      regardless of submission area.  So.. if there has not been a boot driver
      run against the adapter, the mailbox command works as defaults are
      ok. But, if the boot driver has configured the card and, and if no
      platform pci function/slot reset occurs as the os starts, the mailbox
      command will fail. The SLI-3 device will use the stale boot driver dma
      location. This can cause PCI eeh errors.
      
      Fix is to reset the sli-3 function before sending the mailbox command,
      thus synchronizing the function/driver on mailbox location.
      
      Note: The fix uses routines that are typically invoked later in the call
      flow to reset the sli-3 device. The issue in using those routines is
      that the normal (non-fix) flow does additional initialization, namely
      the allocation of the pport structure. So, rather than significantly
      reworking the initialization flow so that the pport is alloc'd first,
      pointer checks are added to work around it. Checks are limited to the
      routines invoked by a sli-3 adapter (s3 routines) as this fix/early call
      is only invoked on a sli3 adapter. Nothing changes post the
      fix. Subsequent initialization, and another adapter reset, still occur -
      both on sli-3 and sli-4 adapters.
      Signed-off-by: default avatarDick Kennedy <dick.kennedy@broadcom.com>
      Signed-off-by: default avatarJames Smart <james.smart@broadcom.com>
      Fixes: 96418b5e ("scsi: lpfc: Fix eh_deadline setting for sli3 adapters.")
      Reviewed-by: default avatarEwan D. Milne <emilne@redhat.com>
      Reviewed-by: default avatarJohannes Thumshirn <jthumshirn@suse.de>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6567967a
    • Bryant G. Ly's avatar
      ibmvscsis: Do not send aborted task response · 5d6d5f07
      Bryant G. Ly authored
      commit 25e78531 upstream.
      
      The driver is sending a response to the actual scsi op that was
      aborted by an abort task TM, while LIO is sending a response to
      the abort task TM.
      
      ibmvscsis_tgt does not send the response to the client until
      release_cmd time. The reason for this was because if we did it
      at queue_status time, then the client would be free to reuse the
      tag for that command, but we're still using the tag until the
      command is released at release_cmd time, so we chose to delay
      sending the response until then. That then caused this issue, because
      release_cmd is always called, even if queue_status is not.
      
      SCSI spec says that the initiator that sends the abort task
      TM NEVER gets a response to the aborted op and with the current
      code it will send a response. Thus this fix will remove that response
      if the CMD_T_ABORTED && !CMD_T_TAS.
      
      Another case with a small timing window is the case where if LIO sends a
      TMR_DOES_NOT_EXIST, and the release_cmd callback is called for the TMR Abort
      cmd before the release_cmd for the (attemped) aborted cmd, then we need to
      ensure that we send the response for the (attempted) abort cmd to the client
      before we send the response for the TMR Abort cmd.
      Signed-off-by: default avatarBryant G. Ly <bryantly@linux.vnet.ibm.com>
      Signed-off-by: default avatarMichael Cyr <mikecyr@linux.vnet.ibm.com>
      Signed-off-by: default avatarNicholas Bellinger <nab@linux-iscsi.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5d6d5f07
    • Johan Hovold's avatar
      of: fdt: add missing allocation-failure check · c3f0eff4
      Johan Hovold authored
      commit 49e67dd1 upstream.
      
      The memory allocator passed to __unflatten_device_tree() (e.g. a wrapped
      kzalloc) can fail so add the missing sanity check to avoid dereferencing
      a NULL pointer.
      
      Fixes: fe140423 ("of/flattree: Refactor unflatten_device_tree and add fdt_unflatten_tree")
      Signed-off-by: default avatarJohan Hovold <johan@kernel.org>
      Signed-off-by: default avatarRob Herring <robh@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c3f0eff4
    • Tyrel Datwyler's avatar
      of: fix "/cpus" reference leak in of_numa_parse_cpu_nodes() · 5bc2cfba
      Tyrel Datwyler authored
      commit b8475cbe upstream.
      
      The call to of_find_node_by_path("/cpus") returns the cpus device_node
      with its reference count incremented. There is no matching of_node_put()
      call in of_numa_parse_cpu_nodes() which results in a leaked reference
      to the "/cpus" node.
      
      This patch adds an of_node_put() to release the reference.
      
      fixes: 298535c0 ("of, numa: Add NUMA of binding implementation.")
      Signed-off-by: default avatarTyrel Datwyler <tyreld@linux.vnet.ibm.com>
      Acked-by: default avatarDavid Daney <david.daney@cavium.com>
      Signed-off-by: default avatarRob Herring <robh@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5bc2cfba
    • Rob Herring's avatar
      of: fix sparse warning in of_pci_range_parser_one · 854d0231
      Rob Herring authored
      commit eb310036 upstream.
      
      sparse gives the following warning for 'pci_space':
      
      ../drivers/of/address.c:266:26: warning: incorrect type in assignment (different base types)
      ../drivers/of/address.c:266:26:    expected unsigned int [unsigned] [usertype] pci_space
      ../drivers/of/address.c:266:26:    got restricted __be32 const [usertype] <noident>
      
      It appears that pci_space is only ever accessed on powerpc, so the endian
      swap is often not needed.
      Signed-off-by: default avatarRob Herring <robh@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      854d0231
    • Takashi Iwai's avatar
      proc: Fix unbalanced hard link numbers · d5331e61
      Takashi Iwai authored
      commit d66bb160 upstream.
      
      proc_create_mount_point() forgot to increase the parent's nlink, and
      it resulted in unbalanced hard link numbers, e.g. /proc/fs shows one
      less than expected.
      
      Fixes: eb6d38d5 ("proc: Allow creating permanently empty directories...")
      Reported-by: default avatarTristan Ye <tristan.ye@suse.com>
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarEric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d5331e61
    • Vaibhav Jain's avatar
      cxl: Route eeh events to all drivers in cxl_pci_error_detected() · 525d5ef5
      Vaibhav Jain authored
      commit 4f58f0bf upstream.
      
      Fix a boundary condition where in some cases an eeh event that results
      in card reset isn't passed on to a driver attached to the virtual PCI
      device associated with a slice. This will happen in case when a slice
      attached device driver returns a value other than
      PCI_ERS_RESULT_NEED_RESET from the eeh error_detected() callback. This
      would result in an early return from cxl_pci_error_detected() and
      other drivers attached to other AFUs on the card wont be notified.
      
      The patch fixes this by making sure that all slice attached
      device-drivers are notified and the return values from
      error_detected() callback are aggregated in a scheme where request for
      'disconnect' trumps all and 'none' trumps 'need_reset'.
      
      Fixes: 9e8df8a2 ("cxl: EEH support")
      Signed-off-by: default avatarVaibhav Jain <vaibhav@linux.vnet.ibm.com>
      Reviewed-by: default avatarAndrew Donnellan <andrew.donnellan@au1.ibm.com>
      Acked-by: default avatarFrederic Barrat <fbarrat@linux.vnet.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      525d5ef5
    • Vaibhav Jain's avatar
      cxl: Force context lock during EEH flow · 4a0e8757
      Vaibhav Jain authored
      commit ea9a26d1 upstream.
      
      During an eeh event when the cxl card is fenced and card sysfs attr
      perst_reloads_same_image is set following warning message is seen in the
      kernel logs:
      
        Adapter context unlocked with 0 active contexts
        ------------[ cut here ]------------
        WARNING: CPU: 12 PID: 627 at
        ../drivers/misc/cxl/main.c:325 cxl_adapter_context_unlock+0x60/0x80 [cxl]
      
      Even though this warning is harmless, it clutters the kernel log
      during an eeh event. This warning is triggered as the EEH callback
      cxl_pci_error_detected doesn't obtain a context-lock before forcibly
      detaching all active context and when context-lock is released during
      call to cxl_configure_adapter from cxl_pci_slot_reset, a warning in
      cxl_adapter_context_unlock is triggered.
      
      To fix this warning, we acquire the adapter context-lock via
      cxl_adapter_context_lock() in the eeh callback
      cxl_pci_error_detected() once all the virtual AFU PHBs are notified
      and their contexts detached. The context-lock is released in
      cxl_pci_slot_reset() after the adapter is successfully reconfigured
      and before the we call the slot_reset callback on slice attached
      device-drivers.
      
      Fixes: 70b565bb ("cxl: Prevent adapter reset if an active context exists")
      Reported-by: default avatarAndrew Donnellan <andrew.donnellan@au1.ibm.com>
      Signed-off-by: default avatarVaibhav Jain <vaibhav@linux.vnet.ibm.com>
      Acked-by: default avatarFrederic Barrat <fbarrat@linux.vnet.ibm.com>
      Reviewed-by: default avatarMatthew R. Ochs <mrochs@linux.vnet.ibm.com>
      Tested-by: default avatarUma Krishnan <ukrishn@linux.vnet.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4a0e8757
    • Gerd Hoffmann's avatar
      ohci-pci: add qemu quirk · 6c58e144
      Gerd Hoffmann authored
      commit 21a60f6e upstream.
      
      On a loaded virtualization host (dozen guests booting at the same time)
      it may happen that the ohci controller emulation doesn't manage to do
      timely frame processing, with the result that the io watchdog fires and
      considers the controller being dead, even though it's only the emulation
      being unusual slow due to the load peak.
      
      So, add a quirk for qemu and don't use the watchdog in case we figure we
      are running on emulated ohci.  The virtual ohci controller masquerades
      as apple ohci controller, but we can identify it by subsystem id.
      Signed-off-by: default avatarGerd Hoffmann <kraxel@redhat.com>
      Signed-off-by: default avatarAlan Stern <stern@rowland.harvard.edu>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6c58e144
    • Tobias Herzog's avatar
      cdc-acm: fix possible invalid access when processing notification · a83f3099
      Tobias Herzog authored
      commit 1bb9914e upstream.
      
      Notifications may only be 8 bytes long. Accessing the 9th and
      10th byte of unimplemented/unknown notifications may be insecure.
      Also check the length of known notifications before accessing anything
      behind the 8th byte.
      Signed-off-by: default avatarTobias Herzog <t-herzog@gmx.de>
      Acked-by: default avatarOliver Neukum <oneukum@suse.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a83f3099
    • David Rivshin's avatar
      gpio: omap: return error if requested debounce time is not possible · 9541621a
      David Rivshin authored
      commit 83977443 upstream.
      
      omap_gpio_debounce() does not validate that the requested debounce
      is within a range it can handle. Instead it lets the register value
      wrap silently, and always returns success.
      
      This can lead to all sorts of unexpected behavior, such as gpio_keys
      asking for a too-long debounce, but getting a very short debounce in
      practice.
      
      Fix this by returning -EINVAL if the requested value does not fit into
      the register field. If there is no debounce clock available at all,
      return -ENOTSUPP.
      
      Fixes: e85ec6c3 ("gpio: omap: fix omap2_set_gpio_debounce")
      Signed-off-by: default avatarDavid Rivshin <drivshin@allworx.com>
      Acked-by: default avatarGrygorii Strashko <grygorii.strashko@ti.com>
      Signed-off-by: default avatarLinus Walleij <linus.walleij@linaro.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9541621a
    • Ben Skeggs's avatar
      drm/nouveau/tmr: handle races with hw when updating the next alarm time · 33f379db
      Ben Skeggs authored
      commit 1b0f8438 upstream.
      
      If the time to the next alarm is short enough, we could race with HW and
      end up with an ~4 second delay until it triggers.
      
      Fix this by checking again after we update HW.
      Signed-off-by: default avatarBen Skeggs <bskeggs@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      33f379db
    • Ben Skeggs's avatar
      drm/nouveau/tmr: avoid processing completed alarms when adding a new one · 53b8e320
      Ben Skeggs authored
      commit 330bdf62 upstream.
      
      The idea here was to avoid having to "manually" program the HW if there's
      a new earliest alarm.  This was lazy and bad, as it leads to loads of fun
      races between inter-related callers (ie. therm).
      
      Turns out, it's not so difficult after all.  Go figure ;)
      Signed-off-by: default avatarBen Skeggs <bskeggs@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      53b8e320
    • Ben Skeggs's avatar
      drm/nouveau/tmr: fix corruption of the pending list when rescheduling an alarm · 952030b8
      Ben Skeggs authored
      commit 9fc64667 upstream.
      
      At least therm/fantog "attempts" to work around this issue, which could
      lead to corruption of the pending alarm list.
      
      Fix it properly by not updating the timestamp without the lock held, or
      trying to add an already pending alarm to the pending alarm list....
      Signed-off-by: default avatarBen Skeggs <bskeggs@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      952030b8
    • Ben Skeggs's avatar
      drm/nouveau/tmr: ack interrupt before processing alarms · 4c168033
      Ben Skeggs authored
      commit 3733bd8b upstream.
      
      Fixes a race where we can miss an alarm that triggers while we're already
      processing previous alarms.
      Signed-off-by: default avatarBen Skeggs <bskeggs@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4c168033
    • Ben Skeggs's avatar
      drm/nouveau/kms/nv50: skip core channel cursor update on position-only changes · a5e41c9e
      Ben Skeggs authored
      commit e6db9579 upstream.
      
      The DRM core used to only call prepare_fb/cleanup_fb() when a plane's
      framebuffer changed, which achieved the desired effect.
      
      It's apparently now up to the driver to decide on its own.
      Signed-off-by: default avatarBen Skeggs <bskeggs@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a5e41c9e
    • Ben Skeggs's avatar
      drm/nouveau/kms/nv50: fix source-rect-only plane updates · 4aecf3a7
      Ben Skeggs authored
      commit 36601c2b upstream.
      
      This "optimisation" (which was originally meant to skip updating cursor
      settings in the core channel on position-only updates) turned out to be
      pointless in the final design of the code before it was merged.
      
      Remove it completely, as it breaks other cases.
      Signed-off-by: default avatarBen Skeggs <bskeggs@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4aecf3a7
    • Ben Skeggs's avatar
      drm/nouveau/therm: remove ineffective workarounds for alarm bugs · 2138faf0
      Ben Skeggs authored
      commit e4311ee5 upstream.
      
      These were ineffective due to touching the list without the alarm lock,
      but should no longer be required.
      Signed-off-by: default avatarBen Skeggs <bskeggs@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2138faf0
    • Mario Kleiner's avatar
      drm/amdgpu: Add missing lb_vblank_lead_lines setup to DCE-6 path. · 21d3947b
      Mario Kleiner authored
      commit effaf848 upstream.
      
      This apparently got lost when implementing the new DCE-6 support
      and would cause failures in pageflip scheduling and timestamping.
      Signed-off-by: default avatarMario Kleiner <mario.kleiner.de@gmail.com>
      Cc: Alex Deucher <alexander.deucher@amd.com>
      Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      21d3947b
    • Mario Kleiner's avatar
      drm/amdgpu: Avoid overflows/divide-by-zero in latency_watermark calculations. · 3f0db81c
      Mario Kleiner authored
      commit e190ed1e upstream.
      
      At dot clocks > approx. 250 Mhz, some of these calcs will overflow and
      cause miscalculation of latency watermarks, and for some overflows also
      divide-by-zero driver crash ("divide error: 0000 [#1] PREEMPT SMP" in
      "dce_v10_0_latency_watermark+0x12d/0x190").
      
      This zero-divide happened, e.g., on AMD Tonga Pro under DCE-10,
      on a Displayport panel when trying to set a video mode of 2560x1440
      at 165 Hz vrefresh with a dot clock of 635.540 Mhz.
      
      Refine calculations to avoid the overflows.
      
      Tested for DCE-10 with R9 380 Tonga + ASUS ROG PG279 panel.
      Reviewed-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Signed-off-by: default avatarMario Kleiner <mario.kleiner.de@gmail.com>
      Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3f0db81c
    • Mario Kleiner's avatar
      drm/amdgpu: Make display watermark calculations more accurate · de2fa45a
      Mario Kleiner authored
      commit d63c277d upstream.
      
      Avoid big roundoff errors in scanline/hactive durations for
      high pixel clocks, especially for >= 500 Mhz, and thereby
      program more accurate display fifo watermarks.
      
      Implemented here for DCE 6,8,10,11.
      Successfully tested on DCE 10 with AMD R9 380 Tonga.
      Reviewed-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Signed-off-by: default avatarMario Kleiner <mario.kleiner.de@gmail.com>
      Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      de2fa45a
    • Johan Hovold's avatar
      ath9k_htc: fix NULL-deref at probe · fab44023
      Johan Hovold authored
      commit ebeb3667 upstream.
      
      Make sure to check the number of endpoints to avoid dereferencing a
      NULL-pointer or accessing memory beyond the endpoint array should a
      malicious device lack the expected endpoints.
      
      Fixes: 36bcce43 ("ath9k_htc: Handle storage devices")
      Signed-off-by: default avatarJohan Hovold <johan@kernel.org>
      Signed-off-by: default avatarKalle Valo <kvalo@qca.qualcomm.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      fab44023
    • Dmitry Tunin's avatar
      ath9k_htc: Add support of AirTies 1eda:2315 AR9271 device · 43068b45
      Dmitry Tunin authored
      commit 16ff1fb0 upstream.
      
      T:  Bus=01 Lev=02 Prnt=02 Port=02 Cnt=01 Dev#=  7 Spd=480 MxCh= 0
      D:  Ver= 2.00 Cls=ff(vend.) Sub=ff Prot=ff MxPS=64 #Cfgs=  1
      P:  Vendor=1eda ProdID=2315 Rev=01.08
      S:  Manufacturer=ATHEROS
      S:  Product=USB2.0 WLAN
      S:  SerialNumber=12345
      C:  #Ifs= 1 Cfg#= 1 Atr=80 MxPwr=500mA
      I:  If#= 0 Alt= 0 #EPs= 6 Cls=ff(vend.) Sub=00 Prot=00 Driver=(none)
      Signed-off-by: default avatarDmitry Tunin <hanipouspilot@gmail.com>
      Signed-off-by: default avatarKalle Valo <kvalo@qca.qualcomm.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      43068b45
    • Martin Schwidefsky's avatar
      s390/cputime: fix incorrect system time · 82e31de0
      Martin Schwidefsky authored
      commit 07a63cbe upstream.
      
      git commit c5328901 "[S390] entry[64].S improvements" removed
      the update of the exit_timer lowcore field from the critical section
      cleanup of the .Lsysc_restore/.Lsysc_done and .Lio_restore/.Lio_done
      blocks. If the PSW is updated by the critical section cleanup to point to
      user space again, the interrupt entry code will do a vtime calculation
      after the cleanup completed with an exit_timer value which has *not* been
      updated. Due to this incorrect system time deltas are calculated.
      
      If an interrupt occured with an old PSW between .Lsysc_restore/.Lsysc_done
      or .Lio_restore/.Lio_done update __LC_EXIT_TIMER with the system entry
      time of the interrupt.
      Tested-by: default avatarChristian Borntraeger <borntraeger@de.ibm.com>
      Signed-off-by: default avatarMartin Schwidefsky <schwidefsky@de.ibm.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      82e31de0
    • Michael Holzheu's avatar
      s390/kdump: Add final note · 5d1adbaf
      Michael Holzheu authored
      commit dcc00b79 upstream.
      
      Since linux v3.14 with commit 38dfac84 ("vmcore: prevent PT_NOTE
      p_memsz overflow during header update") on s390 we get the following
      message in the kdump kernel:
      
        Warning: Exceeded p_memsz, dropping PT_NOTE entry n_namesz=0x6b6b6b6b,
        n_descsz=0x6b6b6b6b
      
      The reason for this is that we don't create a final zero note in
      the ELF header which the proc/vmcore code uses to find out the end
      of the notes section (see also kernel/kexec_core.c:final_note()).
      
      It still worked on s390 by chance because we (most of the time?) have the
      byte pattern 0x6b6b6b6b after the notes section which also makes the notes
      parsing code stop in update_note_header_size_elf64() because 0x6b6b6b6b is
      interpreded as note size:
      
        if ((real_sz + sz) > max_sz) {
                pr_warn("Warning: Exceeded p_memsz, dropping P ...);
                break;
        }
      
      So fix this and add the missing final note to the ELF header.
      We don't have to adjust the memory size for ELF header ("alloc_size")
      because the new ELF note still fits into the 0x1000 base memory.
      Signed-off-by: default avatarMichael Holzheu <holzheu@linux.vnet.ibm.com>
      Signed-off-by: default avatarMartin Schwidefsky <schwidefsky@de.ibm.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5d1adbaf
    • Richard Cochran's avatar
      regulator: tps65023: Fix inverted core enable logic. · d2cc3a8c
      Richard Cochran authored
      commit c90722b5 upstream.
      
      Commit 43530b69 ("regulator: Use
      regmap_read/write(), regmap_update_bits functions directly") intended
      to replace working inline helper functions with standard regmap
      calls.  However, it also inverted the set/clear logic of the "CORE ADJ
      Allowed" bit.  That patch was clearly never tested, since without that
      bit cleared, the core VDCDC1 voltage output does not react to I2C
      configuration changes.
      
      This patch fixes the issue by clearing the bit as in the original,
      correct implementation.  Note for stable back porting that, due to
      subsequent driver churn, this patch will not apply on every kernel
      version.
      
      Fixes: 43530b69 ("regulator: Use regmap_read/write(), regmap_update_bits functions directly")
      Signed-off-by: default avatarRichard Cochran <rcochran@linutronix.de>
      Signed-off-by: default avatarMark Brown <broonie@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d2cc3a8c
    • Wadim Egorov's avatar
      regulator: rk808: Fix RK818 LDO2 · fde4f790
      Wadim Egorov authored
      commit 75f88115 upstream.
      
      Set the correct voltage select register for LDO2.
      Signed-off-by: default avatarWadim Egorov <w.egorov@phytec.de>
      Signed-off-by: default avatarMark Brown <broonie@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      fde4f790
    • Linus Torvalds's avatar
      x86: fix 32-bit case of __get_user_asm_u64() · efb209c8
      Linus Torvalds authored
      commit 33c9e972 upstream.
      
      The code to fetch a 64-bit value from user space was entirely buggered,
      and has been since the code was merged in early 2016 in commit
      b2f68038 ("x86/mm/32: Add support for 64-bit __get_user() on 32-bit
      kernels").
      
      Happily the buggered routine is almost certainly entirely unused, since
      the normal way to access user space memory is just with the non-inlined
      "get_user()", and the inlined version didn't even historically exist.
      
      The normal "get_user()" case is handled by external hand-written asm in
      arch/x86/lib/getuser.S that doesn't have either of these issues.
      
      There were two independent bugs in __get_user_asm_u64():
      
       - it still did the STAC/CLAC user space access marking, even though
         that is now done by the wrapper macros, see commit 11f1a4b9
         ("x86: reorganize SMAP handling in user space accesses").
      
         This didn't result in a semantic error, it just means that the
         inlined optimized version was hugely less efficient than the
         allegedly slower standard version, since the CLAC/STAC overhead is
         quite high on modern Intel CPU's.
      
       - the double register %eax/%edx was marked as an output, but the %eax
         part of it was touched early in the asm, and could thus clobber other
         inputs to the asm that gcc didn't expect it to touch.
      
         In particular, that meant that the generated code could look like
         this:
      
              mov    (%eax),%eax
              mov    0x4(%eax),%edx
      
         where the load of %edx obviously was _supposed_ to be from the 32-bit
         word that followed the source of %eax, but because %eax was
         overwritten by the first instruction, the source of %edx was
         basically random garbage.
      
      The fixes are trivial: remove the extraneous STAC/CLAC entries, and mark
      the 64-bit output as early-clobber to let gcc know that no inputs should
      alias with the output register.
      
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Benjamin LaHaise <bcrl@kvack.org>
      Cc: Ingo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      efb209c8
    • Wanpeng Li's avatar
      KVM: X86: Fix read out-of-bounds vulnerability in kvm pio emulation · 5bbaf88d
      Wanpeng Li authored
      commit cbfc6c91 upstream.
      
      Huawei folks reported a read out-of-bounds vulnerability in kvm pio emulation.
      
      - "inb" instruction to access PIT Mod/Command register (ioport 0x43, write only,
        a read should be ignored) in guest can get a random number.
      - "rep insb" instruction to access PIT register port 0x43 can control memcpy()
        in emulator_pio_in_emulated() to copy max 0x400 bytes but only read 1 bytes,
        which will disclose the unimportant kernel memory in host but no crash.
      
      The similar test program below can reproduce the read out-of-bounds vulnerability:
      
      void hexdump(void *mem, unsigned int len)
      {
              unsigned int i, j;
      
              for(i = 0; i < len + ((len % HEXDUMP_COLS) ? (HEXDUMP_COLS - len % HEXDUMP_COLS) : 0); i++)
              {
                      /* print offset */
                      if(i % HEXDUMP_COLS == 0)
                      {
                              printf("0x%06x: ", i);
                      }
      
                      /* print hex data */
                      if(i < len)
                      {
                              printf("%02x ", 0xFF & ((char*)mem)[i]);
                      }
                      else /* end of block, just aligning for ASCII dump */
                      {
                              printf("   ");
                      }
      
                      /* print ASCII dump */
                      if(i % HEXDUMP_COLS == (HEXDUMP_COLS - 1))
                      {
                              for(j = i - (HEXDUMP_COLS - 1); j <= i; j++)
                              {
                                      if(j >= len) /* end of block, not really printing */
                                      {
                                              putchar(' ');
                                      }
                                      else if(isprint(((char*)mem)[j])) /* printable char */
                                      {
                                              putchar(0xFF & ((char*)mem)[j]);
                                      }
                                      else /* other char */
                                      {
                                              putchar('.');
                                      }
                              }
                              putchar('\n');
                      }
              }
      }
      
      int main(void)
      {
      	int i;
      	if (iopl(3))
      	{
      		err(1, "set iopl unsuccessfully\n");
      		return -1;
      	}
      	static char buf[0x40];
      
      	/* test ioport 0x40,0x41,0x42,0x43,0x44,0x45 */
      
      	memset(buf, 0xab, sizeof(buf));
      
      	asm volatile("push %rdi;");
      	asm volatile("mov %0, %%rdi;"::"q"(buf));
      
      	asm volatile ("mov $0x40, %rdx;");
      	asm volatile ("in %dx,%al;");
      	asm volatile ("stosb;");
      
      	asm volatile ("mov $0x41, %rdx;");
      	asm volatile ("in %dx,%al;");
      	asm volatile ("stosb;");
      
      	asm volatile ("mov $0x42, %rdx;");
      	asm volatile ("in %dx,%al;");
      	asm volatile ("stosb;");
      
      	asm volatile ("mov $0x43, %rdx;");
      	asm volatile ("in %dx,%al;");
      	asm volatile ("stosb;");
      
      	asm volatile ("mov $0x44, %rdx;");
      	asm volatile ("in %dx,%al;");
      	asm volatile ("stosb;");
      
      	asm volatile ("mov $0x45, %rdx;");
      	asm volatile ("in %dx,%al;");
      	asm volatile ("stosb;");
      
      	asm volatile ("pop %rdi;");
      	hexdump(buf, 0x40);
      
      	printf("\n");
      
      	/* ins port 0x40 */
      
      	memset(buf, 0xab, sizeof(buf));
      
      	asm volatile("push %rdi;");
      	asm volatile("mov %0, %%rdi;"::"q"(buf));
      
      	asm volatile ("mov $0x20, %rcx;");
      	asm volatile ("mov $0x40, %rdx;");
      	asm volatile ("rep insb;");
      
      	asm volatile ("pop %rdi;");
      	hexdump(buf, 0x40);
      
      	printf("\n");
      
      	/* ins port 0x43 */
      
      	memset(buf, 0xab, sizeof(buf));
      
      	asm volatile("push %rdi;");
      	asm volatile("mov %0, %%rdi;"::"q"(buf));
      
      	asm volatile ("mov $0x20, %rcx;");
      	asm volatile ("mov $0x43, %rdx;");
      	asm volatile ("rep insb;");
      
      	asm volatile ("pop %rdi;");
      	hexdump(buf, 0x40);
      
      	printf("\n");
      	return 0;
      }
      
      The vcpu->arch.pio_data buffer is used by both in/out instrutions emulation
      w/o clear after using which results in some random datas are left over in
      the buffer. Guest reads port 0x43 will be ignored since it is write only,
      however, the function kernel_pio() can't distigush this ignore from successfully
      reads data from device's ioport. There is no new data fill the buffer from
      port 0x43, however, emulator_pio_in_emulated() will copy the stale data in
      the buffer to the guest unconditionally. This patch fixes it by clearing the
      buffer before in instruction emulation to avoid to grant guest the stale data
      in the buffer.
      
      In addition, string I/O is not supported for in kernel device. So there is no
      iteration to read ioport %RCX times for string I/O. The function kernel_pio()
      just reads one round, and then copy the io size * %RCX to the guest unconditionally,
      actually it copies the one round ioport data w/ other random datas which are left
      over in the vcpu->arch.pio_data buffer to the guest. This patch fixes it by
      introducing the string I/O support for in kernel device in order to grant the right
      ioport datas to the guest.
      
      Before the patch:
      
      0x000000: fe 38 93 93 ff ff ab ab .8......
      0x000008: ab ab ab ab ab ab ab ab ........
      0x000010: ab ab ab ab ab ab ab ab ........
      0x000018: ab ab ab ab ab ab ab ab ........
      0x000020: ab ab ab ab ab ab ab ab ........
      0x000028: ab ab ab ab ab ab ab ab ........
      0x000030: ab ab ab ab ab ab ab ab ........
      0x000038: ab ab ab ab ab ab ab ab ........
      
      0x000000: f6 00 00 00 00 00 00 00 ........
      0x000008: 00 00 00 00 00 00 00 00 ........
      0x000010: 00 00 00 00 4d 51 30 30 ....MQ00
      0x000018: 30 30 20 33 20 20 20 20 00 3
      0x000020: ab ab ab ab ab ab ab ab ........
      0x000028: ab ab ab ab ab ab ab ab ........
      0x000030: ab ab ab ab ab ab ab ab ........
      0x000038: ab ab ab ab ab ab ab ab ........
      
      0x000000: f6 00 00 00 00 00 00 00 ........
      0x000008: 00 00 00 00 00 00 00 00 ........
      0x000010: 00 00 00 00 4d 51 30 30 ....MQ00
      0x000018: 30 30 20 33 20 20 20 20 00 3
      0x000020: ab ab ab ab ab ab ab ab ........
      0x000028: ab ab ab ab ab ab ab ab ........
      0x000030: ab ab ab ab ab ab ab ab ........
      0x000038: ab ab ab ab ab ab ab ab ........
      
      After the patch:
      
      0x000000: 1e 02 f8 00 ff ff ab ab ........
      0x000008: ab ab ab ab ab ab ab ab ........
      0x000010: ab ab ab ab ab ab ab ab ........
      0x000018: ab ab ab ab ab ab ab ab ........
      0x000020: ab ab ab ab ab ab ab ab ........
      0x000028: ab ab ab ab ab ab ab ab ........
      0x000030: ab ab ab ab ab ab ab ab ........
      0x000038: ab ab ab ab ab ab ab ab ........
      
      0x000000: d2 e2 d2 df d2 db d2 d7 ........
      0x000008: d2 d3 d2 cf d2 cb d2 c7 ........
      0x000010: d2 c4 d2 c0 d2 bc d2 b8 ........
      0x000018: d2 b4 d2 b0 d2 ac d2 a8 ........
      0x000020: ab ab ab ab ab ab ab ab ........
      0x000028: ab ab ab ab ab ab ab ab ........
      0x000030: ab ab ab ab ab ab ab ab ........
      0x000038: ab ab ab ab ab ab ab ab ........
      
      0x000000: 00 00 00 00 00 00 00 00 ........
      0x000008: 00 00 00 00 00 00 00 00 ........
      0x000010: 00 00 00 00 00 00 00 00 ........
      0x000018: 00 00 00 00 00 00 00 00 ........
      0x000020: ab ab ab ab ab ab ab ab ........
      0x000028: ab ab ab ab ab ab ab ab ........
      0x000030: ab ab ab ab ab ab ab ab ........
      0x000038: ab ab ab ab ab ab ab ab ........
      Reported-by: default avatarMoguofang <moguofang@huawei.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: Moguofang <moguofang@huawei.com>
      Signed-off-by: default avatarWanpeng Li <wanpeng.li@hotmail.com>
      Signed-off-by: default avatarRadim Krčmář <rkrcmar@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5bbaf88d
    • Wanpeng Li's avatar
      KVM: x86: Fix potential preemption when get the current kvmclock timestamp · b67dcf8c
      Wanpeng Li authored
      commit e2c2206a upstream.
      
       BUG: using __this_cpu_read() in preemptible [00000000] code: qemu-system-x86/2809
       caller is __this_cpu_preempt_check+0x13/0x20
       CPU: 2 PID: 2809 Comm: qemu-system-x86 Not tainted 4.11.0+ #13
       Call Trace:
        dump_stack+0x99/0xce
        check_preemption_disabled+0xf5/0x100
        __this_cpu_preempt_check+0x13/0x20
        get_kvmclock_ns+0x6f/0x110 [kvm]
        get_time_ref_counter+0x5d/0x80 [kvm]
        kvm_hv_process_stimers+0x2a1/0x8a0 [kvm]
        ? kvm_hv_process_stimers+0x2a1/0x8a0 [kvm]
        ? kvm_arch_vcpu_ioctl_run+0xac9/0x1ce0 [kvm]
        kvm_arch_vcpu_ioctl_run+0x5bf/0x1ce0 [kvm]
        kvm_vcpu_ioctl+0x384/0x7b0 [kvm]
        ? kvm_vcpu_ioctl+0x384/0x7b0 [kvm]
        ? __fget+0xf3/0x210
        do_vfs_ioctl+0xa4/0x700
        ? __fget+0x114/0x210
        SyS_ioctl+0x79/0x90
        entry_SYSCALL_64_fastpath+0x23/0xc2
       RIP: 0033:0x7f9d164ed357
        ? __this_cpu_preempt_check+0x13/0x20
      
      This can be reproduced by run kvm-unit-tests/hyperv_stimer.flat w/
      CONFIG_PREEMPT and CONFIG_DEBUG_PREEMPT enabled.
      
      Safe access to per-CPU data requires a couple of constraints, though: the
      thread working with the data cannot be preempted and it cannot be migrated
      while it manipulates per-CPU variables. If the thread is preempted, the
      thread that replaces it could try to work with the same variables; migration
      to another CPU could also cause confusion. However there is no preemption
      disable when reads host per-CPU tsc rate to calculate the current kvmclock
      timestamp.
      
      This patch fixes it by utilizing get_cpu/put_cpu pair to guarantee both
      __this_cpu_read() and rdtsc() are not preempted.
      
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Signed-off-by: default avatarWanpeng Li <wanpeng.li@hotmail.com>
      Reviewed-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarRadim Krčmář <rkrcmar@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b67dcf8c