1. 15 Feb, 2019 5 commits
    • Eric W. Biederman's avatar
      signal: Better detection of synchronous signals · 284f7b1a
      Eric W. Biederman authored
      commit 7146db33 upstream.
      
      Recently syzkaller was able to create unkillablle processes by
      creating a timer that is delivered as a thread local signal on SIGHUP,
      and receiving SIGHUP SA_NODEFERER.  Ultimately causing a loop failing
      to deliver SIGHUP but always trying.
      
      When the stack overflows delivery of SIGHUP fails and force_sigsegv is
      called.  Unfortunately because SIGSEGV is numerically higher than
      SIGHUP next_signal tries again to deliver a SIGHUP.
      
      From a quality of implementation standpoint attempting to deliver the
      timer SIGHUP signal is wrong.  We should attempt to deliver the
      synchronous SIGSEGV signal we just forced.
      
      We can make that happening in a fairly straight forward manner by
      instead of just looking at the signal number we also look at the
      si_code.  In particular for exceptions (aka synchronous signals) the
      si_code is always greater than 0.
      
      That still has the potential to pick up a number of asynchronous
      signals as in a few cases the same si_codes that are used
      for synchronous signals are also used for asynchronous signals,
      and SI_KERNEL is also included in the list of possible si_codes.
      
      Still the heuristic is much better and timer signals are definitely
      excluded.  Which is enough to prevent all known ways for someone
      sending a process signals fast enough to cause unexpected and
      arguably incorrect behavior.
      
      Cc: stable@vger.kernel.org
      Fixes: a27341cd ("Prioritize synchronous signals over 'normal' signals")
      Tested-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Reported-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Signed-off-by: default avatar"Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      284f7b1a
    • Eric W. Biederman's avatar
      signal: Always notice exiting tasks · 3edbf743
      Eric W. Biederman authored
      commit 35634ffa upstream.
      
      Recently syzkaller was able to create unkillablle processes by
      creating a timer that is delivered as a thread local signal on SIGHUP,
      and receiving SIGHUP SA_NODEFERER.  Ultimately causing a loop
      failing to deliver SIGHUP but always trying.
      
      Upon examination it turns out part of the problem is actually most of
      the solution.  Since 2.5 signal delivery has found all fatal signals,
      marked the signal group for death, and queued SIGKILL in every threads
      thread queue relying on signal->group_exit_code to preserve the
      information of which was the actual fatal signal.
      
      The conversion of all fatal signals to SIGKILL results in the
      synchronous signal heuristic in next_signal kicking in and preferring
      SIGHUP to SIGKILL.  Which is especially problematic as all
      fatal signals have already been transformed into SIGKILL.
      
      Instead of dequeueing signals and depending upon SIGKILL to
      be the first signal dequeued, first test if the signal group
      has already been marked for death.  This guarantees that
      nothing in the signal queue can prevent a process that needs
      to exit from exiting.
      
      Cc: stable@vger.kernel.org
      Tested-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Reported-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Ref: ebf5ebe3 ("[PATCH] signal-fixes-2.5.59-A4")
      History Tree: https://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.gitSigned-off-by: default avatar"Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3edbf743
    • Matt Ranostay's avatar
      iio: chemical: atlas-ph-sensor: correct IIO_TEMP values to millicelsius · 4c935000
      Matt Ranostay authored
      commit 0808831d upstream.
      
      IIO_TEMP scale value for temperature was incorrect and not in millicelsius
      as required by the ABI documentation.
      Signed-off-by: default avatarMatt Ranostay <matt.ranostay@konsulko.com>
      Fixes: 27dec00e (iio: chemical: add Atlas pH-SM sensor support)
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarJonathan Cameron <Jonathan.Cameron@huawei.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4c935000
    • Hans de Goede's avatar
      iio: adc: axp288: Fix TS-pin handling · 4337e207
      Hans de Goede authored
      commit 9bcf15f7 upstream.
      
      Prior to this commit there were 3 issues with our handling of the TS-pin:
      
      1) There are 2 ways how the firmware can disable monitoring of the TS-pin
      for designs which do not have a temperature-sensor for the battery:
      a) Clearing bit 0 of the AXP20X_ADC_EN1 register
      b) Setting bit 2 of the AXP288_ADC_TS_PIN_CTRL monitoring
      
      Prior to this commit we were unconditionally setting both bits to the
      value used on devices with a TS. This causes the temperature protection to
      kick in on devices without a TS, such as the Jumper ezbook v2, causing
      them to not charge under Linux.
      
      This commit fixes this by using regmap_update_bits when updating these 2
      registers, leaving the 2 mentioned bits alone.
      
      The next 2 problems are related to our handling of the current-source
      for the TS-pin. The current-source used for the battery temp-sensor (TS)
      is shared with the GPADC. For proper fuel-gauge and charger operation the
      TS current-source needs to be permanently on. But to read the GPADC we
      need to temporary switch the TS current-source to ondemand, so that the
      GPADC can use it, otherwise we will always read an all 0 value.
      
      2) Problem 2 is we were writing hardcoded values to the ADC TS pin-ctrl
      register, overwriting various other unrelated bits. Specifically we were
      overwriting the current-source setting for the TS and GPIO0 pins, forcing
      it to 80ųA independent of its original setting. On a Chuwi Vi10 tablet
      this was causing us to get a too high adc value (due to a too high
      current-source) resulting in the following errors being logged:
      
      ACPI Error: AE_ERROR, Returned by Handler for [UserDefinedRegion]
      ACPI Error: Method parse/execution failed \_SB.SXP1._TMP, AE_ERROR
      
      This commit fixes this by using regmap_update_bits to change only the
      relevant bits.
      
      3) After reading the GPADC channel we were unconditionally enabling the
      TS current-source even on devices where the TS-pin is not used and the
      current-source thus was off before axp288_adc_read_raw call.
      
      This commit fixes this by making axp288_adc_set_ts a nop on devices where
      the ADC is not enabled for the TS-pin.
      
      BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1610545
      Fixes: 3091141d ("iio: adc: axp288: Fix the GPADC pin ...")
      Signed-off-by: default avatarHans de Goede <hdegoede@redhat.com>
      Cc: <Stable@vger.kernel.org>
      Signed-off-by: default avatarJonathan Cameron <Jonathan.Cameron@huawei.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4337e207
    • Martin Kepplinger's avatar
      mtd: rawnand: gpmi: fix MX28 bus master lockup problem · 3ec492ed
      Martin Kepplinger authored
      commit d5d27fd9 upstream.
      
      Disable BCH soft reset according to MX23 erratum #2847 ("BCH soft
      reset may cause bus master lock up") for MX28 too. It has the same
      problem.
      
      Observed problem: once per 100,000+ MX28 reboots NAND read failed on
      DMA timeout errors:
      [    1.770823] UBI: attaching mtd3 to ubi0
      [    2.768088] gpmi_nand: DMA timeout, last DMA :1
      [    3.958087] gpmi_nand: BCH timeout, last DMA :1
      [    4.156033] gpmi_nand: Error in ECC-based read: -110
      [    4.161136] UBI warning: ubi_io_read: error -110 while reading 64
      bytes from PEB 0:0, read only 0 bytes, retry
      [    4.171283] step 1 error
      [    4.173846] gpmi_nand: Chip: 0, Error -1
      
      Without BCH soft reset we successfully executed 1,000,000 MX28 reboots.
      
      I have a quote from NXP regarding this problem, from July 18th 2016:
      
      "As the i.MX23 and i.MX28 are of the same generation, they share many
      characteristics. Unfortunately, also the erratas may be shared.
      In case of the documented erratas and the workarounds, you can also
      apply the workaround solution of one device on the other one. This have
      been reported, but I’m afraid that there are not an estimated date for
      updating the Errata documents.
      Please accept our apologies for any inconveniences this may cause."
      
      Fixes: 6f2a6a52 ("mtd: nand: gpmi: reset BCH earlier, too, to avoid NAND startup problems")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarManfred Schlaegl <manfred.schlaegl@ginzinger.com>
      Signed-off-by: default avatarMartin Kepplinger <martin.kepplinger@ginzinger.com>
      Reviewed-by: default avatarMiquel Raynal <miquel.raynal@bootlin.com>
      Reviewed-by: default avatarFabio Estevam <festevam@gmail.com>
      Acked-by: default avatarHan Xu <han.xu@nxp.com>
      Signed-off-by: default avatarBoris Brezillon <bbrezillon@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3ec492ed
  2. 12 Feb, 2019 35 commits
    • Greg Kroah-Hartman's avatar
      Linux 4.14.99 · 383e9b61
      Greg Kroah-Hartman authored
      383e9b61
    • Lorenzo Bianconi's avatar
      ath9k: dynack: check da->enabled first in sampling routines · aa23996b
      Lorenzo Bianconi authored
      commit 9d3d65a9 upstream.
      
      Check da->enabled flag first in ath_dynack_sample_tx_ts and
      ath_dynack_sample_ack_ts routines in order to avoid useless
      processing
      Tested-by: default avatarKoen Vandeputte <koen.vandeputte@ncentric.com>
      Signed-off-by: default avatarLorenzo Bianconi <lorenzo.bianconi@redhat.com>
      Signed-off-by: default avatarKalle Valo <kvalo@codeaurora.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      aa23996b
    • Lorenzo Bianconi's avatar
      ath9k: dynack: make ewma estimation faster · d2e843ea
      Lorenzo Bianconi authored
      commit 0c60c490 upstream.
      
      In order to make propagation time estimation faster,
      use current sample as ewma output value during 'late ack'
      tracking
      Tested-by: default avatarKoen Vandeputte <koen.vandeputte@ncentric.com>
      Signed-off-by: default avatarLorenzo Bianconi <lorenzo.bianconi@redhat.com>
      Signed-off-by: default avatarKalle Valo <kvalo@codeaurora.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d2e843ea
    • Peter Zijlstra's avatar
      perf/x86/intel: Delay memory deallocation until x86_pmu_dead_cpu() · db85eb41
      Peter Zijlstra authored
      commit 602cae04 upstream.
      
      intel_pmu_cpu_prepare() allocated memory for ->shared_regs among other
      members of struct cpu_hw_events. This memory is released in
      intel_pmu_cpu_dying() which is wrong. The counterpart of the
      intel_pmu_cpu_prepare() callback is x86_pmu_dead_cpu().
      
      Otherwise if the CPU fails on the UP path between CPUHP_PERF_X86_PREPARE
      and CPUHP_AP_PERF_X86_STARTING then it won't release the memory but
      allocate new memory on the next attempt to online the CPU (leaking the
      old memory).
      Also, if the CPU down path fails between CPUHP_AP_PERF_X86_STARTING and
      CPUHP_PERF_X86_PREPARE then the CPU will go back online but never
      allocate the memory that was released in x86_pmu_dying_cpu().
      
      Make the memory allocation/free symmetrical in regard to the CPU hotplug
      notifier by moving the deallocation to intel_pmu_cpu_dead().
      
      This started in commit:
      
         a7e3ed1e ("perf: Add support for supplementary event registers").
      
      In principle the bug was introduced in v2.6.39 (!), but it will almost
      certainly not backport cleanly across the big CPU hotplug rewrite between v4.7-v4.15...
      
      [ bigeasy: Added patch description. ]
      [ mingo: Added backporting guidance. ]
      Reported-by: default avatarHe Zhe <zhe.he@windriver.com>
      Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> # With developer hat on
      Signed-off-by: default avatarSebastian Andrzej Siewior <bigeasy@linutronix.de>
      Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> # With maintainer hat on
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: acme@kernel.org
      Cc: bp@alien8.de
      Cc: hpa@zytor.com
      Cc: jolsa@kernel.org
      Cc: kan.liang@linux.intel.com
      Cc: namhyung@kernel.org
      Cc: <stable@vger.kernel.org>
      Fixes: a7e3ed1e ("perf: Add support for supplementary event registers").
      Link: https://lkml.kernel.org/r/20181219165350.6s3jvyxbibpvlhtq@linutronix.deSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      [ He Zhe: Fixes conflict caused by missing disable_counter_freeze which is
       introduced since v4.20 af3bdb99. ]
      Signed-off-by: default avatarHe Zhe <zhe.he@windriver.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      db85eb41
    • Mike Marciniszyn's avatar
      IB/hfi1: Add limit test for RC/UC send via loopback · e36c4e23
      Mike Marciniszyn authored
      commit 09ce351d upstream.
      
      Fix potential memory corruption and panic in loopback for IB_WR_SEND
      variants.
      
      The code blindly assumes the posted length will fit in the fetched rwqe,
      which is not a valid assumption.
      
      Fix by adding a limit test, and triggering the appropriate send completion
      and putting the QP in an error state.  This mimics the handling for
      non-loopback QPs.
      
      Fixes: 15703461 ("IB/{hfi1, qib, rdmavt}: Move ruc_loopback to rdmavt")
      Cc: <stable@vger.kernel.org> #v4.20+
      Reviewed-by: default avatarMichael J. Ruhl <michael.j.ruhl@intel.com>
      Signed-off-by: default avatarMike Marciniszyn <mike.marciniszyn@intel.com>
      Signed-off-by: default avatarDennis Dalessandro <dennis.dalessandro@intel.com>
      Signed-off-by: default avatarJason Gunthorpe <jgg@mellanox.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarMike Marciniszyn <mike.marciniszyn@intel.com>
      e36c4e23
    • J. Bruce Fields's avatar
      nfsd4: catch some false session retries · ff371bc8
      J. Bruce Fields authored
      commit 53da6a53 upstream.
      
      The spec allows us to return NFS4ERR_SEQ_FALSE_RETRY if we notice that
      the client is making a call that matches a previous (slot, seqid) pair
      but that *isn't* actually a replay, because some detail of the call
      doesn't actually match the previous one.
      
      Catching every such case is difficult, but we may as well catch a few
      easy ones.  This also handles the case described in the previous patch,
      in a different way.
      
      The spec does however require us to catch the case where the difference
      is in the rpc credentials.  This prevents somebody from snooping another
      user's replies by fabricating retries.
      
      (But the practical value of the attack is limited by the fact that the
      replies with the most sensitive data are READ replies, which are not
      normally cached.)
      Tested-by: default avatarOlga Kornievskaia <aglo@umich.edu>
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      Signed-off-by: default avatarDonald Buczek <buczek@molgen.mpg.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ff371bc8
    • J. Bruce Fields's avatar
      nfsd4: fix cached replies to solo SEQUENCE compounds · e74cfcb8
      J. Bruce Fields authored
      commit 085def3a upstream.
      
      Currently our handling of 4.1+ requests without "cachethis" set is
      confusing and not quite correct.
      
      Suppose a client sends a compound consisting of only a single SEQUENCE
      op, and it matches the seqid in a session slot (so it's a retry), but
      the previous request with that seqid did not have "cachethis" set.
      
      The obvious thing to do might be to return NFS4ERR_RETRY_UNCACHED_REP,
      but the protocol only allows that to be returned on the op following the
      SEQUENCE, and there is no such op in this case.
      
      The protocol permits us to cache replies even if the client didn't ask
      us to.  And it's easy to do so in the case of solo SEQUENCE compounds.
      
      So, when we get a solo SEQUENCE, we can either return the previously
      cached reply or NFSERR_SEQ_FALSE_RETRY if we notice it differs in some
      way from the original call.
      
      Currently, we're returning a corrupt reply in the case a solo SEQUENCE
      matches a previous compound with more ops.  This actually matters
      because the Linux client recently started doing this as a way to recover
      from lost replies to idempotent operations in the case the process doing
      the original reply was killed: in that case it's difficult to keep the
      original arguments around to do a real retry, and the client no longer
      cares what the result is anyway, but it would like to make sure that the
      slot's sequence id has been incremented, and the solo SEQUENCE assures
      that: if the server never got the original reply, it will increment the
      sequence id.  If it did get the original reply, it won't increment, and
      nothing else that about the reply really matters much.  But we can at
      least attempt to return valid xdr!
      Tested-by: default avatarOlga Kornievskaia <aglo@umich.edu>
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      Signed-off-by: default avatarDonald Buczek <buczek@molgen.mpg.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e74cfcb8
    • Andy Shevchenko's avatar
      serial: 8250_pci: Make PCI class test non fatal · f43699b8
      Andy Shevchenko authored
      commit 824d17c5 upstream.
      
      As has been reported the National Instruments serial cards have broken
      PCI class.
      
      The commit 7d8905d0
      
        ("serial: 8250_pci: Enable device after we check black list")
      
      made the PCI class check mandatory for the case when device is listed in
      a quirk list.
      
      Make PCI class test non fatal to allow broken card be enumerated.
      
      Fixes: 7d8905d0 ("serial: 8250_pci: Enable device after we check black list")
      Cc: stable <stable@vger.kernel.org>
      Reported-by: default avatarGuan Yung Tseng <guan.yung.tseng@ni.com>
      Tested-by: default avatarGuan Yung Tseng <guan.yung.tseng@ni.com>
      Tested-by: default avatarKHUENY.Gerhard <Gerhard.KHUENY@bachmann.info>
      Signed-off-by: default avatarAndy Shevchenko <andriy.shevchenko@linux.intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f43699b8
    • Greg Kroah-Hartman's avatar
      serial: fix race between flush_to_ldisc and tty_open · cc616243
      Greg Kroah-Hartman authored
      commit fedb5760 upstream.
      
      There still is a race window after the commit b027e229
      ("tty: fix data race between tty_init_dev and flush of buf"),
      and we encountered this crash issue if receive_buf call comes
      before tty initialization completes in tty_open and
      tty->driver_data may be NULL.
      
      CPU0                                    CPU1
      ----                                    ----
                                        tty_open
                                         tty_init_dev
                                           tty_ldisc_unlock
                                             schedule
      flush_to_ldisc
       receive_buf
        tty_port_default_receive_buf
         tty_ldisc_receive_buf
          n_tty_receive_buf_common
            __receive_buf
             uart_flush_chars
              uart_start
              /*tty->driver_data is NULL*/
                                         tty->ops->open
                                         /*init tty->driver_data*/
      
      it can be fixed by extending ldisc semaphore lock in tty_init_dev
      to driver_data initialized completely after tty->ops->open(), but
      this will lead to get lock on one function and unlock in some other
      function, and hard to maintain, so fix this race only by checking
      tty->driver_data when receiving, and return if tty->driver_data
      is NULL, and n_tty_receive_buf_common maybe calls uart_unthrottle,
      so add the same check.
      
      Because the tty layer knows nothing about the driver associated with the
      device, the tty layer can not do anything here, it is up to the tty
      driver itself to check for this type of race.  Fix up the serial driver
      to correctly check to see if it is finished binding with the device when
      being called, and if not, abort the tty calls.
      
      [Description and problem report and testing from Li RongQing, I rewrote
      the patch to be in the serial layer, not in the tty core - gregkh]
      Reported-by: default avatarLi RongQing <lirongqing@baidu.com>
      Tested-by: default avatarLi RongQing <lirongqing@baidu.com>
      Signed-off-by: default avatarWang Li <wangli39@baidu.com>
      Signed-off-by: default avatarZhang Yu <zhangyu31@baidu.com>
      Signed-off-by: default avatarLi RongQing <lirongqing@baidu.com>
      Cc: stable <stable@vger.kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      cc616243
    • Gustavo A. R. Silva's avatar
      perf tests evsel-tp-sched: Fix bitwise operator · ee5470d3
      Gustavo A. R. Silva authored
      commit 489338a7 upstream.
      
      Notice that the use of the bitwise OR operator '|' always leads to true
      in this particular case, which seems a bit suspicious due to the context
      in which this expression is being used.
      
      Fix this by using bitwise AND operator '&' instead.
      
      This bug was detected with the help of Coccinelle.
      Signed-off-by: default avatarGustavo A. R. Silva <gustavo@embeddedor.com>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: stable@vger.kernel.org
      Fixes: 6a6cd11d ("perf test: Add test for the sched tracepoint format fields")
      Link: http://lkml.kernel.org/r/20190122233439.GA5868@embeddedorSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ee5470d3
    • Mark Rutland's avatar
      perf/core: Don't WARN() for impossible ring-buffer sizes · bbf85087
      Mark Rutland authored
      commit 9dff0aa9 upstream.
      
      The perf tool uses /proc/sys/kernel/perf_event_mlock_kb to determine how
      large its ringbuffer mmap should be. This can be configured to arbitrary
      values, which can be larger than the maximum possible allocation from
      kmalloc.
      
      When this is configured to a suitably large value (e.g. thanks to the
      perf fuzzer), attempting to use perf record triggers a WARN_ON_ONCE() in
      __alloc_pages_nodemask():
      
         WARNING: CPU: 2 PID: 5666 at mm/page_alloc.c:4511 __alloc_pages_nodemask+0x3f8/0xbc8
      
      Let's avoid this by checking that the requested allocation is possible
      before calling kzalloc.
      Reported-by: default avatarJulien Thierry <julien.thierry@arm.com>
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Reviewed-by: default avatarJulien Thierry <julien.thierry@arm.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: <stable@vger.kernel.org>
      Link: https://lkml.kernel.org/r/20190110142745.25495-1-mark.rutland@arm.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      bbf85087
    • Tony Luck's avatar
      x86/MCE: Initialize mce.bank in the case of a fatal error in mce_no_way_out() · 96ae22dc
      Tony Luck authored
      commit d28af26f upstream.
      
      Internal injection testing crashed with a console log that said:
      
        mce: [Hardware Error]: CPU 7: Machine Check Exception: f Bank 0: bd80000000100134
      
      This caused a lot of head scratching because the MCACOD (bits 15:0) of
      that status is a signature from an L1 data cache error. But Linux says
      that it found it in "Bank 0", which on this model CPU only reports L1
      instruction cache errors.
      
      The answer was that Linux doesn't initialize "m->bank" in the case that
      it finds a fatal error in the mce_no_way_out() pre-scan of banks. If
      this was a local machine check, then this partially initialized struct
      mce is being passed to mce_panic().
      
      Fix is simple: just initialize m->bank in the case of a fatal error.
      
      Fixes: 40c36e27 ("x86/mce: Fix incorrect "Machine check from unknown source" message")
      Signed-off-by: default avatarTony Luck <tony.luck@intel.com>
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vishal Verma <vishal.l.verma@intel.com>
      Cc: x86-ml <x86@kernel.org>
      Cc: stable@vger.kernel.org # v4.18 Note pre-v5.0 arch/x86/kernel/cpu/mce/core.c was called arch/x86/kernel/cpu/mcheck/mce.c
      Link: https://lkml.kernel.org/r/20190201003341.10638-1-tony.luck@intel.comSigned-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      96ae22dc
    • Kan Liang's avatar
      perf/x86/intel/uncore: Add Node ID mask · 5d4dbc45
      Kan Liang authored
      commit 9e63a789 upstream.
      
      Some PCI uncore PMUs cannot be registered on an 8-socket system (HPE
      Superdome Flex).
      
      To understand which Socket the PCI uncore PMUs belongs to, perf retrieves
      the local Node ID of the uncore device from CPUNODEID(0xC0) of the PCI
      configuration space, and the mapping between Socket ID and Node ID from
      GIDNIDMAP(0xD4). The Socket ID can be calculated accordingly.
      
      The local Node ID is only available at bit 2:0, but current code doesn't
      mask it. If a BIOS doesn't clear the rest of the bits, an incorrect Node ID
      will be fetched.
      
      Filter the Node ID by adding a mask.
      Reported-by: default avatarSong Liu <songliubraving@fb.com>
      Tested-by: default avatarSong Liu <songliubraving@fb.com>
      Signed-off-by: default avatarKan Liang <kan.liang@linux.intel.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: <stable@vger.kernel.org> # v3.7+
      Fixes: 7c94ee2e ("perf/x86: Add Intel Nehalem and Sandy Bridge-EP uncore support")
      Link: https://lkml.kernel.org/r/1548600794-33162-1-git-send-email-kan.liang@linux.intel.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5d4dbc45
    • Josh Poimboeuf's avatar
      cpu/hotplug: Fix "SMT disabled by BIOS" detection for KVM · 5e1f1c1f
      Josh Poimboeuf authored
      commit b284909a upstream.
      
      With the following commit:
      
        73d5e2b4 ("cpu/hotplug: detect SMT disabled by BIOS")
      
      ... the hotplug code attempted to detect when SMT was disabled by BIOS,
      in which case it reported SMT as permanently disabled.  However, that
      code broke a virt hotplug scenario, where the guest is booted with only
      primary CPU threads, and a sibling is brought online later.
      
      The problem is that there doesn't seem to be a way to reliably
      distinguish between the HW "SMT disabled by BIOS" case and the virt
      "sibling not yet brought online" case.  So the above-mentioned commit
      was a bit misguided, as it permanently disabled SMT for both cases,
      preventing future virt sibling hotplugs.
      
      Going back and reviewing the original problems which were attempted to
      be solved by that commit, when SMT was disabled in BIOS:
      
        1) /sys/devices/system/cpu/smt/control showed "on" instead of
           "notsupported"; and
      
        2) vmx_vm_init() was incorrectly showing the L1TF_MSG_SMT warning.
      
      I'd propose that we instead consider #1 above to not actually be a
      problem.  Because, at least in the virt case, it's possible that SMT
      wasn't disabled by BIOS and a sibling thread could be brought online
      later.  So it makes sense to just always default the smt control to "on"
      to allow for that possibility (assuming cpuid indicates that the CPU
      supports SMT).
      
      The real problem is #2, which has a simple fix: change vmx_vm_init() to
      query the actual current SMT state -- i.e., whether any siblings are
      currently online -- instead of looking at the SMT "control" sysfs value.
      
      So fix it by:
      
        a) reverting the original "fix" and its followup fix:
      
           73d5e2b4 ("cpu/hotplug: detect SMT disabled by BIOS")
           bc2d8d26 ("cpu/hotplug: Fix SMT supported evaluation")
      
           and
      
        b) changing vmx_vm_init() to query the actual current SMT state --
           instead of the sysfs control value -- to determine whether the L1TF
           warning is needed.  This also requires the 'sched_smt_present'
           variable to exported, instead of 'cpu_smt_control'.
      
      Fixes: 73d5e2b4 ("cpu/hotplug: detect SMT disabled by BIOS")
      Reported-by: default avatarIgor Mammedov <imammedo@redhat.com>
      Signed-off-by: default avatarJosh Poimboeuf <jpoimboe@redhat.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: Joe Mario <jmario@redhat.com>
      Cc: Jiri Kosina <jikos@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: kvm@vger.kernel.org
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/e3a85d585da28cc333ecbc1e78ee9216e6da9396.1548794349.git.jpoimboe@redhat.comSigned-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      
      5e1f1c1f
    • Peter Shier's avatar
      KVM: nVMX: unconditionally cancel preemption timer in free_nested (CVE-2019-7221) · 1c965b1b
      Peter Shier authored
      commit ecec7688 upstream.
      
      Bugzilla: 1671904
      
      There are multiple code paths where an hrtimer may have been started to
      emulate an L1 VMX preemption timer that can result in a call to free_nested
      without an intervening L2 exit where the hrtimer is normally
      cancelled. Unconditionally cancel in free_nested to cover all cases.
      
      Embargoed until Feb 7th 2019.
      Signed-off-by: default avatarPeter Shier <pshier@google.com>
      Reported-by: default avatarJim Mattson <jmattson@google.com>
      Reviewed-by: default avatarJim Mattson <jmattson@google.com>
      Reported-by: default avatarFelix Wilhelm <fwilhelm@google.com>
      Cc: stable@kernel.org
      Message-Id: <20181011184646.154065-1-pshier@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1c965b1b
    • Jann Horn's avatar
      kvm: fix kvm_ioctl_create_device() reference counting (CVE-2019-6974) · 8c1b11bc
      Jann Horn authored
      commit cfa39381 upstream.
      
      kvm_ioctl_create_device() does the following:
      
      1. creates a device that holds a reference to the VM object (with a borrowed
         reference, the VM's refcount has not been bumped yet)
      2. initializes the device
      3. transfers the reference to the device to the caller's file descriptor table
      4. calls kvm_get_kvm() to turn the borrowed reference to the VM into a real
         reference
      
      The ownership transfer in step 3 must not happen before the reference to the VM
      becomes a proper, non-borrowed reference, which only happens in step 4.
      After step 3, an attacker can close the file descriptor and drop the borrowed
      reference, which can cause the refcount of the kvm object to drop to zero.
      
      This means that we need to grab a reference for the device before
      anon_inode_getfd(), otherwise the VM can disappear from under us.
      
      Fixes: 852b6d57 ("kvm: add device control API")
      Cc: stable@kernel.org
      Signed-off-by: default avatarJann Horn <jannh@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8c1b11bc
    • Paolo Bonzini's avatar
      KVM: x86: work around leak of uninitialized stack contents (CVE-2019-7222) · ef1b3d48
      Paolo Bonzini authored
      commit 353c0956 upstream.
      
      Bugzilla: 1671930
      
      Emulation of certain instructions (VMXON, VMCLEAR, VMPTRLD, VMWRITE with
      memory operand, INVEPT, INVVPID) can incorrectly inject a page fault
      when passed an operand that points to an MMIO address.  The page fault
      will use uninitialized kernel stack memory as the CR2 and error code.
      
      The right behavior would be to abort the VM with a KVM_EXIT_INTERNAL_ERROR
      exit to userspace; however, it is not an easy fix, so for now just
      ensure that the error code and CR2 are zero.
      
      Embargoed until Feb 7th 2019.
      Reported-by: default avatarFelix Wilhelm <fwilhelm@google.com>
      Cc: stable@kernel.org
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ef1b3d48
    • James Bottomley's avatar
      scsi: aic94xx: fix module loading · c39ebf3b
      James Bottomley authored
      commit 42caa0ed upstream.
      
      The aic94xx driver is currently failing to load with errors like
      
      sysfs: cannot create duplicate filename '/devices/pci0000:00/0000:00:03.0/0000:02:00.3/0000:07:02.0/revision'
      
      Because the PCI code had recently added a file named 'revision' to every
      PCI device.  Fix this by renaming the aic94xx revision file to
      aic_revision.  This is safe to do for us because as far as I can tell,
      there's nothing in userspace relying on the current aic94xx revision file
      so it can be renamed without breaking anything.
      
      Fixes: 702ed3be (PCI: Create revision file in sysfs)
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarJames Bottomley <James.Bottomley@HansenPartnership.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c39ebf3b
    • Vaibhav Jain's avatar
      scsi: cxlflash: Prevent deadlock when adapter probe fails · 9a3c75fb
      Vaibhav Jain authored
      commit bb61b843 upstream.
      
      Presently when an error is encountered during probe of the cxlflash
      adapter, a deadlock is seen with cpu thread stuck inside
      cxlflash_remove(). Below is the trace of the deadlock as logged by
      khungtaskd:
      
      cxlflash 0006:00:00.0: cxlflash_probe: init_afu failed rc=-16
      INFO: task kworker/80:1:890 blocked for more than 120 seconds.
             Not tainted 5.0.0-rc4-capi2-kexec+ #2
      "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      kworker/80:1    D    0   890      2 0x00000808
      Workqueue: events work_for_cpu_fn
      
      Call Trace:
       0x4d72136320 (unreliable)
       __switch_to+0x2cc/0x460
       __schedule+0x2bc/0xac0
       schedule+0x40/0xb0
       cxlflash_remove+0xec/0x640 [cxlflash]
       cxlflash_probe+0x370/0x8f0 [cxlflash]
       local_pci_probe+0x6c/0x140
       work_for_cpu_fn+0x38/0x60
       process_one_work+0x260/0x530
       worker_thread+0x280/0x5d0
       kthread+0x1a8/0x1b0
       ret_from_kernel_thread+0x5c/0x80
      INFO: task systemd-udevd:5160 blocked for more than 120 seconds.
      
      The deadlock occurs as cxlflash_remove() is called from cxlflash_probe()
      without setting 'cxlflash_cfg->state' to STATE_PROBED and the probe thread
      starts to wait on 'cxlflash_cfg->reset_waitq'. Since the device was never
      successfully probed the 'cxlflash_cfg->state' never changes from
      STATE_PROBING hence the deadlock occurs.
      
      We fix this deadlock by setting the variable 'cxlflash_cfg->state' to
      STATE_PROBED in case an error occurs during cxlflash_probe() and just
      before calling cxlflash_remove().
      
      Cc: stable@vger.kernel.org
      Fixes: c21e0bbf("cxlflash: Base support for IBM CXL Flash Adapter")
      Signed-off-by: default avatarVaibhav Jain <vaibhav@linux.ibm.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9a3c75fb
    • Johan Hovold's avatar
      staging: speakup: fix tty-operation NULL derefs · 04f2d3f8
      Johan Hovold authored
      commit a1960e0f upstream.
      
      The send_xchar() and tiocmset() tty operations are optional. Add the
      missing sanity checks to prevent user-space triggerable NULL-pointer
      dereferences.
      
      Fixes: 6b9ad1c7 ("staging: speakup: add send_xchar, tiocmset and input functionality for tty")
      Cc: stable <stable@vger.kernel.org>     # 4.13
      Cc: Okash Khawaja <okash.khawaja@gmail.com>
      Cc: Samuel Thibault <samuel.thibault@ens-lyon.org>
      Signed-off-by: default avatarJohan Hovold <johan@kernel.org>
      Reviewed-by: default avatarSamuel Thibault <samuel.thibault@ens-lyon.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      04f2d3f8
    • Paul Elder's avatar
      usb: gadget: musb: fix short isoc packets with inventra dma · f8b500d1
      Paul Elder authored
      commit c418fd6c upstream.
      
      Handling short packets (length < max packet size) in the Inventra DMA
      engine in the MUSB driver causes the MUSB DMA controller to hang. An
      example of a problem that is caused by this problem is when streaming
      video out of a UVC gadget, only the first video frame is transferred.
      
      For short packets (mode-0 or mode-1 DMA), MUSB_TXCSR_TXPKTRDY must be
      set manually by the driver. This was previously done in musb_g_tx
      (musb_gadget.c), but incorrectly (all csr flags were cleared, and only
      MUSB_TXCSR_MODE and MUSB_TXCSR_TXPKTRDY were set). Fixing that problem
      allows some requests to be transferred correctly, but multiple requests
      were often put together in one USB packet, and caused problems if the
      packet size was not a multiple of 4. Instead, set MUSB_TXCSR_TXPKTRDY
      in dma_controller_irq (musbhsdma.c), just like host mode transfers.
      
      This topic was originally tackled by Nicolas Boichat [0] [1] and is
      discussed further at [2] as part of his GSoC project [3].
      
      [0] https://groups.google.com/forum/?hl=en#!topic/beagleboard-gsoc/k8Azwfp75CU
      [1] https://gitorious.org/beagleboard-usbsniffer/beagleboard-usbsniffer-kernel/commit/b0be3b6cc195ba732189b04f1d43ec843c3e54c9?p=beagleboard-usbsniffer:beagleboard-usbsniffer-kernel.git;a=patch;h=b0be3b6cc195ba732189b04f1d43ec843c3e54c9
      [2] http://beagleboard-usbsniffer.blogspot.com/2010/07/musb-isochronous-transfers-fixed.html
      [3] http://elinux.org/BeagleBoard/GSoC/USBSniffer
      
      Fixes: 550a7375 ("USB: Add MUSB and TUSB support")
      Signed-off-by: default avatarPaul Elder <paul.elder@ideasonboard.com>
      Signed-off-by: default avatarBin Liu <b-liu@ti.com>
      Cc: stable <stable@vger.kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f8b500d1
    • Gustavo A. R. Silva's avatar
      usb: gadget: udc: net2272: Fix bitwise and boolean operations · 565e332b
      Gustavo A. R. Silva authored
      commit 07c69f11 upstream.
      
      (!x & y) strikes again.
      
      Fix bitwise and boolean operations by enclosing the expression:
      
      	intcsr & (1 << NET2272_PCI_IRQ)
      
      in parentheses, before applying the boolean operator '!'.
      
      Notice that this code has been there since 2011. So, it would
      be helpful if someone can double-check this.
      
      This issue was detected with the help of Coccinelle.
      
      Fixes: ceb80363 ("USB: net2272: driver for PLX NET2272 USB device controller")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarGustavo A. R. Silva <gustavo@embeddedor.com>
      Signed-off-by: default avatarFelipe Balbi <felipe.balbi@linux.intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      565e332b
    • Tejas Joglekar's avatar
      usb: dwc3: gadget: Handle 0 xfer length for OUT EP · 09145ec8
      Tejas Joglekar authored
      commit 1e19cdc8 upstream.
      
      For OUT endpoints, zero-length transfers require MaxPacketSize buffer as
      per the DWC_usb3 programming guide 3.30a section 4.2.3.3.
      
      This patch fixes this by explicitly checking zero length
      transfer to correctly pad up to MaxPacketSize.
      
      Fixes: c6267a51 ("usb: dwc3: gadget: align transfers to wMaxPacketSize")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarTejas Joglekar <joglekar@synopsys.com>
      Signed-off-by: default avatarFelipe Balbi <felipe.balbi@linux.intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      09145ec8
    • Bin Liu's avatar
      usb: phy: am335x: fix race condition in _probe · fb151544
      Bin Liu authored
      commit a53469a6 upstream.
      
      power off the phy should be done before populate the phy. Otherwise,
      am335x_init() could be called by the phy owner to power on the phy first,
      then am335x_phy_probe() turns off the phy again without the caller knowing
      it.
      
      Fixes: 2fc711d7 ("usb: phy: am335x: Enable USB remote wakeup using PHY wakeup")
      Cc: stable@vger.kernel.org # v3.18+
      Signed-off-by: default avatarBin Liu <b-liu@ti.com>
      Signed-off-by: default avatarFelipe Balbi <felipe.balbi@linux.intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      fb151544
    • Marc Zyngier's avatar
      irqchip/gic-v3-its: Plug allocation race for devices sharing a DevID · 2f2456fe
      Marc Zyngier authored
      commit 9791ec7d upstream.
      
      On systems or VMs where multiple devices share a single DevID
      (because they sit behind a PCI bridge, or because the HW is
      broken in funky ways), we reuse the save its_device structure
      in order to reflect this.
      
      It turns out that there is a distinct lack of locking when looking
      up the its_device, and two device being probed concurrently can result
      in double allocations. That's obviously not nice.
      
      A solution for this is to have a per-ITS mutex that serializes device
      allocation.
      
      A similar issue exists on the freeing side, which can run concurrently
      with the allocation. On top of now taking the appropriate lock, we
      also make sure that a shared device is never freed, as we have no way
      to currently track the life cycle of such object.
      Reported-by: default avatarZheng Xiang <zhengxiang9@huawei.com>
      Tested-by: default avatarZheng Xiang <zhengxiang9@huawei.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2f2456fe
    • Thomas Gleixner's avatar
      futex: Handle early deadlock return correctly · 3d8343b7
      Thomas Gleixner authored
      commit 1a1fb985 upstream.
      
      commit 56222b21 ("futex: Drop hb->lock before enqueueing on the
      rtmutex") changed the locking rules in the futex code so that the hash
      bucket lock is not longer held while the waiter is enqueued into the
      rtmutex wait list. This made the lock and the unlock path symmetric, but
      unfortunately the possible early exit from __rt_mutex_proxy_start() due to
      a detected deadlock was not updated accordingly. That allows a concurrent
      unlocker to observe inconsitent state which triggers the warning in the
      unlock path.
      
      futex_lock_pi()                         futex_unlock_pi()
        lock(hb->lock)
        queue(hb_waiter)				lock(hb->lock)
        lock(rtmutex->wait_lock)
        unlock(hb->lock)
                                              // acquired hb->lock
                                              hb_waiter = futex_top_waiter()
                                              lock(rtmutex->wait_lock)
        __rt_mutex_proxy_start()
           ---> fail
                remove(rtmutex_waiter);
           ---> returns -EDEADLOCK
        unlock(rtmutex->wait_lock)
                                              // acquired wait_lock
                                              wake_futex_pi()
                                              rt_mutex_next_owner()
      					  --> returns NULL
                                                --> WARN
      
        lock(hb->lock)
        unqueue(hb_waiter)
      
      The problem is caused by the remove(rtmutex_waiter) in the failure case of
      __rt_mutex_proxy_start() as this lets the unlocker observe a waiter in the
      hash bucket but no waiter on the rtmutex, i.e. inconsistent state.
      
      The original commit handles this correctly for the other early return cases
      (timeout, signal) by delaying the removal of the rtmutex waiter until the
      returning task reacquired the hash bucket lock.
      
      Treat the failure case of __rt_mutex_proxy_start() in the same way and let
      the existing cleanup code handle the eventual handover of the rtmutex
      gracefully. The regular rt_mutex_proxy_start() gains the rtmutex waiter
      removal for the failure case, so that the other callsites are still
      operating correctly.
      
      Add proper comments to the code so all these details are fully documented.
      
      Thanks to Peter for helping with the analysis and writing the really
      valuable code comments.
      
      Fixes: 56222b21 ("futex: Drop hb->lock before enqueueing on the rtmutex")
      Reported-by: default avatarHeiko Carstens <heiko.carstens@de.ibm.com>
      Co-developed-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Signed-off-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Tested-by: default avatarHeiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: linux-s390@vger.kernel.org
      Cc: Stefan Liebler <stli@linux.ibm.com>
      Cc: Sebastian Sewior <bigeasy@linutronix.de>
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/alpine.DEB.2.21.1901292311410.1950@nanos.tec.linutronix.deSigned-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3d8343b7
    • Leonid Iziumtsev's avatar
      dmaengine: imx-dma: fix wrong callback invoke · 26dd015c
      Leonid Iziumtsev authored
      commit 341198ed upstream.
      
      Once the "ld_queue" list is not empty, next descriptor will migrate
      into "ld_active" list. The "desc" variable will be overwritten
      during that transition. And later the dmaengine_desc_get_callback_invoke()
      will use it as an argument. As result we invoke wrong callback.
      
      That behaviour was in place since:
      commit fcaaba6c ("dmaengine: imx-dma: fix callback path in tasklet").
      But after commit 4cd13c21 ("softirq: Let ksoftirqd do its job")
      things got worse, since possible delay between tasklet_schedule()
      from DMA irq handler and actual tasklet function execution got bigger.
      And that gave more time for new DMA request to be submitted and
      to be put into "ld_queue" list.
      
      It has been noticed that DMA issue is causing problems for "mxc-mmc"
      driver. While stressing the system with heavy network traffic and
      writing/reading to/from sd card simultaneously the timeout may happen:
      
      10013000.sdhci: mxcmci_watchdog: read time out (status = 0x30004900)
      
      That often lead to file system corruption.
      Signed-off-by: default avatarLeonid Iziumtsev <leonid.iziumtsev@gmail.com>
      Signed-off-by: default avatarVinod Koul <vkoul@kernel.org>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      26dd015c
    • Lukas Wunner's avatar
      dmaengine: bcm2835: Fix abort of transactions · f9f256b1
      Lukas Wunner authored
      commit 9e528c79 upstream.
      
      There are multiple issues with bcm2835_dma_abort() (which is called on
      termination of a transaction):
      
      * The algorithm to abort the transaction first pauses the channel by
        clearing the ACTIVE flag in the CS register, then waits for the PAUSED
        flag to clear.  Page 49 of the spec documents the latter as follows:
      
        "Indicates if the DMA is currently paused and not transferring data.
         This will occur if the active bit has been cleared [...]"
         https://www.raspberrypi.org/app/uploads/2012/02/BCM2835-ARM-Peripherals.pdf
      
        So the function is entering an infinite loop because it is waiting for
        PAUSED to clear which is always set due to the function having cleared
        the ACTIVE flag.  The only thing that's saving it from itself is the
        upper bound of 10000 loop iterations.
      
        The code comment says that the intention is to "wait for any current
        AXI transfer to complete", so the author probably wanted to check the
        WAITING_FOR_OUTSTANDING_WRITES flag instead.  Amend the function
        accordingly.
      
      * The CS register is only read at the beginning of the function.  It
        needs to be read again after pausing the channel and before checking
        for outstanding writes, otherwise writes which were issued between
        the register read at the beginning of the function and pausing the
        channel may not be waited for.
      
      * The function seeks to abort the transfer by writing 0 to the NEXTCONBK
        register and setting the ABORT and ACTIVE flags.  Thereby, the 0 in
        NEXTCONBK is sought to be loaded into the CONBLK_AD register.  However
        experimentation has shown this approach to not work:  The CONBLK_AD
        register remains the same as before and the CS register contains
        0x00000030 (PAUSED | DREQ_STOPS_DMA).  In other words, the control
        block is not aborted but merely paused and it will be resumed once the
        next DMA transaction is started.  That is absolutely not the desired
        behavior.
      
        A simpler approach is to set the channel's RESET flag instead.  This
        reliably zeroes the NEXTCONBK as well as the CS register.  It requires
        less code and only a single MMIO write.  This is also what popular
        user space DMA drivers do, e.g.:
        https://github.com/metachris/RPIO/blob/master/source/c_pwm/pwm.c
      
        Note that the spec is contradictory whether the NEXTCONBK register
        is writeable at all.  On the one hand, page 41 claims:
      
        "The value loaded into the NEXTCONBK register can be overwritten so
        that the linked list of Control Block data structures can be
        dynamically altered. However it is only safe to do this when the DMA
        is paused."
      
        On the other hand, page 40 specifies:
      
        "Only three registers in each channel's register set are directly
        writeable (CS, CONBLK_AD and DEBUG). The other registers (TI,
        SOURCE_AD, DEST_AD, TXFR_LEN, STRIDE & NEXTCONBK), are automatically
        loaded from a Control Block data structure held in external memory."
      
      Fixes: 96286b57 ("dmaengine: Add support for BCM2835")
      Signed-off-by: default avatarLukas Wunner <lukas@wunner.de>
      Cc: stable@vger.kernel.org # v3.14+
      Cc: Frank Pavlic <f.pavlic@kunbus.de>
      Cc: Martin Sperl <kernel@martin.sperl.org>
      Cc: Florian Meier <florian.meier@koalo.de>
      Cc: Clive Messer <clive.m.messer@gmail.com>
      Cc: Matthias Reichl <hias@horus.com>
      Tested-by: default avatarStefan Wahren <stefan.wahren@i2se.com>
      Acked-by: default avatarFlorian Kauer <florian.kauer@koalo.de>
      Signed-off-by: default avatarVinod Koul <vkoul@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f9f256b1
    • Lukas Wunner's avatar
      dmaengine: bcm2835: Fix interrupt race on RT · 63ca7858
      Lukas Wunner authored
      commit f7da7782 upstream.
      
      If IRQ handlers are threaded (either because CONFIG_PREEMPT_RT_BASE is
      enabled or "threadirqs" was passed on the command line) and if system
      load is sufficiently high that wakeup latency of IRQ threads degrades,
      SPI DMA transactions on the BCM2835 occasionally break like this:
      
      ks8851 spi0.0: SPI transfer timed out
      bcm2835-dma 3f007000.dma: DMA transfer could not be terminated
      ks8851 spi0.0 eth2: ks8851_rdfifo: spi_sync() failed
      
      The root cause is an assumption made by the DMA driver which is
      documented in a code comment in bcm2835_dma_terminate_all():
      
      /*
       * Stop DMA activity: we assume the callback will not be called
       * after bcm_dma_abort() returns (even if it does, it will see
       * c->desc is NULL and exit.)
       */
      
      That assumption falls apart if the IRQ handler bcm2835_dma_callback() is
      threaded: A client may terminate a descriptor and issue a new one
      before the IRQ handler had a chance to run. In fact the IRQ handler may
      miss an *arbitrary* number of descriptors. The result is the following
      race condition:
      
      1. A descriptor finishes, its interrupt is deferred to the IRQ thread.
      2. A client calls dma_terminate_async() which sets channel->desc = NULL.
      3. The client issues a new descriptor. Because channel->desc is NULL,
         bcm2835_dma_issue_pending() immediately starts the descriptor.
      4. Finally the IRQ thread runs and writes BCM2835_DMA_INT to the CS
         register to acknowledge the interrupt. This clears the ACTIVE flag,
         so the newly issued descriptor is paused in the middle of the
         transaction. Because channel->desc is not NULL, the IRQ thread
         finalizes the descriptor and tries to start the next one.
      
      I see two possible solutions: The first is to call synchronize_irq()
      in bcm2835_dma_issue_pending() to wait until the IRQ thread has
      finished before issuing a new descriptor. The downside of this approach
      is unnecessary latency if clients desire rapidly terminating and
      re-issuing descriptors and don't have any use for an IRQ callback.
      (The SPI TX DMA channel is a case in point.)
      
      A better alternative is to make the IRQ thread recognize that it has
      missed descriptors and avoid finalizing the newly issued descriptor.
      So first of all, set the ACTIVE flag when acknowledging the interrupt.
      This keeps a newly issued descriptor running.
      
      If the descriptor was finished, the channel remains idle despite the
      ACTIVE flag being set. However the ACTIVE flag can then no longer be
      used to check whether the channel is idle, so instead check whether
      the register containing the current control block address is zero
      and finalize the current descriptor only if so.
      
      That way, there is no impact on latency and throughput if the client
      doesn't care for the interrupt: Only minimal additional overhead is
      introduced for non-cyclic descriptors as one further MMIO read is
      necessary per interrupt to check for idleness of the channel. Cyclic
      descriptors are sped up slightly by removing one MMIO write per
      interrupt.
      
      Fixes: 96286b57 ("dmaengine: Add support for BCM2835")
      Signed-off-by: default avatarLukas Wunner <lukas@wunner.de>
      Cc: stable@vger.kernel.org # v3.14+
      Cc: Frank Pavlic <f.pavlic@kunbus.de>
      Cc: Martin Sperl <kernel@martin.sperl.org>
      Cc: Florian Meier <florian.meier@koalo.de>
      Cc: Clive Messer <clive.m.messer@gmail.com>
      Cc: Matthias Reichl <hias@horus.com>
      Tested-by: default avatarStefan Wahren <stefan.wahren@i2se.com>
      Acked-by: default avatarFlorian Kauer <florian.kauer@koalo.de>
      Signed-off-by: default avatarVinod Koul <vkoul@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      63ca7858
    • Miklos Szeredi's avatar
      fuse: handle zero sized retrieve correctly · 266a6989
      Miklos Szeredi authored
      commit 97e1532e upstream.
      
      Dereferencing req->page_descs[0] will Oops if req->max_pages is zero.
      
      Reported-by: syzbot+c1e36d30ee3416289cc0@syzkaller.appspotmail.com
      Tested-by: syzbot+c1e36d30ee3416289cc0@syzkaller.appspotmail.com
      Fixes: b2430d75 ("fuse: add per-page descriptor <offset, length> to fuse_req")
      Cc: <stable@vger.kernel.org> # v3.9
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      266a6989
    • Miklos Szeredi's avatar
      fuse: decrement NR_WRITEBACK_TEMP on the right page · b928e93d
      Miklos Szeredi authored
      commit a2ebba82 upstream.
      
      NR_WRITEBACK_TEMP is accounted on the temporary page in the request, not
      the page cache page.
      
      Fixes: 8b284dc4 ("fuse: writepages: handle same page rewrites")
      Cc: <stable@vger.kernel.org> # v3.13
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b928e93d
    • Jann Horn's avatar
      fuse: call pipe_buf_release() under pipe lock · 65f222bb
      Jann Horn authored
      commit 9509941e upstream.
      
      Some of the pipe_buf_release() handlers seem to assume that the pipe is
      locked - in particular, anon_pipe_buf_release() accesses pipe->tmp_page
      without taking any extra locks. From a glance through the callers of
      pipe_buf_release(), it looks like FUSE is the only one that calls
      pipe_buf_release() without having the pipe locked.
      
      This bug should only lead to a memory leak, nothing terrible.
      
      Fixes: dd3bb14f ("fuse: support splice() writing to fuse device")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarJann Horn <jannh@google.com>
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      65f222bb
    • Takashi Iwai's avatar
      ALSA: hda - Serialize codec registrations · c201e435
      Takashi Iwai authored
      commit 305a0ade upstream.
      
      In the current code, the codec registration may happen both at the
      codec bind time and the end of the controller probe time.  In a rare
      occasion, they race with each other, leading to Oops due to the still
      uninitialized card device.
      
      This patch introduces a simple flag to prevent the codec registration
      at the codec bind time as long as the controller probe is going on.
      The controller probe invokes snd_card_register() that does the whole
      registration task, and we don't need to register each piece
      beforehand.
      
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c201e435
    • Charles Keepax's avatar
      ALSA: compress: Fix stop handling on compressed capture streams · bbc0621f
      Charles Keepax authored
      commit 4f2ab5e1 upstream.
      
      It is normal user behaviour to start, stop, then start a stream
      again without closing it. Currently this works for compressed
      playback streams but not capture ones.
      
      The states on a compressed capture stream go directly from OPEN to
      PREPARED, unlike a playback stream which moves to SETUP and waits
      for a write of data before moving to PREPARED. Currently however,
      when a stop is sent the state is set to SETUP for both types of
      streams. This leaves a capture stream in the situation where a new
      start can't be sent as that requires the state to be PREPARED and
      a new set_params can't be sent as that requires the state to be
      OPEN. The only option being to close the stream, and then reopen.
      
      Correct this issues by allowing snd_compr_drain_notify to set the
      state depending on the stream direction, as we already do in
      set_params.
      
      Fixes: 49bb6402 ("ALSA: compress_core: Add support for capture streams")
      Signed-off-by: default avatarCharles Keepax <ckeepax@opensource.cirrus.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      bbc0621f
    • Rundong Ge's avatar
      net: dsa: slave: Don't propagate flag changes on down slave interfaces · 6aab49c5
      Rundong Ge authored
      [ Upstream commit 17ab4f61 ]
      
      The unbalance of master's promiscuity or allmulti will happen after ifdown
      and ifup a slave interface which is in a bridge.
      
      When we ifdown a slave interface , both the 'dsa_slave_close' and
      'dsa_slave_change_rx_flags' will clear the master's flags. The flags
      of master will be decrease twice.
      In the other hand, if we ifup the slave interface again, since the
      slave's flags were cleared the 'dsa_slave_open' won't set the master's
      flag, only 'dsa_slave_change_rx_flags' that triggered by 'br_add_if'
      will set the master's flags. The flags of master is increase once.
      
      Only propagating flag changes when a slave interface is up makes
      sure this does not happen. The 'vlan_dev_change_rx_flags' had the
      same problem and was fixed, and changes here follows that fix.
      
      Fixes: 91da11f8 ("net: Distributed Switch Architecture protocol support")
      Signed-off-by: default avatarRundong Ge <rdong.ge@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6aab49c5