1. 14 Jul, 2019 28 commits
  2. 10 Jul, 2019 12 commits
    • Greg Kroah-Hartman's avatar
      Linux 4.19.58 · 7a6bfa08
      Greg Kroah-Hartman authored
      7a6bfa08
    • Robin Gong's avatar
      dmaengine: imx-sdma: remove BD_INTR for channel0 · f37de75c
      Robin Gong authored
      commit 3f93a4f2 upstream.
      
      It is possible for an irq triggered by channel0 to be received later
      after clks are disabled once firmware loaded during sdma probe. If
      that happens then clearing them by writing to SDMA_H_INTR won't work
      and the kernel will hang processing infinite interrupts. Actually,
      don't need interrupt triggered on channel0 since it's pollling
      SDMA_H_STATSTOP to know channel0 done rather than interrupt in
      current code, just clear BD_INTR to disable channel0 interrupt to
      avoid the above case.
      This issue was brought by commit 1d069bfa ("dmaengine: imx-sdma:
      ack channel 0 IRQ in the interrupt handler") which didn't take care
      the above case.
      
      Fixes: 1d069bfa ("dmaengine: imx-sdma: ack channel 0 IRQ in the interrupt handler")
      Cc: stable@vger.kernel.org #5.0+
      Signed-off-by: default avatarRobin Gong <yibin.gong@nxp.com>
      Reported-by: default avatarSven Van Asbroeck <thesven73@gmail.com>
      Tested-by: default avatarSven Van Asbroeck <thesven73@gmail.com>
      Reviewed-by: default avatarMichael Olbrich <m.olbrich@pengutronix.de>
      Signed-off-by: default avatarVinod Koul <vkoul@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f37de75c
    • Sricharan R's avatar
      dmaengine: qcom: bam_dma: Fix completed descriptors count · 018c968d
      Sricharan R authored
      commit f6034225 upstream.
      
      One space is left unused in circular FIFO to differentiate
      'full' and 'empty' cases. So take that in to account while
      counting for the descriptors completed.
      
      Fixes the issue reported here,
      	https://lkml.org/lkml/2019/6/18/669
      
      Cc: stable@vger.kernel.org
      Reported-by: default avatarSrinivas Kandagatla <srinivas.kandagatla@linaro.org>
      Signed-off-by: default avatarSricharan R <sricharan@codeaurora.org>
      Tested-by: default avatarSrinivas Kandagatla <srinivas.kandagatla@linaro.org>
      Signed-off-by: default avatarVinod Koul <vkoul@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      018c968d
    • Cedric Hombourger's avatar
      MIPS: have "plain" make calls build dtbs for selected platforms · 870de149
      Cedric Hombourger authored
      commit 637dfa0f upstream.
      
      scripts/package/builddeb calls "make dtbs_install" after executing
      a plain make (i.e. no build targets specified). It will fail if dtbs
      were not built beforehand. Match the arm64 architecture where DTBs get
      built by the "all" target.
      Signed-off-by: default avatarCedric Hombourger <Cedric_Hombourger@mentor.com>
      [paul.burton@mips.com: s/builddep/builddeb]
      Signed-off-by: default avatarPaul Burton <paul.burton@mips.com>
      Cc: linux-mips@vger.kernel.org
      Cc: stable@vger.kernel.org # v4.1+
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      870de149
    • Dmitry Korotin's avatar
      MIPS: Add missing EHB in mtc0 -> mfc0 sequence. · 8957895b
      Dmitry Korotin authored
      commit 0b24cae4 upstream.
      
      Add a missing EHB (Execution Hazard Barrier) in mtc0 -> mfc0 sequence.
      Without this execution hazard barrier it's possible for the value read
      back from the KScratch register to be the value from before the mtc0.
      
      Reproducible on P5600 & P6600.
      
      The hazard is documented in the MIPS Architecture Reference Manual Vol.
      III: MIPS32/microMIPS32 Privileged Resource Architecture (MD00088), rev
      6.03 table 8.1 which includes:
      
         Producer | Consumer | Hazard
        ----------|----------|----------------------------
         mtc0     | mfc0     | any coprocessor 0 register
      Signed-off-by: default avatarDmitry Korotin <dkorotin@wavecomp.com>
      [paul.burton@mips.com:
        - Commit message tweaks.
        - Add Fixes tags.
        - Mark for stable back to v3.15 where P5600 support was introduced.]
      Signed-off-by: default avatarPaul Burton <paul.burton@mips.com>
      Fixes: 3d8bfdd0 ("MIPS: Use C0_KScratch (if present) to hold PGD pointer.")
      Fixes: 829dcc0a ("MIPS: Add MIPS P5600 probe support")
      Cc: linux-mips@vger.kernel.org
      Cc: stable@vger.kernel.org # v3.15+
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8957895b
    • Hauke Mehrtens's avatar
      MIPS: Fix bounds check virt_addr_valid · 2b8f8a80
      Hauke Mehrtens authored
      commit d6ed083f upstream.
      
      The bounds check used the uninitialized variable vaddr, it should use
      the given parameter kaddr instead. When using the uninitialized value
      the compiler assumed it to be 0 and optimized this function to just
      return 0 in all cases.
      
      This should make the function check the range of the given address and
      only do the page map check in case it is in the expected range of
      virtual addresses.
      
      Fixes: 074a1e11 ("MIPS: Bounds check virt_addr_valid")
      Cc: stable@vger.kernel.org # v4.12+
      Cc: Paul Burton <paul.burton@mips.com>
      Signed-off-by: default avatarHauke Mehrtens <hauke@hauke-m.de>
      Signed-off-by: default avatarPaul Burton <paul.burton@mips.com>
      Cc: ralf@linux-mips.org
      Cc: jhogan@kernel.org
      Cc: f4bug@amsat.org
      Cc: linux-mips@vger.kernel.org
      Cc: ysu@wavecomp.com
      Cc: jcristau@debian.org
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2b8f8a80
    • Chuck Lever's avatar
      svcrdma: Ignore source port when computing DRC hash · 80b25628
      Chuck Lever authored
      commit 1e091c3b upstream.
      
      The DRC appears to be effectively empty after an RPC/RDMA transport
      reconnect. The problem is that each connection uses a different
      source port, which defeats the DRC hash.
      
      Clients always have to disconnect before they send retransmissions
      to reset the connection's credit accounting, thus every retransmit
      on NFS/RDMA will miss the DRC.
      
      An NFS/RDMA client's IP source port is meaningless for RDMA
      transports. The transport layer typically sets the source port value
      on the connection to a random ephemeral port. The server already
      ignores it for the "secure port" check. See commit 16e4d93f
      ("NFSD: Ignore client's source port on RDMA transports").
      
      The Linux NFS server's DRC resolves XID collisions from the same
      source IP address by using the checksum of the first 200 bytes of
      the RPC call header.
      Signed-off-by: default avatarChuck Lever <chuck.lever@oracle.com>
      Cc: stable@vger.kernel.org # v4.14+
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      80b25628
    • Paul Menzel's avatar
      nfsd: Fix overflow causing non-working mounts on 1 TB machines · 8129a10c
      Paul Menzel authored
      commit 3b2d4dcf upstream.
      
      Since commit 10a68cdf (nfsd: fix performance-limiting session
      calculation) (Linux 5.1-rc1 and 4.19.31), shares from NFS servers with
      1 TB of memory cannot be mounted anymore. The mount just hangs on the
      client.
      
      The gist of commit 10a68cdf is the change below.
      
          -avail = clamp_t(int, avail, slotsize, avail/3);
          +avail = clamp_t(int, avail, slotsize, total_avail/3);
      
      Here are the macros.
      
          #define min_t(type, x, y)       __careful_cmp((type)(x), (type)(y), <)
          #define clamp_t(type, val, lo, hi) min_t(type, max_t(type, val, lo), hi)
      
      `total_avail` is 8,434,659,328 on the 1 TB machine. `clamp_t()` casts
      the values to `int`, which for 32-bit integers can only hold values
      −2,147,483,648 (−2^31) through 2,147,483,647 (2^31 − 1).
      
      `avail` (in the function signature) is just 65536, so that no overflow
      was happening. Before the commit the assignment would result in 21845,
      and `num = 4`.
      
      When using `total_avail`, it is causing the assignment to be
      18446744072226137429 (printed as %lu), and `num` is then 4164608182.
      
      My next guess is, that `nfsd_drc_mem_used` is then exceeded, and the
      server thinks there is no memory available any more for this client.
      
      Updating the arguments of `clamp_t()` and `min_t()` to `unsigned long`
      fixes the issue.
      
      Now, `avail = 65536` (before commit 10a68cdf `avail = 21845`), but
      `num = 4` remains the same.
      
      Fixes: c54f24e3 (nfsd: fix performance-limiting session calculation)
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarPaul Menzel <pmenzel@molgen.mpg.de>
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8129a10c
    • Wanpeng Li's avatar
      KVM: LAPIC: Fix pending interrupt in IRR blocked by software disable LAPIC · f25c0695
      Wanpeng Li authored
      commit bb34e690 upstream.
      
      Thomas reported that:
      
       | Background:
       |
       |    In preparation of supporting IPI shorthands I changed the CPU offline
       |    code to software disable the local APIC instead of just masking it.
       |    That's done by clearing the APIC_SPIV_APIC_ENABLED bit in the APIC_SPIV
       |    register.
       |
       | Failure:
       |
       |    When the CPU comes back online the startup code triggers occasionally
       |    the warning in apic_pending_intr_clear(). That complains that the IRRs
       |    are not empty.
       |
       |    The offending vector is the local APIC timer vector who's IRR bit is set
       |    and stays set.
       |
       | It took me quite some time to reproduce the issue locally, but now I can
       | see what happens.
       |
       | It requires apicv_enabled=0, i.e. full apic emulation. With apicv_enabled=1
       | (and hardware support) it behaves correctly.
       |
       | Here is the series of events:
       |
       |     Guest CPU
       |
       |     goes down
       |
       |       native_cpu_disable()
       |
       | 			apic_soft_disable();
       |
       |     play_dead()
       |
       |     ....
       |
       |     startup()
       |
       |       if (apic_enabled())
       |         apic_pending_intr_clear()	<- Not taken
       |
       |      enable APIC
       |
       |         apic_pending_intr_clear()	<- Triggers warning because IRR is stale
       |
       | When this happens then the deadline timer or the regular APIC timer -
       | happens with both, has fired shortly before the APIC is disabled, but the
       | interrupt was not serviced because the guest CPU was in an interrupt
       | disabled region at that point.
       |
       | The state of the timer vector ISR/IRR bits:
       |
       |     	     	       	        ISR     IRR
       | before apic_soft_disable()    0	      1
       | after apic_soft_disable()     0	      1
       |
       | On startup		      		 0	      1
       |
       | Now one would assume that the IRR is cleared after the INIT reset, but this
       | happens only on CPU0.
       |
       | Why?
       |
       | Because our CPU0 hotplug is just for testing to make sure nothing breaks
       | and goes through an NMI wakeup vehicle because INIT would send it through
       | the boots-trap code which is not really working if that CPU was not
       | physically unplugged.
       |
       | Now looking at a real world APIC the situation in that case is:
       |
       |     	     	       	      	ISR     IRR
       | before apic_soft_disable()    0	      1
       | after apic_soft_disable()     0	      1
       |
       | On startup		      		 0	      0
       |
       | Why?
       |
       | Once the dying CPU reenables interrupts the pending interrupt gets
       | delivered as a spurious interupt and then the state is clear.
       |
       | While that CPU0 hotplug test case is surely an esoteric issue, the APIC
       | emulation is still wrong, Even if the play_dead() code would not enable
       | interrupts then the pending IRR bit would turn into an ISR .. interrupt
       | when the APIC is reenabled on startup.
      
      From SDM 10.4.7.2 Local APIC State After It Has Been Software Disabled
      * Pending interrupts in the IRR and ISR registers are held and require
        masking or handling by the CPU.
      
      In Thomas's testing, hardware cpu will not respect soft disable LAPIC
      when IRR has already been set or APICv posted-interrupt is in flight,
      so we can skip soft disable APIC checking when clearing IRR and set ISR,
      continue to respect soft disable APIC when attempting to set IRR.
      Reported-by: default avatarRong Chen <rong.a.chen@intel.com>
      Reported-by: default avatarFeng Tang <feng.tang@intel.com>
      Reported-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Tested-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Rong Chen <rong.a.chen@intel.com>
      Cc: Feng Tang <feng.tang@intel.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarWanpeng Li <wanpengli@tencent.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f25c0695
    • Paolo Bonzini's avatar
      KVM: x86: degrade WARN to pr_warn_ratelimited · f6472f50
      Paolo Bonzini authored
      commit 3f16a5c3 upstream.
      
      This warning can be triggered easily by userspace, so it should certainly not
      cause a panic if panic_on_warn is set.
      
      Reported-by: syzbot+c03f30b4f4c46bdf8575@syzkaller.appspotmail.com
      Suggested-by: default avatarAlexander Potapenko <glider@google.com>
      Acked-by: default avatarAlexander Potapenko <glider@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f6472f50
    • Guillaume Nault's avatar
      netfilter: ipv6: nf_defrag: accept duplicate fragments again · ac0024ba
      Guillaume Nault authored
      [ Upstream commit 8a3dca63 ]
      
      When fixing the skb leak introduced by the conversion to rbtree, I
      forgot about the special case of duplicate fragments. The condition
      under the 'insert_error' label isn't effective anymore as
      nf_ct_frg6_gather() doesn't override the returned value anymore. So
      duplicate fragments now get NF_DROP verdict.
      
      To accept duplicate fragments again, handle them specially as soon as
      inet_frag_queue_insert() reports them. Return -EINPROGRESS which will
      translate to NF_STOLEN verdict, like any accepted fragment. However,
      such packets don't carry any new information and aren't queued, so we
      just drop them immediately.
      
      Fixes: a0d56cb9 ("netfilter: ipv6: nf_defrag: fix leakage of unqueued fragments")
      Signed-off-by: default avatarGuillaume Nault <gnault@redhat.com>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      ac0024ba
    • Daniel Borkmann's avatar
      bpf: fix bpf_jit_limit knob for PAGE_SIZE >= 64K · 54e8cf41
      Daniel Borkmann authored
      [ Upstream commit fdadd049 ]
      
      Michael and Sandipan report:
      
        Commit ede95a63 introduced a bpf_jit_limit tuneable to limit BPF
        JIT allocations. At compile time it defaults to PAGE_SIZE * 40000,
        and is adjusted again at init time if MODULES_VADDR is defined.
      
        For ppc64 kernels, MODULES_VADDR isn't defined, so we're stuck with
        the compile-time default at boot-time, which is 0x9c400000 when
        using 64K page size. This overflows the signed 32-bit bpf_jit_limit
        value:
      
        root@ubuntu:/tmp# cat /proc/sys/net/core/bpf_jit_limit
        -1673527296
      
        and can cause various unexpected failures throughout the network
        stack. In one case `strace dhclient eth0` reported:
      
        setsockopt(5, SOL_SOCKET, SO_ATTACH_FILTER, {len=11, filter=0x105dd27f8},
                   16) = -1 ENOTSUPP (Unknown error 524)
      
        and similar failures can be seen with tools like tcpdump. This doesn't
        always reproduce however, and I'm not sure why. The more consistent
        failure I've seen is an Ubuntu 18.04 KVM guest booted on a POWER9
        host would time out on systemd/netplan configuring a virtio-net NIC
        with no noticeable errors in the logs.
      
      Given this and also given that in near future some architectures like
      arm64 will have a custom area for BPF JIT image allocations we should
      get rid of the BPF_JIT_LIMIT_DEFAULT fallback / default entirely. For
      4.21, we have an overridable bpf_jit_alloc_exec(), bpf_jit_free_exec()
      so therefore add another overridable bpf_jit_alloc_exec_limit() helper
      function which returns the possible size of the memory area for deriving
      the default heuristic in bpf_jit_charge_init().
      
      Like bpf_jit_alloc_exec() and bpf_jit_free_exec(), the new
      bpf_jit_alloc_exec_limit() assumes that module_alloc() is the default
      JIT memory provider, and therefore in case archs implement their custom
      module_alloc() we use MODULES_{END,_VADDR} for limits and otherwise for
      vmalloc_exec() cases like on ppc64 we use VMALLOC_{END,_START}.
      
      Additionally, for archs supporting large page sizes, we should change
      the sysctl to be handled as long to not run into sysctl restrictions
      in future.
      
      Fixes: ede95a63 ("bpf: add bpf_jit_limit knob to restrict unpriv allocations")
      Reported-by: default avatarSandipan Das <sandipan@linux.ibm.com>
      Reported-by: default avatarMichael Roth <mdroth@linux.vnet.ibm.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Tested-by: default avatarMichael Roth <mdroth@linux.vnet.ibm.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      54e8cf41