1. 10 Sep, 2017 40 commits
    • Weston Andros Adamson's avatar
      nfs/flexfiles: fix leak of nfs4_ff_ds_version arrays · cc4ed9e8
      Weston Andros Adamson authored
      [ Upstream commit 1feb2616 ]
      
      The client was freeing the nfs4_ff_layout_ds, but not the contained
      nfs4_ff_ds_version array.
      Signed-off-by: default avatarWeston Andros Adamson <dros@primarydata.com>
      Cc: stable@vger.kernel.org # v4.0+
      Signed-off-by: default avatarAnna Schumaker <Anna.Schumaker@Netapp.com>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      cc4ed9e8
    • Mateusz Jurczyk's avatar
      fuse: initialize the flock flag in fuse_file on allocation · 8c088776
      Mateusz Jurczyk authored
      [ Upstream commit 68227c03 ]
      
      Before the patch, the flock flag could remain uninitialized for the
      lifespan of the fuse_file allocation. Unless set to true in
      fuse_file_flock(), it would remain in an indeterminate state until read in
      an if statement in fuse_release_common(). This could consequently lead to
      taking an unexpected branch in the code.
      
      The bug was discovered by a runtime instrumentation designed to detect use
      of uninitialized memory in the kernel.
      Signed-off-by: default avatarMateusz Jurczyk <mjurczyk@google.com>
      Fixes: 37fb3a30 ("fuse: fix flock")
      Cc: <stable@vger.kernel.org> # v3.1+
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@redhat.com>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      8c088776
    • Nicholas Bellinger's avatar
      iscsi-target: Fix iscsi_np reset hung task during parallel delete · 8c438028
      Nicholas Bellinger authored
      [ Upstream commit 978d13d6 ]
      
      This patch fixes a bug associated with iscsit_reset_np_thread()
      that can occur during parallel configfs rmdir of a single iscsi_np
      used across multiple iscsi-target instances, that would result in
      hung task(s) similar to below where configfs rmdir process context
      was blocked indefinately waiting for iscsi_np->np_restart_comp
      to finish:
      
      [ 6726.112076] INFO: task dcp_proxy_node_:15550 blocked for more than 120 seconds.
      [ 6726.119440]       Tainted: G        W  O     4.1.26-3321 #2
      [ 6726.125045] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      [ 6726.132927] dcp_proxy_node_ D ffff8803f202bc88     0 15550      1 0x00000000
      [ 6726.140058]  ffff8803f202bc88 ffff88085c64d960 ffff88083b3b1ad0 ffff88087fffeb08
      [ 6726.147593]  ffff8803f202c000 7fffffffffffffff ffff88083f459c28 ffff88083b3b1ad0
      [ 6726.155132]  ffff88035373c100 ffff8803f202bca8 ffffffff8168ced2 ffff8803f202bcb8
      [ 6726.162667] Call Trace:
      [ 6726.165150]  [<ffffffff8168ced2>] schedule+0x32/0x80
      [ 6726.170156]  [<ffffffff8168f5b4>] schedule_timeout+0x214/0x290
      [ 6726.176030]  [<ffffffff810caef2>] ? __send_signal+0x52/0x4a0
      [ 6726.181728]  [<ffffffff8168d7d6>] wait_for_completion+0x96/0x100
      [ 6726.187774]  [<ffffffff810e7c80>] ? wake_up_state+0x10/0x10
      [ 6726.193395]  [<ffffffffa035d6e2>] iscsit_reset_np_thread+0x62/0xe0 [iscsi_target_mod]
      [ 6726.201278]  [<ffffffffa0355d86>] iscsit_tpg_disable_portal_group+0x96/0x190 [iscsi_target_mod]
      [ 6726.210033]  [<ffffffffa0363f7f>] lio_target_tpg_store_enable+0x4f/0xc0 [iscsi_target_mod]
      [ 6726.218351]  [<ffffffff81260c5a>] configfs_write_file+0xaa/0x110
      [ 6726.224392]  [<ffffffff811ea364>] vfs_write+0xa4/0x1b0
      [ 6726.229576]  [<ffffffff811eb111>] SyS_write+0x41/0xb0
      [ 6726.234659]  [<ffffffff8169042e>] system_call_fastpath+0x12/0x71
      
      It would happen because each iscsit_reset_np_thread() sets state
      to ISCSI_NP_THREAD_RESET, sends SIGINT, and then blocks waiting
      for completion on iscsi_np->np_restart_comp.
      
      However, if iscsi_np was active processing a login request and
      more than a single iscsit_reset_np_thread() caller to the same
      iscsi_np was blocked on iscsi_np->np_restart_comp, iscsi_np
      kthread process context in __iscsi_target_login_thread() would
      flush pending signals and only perform a single completion of
      np->np_restart_comp before going back to sleep within transport
      specific iscsit_transport->iscsi_accept_np code.
      
      To address this bug, add a iscsi_np->np_reset_count and update
      __iscsi_target_login_thread() to keep completing np->np_restart_comp
      until ->np_reset_count has reached zero.
      Reported-by: default avatarGary Guo <ghg@datera.io>
      Tested-by: default avatarGary Guo <ghg@datera.io>
      Cc: Mike Christie <mchristi@redhat.com>
      Cc: Hannes Reinecke <hare@suse.de>
      Cc: stable@vger.kernel.org # 3.10+
      Signed-off-by: default avatarNicholas Bellinger <nab@linux-iscsi.org>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      8c438028
    • Varun Prakash's avatar
      iscsi-target: fix memory leak in iscsit_setup_text_cmd() · 3c21f983
      Varun Prakash authored
      [ Upstream commit ea8dc5b4 ]
      
      On receiving text request iscsi-target allocates buffer for
      payload in iscsit_handle_text_cmd() and assigns buffer pointer
      to cmd->text_in_ptr, this buffer is currently freed in
      iscsit_release_cmd(), if iscsi-target sets 'C' bit in text
      response then it will receive another text request from the
      initiator with ttt != 0xffffffff in this case iscsi-target
      will find cmd using itt and call iscsit_setup_text_cmd()
      which will set cmd->text_in_ptr to NULL without freeing
      previously allocated buffer.
      
      This patch fixes this issue by calling kfree(cmd->text_in_ptr)
      in iscsit_setup_text_cmd() before assigning NULL to it.
      
      For the first text request cmd->text_in_ptr is NULL as
      cmd is memset to 0 in iscsit_allocate_cmd().
      Signed-off-by: default avatarVarun Prakash <varun@chelsio.com>
      Cc: <stable@vger.kernel.org> # 4.0+
      Signed-off-by: default avatarNicholas Bellinger <nab@linux-iscsi.org>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      3c21f983
    • Jonathan Toppins's avatar
      mm: ratelimit PFNs busy info message · ad30f9ad
      Jonathan Toppins authored
      [ Upstream commit 75dddef3 ]
      
      The RDMA subsystem can generate several thousand of these messages per
      second eventually leading to a kernel crash.  Ratelimit these messages
      to prevent this crash.
      
      Doug said:
       "I've been carrying a version of this for several kernel versions. I
        don't remember when they started, but we have one (and only one) class
        of machines: Dell PE R730xd, that generate these errors. When it
        happens, without a rate limit, we get rcu timeouts and kernel oopses.
        With the rate limit, we just get a lot of annoying kernel messages but
        the machine continues on, recovers, and eventually the memory
        operations all succeed"
      
      And:
       "> Well... why are all these EBUSY's occurring? It sounds inefficient
        > (at least) but if it is expected, normal and unavoidable then
        > perhaps we should just remove that message altogether?
      
        I don't have an answer to that question. To be honest, I haven't
        looked real hard. We never had this at all, then it started out of the
        blue, but only on our Dell 730xd machines (and it hits all of them),
        but no other classes or brands of machines. And we have our 730xd
        machines loaded up with different brands and models of cards (for
        instance one dedicated to mlx4 hardware, one for qib, one for mlx5, an
        ocrdma/cxgb4 combo, etc), so the fact that it hit all of the machines
        meant it wasn't tied to any particular brand/model of RDMA hardware.
        To me, it always smelled of a hardware oddity specific to maybe the
        CPUs or mainboard chipsets in these machines, so given that I'm not an
        mm expert anyway, I never chased it down.
      
        A few other relevant details: it showed up somewhere around 4.8/4.9 or
        thereabouts. It never happened before, but the prinkt has been there
        since the 3.18 days, so possibly the test to trigger this message was
        changed, or something else in the allocator changed such that the
        situation started happening on these machines?
      
        And, like I said, it is specific to our 730xd machines (but they are
        all identical, so that could mean it's something like their specific
        ram configuration is causing the allocator to hit this on these
        machine but not on other machines in the cluster, I don't want to say
        it's necessarily the model of chipset or CPU, there are other bits of
        identicalness between these machines)"
      
      Link: http://lkml.kernel.org/r/499c0f6cc10d6eb829a67f2a4d75b4228a9b356e.1501695897.git.jtoppins@redhat.comSigned-off-by: default avatarJonathan Toppins <jtoppins@redhat.com>
      Reviewed-by: default avatarDoug Ledford <dledford@redhat.com>
      Tested-by: default avatarDoug Ledford <dledford@redhat.com>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Mel Gorman <mgorman@techsingularity.net>
      Cc: Hillf Danton <hillf.zj@alibaba-inc.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      ad30f9ad
    • Matthew Dawson's avatar
      mm/mempool: avoid KASAN marking mempool poison checks as use-after-free · db31d51c
      Matthew Dawson authored
      [ Upstream commit 76401310 ]
      
      When removing an element from the mempool, mark it as unpoisoned in KASAN
      before verifying its contents for SLUB/SLAB debugging.  Otherwise KASAN
      will flag the reads checking the element use-after-free writes as
      use-after-free reads.
      Signed-off-by: default avatarMatthew Dawson <matthew@mjdsystems.ca>
      Acked-by: default avatarAndrey Ryabinin <aryabinin@virtuozzo.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      db31d51c
    • Suzuki K Poulose's avatar
      KVM: arm/arm64: Handle hva aging while destroying the vm · 2413d137
      Suzuki K Poulose authored
      [ Upstream commit 7e5a6722 ]
      
      The mmu_notifier_release() callback of KVM triggers cleaning up
      the stage2 page table on kvm-arm. However there could be other
      notifier callbacks in parallel with the mmu_notifier_release(),
      which could cause the call backs ending up in an empty stage2
      page table. Make sure we check it for all the notifier callbacks.
      
      Cc: stable@vger.kernel.org
      Fixes: commit 293f2936 ("kvm-arm: Unmap shadow pagetables properly")
      Reported-by: default avatarAlex Graf <agraf@suse.de>
      Reviewed-by: default avatarChristoffer Dall <cdall@linaro.org>
      Signed-off-by: default avatarSuzuki K Poulose <suzuki.poulose@arm.com>
      Signed-off-by: default avatarMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      2413d137
    • Rob Gardner's avatar
      sparc64: Prevent perf from running during super critical sections · 55fd9ed0
      Rob Gardner authored
      [ Upstream commit fc290a11 ]
      
      This fixes another cause of random segfaults and bus errors that may
      occur while running perf with the callgraph option.
      
      Critical sections beginning with spin_lock_irqsave() raise the interrupt
      level to PIL_NORMAL_MAX (14) and intentionally do not block performance
      counter interrupts, which arrive at PIL_NMI (15).
      
      But some sections of code are "super critical" with respect to perf
      because the perf_callchain_user() path accesses user space and may cause
      TLB activity as well as faults as it unwinds the user stack.
      
      One particular critical section occurs in switch_mm:
      
              spin_lock_irqsave(&mm->context.lock, flags);
              ...
              load_secondary_context(mm);
              tsb_context_switch(mm);
              ...
              spin_unlock_irqrestore(&mm->context.lock, flags);
      
      If a perf interrupt arrives in between load_secondary_context() and
      tsb_context_switch(), then perf_callchain_user() could execute with
      the context ID of one process, but with an active TSB for a different
      process. When the user stack is accessed, it is very likely to
      incur a TLB miss, since the h/w context ID has been changed. The TLB
      will then be reloaded with a translation from the TSB for one process,
      but using a context ID for another process. This exposes memory from
      one process to another, and since it is a mapping for stack memory,
      this usually causes the new process to crash quickly.
      
      This super critical section needs more protection than is provided
      by spin_lock_irqsave() since perf interrupts must not be allowed in.
      
      Since __tsb_context_switch already goes through the trouble of
      disabling interrupts completely, we fix this by moving the secondary
      context load down into this better protected region.
      
      Orabug: 25577560
      Signed-off-by: default avatarDave Aldridge <david.j.aldridge@oracle.com>
      Signed-off-by: default avatarRob Gardner <rob.gardner@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      55fd9ed0
    • Willem de Bruijn's avatar
      packet: fix tp_reserve race in packet_set_ring · b7761b0c
      Willem de Bruijn authored
      [ Upstream commit c27927e3 ]
      
      Updates to tp_reserve can race with reads of the field in
      packet_set_ring. Avoid this by holding the socket lock during
      updates in setsockopt PACKET_RESERVE.
      
      This bug was discovered by syzkaller.
      
      Fixes: 8913336a ("packet: add PACKET_RESERVE sockopt")
      Reported-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Signed-off-by: default avatarWillem de Bruijn <willemb@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      b7761b0c
    • Willem de Bruijn's avatar
      net: avoid skb_warn_bad_offload false positives on UFO · 01deb691
      Willem de Bruijn authored
      [ Upstream commit 8d63bee6 ]
      
      skb_warn_bad_offload triggers a warning when an skb enters the GSO
      stack at __skb_gso_segment that does not have CHECKSUM_PARTIAL
      checksum offload set.
      
      Commit b2504a5d ("net: reduce skb_warn_bad_offload() noise")
      observed that SKB_GSO_DODGY producers can trigger the check and
      that passing those packets through the GSO handlers will fix it
      up. But, the software UFO handler will set ip_summed to
      CHECKSUM_NONE.
      
      When __skb_gso_segment is called from the receive path, this
      triggers the warning again.
      
      Make UFO set CHECKSUM_UNNECESSARY instead of CHECKSUM_NONE. On
      Tx these two are equivalent. On Rx, this better matches the
      skb state (checksum computed), as CHECKSUM_NONE here means no
      checksum computed.
      
      See also this thread for context:
      http://patchwork.ozlabs.org/patch/799015/
      
      Fixes: b2504a5d ("net: reduce skb_warn_bad_offload() noise")
      Signed-off-by: default avatarWillem de Bruijn <willemb@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      01deb691
    • Daniel Borkmann's avatar
      bpf, s390: fix jit branch offset related to ldimm64 · 731160f5
      Daniel Borkmann authored
      [ Upstream commit b0a0c256 ]
      
      While testing some other work that required JIT modifications, I
      run into test_bpf causing a hang when JIT enabled on s390. The
      problematic test case was the one from ddc665a4 (bpf, arm64:
      fix jit branch offset related to ldimm64), and turns out that we
      do have a similar issue on s390 as well. In bpf_jit_prog() we
      update next instruction address after returning from bpf_jit_insn()
      with an insn_count. bpf_jit_insn() returns either -1 in case of
      error (e.g. unsupported insn), 1 or 2. The latter is only the
      case for ldimm64 due to spanning 2 insns, however, next address
      is only set to i + 1 not taking actual insn_count into account,
      thus fix is to use insn_count instead of 1. bpf_jit_enable in
      mode 2 provides also disasm on s390:
      
      Before fix:
      
        000003ff800349b6: a7f40003   brc     15,3ff800349bc                 ; target
        000003ff800349ba: 0000               unknown
        000003ff800349bc: e3b0f0700024       stg     %r11,112(%r15)
        000003ff800349c2: e3e0f0880024       stg     %r14,136(%r15)
        000003ff800349c8: 0db0               basr    %r11,%r0
        000003ff800349ca: c0ef00000000       llilf   %r14,0
        000003ff800349d0: e320b0360004       lg      %r2,54(%r11)
        000003ff800349d6: e330b03e0004       lg      %r3,62(%r11)
        000003ff800349dc: ec23ffeda065       clgrj   %r2,%r3,10,3ff800349b6 ; jmp
        000003ff800349e2: e3e0b0460004       lg      %r14,70(%r11)
        000003ff800349e8: e3e0b04e0004       lg      %r14,78(%r11)
        000003ff800349ee: b904002e   lgr     %r2,%r14
        000003ff800349f2: e3b0f0700004       lg      %r11,112(%r15)
        000003ff800349f8: e3e0f0880004       lg      %r14,136(%r15)
        000003ff800349fe: 07fe               bcr     15,%r14
      
      After fix:
      
        000003ff80ef3db4: a7f40003   brc     15,3ff80ef3dba
        000003ff80ef3db8: 0000               unknown
        000003ff80ef3dba: e3b0f0700024       stg     %r11,112(%r15)
        000003ff80ef3dc0: e3e0f0880024       stg     %r14,136(%r15)
        000003ff80ef3dc6: 0db0               basr    %r11,%r0
        000003ff80ef3dc8: c0ef00000000       llilf   %r14,0
        000003ff80ef3dce: e320b0360004       lg      %r2,54(%r11)
        000003ff80ef3dd4: e330b03e0004       lg      %r3,62(%r11)
        000003ff80ef3dda: ec230006a065       clgrj   %r2,%r3,10,3ff80ef3de6 ; jmp
        000003ff80ef3de0: e3e0b0460004       lg      %r14,70(%r11)
        000003ff80ef3de6: e3e0b04e0004       lg      %r14,78(%r11)          ; target
        000003ff80ef3dec: b904002e   lgr     %r2,%r14
        000003ff80ef3df0: e3b0f0700004       lg      %r11,112(%r15)
        000003ff80ef3df6: e3e0f0880004       lg      %r14,136(%r15)
        000003ff80ef3dfc: 07fe               bcr     15,%r14
      
      test_bpf.ko suite runs fine after the fix.
      
      Fixes: 05462310 ("s390/bpf: Add s390x eBPF JIT compiler backend")
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Tested-by: default avatarMichael Holzheu <holzheu@linux.vnet.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      731160f5
    • Yuchung Cheng's avatar
      tcp: avoid setting cwnd to invalid ssthresh after cwnd reduction states · f47841f8
      Yuchung Cheng authored
      [ Upstream commit ed254971 ]
      
      If the sender switches the congestion control during ECN-triggered
      cwnd-reduction state (CA_CWR), upon exiting recovery cwnd is set to
      the ssthresh value calculated by the previous congestion control. If
      the previous congestion control is BBR that always keep ssthresh
      to TCP_INIFINITE_SSTHRESH, cwnd ends up being infinite. The safe
      step is to avoid assigning invalid ssthresh value when recovery ends.
      Signed-off-by: default avatarYuchung Cheng <ycheng@google.com>
      Signed-off-by: default avatarNeal Cardwell <ncardwell@google.com>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      f47841f8
    • zheng li's avatar
      ipv4: Should use consistent conditional judgement for ip fragment in... · 24fd7957
      zheng li authored
      ipv4: Should use consistent conditional judgement for ip fragment in __ip_append_data and ip_finish_output
      
      [ Upstream commit 0a28cfd5 ]
      
      There is an inconsistent conditional judgement in __ip_append_data and
      ip_finish_output functions, the variable length in __ip_append_data just
      include the length of application's payload and udp header, don't include
      the length of ip header, but in ip_finish_output use
      (skb->len > ip_skb_dst_mtu(skb)) as judgement, and skb->len include the
      length of ip header.
      
      That causes some particular application's udp payload whose length is
      between (MTU - IP Header) and MTU were fragmented by ip_fragment even
      though the rst->dev support UFO feature.
      
      Add the length of ip header to length in __ip_append_data to keep
      consistent conditional judgement as ip_finish_output for ip fragment.
      Signed-off-by: default avatarZheng Li <james.z.li@ericsson.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      24fd7957
    • Ard Biesheuvel's avatar
      mm: don't dereference struct page fields of invalid pages · a9f54518
      Ard Biesheuvel authored
      [ Upstream commit f073bdc5 ]
      
      The VM_BUG_ON() check in move_freepages() checks whether the node id of
      a page matches the node id of its zone.  However, it does this before
      having checked whether the struct page pointer refers to a valid struct
      page to begin with.  This is guaranteed in most cases, but may not be
      the case if CONFIG_HOLES_IN_ZONE=y.
      
      So reorder the VM_BUG_ON() with the pfn_valid_within() check.
      
      Link: http://lkml.kernel.org/r/1481706707-6211-2-git-send-email-ard.biesheuvel@linaro.orgSigned-off-by: default avatarArd Biesheuvel <ard.biesheuvel@linaro.org>
      Acked-by: default avatarWill Deacon <will.deacon@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Hanjun Guo <hanjun.guo@linaro.org>
      Cc: Yisheng Xie <xieyisheng1@huawei.com>
      Cc: Robert Richter <rrichter@cavium.com>
      Cc: James Morse <james.morse@arm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      a9f54518
    • Jamie Iles's avatar
      signal: protect SIGNAL_UNKILLABLE from unintentional clearing. · 144f240c
      Jamie Iles authored
      [ Upstream commit 2d39b3cd ]
      
      Since commit 00cd5c37 ("ptrace: permit ptracing of /sbin/init") we
      can now trace init processes.  init is initially protected with
      SIGNAL_UNKILLABLE which will prevent fatal signals such as SIGSTOP, but
      there are a number of paths during tracing where SIGNAL_UNKILLABLE can
      be implicitly cleared.
      
      This can result in init becoming stoppable/killable after tracing.  For
      example, running:
      
        while true; do kill -STOP 1; done &
        strace -p 1
      
      and then stopping strace and the kill loop will result in init being
      left in state TASK_STOPPED.  Sending SIGCONT to init will resume it, but
      init will now respond to future SIGSTOP signals rather than ignoring
      them.
      
      Make sure that when setting SIGNAL_STOP_CONTINUED/SIGNAL_STOP_STOPPED
      that we don't clear SIGNAL_UNKILLABLE.
      
      Link: http://lkml.kernel.org/r/20170104122017.25047-1-jamie.iles@oracle.comSigned-off-by: default avatarJamie Iles <jamie.iles@oracle.com>
      Acked-by: default avatarOleg Nesterov <oleg@redhat.com>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      144f240c
    • Sudip Mukherjee's avatar
      lib/Kconfig.debug: fix frv build failure · 232a0eda
      Sudip Mukherjee authored
      [ Upstream commit da0510c4 ]
      
      The build of frv allmodconfig was failing with the errors like:
      
        /tmp/cc0JSPc3.s: Assembler messages:
        /tmp/cc0JSPc3.s:1839: Error: symbol `.LSLT0' is already defined
        /tmp/cc0JSPc3.s:1842: Error: symbol `.LASLTP0' is already defined
        /tmp/cc0JSPc3.s:1969: Error: symbol `.LELTP0' is already defined
        /tmp/cc0JSPc3.s:1970: Error: symbol `.LELT0' is already defined
      
      Commit 866ced95 ("kbuild: Support split debug info v4") introduced
      splitting the debug info and keeping that in a separate file.  Somehow,
      the frv-linux gcc did not like that and I am guessing that instead of
      splitting it started copying.  The first report about this is at:
      
        https://lists.01.org/pipermail/kbuild-all/2015-July/010527.html.
      
      I will try and see if this can work with frv and if still fails I will
      open a bug report with gcc.  But meanwhile this is the easiest option to
      solve build failure of frv.
      
      Fixes: 866ced95 ("kbuild: Support split debug info v4")
      Link: http://lkml.kernel.org/r/1482062348-5352-1-git-send-email-sudipm.mukherjee@gmail.comSigned-off-by: default avatarSudip Mukherjee <sudip.mukherjee@codethink.co.uk>
      Reported-by: default avatarFengguang Wu <fengguang.wu@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: David Howells <dhowells@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      232a0eda
    • Michal Hocko's avatar
      mm, slab: make sure that KMALLOC_MAX_SIZE will fit into MAX_ORDER · 50053397
      Michal Hocko authored
      [ Upstream commit bb1107f7 ]
      
      Andrey Konovalov has reported the following warning triggered by the
      syzkaller fuzzer.
      
        WARNING: CPU: 1 PID: 9935 at mm/page_alloc.c:3511 __alloc_pages_nodemask+0x159c/0x1e20
        Kernel panic - not syncing: panic_on_warn set ...
        CPU: 1 PID: 9935 Comm: syz-executor0 Not tainted 4.9.0-rc7+ #34
        Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
        Call Trace:
          __alloc_pages_slowpath mm/page_alloc.c:3511
          __alloc_pages_nodemask+0x159c/0x1e20 mm/page_alloc.c:3781
          alloc_pages_current+0x1c7/0x6b0 mm/mempolicy.c:2072
          alloc_pages include/linux/gfp.h:469
          kmalloc_order+0x1f/0x70 mm/slab_common.c:1015
          kmalloc_order_trace+0x1f/0x160 mm/slab_common.c:1026
          kmalloc_large include/linux/slab.h:422
          __kmalloc+0x210/0x2d0 mm/slub.c:3723
          kmalloc include/linux/slab.h:495
          ep_write_iter+0x167/0xb50 drivers/usb/gadget/legacy/inode.c:664
          new_sync_write fs/read_write.c:499
          __vfs_write+0x483/0x760 fs/read_write.c:512
          vfs_write+0x170/0x4e0 fs/read_write.c:560
          SYSC_write fs/read_write.c:607
          SyS_write+0xfb/0x230 fs/read_write.c:599
          entry_SYSCALL_64_fastpath+0x1f/0xc2
      
      The issue is caused by a lack of size check for the request size in
      ep_write_iter which should be fixed.  It, however, points to another
      problem, that SLUB defines KMALLOC_MAX_SIZE too large because the its
      KMALLOC_SHIFT_MAX is (MAX_ORDER + PAGE_SHIFT) which means that the
      resulting page allocator request might be MAX_ORDER which is too large
      (see __alloc_pages_slowpath).
      
      The same applies to the SLOB allocator which allows even larger sizes.
      Make sure that they are capped properly and never request more than
      MAX_ORDER order.
      
      Link: http://lkml.kernel.org/r/20161220130659.16461-2-mhocko@kernel.orgSigned-off-by: default avatarMichal Hocko <mhocko@suse.com>
      Reported-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Acked-by: default avatarChristoph Lameter <cl@linux.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      50053397
    • Rabin Vincent's avatar
      ARM: 8632/1: ftrace: fix syscall name matching · 997fb3de
      Rabin Vincent authored
      [ Upstream commit 270c8cf1 ]
      
      ARM has a few system calls (most notably mmap) for which the names of
      the functions which are referenced in the syscall table do not match the
      names of the syscall tracepoints.  As a consequence of this, these
      tracepoints are not made available.  Implement
      arch_syscall_match_sym_name to fix this and allow tracing even these
      system calls.
      Signed-off-by: default avatarRabin Vincent <rabinv@axis.com>
      Signed-off-by: default avatarRussell King <rmk+kernel@arm.linux.org.uk>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      997fb3de
    • Omar Sandoval's avatar
      virtio_blk: fix panic in initialization error path · 1ca4367b
      Omar Sandoval authored
      [ Upstream commit 6bf6b0aa ]
      
      If blk_mq_init_queue() returns an error, it gets assigned to
      vblk->disk->queue. Then, when we call put_disk(), we end up calling
      blk_put_queue() with the ERR_PTR, causing a bad dereference. Fix it by
      only assigning to vblk->disk->queue on success.
      Signed-off-by: default avatarOmar Sandoval <osandov@fb.com>
      Reviewed-by: default avatarJeff Moyer <jmoyer@redhat.com>
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      1ca4367b
    • Milan P. Gandhi's avatar
      scsi: qla2xxx: Get mutex lock before checking optrom_state · e07b3b23
      Milan P. Gandhi authored
      [ Upstream commit c7702b8c ]
      
      There is a race condition with qla2xxx optrom functions where one thread
      might modify optrom buffer, optrom_state while other thread is still
      reading from it.
      
      In couple of crashes, it was found that we had successfully passed the
      following 'if' check where we confirm optrom_state to be
      QLA_SREADING. But by the time we acquired mutex lock to proceed with
      memory_read_from_buffer function, some other thread/process had already
      modified that option rom buffer and optrom_state from QLA_SREADING to
      QLA_SWAITING. Then we got ha->optrom_buffer 0x0 and crashed the system:
      
              if (ha->optrom_state != QLA_SREADING)
                      return 0;
      
              mutex_lock(&ha->optrom_mutex);
              rval = memory_read_from_buffer(buf, count, &off, ha->optrom_buffer,
                  ha->optrom_region_size);
              mutex_unlock(&ha->optrom_mutex);
      
      With current optrom function we get following crash due to a race
      condition:
      
      [ 1479.466679] BUG: unable to handle kernel NULL pointer dereference at           (null)
      [ 1479.466707] IP: [<ffffffff81326756>] memcpy+0x6/0x110
      [...]
      [ 1479.473673] Call Trace:
      [ 1479.474296]  [<ffffffff81225cbc>] ? memory_read_from_buffer+0x3c/0x60
      [ 1479.474941]  [<ffffffffa01574dc>] qla2x00_sysfs_read_optrom+0x9c/0xc0 [qla2xxx]
      [ 1479.475571]  [<ffffffff8127e76b>] read+0xdb/0x1f0
      [ 1479.476206]  [<ffffffff811fdf9e>] vfs_read+0x9e/0x170
      [ 1479.476839]  [<ffffffff811feb6f>] SyS_read+0x7f/0xe0
      [ 1479.477466]  [<ffffffff816964c9>] system_call_fastpath+0x16/0x1b
      
      Below patch modifies qla2x00_sysfs_read_optrom,
      qla2x00_sysfs_write_optrom functions to get the mutex_lock before
      checking ha->optrom_state to avoid similar crashes.
      
      The patch was applied and tested and same crashes were no longer
      observed again.
      Tested-by: default avatarMilan P. Gandhi <mgandhi@redhat.com>
      Signed-off-by: default avatarMilan P. Gandhi <mgandhi@redhat.com>
      Reviewed-by: default avatarLaurence Oberman <loberman@redhat.com>
      Acked-by: default avatarHimanshu Madhani <himanshu.madhani@cavium.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      e07b3b23
    • Nicholas Mc Guire's avatar
      x86/boot: Add missing declaration of string functions · fc7941ca
      Nicholas Mc Guire authored
      [ Upstream commit fac69d0e ]
      
      Add the missing declarations of basic string functions to string.h to allow
      a clean build.
      
      Fixes: 5be86566 ("String-handling functions for the new x86 setup code.")
      Signed-off-by: default avatarNicholas Mc Guire <hofrat@osadl.org>
      Link: http://lkml.kernel.org/r/1483781911-21399-1-git-send-email-hofrat@osadl.orgSigned-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      fc7941ca
    • Michael Chan's avatar
      tg3: Fix race condition in tg3_get_stats64(). · a445f819
      Michael Chan authored
      [ Upstream commit f5992b72 ]
      
      The driver's ndo_get_stats64() method is not always called under RTNL.
      So it can race with driver close or ethtool reconfigurations.  Fix the
      race condition by taking tp->lock spinlock in tg3_free_consistent()
      when freeing the tp->hw_stats memory block.  tg3_get_stats64() is
      already taking tp->lock.
      Reported-by: default avatarWang Yufen <wangyufen@huawei.com>
      Signed-off-by: default avatarMichael Chan <michael.chan@broadcom.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      a445f819
    • Sergei Shtylyov's avatar
      sh_eth: R8A7740 supports packet shecksumming · 7c0dbe30
      Sergei Shtylyov authored
      [ Upstream commit 0f1f9cbc ]
      
      The R8A7740 GEther controller supports the packet checksum offloading
      but the 'hw_crc' (bad name, I'll fix it) flag isn't set in the R8A7740
      data,  thus CSMR isn't cleared...
      
      Fixes: 73a0d907 ("net: sh_eth: add support R8A7740")
      Signed-off-by: default avatarSergei Shtylyov <sergei.shtylyov@cogentembedded.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      7c0dbe30
    • Arnd Bergmann's avatar
      wext: handle NULL extra data in iwe_stream_add_point better · 9965cafb
      Arnd Bergmann authored
      [ Upstream commit 93be2b74 ]
      
      gcc-7 complains that wl3501_cs passes NULL into a function that
      then uses the argument as the input for memcpy:
      
      drivers/net/wireless/wl3501_cs.c: In function 'wl3501_get_scan':
      include/net/iw_handler.h:559:3: error: argument 2 null where non-null expected [-Werror=nonnull]
         memcpy(stream + point_len, extra, iwe->u.data.length);
      
      This works fine here because iwe->u.data.length is guaranteed to be 0
      and the memcpy doesn't actually have an effect.
      
      Making the length check explicit avoids the warning and should have
      no other effect here.
      
      Also check the pointer itself, since otherwise we get warnings
      elsewhere in the code.
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      9965cafb
    • Jane Chu's avatar
      sparc64: Measure receiver forward progress to avoid send mondo timeout · 97a7ea6d
      Jane Chu authored
      [ Upstream commit 9d53caec ]
      
      A large sun4v SPARC system may have moments of intensive xcall activities,
      usually caused by unmapping many pages on many CPUs concurrently. This can
      flood receivers with CPU mondo interrupts for an extended period, causing
      some unlucky senders to hit send-mondo timeout. This problem gets worse
      as cpu count increases because sometimes mappings must be invalidated on
      all CPUs, and sometimes all CPUs may gang up on a single CPU.
      
      But a busy system is not a broken system. In the above scenario, as long
      as the receiver is making forward progress processing mondo interrupts,
      the sender should continue to retry.
      
      This patch implements the receiver's forward progress meter by introducing
      a per cpu counter 'cpu_mondo_counter[cpu]' where 'cpu' is in the range
      of 0..NR_CPUS. The receiver increments its counter as soon as it receives
      a mondo and the sender tracks the receiver's counter. If the receiver has
      stopped making forward progress when the retry limit is reached, the sender
      declares send-mondo-timeout and panic; otherwise, the receiver is allowed
      to keep making forward progress.
      
      In addition, it's been observed that PCIe hotplug events generate Correctable
      Errors that are handled by hypervisor and then OS. Hypervisor 'borrows'
      a guest cpu strand briefly to provide the service. If the cpu strand is
      simultaneously the only cpu targeted by a mondo, it may not be available
      for the mondo in 20msec, causing SUN4V mondo timeout. It appears that 1 second
      is the agreed wait time between hypervisor and guest OS, this patch makes
      the adjustment.
      
      Orabug: 25476541
      Orabug: 26417466
      Signed-off-by: default avatarJane Chu <jane.chu@oracle.com>
      Reviewed-by: default avatarSteve Sistare <steven.sistare@oracle.com>
      Reviewed-by: default avatarAnthony Yznaga <anthony.yznaga@oracle.com>
      Reviewed-by: default avatarRob Gardner <rob.gardner@oracle.com>
      Reviewed-by: default avatarThomas Tai <thomas.tai@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      97a7ea6d
    • Wei Liu's avatar
      xen-netback: correctly schedule rate-limited queues · ce5ad993
      Wei Liu authored
      [ Upstream commit dfa523ae ]
      
      Add a flag to indicate if a queue is rate-limited. Test the flag in
      NAPI poll handler and avoid rescheduling the queue if true, otherwise
      we risk locking up the host. The rescheduling will be done in the
      timer callback function.
      Reported-by: default avatarJean-Louis Dupond <jean-louis@dupond.be>
      Signed-off-by: default avatarWei Liu <wei.liu2@citrix.com>
      Tested-by: default avatarJean-Louis Dupond <jean-louis@dupond.be>
      Reviewed-by: default avatarPaul Durrant <paul.durrant@citrix.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      ce5ad993
    • Florian Fainelli's avatar
      net: phy: Fix PHY unbind crash · 324f2efc
      Florian Fainelli authored
      [ Upstream commit 7b9a88a3 ]
      
      The PHY library does not deal very well with bind and unbind events. The first
      thing we would see is that we were not properly canceling the PHY state machine
      workqueue, so we would be crashing while dereferencing phydev->drv since there
      is no driver attached anymore.
      Suggested-by: default avatarRussell King <rmk+kernel@armlinux.org.uk>
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      324f2efc
    • Florian Fainelli's avatar
      net: phy: Correctly process PHY_HALTED in phy_stop_machine() · 128f29a4
      Florian Fainelli authored
      [ Upstream commit 7ad813f2 ]
      
      Marc reported that he was not getting the PHY library adjust_link()
      callback function to run when calling phy_stop() + phy_disconnect()
      which does not indeed happen because we set the state machine to
      PHY_HALTED but we don't get to run it to process this state past that
      point.
      
      Fix this with a synchronous call to phy_state_machine() in order to have
      the state machine actually act on PHY_HALTED, set the PHY device's link
      down, turn the network device's carrier off and finally call the
      adjust_link() function.
      Reported-by: default avatarMarc Gonzalez <marc_gonzalez@sigmadesigns.com>
      Fixes: a390d1f3 ("phylib: convert state_queue work to delayed_work")
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarMarc Gonzalez <marc_gonzalez@sigmadesigns.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      128f29a4
    • Xin Long's avatar
      sctp: fix the check for _sctp_walk_params and _sctp_walk_errors · 34e5d770
      Xin Long authored
      [ Upstream commit 6b84202c ]
      
      Commit b1f5bfc2 ("sctp: don't dereference ptr before leaving
      _sctp_walk_{params, errors}()") tried to fix the issue that it
      may overstep the chunk end for _sctp_walk_{params, errors} with
      'chunk_end > offset(length) + sizeof(length)'.
      
      But it introduced a side effect: When processing INIT, it verifies
      the chunks with 'param.v == chunk_end' after iterating all params
      by sctp_walk_params(). With the check 'chunk_end > offset(length)
      + sizeof(length)', it would return when the last param is not yet
      accessed. Because the last param usually is fwdtsn supported param
      whose size is 4 and 'chunk_end == offset(length) + sizeof(length)'
      
      This is a badly issue even causing sctp couldn't process 4-shakes.
      Client would always get abort when connecting to server, due to
      the failure of INIT chunk verification on server.
      
      The patch is to use 'chunk_end <= offset(length) + sizeof(length)'
      instead of 'chunk_end < offset(length) + sizeof(length)' for both
      _sctp_walk_params and _sctp_walk_errors.
      
      Fixes: b1f5bfc2 ("sctp: don't dereference ptr before leaving _sctp_walk_{params, errors}()")
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Acked-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      34e5d770
    • Alexander Potapenko's avatar
      sctp: don't dereference ptr before leaving _sctp_walk_{params, errors}() · 943eb738
      Alexander Potapenko authored
      [ Upstream commit b1f5bfc2 ]
      
      If the length field of the iterator (|pos.p| or |err|) is past the end
      of the chunk, we shouldn't access it.
      
      This bug has been detected by KMSAN. For the following pair of system
      calls:
      
        socket(PF_INET6, SOCK_STREAM, 0x84 /* IPPROTO_??? */) = 3
        sendto(3, "A", 1, MSG_OOB, {sa_family=AF_INET6, sin6_port=htons(0),
               inet_pton(AF_INET6, "::1", &sin6_addr), sin6_flowinfo=0,
               sin6_scope_id=0}, 28) = 1
      
      the tool has reported a use of uninitialized memory:
      
        ==================================================================
        BUG: KMSAN: use of uninitialized memory in sctp_rcv+0x17b8/0x43b0
        CPU: 1 PID: 2940 Comm: probe Not tainted 4.11.0-rc5+ #2926
        Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs
        01/01/2011
        Call Trace:
         <IRQ>
         __dump_stack lib/dump_stack.c:16
         dump_stack+0x172/0x1c0 lib/dump_stack.c:52
         kmsan_report+0x12a/0x180 mm/kmsan/kmsan.c:927
         __msan_warning_32+0x61/0xb0 mm/kmsan/kmsan_instr.c:469
         __sctp_rcv_init_lookup net/sctp/input.c:1074
         __sctp_rcv_lookup_harder net/sctp/input.c:1233
         __sctp_rcv_lookup net/sctp/input.c:1255
         sctp_rcv+0x17b8/0x43b0 net/sctp/input.c:170
         sctp6_rcv+0x32/0x70 net/sctp/ipv6.c:984
         ip6_input_finish+0x82f/0x1ee0 net/ipv6/ip6_input.c:279
         NF_HOOK ./include/linux/netfilter.h:257
         ip6_input+0x239/0x290 net/ipv6/ip6_input.c:322
         dst_input ./include/net/dst.h:492
         ip6_rcv_finish net/ipv6/ip6_input.c:69
         NF_HOOK ./include/linux/netfilter.h:257
         ipv6_rcv+0x1dbd/0x22e0 net/ipv6/ip6_input.c:203
         __netif_receive_skb_core+0x2f6f/0x3a20 net/core/dev.c:4208
         __netif_receive_skb net/core/dev.c:4246
         process_backlog+0x667/0xba0 net/core/dev.c:4866
         napi_poll net/core/dev.c:5268
         net_rx_action+0xc95/0x1590 net/core/dev.c:5333
         __do_softirq+0x485/0x942 kernel/softirq.c:284
         do_softirq_own_stack+0x1c/0x30 arch/x86/entry/entry_64.S:902
         </IRQ>
         do_softirq kernel/softirq.c:328
         __local_bh_enable_ip+0x25b/0x290 kernel/softirq.c:181
         local_bh_enable+0x37/0x40 ./include/linux/bottom_half.h:31
         rcu_read_unlock_bh ./include/linux/rcupdate.h:931
         ip6_finish_output2+0x19b2/0x1cf0 net/ipv6/ip6_output.c:124
         ip6_finish_output+0x764/0x970 net/ipv6/ip6_output.c:149
         NF_HOOK_COND ./include/linux/netfilter.h:246
         ip6_output+0x456/0x520 net/ipv6/ip6_output.c:163
         dst_output ./include/net/dst.h:486
         NF_HOOK ./include/linux/netfilter.h:257
         ip6_xmit+0x1841/0x1c00 net/ipv6/ip6_output.c:261
         sctp_v6_xmit+0x3b7/0x470 net/sctp/ipv6.c:225
         sctp_packet_transmit+0x38cb/0x3a20 net/sctp/output.c:632
         sctp_outq_flush+0xeb3/0x46e0 net/sctp/outqueue.c:885
         sctp_outq_uncork+0xb2/0xd0 net/sctp/outqueue.c:750
         sctp_side_effects net/sctp/sm_sideeffect.c:1773
         sctp_do_sm+0x6962/0x6ec0 net/sctp/sm_sideeffect.c:1147
         sctp_primitive_ASSOCIATE+0x12c/0x160 net/sctp/primitive.c:88
         sctp_sendmsg+0x43e5/0x4f90 net/sctp/socket.c:1954
         inet_sendmsg+0x498/0x670 net/ipv4/af_inet.c:762
         sock_sendmsg_nosec net/socket.c:633
         sock_sendmsg net/socket.c:643
         SYSC_sendto+0x608/0x710 net/socket.c:1696
         SyS_sendto+0x8a/0xb0 net/socket.c:1664
         do_syscall_64+0xe6/0x130 arch/x86/entry/common.c:285
         entry_SYSCALL64_slow_path+0x25/0x25 arch/x86/entry/entry_64.S:246
        RIP: 0033:0x401133
        RSP: 002b:00007fff6d99cd38 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
        RAX: ffffffffffffffda RBX: 00000000004002b0 RCX: 0000000000401133
        RDX: 0000000000000001 RSI: 0000000000494088 RDI: 0000000000000003
        RBP: 00007fff6d99cd90 R08: 00007fff6d99cd50 R09: 000000000000001c
        R10: 0000000000000001 R11: 0000000000000246 R12: 0000000000000000
        R13: 00000000004063d0 R14: 0000000000406460 R15: 0000000000000000
        origin:
         save_stack_trace+0x37/0x40 arch/x86/kernel/stacktrace.c:59
         kmsan_save_stack_with_flags mm/kmsan/kmsan.c:302
         kmsan_internal_poison_shadow+0xb1/0x1a0 mm/kmsan/kmsan.c:198
         kmsan_poison_shadow+0x6d/0xc0 mm/kmsan/kmsan.c:211
         slab_alloc_node mm/slub.c:2743
         __kmalloc_node_track_caller+0x200/0x360 mm/slub.c:4351
         __kmalloc_reserve net/core/skbuff.c:138
         __alloc_skb+0x26b/0x840 net/core/skbuff.c:231
         alloc_skb ./include/linux/skbuff.h:933
         sctp_packet_transmit+0x31e/0x3a20 net/sctp/output.c:570
         sctp_outq_flush+0xeb3/0x46e0 net/sctp/outqueue.c:885
         sctp_outq_uncork+0xb2/0xd0 net/sctp/outqueue.c:750
         sctp_side_effects net/sctp/sm_sideeffect.c:1773
         sctp_do_sm+0x6962/0x6ec0 net/sctp/sm_sideeffect.c:1147
         sctp_primitive_ASSOCIATE+0x12c/0x160 net/sctp/primitive.c:88
         sctp_sendmsg+0x43e5/0x4f90 net/sctp/socket.c:1954
         inet_sendmsg+0x498/0x670 net/ipv4/af_inet.c:762
         sock_sendmsg_nosec net/socket.c:633
         sock_sendmsg net/socket.c:643
         SYSC_sendto+0x608/0x710 net/socket.c:1696
         SyS_sendto+0x8a/0xb0 net/socket.c:1664
         do_syscall_64+0xe6/0x130 arch/x86/entry/common.c:285
         return_from_SYSCALL_64+0x0/0x6a arch/x86/entry/entry_64.S:246
        ==================================================================
      Signed-off-by: default avatarAlexander Potapenko <glider@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      943eb738
    • Xin Long's avatar
      dccp: fix a memleak for dccp_feat_init err process · e3c935a8
      Xin Long authored
      [ Upstream commit e90ce2fc ]
      
      In dccp_feat_init, when ccid_get_builtin_ccids failsto alloc
      memory for rx.val, it should free tx.val before returning an
      error.
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      e3c935a8
    • Xin Long's avatar
      dccp: fix a memleak that dccp_ipv4 doesn't put reqsk properly · 0b47202e
      Xin Long authored
      [ Upstream commit b7953d3c ]
      
      The patch "dccp: fix a memleak that dccp_ipv6 doesn't put reqsk
      properly" fixed reqsk refcnt leak for dccp_ipv6. The same issue
      exists on dccp_ipv4.
      
      This patch is to fix it for dccp_ipv4.
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      0b47202e
    • WANG Cong's avatar
      packet: fix use-after-free in prb_retire_rx_blk_timer_expired() · d108d699
      WANG Cong authored
      [ Upstream commit c800aaf8 ]
      
      There are multiple reports showing we have a use-after-free in
      the timer prb_retire_rx_blk_timer_expired(), where we use struct
      tpacket_kbdq_core::pkbdq, a pg_vec, after it gets freed by
      free_pg_vec().
      
      The interesting part is it is not freed via packet_release() but
      via packet_setsockopt(), which means we are not closing the socket.
      Looking into the big and fat function packet_set_ring(), this could
      happen if we satisfy the following conditions:
      
      1. closing == 0, not on packet_release() path
      2. req->tp_block_nr == 0, we don't allocate a new pg_vec
      3. rx_ring->pg_vec is already set as V3, which means we already called
         packet_set_ring() wtih req->tp_block_nr > 0 previously
      4. req->tp_frame_nr == 0, pass sanity check
      5. po->mapped == 0, never called mmap()
      
      In this scenario we are clearing the old rx_ring->pg_vec, so we need
      to free this pg_vec, but we don't stop the timer on this path because
      of closing==0.
      
      The timer has to be stopped as long as we need to free pg_vec, therefore
      the check on closing!=0 is wrong, we should check pg_vec!=NULL instead.
      
      Thanks to liujian for testing different fixes.
      
      Reported-by: alexander.levin@verizon.com
      Reported-by: default avatarDave Jones <davej@codemonkey.org.uk>
      Reported-by: default avatarliujian (CE) <liujian56@huawei.com>
      Tested-by: default avatarliujian (CE) <liujian56@huawei.com>
      Cc: Ding Tianhong <dingtianhong@huawei.com>
      Cc: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
      Signed-off-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      d108d699
    • Thomas Jarosch's avatar
      mcs7780: Fix initialization when CONFIG_VMAP_STACK is enabled · 95cda937
      Thomas Jarosch authored
      [ Upstream commit 9476d393 ]
      
      DMA transfers are not allowed to buffers that are on the stack.
      Therefore allocate a buffer to store the result of usb_control_message().
      
      Fixes these bugreports:
      https://bugzilla.kernel.org/show_bug.cgi?id=195217
      
      https://bugzilla.redhat.com/show_bug.cgi?id=1421387
      https://bugzilla.redhat.com/show_bug.cgi?id=1427398
      
      Shortened kernel backtrace from 4.11.9-200.fc25.x86_64:
      kernel: ------------[ cut here ]------------
      kernel: WARNING: CPU: 3 PID: 2957 at drivers/usb/core/hcd.c:1587
      kernel: transfer buffer not dma capable
      kernel: Call Trace:
      kernel: dump_stack+0x63/0x86
      kernel: __warn+0xcb/0xf0
      kernel: warn_slowpath_fmt+0x5a/0x80
      kernel: usb_hcd_map_urb_for_dma+0x37f/0x570
      kernel: ? try_to_del_timer_sync+0x53/0x80
      kernel: usb_hcd_submit_urb+0x34e/0xb90
      kernel: ? schedule_timeout+0x17e/0x300
      kernel: ? del_timer_sync+0x50/0x50
      kernel: ? __slab_free+0xa9/0x300
      kernel: usb_submit_urb+0x2f4/0x560
      kernel: ? urb_destroy+0x24/0x30
      kernel: usb_start_wait_urb+0x6e/0x170
      kernel: usb_control_msg+0xdc/0x120
      kernel: mcs_get_reg+0x36/0x40 [mcs7780]
      kernel: mcs_net_open+0xb5/0x5c0 [mcs7780]
      ...
      
      Regression goes back to 4.9, so it's a good candidate for -stable.
      Though it's the decision of the maintainer.
      
      Thanks to Dan Williams for adding the "transfer buffer not dma capable"
      warning in the first place. It instantly pointed me in the right direction.
      
      Patch has been tested with transferring data from a Polar watch.
      Signed-off-by: default avatarThomas Jarosch <thomas.jarosch@intra2net.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      95cda937
    • WANG Cong's avatar
      rtnetlink: allocate more memory for dev_set_mac_address() · 608c5794
      WANG Cong authored
      [ Upstream commit 153711f9 ]
      
      virtnet_set_mac_address() interprets mac address as struct
      sockaddr, but upper layer only allocates dev->addr_len
      which is ETH_ALEN + sizeof(sa_family_t) in this case.
      
      We lack a unified definition for mac address, so just fix
      the upper layer, this also allows drivers to interpret it
      to struct sockaddr freely.
      Reported-by: default avatarDavid Ahern <dsahern@gmail.com>
      Signed-off-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      608c5794
    • Mahesh Bandewar's avatar
      ipv4: initialize fib_trie prior to register_netdev_notifier call. · a2805888
      Mahesh Bandewar authored
      [ Upstream commit 8799a221 ]
      
      Net stack initialization currently initializes fib-trie after the
      first call to netdevice_notifier() call. In fact fib_trie initialization
      needs to happen before first rtnl_register(). It does not cause any problem
      since there are no devices UP at this moment, but trying to bring 'lo'
      UP at initialization would make this assumption wrong and exposes the issue.
      
      Fixes following crash
      
       Call Trace:
        ? alternate_node_alloc+0x76/0xa0
        fib_table_insert+0x1b7/0x4b0
        fib_magic.isra.17+0xea/0x120
        fib_add_ifaddr+0x7b/0x190
        fib_netdev_event+0xc0/0x130
        register_netdevice_notifier+0x1c1/0x1d0
        ip_fib_init+0x72/0x85
        ip_rt_init+0x187/0x1e9
        ip_init+0xe/0x1a
        inet_init+0x171/0x26c
        ? ipv4_offload_init+0x66/0x66
        do_one_initcall+0x43/0x160
        kernel_init_freeable+0x191/0x219
        ? rest_init+0x80/0x80
        kernel_init+0xe/0x150
        ret_from_fork+0x22/0x30
       Code: f6 46 23 04 74 86 4c 89 f7 e8 ae 45 01 00 49 89 c7 4d 85 ff 0f 85 7b ff ff ff 31 db eb 08 4c 89 ff e8 16 47 01 00 48 8b 44 24 38 <45> 8b 6e 14 4d 63 76 74 48 89 04 24 0f 1f 44 00 00 48 83 c4 08
       RIP: kmem_cache_alloc+0xcf/0x1c0 RSP: ffff9b1500017c28
       CR2: 0000000000000014
      
      Fixes: 7b1a74fd ("[NETNS]: Refactor fib initialization so it can handle multiple namespaces.")
      Fixes: 7f9b8052 ("[IPV4]: fib hash|trie initialization")
      Signed-off-by: default avatarMahesh Bandewar <maheshb@google.com>
      Acked-by: default avatar"Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      a2805888
    • Sabrina Dubroca's avatar
      ipv6: avoid overflow of offset in ip6_find_1stfragopt · 0fc2cead
      Sabrina Dubroca authored
      [ Upstream commit 6399f1fa ]
      
      In some cases, offset can overflow and can cause an infinite loop in
      ip6_find_1stfragopt(). Make it unsigned int to prevent the overflow, and
      cap it at IPV6_MAXPLEN, since packets larger than that should be invalid.
      
      This problem has been here since before the beginning of git history.
      Signed-off-by: default avatarSabrina Dubroca <sd@queasysnail.net>
      Acked-by: default avatarHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      0fc2cead
    • David S. Miller's avatar
      net: Zero terminate ifr_name in dev_ifname(). · cd701222
      David S. Miller authored
      [ Upstream commit 63679112 ]
      
      The ifr.ifr_name is passed around and assumed to be NULL terminated.
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      cd701222
    • Steven Toth's avatar
      [media] saa7164: fix double fetch PCIe access condition · 0b3294aa
      Steven Toth authored
      [ Upstream commit 6fb05e0d ]
      
      Avoid a double fetch by reusing the values from the prior transfer.
      
      Originally reported via https://bugzilla.kernel.org/show_bug.cgi?id=195559
      
      Thanks to Pengfei Wang <wpengfeinudt@gmail.com> for reporting.
      Signed-off-by: default avatarSteven Toth <stoth@kernellabs.com>
      Reported-by: default avatarPengfei Wang <wpengfeinudt@gmail.com>
      Signed-off-by: default avatarMauro Carvalho Chehab <mchehab@s-opensource.com>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      0b3294aa
    • Jin Qian's avatar
      f2fs: sanity check checkpoint segno and blkoff · 604b43bb
      Jin Qian authored
      [ Upstream commit 15d3042a ]
      
      Make sure segno and blkoff read from raw image are valid.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarJin Qian <jinqian@google.com>
      [Jaegeuk Kim: adjust minor coding style]
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      604b43bb