1. 13 Aug, 2017 2 commits
    • Eric Dumazet's avatar
      net: fix keepalive code vs TCP_FASTOPEN_CONNECT · 4e0675f4
      Eric Dumazet authored
      
      [ Upstream commit 2dda6400 ]
      
      syzkaller was able to trigger a divide by 0 in TCP stack [1]
      
      Issue here is that keepalive timer needs to be updated to not attempt
      to send a probe if the connection setup was deferred using
      TCP_FASTOPEN_CONNECT socket option added in linux-4.11
      
      [1]
       divide error: 0000 [#1] SMP
       CPU: 18 PID: 0 Comm: swapper/18 Not tainted
       task: ffff986f62f4b040 ti: ffff986f62fa2000 task.ti: ffff986f62fa2000
       RIP: 0010:[<ffffffff8409cc0d>]  [<ffffffff8409cc0d>] __tcp_select_window+0x8d/0x160
       Call Trace:
        <IRQ>
        [<ffffffff8409d951>] tcp_transmit_skb+0x11/0x20
        [<ffffffff8409da21>] tcp_xmit_probe_skb+0xc1/0xe0
        [<ffffffff840a0ee8>] tcp_write_wakeup+0x68/0x160
        [<ffffffff840a151b>] tcp_keepalive_timer+0x17b/0x230
        [<ffffffff83b3f799>] call_timer_fn+0x39/0xf0
        [<ffffffff83b40797>] run_timer_softirq+0x1d7/0x280
        [<ffffffff83a04ddb>] __do_softirq+0xcb/0x257
        [<ffffffff83ae03ac>] irq_exit+0x9c/0xb0
        [<ffffffff83a04c1a>] smp_apic_timer_interrupt+0x6a/0x80
        [<ffffffff83a03eaf>] apic_timer_interrupt+0x7f/0x90
        <EOI>
        [<ffffffff83fed2ea>] ? cpuidle_enter_state+0x13a/0x3b0
        [<ffffffff83fed2cd>] ? cpuidle_enter_state+0x11d/0x3b0
      
      Tested:
      
      Following packetdrill no longer crashes the kernel
      
      `echo 0 >/proc/sys/net/ipv4/tcp_timestamps`
      
      // Cache warmup: send a Fast Open cookie request
          0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
         +0 fcntl(3, F_SETFL, O_RDWR|O_NONBLOCK) = 0
         +0 setsockopt(3, SOL_TCP, TCP_FASTOPEN_CONNECT, [1], 4) = 0
         +0 connect(3, ..., ...) = -1 EINPROGRESS (Operation is now in progress)
         +0 > S 0:0(0) <mss 1460,nop,nop,sackOK,nop,wscale 8,FO,nop,nop>
       +.01 < S. 123:123(0) ack 1 win 14600 <mss 1460,nop,nop,sackOK,nop,wscale 6,FO abcd1234,nop,nop>
         +0 > . 1:1(0) ack 1
         +0 close(3) = 0
         +0 > F. 1:1(0) ack 1
         +0 < F. 1:1(0) ack 2 win 92
         +0 > .  2:2(0) ack 2
      
         +0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 4
         +0 fcntl(4, F_SETFL, O_RDWR|O_NONBLOCK) = 0
         +0 setsockopt(4, SOL_TCP, TCP_FASTOPEN_CONNECT, [1], 4) = 0
         +0 setsockopt(4, SOL_SOCKET, SO_KEEPALIVE, [1], 4) = 0
       +.01 connect(4, ..., ...) = 0
         +0 setsockopt(4, SOL_TCP, TCP_KEEPIDLE, [5], 4) = 0
         +10 close(4) = 0
      
      `echo 1 >/proc/sys/net/ipv4/tcp_timestamps`
      
      Fixes: 19f6d3f3 ("net/tcp-fastopen: Add new API support")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Cc: Wei Wang <weiwan@google.com>
      Cc: Yuchung Cheng <ycheng@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4e0675f4
    • Yuchung Cheng's avatar
      tcp: avoid setting cwnd to invalid ssthresh after cwnd reduction states · 025bb7f7
      Yuchung Cheng authored
      
      [ Upstream commit ed254971 ]
      
      If the sender switches the congestion control during ECN-triggered
      cwnd-reduction state (CA_CWR), upon exiting recovery cwnd is set to
      the ssthresh value calculated by the previous congestion control. If
      the previous congestion control is BBR that always keep ssthresh
      to TCP_INIFINITE_SSTHRESH, cwnd ends up being infinite. The safe
      step is to avoid assigning invalid ssthresh value when recovery ends.
      Signed-off-by: default avatarYuchung Cheng <ycheng@google.com>
      Signed-off-by: default avatarNeal Cardwell <ncardwell@google.com>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      025bb7f7
  2. 11 Aug, 2017 38 commits
    • Greg Kroah-Hartman's avatar
      Linux 4.4.81 · 2ab639c7
      Greg Kroah-Hartman authored
      2ab639c7
    • Tejun Heo's avatar
      workqueue: implicit ordered attribute should be overridable · 34a08ae4
      Tejun Heo authored
      commit 0a94efb5 upstream.
      
      5c0338c6 ("workqueue: restore WQ_UNBOUND/max_active==1 to be
      ordered") automatically enabled ordered attribute for unbound
      workqueues w/ max_active == 1.  Because ordered workqueues reject
      max_active and some attribute changes, this implicit ordered mode
      broke cases where the user creates an unbound workqueue w/ max_active
      == 1 and later explicitly changes the related attributes.
      
      This patch distinguishes explicit and implicit ordered setting and
      overrides from attribute changes if implict.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Fixes: 5c0338c6 ("workqueue: restore WQ_UNBOUND/max_active==1 to be ordered")
      Cc: Holger Hoffstätte <holger@applied-asynchrony.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      34a08ae4
    • Michal Kubeček's avatar
      net: account for current skb length when deciding about UFO · 0c787041
      Michal Kubeček authored
      
      [ Upstream commit a5cb659b ]
      
      Our customer encountered stuck NFS writes for blocks starting at specific
      offsets w.r.t. page boundary caused by networking stack sending packets via
      UFO enabled device with wrong checksum. The problem can be reproduced by
      composing a long UDP datagram from multiple parts using MSG_MORE flag:
      
        sendto(sd, buff, 1000, MSG_MORE, ...);
        sendto(sd, buff, 1000, MSG_MORE, ...);
        sendto(sd, buff, 3000, 0, ...);
      
      Assume this packet is to be routed via a device with MTU 1500 and
      NETIF_F_UFO enabled. When second sendto() gets into __ip_append_data(),
      this condition is tested (among others) to decide whether to call
      ip_ufo_append_data():
      
        ((length + fragheaderlen) > mtu) || (skb && skb_is_gso(skb))
      
      At the moment, we already have skb with 1028 bytes of data which is not
      marked for GSO so that the test is false (fragheaderlen is usually 20).
      Thus we append second 1000 bytes to this skb without invoking UFO. Third
      sendto(), however, has sufficient length to trigger the UFO path so that we
      end up with non-UFO skb followed by a UFO one. Later on, udp_send_skb()
      uses udp_csum() to calculate the checksum but that assumes all fragments
      have correct checksum in skb->csum which is not true for UFO fragments.
      
      When checking against MTU, we need to add skb->len to length of new segment
      if we already have a partially filled skb and fragheaderlen only if there
      isn't one.
      
      In the IPv6 case, skb can only be null if this is the first segment so that
      we have to use headersize (length of the first IPv6 header) rather than
      fragheaderlen (length of IPv6 header of further fragments) for skb == NULL.
      
      Fixes: e89e9cf5 ("[IPv4/IPv6]: UFO Scatter-gather approach")
      Fixes: e4c5e13a ("ipv6: Should use consistent conditional judgement for
      	ip6 fragment between __ip6_append_data and ip6_finish_output")
      Signed-off-by: default avatarMichal Kubecek <mkubecek@suse.cz>
      Acked-by: default avatarVlad Yasevich <vyasevic@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0c787041
    • zheng li's avatar
      ipv4: Should use consistent conditional judgement for ip fragment in... · 12b8f014
      zheng li authored
      ipv4: Should use consistent conditional judgement for ip fragment in __ip_append_data and ip_finish_output
      
      
      [ Upstream commit 0a28cfd5 ]
      
      There is an inconsistent conditional judgement in __ip_append_data and
      ip_finish_output functions, the variable length in __ip_append_data just
      include the length of application's payload and udp header, don't include
      the length of ip header, but in ip_finish_output use
      (skb->len > ip_skb_dst_mtu(skb)) as judgement, and skb->len include the
      length of ip header.
      
      That causes some particular application's udp payload whose length is
      between (MTU - IP Header) and MTU were fragmented by ip_fragment even
      though the rst->dev support UFO feature.
      
      Add the length of ip header to length in __ip_append_data to keep
      consistent conditional judgement as ip_finish_output for ip fragment.
      Signed-off-by: default avatarZheng Li <james.z.li@ericsson.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      12b8f014
    • Ard Biesheuvel's avatar
      mm: don't dereference struct page fields of invalid pages · 78c04996
      Ard Biesheuvel authored
      
      [ Upstream commit f073bdc5 ]
      
      The VM_BUG_ON() check in move_freepages() checks whether the node id of
      a page matches the node id of its zone.  However, it does this before
      having checked whether the struct page pointer refers to a valid struct
      page to begin with.  This is guaranteed in most cases, but may not be
      the case if CONFIG_HOLES_IN_ZONE=y.
      
      So reorder the VM_BUG_ON() with the pfn_valid_within() check.
      
      Link: http://lkml.kernel.org/r/1481706707-6211-2-git-send-email-ard.biesheuvel@linaro.orgSigned-off-by: default avatarArd Biesheuvel <ard.biesheuvel@linaro.org>
      Acked-by: default avatarWill Deacon <will.deacon@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Hanjun Guo <hanjun.guo@linaro.org>
      Cc: Yisheng Xie <xieyisheng1@huawei.com>
      Cc: Robert Richter <rrichter@cavium.com>
      Cc: James Morse <james.morse@arm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      78c04996
    • Jamie Iles's avatar
      signal: protect SIGNAL_UNKILLABLE from unintentional clearing. · bbe660db
      Jamie Iles authored
      
      [ Upstream commit 2d39b3cd ]
      
      Since commit 00cd5c37 ("ptrace: permit ptracing of /sbin/init") we
      can now trace init processes.  init is initially protected with
      SIGNAL_UNKILLABLE which will prevent fatal signals such as SIGSTOP, but
      there are a number of paths during tracing where SIGNAL_UNKILLABLE can
      be implicitly cleared.
      
      This can result in init becoming stoppable/killable after tracing.  For
      example, running:
      
        while true; do kill -STOP 1; done &
        strace -p 1
      
      and then stopping strace and the kill loop will result in init being
      left in state TASK_STOPPED.  Sending SIGCONT to init will resume it, but
      init will now respond to future SIGSTOP signals rather than ignoring
      them.
      
      Make sure that when setting SIGNAL_STOP_CONTINUED/SIGNAL_STOP_STOPPED
      that we don't clear SIGNAL_UNKILLABLE.
      
      Link: http://lkml.kernel.org/r/20170104122017.25047-1-jamie.iles@oracle.comSigned-off-by: default avatarJamie Iles <jamie.iles@oracle.com>
      Acked-by: default avatarOleg Nesterov <oleg@redhat.com>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      bbe660db
    • Sudip Mukherjee's avatar
      lib/Kconfig.debug: fix frv build failure · 623f4fcd
      Sudip Mukherjee authored
      
      [ Upstream commit da0510c4 ]
      
      The build of frv allmodconfig was failing with the errors like:
      
        /tmp/cc0JSPc3.s: Assembler messages:
        /tmp/cc0JSPc3.s:1839: Error: symbol `.LSLT0' is already defined
        /tmp/cc0JSPc3.s:1842: Error: symbol `.LASLTP0' is already defined
        /tmp/cc0JSPc3.s:1969: Error: symbol `.LELTP0' is already defined
        /tmp/cc0JSPc3.s:1970: Error: symbol `.LELT0' is already defined
      
      Commit 866ced95 ("kbuild: Support split debug info v4") introduced
      splitting the debug info and keeping that in a separate file.  Somehow,
      the frv-linux gcc did not like that and I am guessing that instead of
      splitting it started copying.  The first report about this is at:
      
        https://lists.01.org/pipermail/kbuild-all/2015-July/010527.html.
      
      I will try and see if this can work with frv and if still fails I will
      open a bug report with gcc.  But meanwhile this is the easiest option to
      solve build failure of frv.
      
      Fixes: 866ced95 ("kbuild: Support split debug info v4")
      Link: http://lkml.kernel.org/r/1482062348-5352-1-git-send-email-sudipm.mukherjee@gmail.comSigned-off-by: default avatarSudip Mukherjee <sudip.mukherjee@codethink.co.uk>
      Reported-by: default avatarFengguang Wu <fengguang.wu@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: David Howells <dhowells@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      623f4fcd
    • Michal Hocko's avatar
      mm, slab: make sure that KMALLOC_MAX_SIZE will fit into MAX_ORDER · 9c83b97b
      Michal Hocko authored
      
      [ Upstream commit bb1107f7 ]
      
      Andrey Konovalov has reported the following warning triggered by the
      syzkaller fuzzer.
      
        WARNING: CPU: 1 PID: 9935 at mm/page_alloc.c:3511 __alloc_pages_nodemask+0x159c/0x1e20
        Kernel panic - not syncing: panic_on_warn set ...
        CPU: 1 PID: 9935 Comm: syz-executor0 Not tainted 4.9.0-rc7+ #34
        Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
        Call Trace:
          __alloc_pages_slowpath mm/page_alloc.c:3511
          __alloc_pages_nodemask+0x159c/0x1e20 mm/page_alloc.c:3781
          alloc_pages_current+0x1c7/0x6b0 mm/mempolicy.c:2072
          alloc_pages include/linux/gfp.h:469
          kmalloc_order+0x1f/0x70 mm/slab_common.c:1015
          kmalloc_order_trace+0x1f/0x160 mm/slab_common.c:1026
          kmalloc_large include/linux/slab.h:422
          __kmalloc+0x210/0x2d0 mm/slub.c:3723
          kmalloc include/linux/slab.h:495
          ep_write_iter+0x167/0xb50 drivers/usb/gadget/legacy/inode.c:664
          new_sync_write fs/read_write.c:499
          __vfs_write+0x483/0x760 fs/read_write.c:512
          vfs_write+0x170/0x4e0 fs/read_write.c:560
          SYSC_write fs/read_write.c:607
          SyS_write+0xfb/0x230 fs/read_write.c:599
          entry_SYSCALL_64_fastpath+0x1f/0xc2
      
      The issue is caused by a lack of size check for the request size in
      ep_write_iter which should be fixed.  It, however, points to another
      problem, that SLUB defines KMALLOC_MAX_SIZE too large because the its
      KMALLOC_SHIFT_MAX is (MAX_ORDER + PAGE_SHIFT) which means that the
      resulting page allocator request might be MAX_ORDER which is too large
      (see __alloc_pages_slowpath).
      
      The same applies to the SLOB allocator which allows even larger sizes.
      Make sure that they are capped properly and never request more than
      MAX_ORDER order.
      
      Link: http://lkml.kernel.org/r/20161220130659.16461-2-mhocko@kernel.orgSigned-off-by: default avatarMichal Hocko <mhocko@suse.com>
      Reported-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Acked-by: default avatarChristoph Lameter <cl@linux.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9c83b97b
    • Rabin Vincent's avatar
      ARM: 8632/1: ftrace: fix syscall name matching · 5205f521
      Rabin Vincent authored
      
      [ Upstream commit 270c8cf1 ]
      
      ARM has a few system calls (most notably mmap) for which the names of
      the functions which are referenced in the syscall table do not match the
      names of the syscall tracepoints.  As a consequence of this, these
      tracepoints are not made available.  Implement
      arch_syscall_match_sym_name to fix this and allow tracing even these
      system calls.
      Signed-off-by: default avatarRabin Vincent <rabinv@axis.com>
      Signed-off-by: default avatarRussell King <rmk+kernel@arm.linux.org.uk>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5205f521
    • Omar Sandoval's avatar
      virtio_blk: fix panic in initialization error path · 874f2265
      Omar Sandoval authored
      
      [ Upstream commit 6bf6b0aa ]
      
      If blk_mq_init_queue() returns an error, it gets assigned to
      vblk->disk->queue. Then, when we call put_disk(), we end up calling
      blk_put_queue() with the ERR_PTR, causing a bad dereference. Fix it by
      only assigning to vblk->disk->queue on success.
      Signed-off-by: default avatarOmar Sandoval <osandov@fb.com>
      Reviewed-by: default avatarJeff Moyer <jmoyer@redhat.com>
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      874f2265
    • Gerd Hoffmann's avatar
      drm/virtio: fix framebuffer sparse warning · c9e4ee44
      Gerd Hoffmann authored
      
      [ Upstream commit 71d3f6ef ]
      
      virtio uses normal ram as backing storage for the framebuffer, so we
      should assign the address to new screen_buffer (added by commit
      17a7b0b4) instead of screen_base.
      Reported-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: default avatarGerd Hoffmann <kraxel@redhat.com>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c9e4ee44
    • Milan P. Gandhi's avatar
      scsi: qla2xxx: Get mutex lock before checking optrom_state · 1e43b2d0
      Milan P. Gandhi authored
      
      [ Upstream commit c7702b8c ]
      
      There is a race condition with qla2xxx optrom functions where one thread
      might modify optrom buffer, optrom_state while other thread is still
      reading from it.
      
      In couple of crashes, it was found that we had successfully passed the
      following 'if' check where we confirm optrom_state to be
      QLA_SREADING. But by the time we acquired mutex lock to proceed with
      memory_read_from_buffer function, some other thread/process had already
      modified that option rom buffer and optrom_state from QLA_SREADING to
      QLA_SWAITING. Then we got ha->optrom_buffer 0x0 and crashed the system:
      
              if (ha->optrom_state != QLA_SREADING)
                      return 0;
      
              mutex_lock(&ha->optrom_mutex);
              rval = memory_read_from_buffer(buf, count, &off, ha->optrom_buffer,
                  ha->optrom_region_size);
              mutex_unlock(&ha->optrom_mutex);
      
      With current optrom function we get following crash due to a race
      condition:
      
      [ 1479.466679] BUG: unable to handle kernel NULL pointer dereference at           (null)
      [ 1479.466707] IP: [<ffffffff81326756>] memcpy+0x6/0x110
      [...]
      [ 1479.473673] Call Trace:
      [ 1479.474296]  [<ffffffff81225cbc>] ? memory_read_from_buffer+0x3c/0x60
      [ 1479.474941]  [<ffffffffa01574dc>] qla2x00_sysfs_read_optrom+0x9c/0xc0 [qla2xxx]
      [ 1479.475571]  [<ffffffff8127e76b>] read+0xdb/0x1f0
      [ 1479.476206]  [<ffffffff811fdf9e>] vfs_read+0x9e/0x170
      [ 1479.476839]  [<ffffffff811feb6f>] SyS_read+0x7f/0xe0
      [ 1479.477466]  [<ffffffff816964c9>] system_call_fastpath+0x16/0x1b
      
      Below patch modifies qla2x00_sysfs_read_optrom,
      qla2x00_sysfs_write_optrom functions to get the mutex_lock before
      checking ha->optrom_state to avoid similar crashes.
      
      The patch was applied and tested and same crashes were no longer
      observed again.
      Tested-by: default avatarMilan P. Gandhi <mgandhi@redhat.com>
      Signed-off-by: default avatarMilan P. Gandhi <mgandhi@redhat.com>
      Reviewed-by: default avatarLaurence Oberman <loberman@redhat.com>
      Acked-by: default avatarHimanshu Madhani <himanshu.madhani@cavium.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1e43b2d0
    • Zefir Kurtisi's avatar
      phy state machine: failsafe leave invalid RUNNING state · a9873711
      Zefir Kurtisi authored
      
      [ Upstream commit 811a9191 ]
      
      While in RUNNING state, phy_state_machine() checks for link changes by
      comparing phydev->link before and after calling phy_read_status().
      This works as long as it is guaranteed that phydev->link is never
      changed outside the phy_state_machine().
      
      If in some setups this happens, it causes the state machine to miss
      a link loss and remain RUNNING despite phydev->link being 0.
      
      This has been observed running a dsa setup with a process continuously
      polling the link states over ethtool each second (SNMPD RFC-1213
      agent). Disconnecting the link on a phy followed by a ETHTOOL_GSET
      causes dsa_slave_get_settings() / dsa_slave_get_link_ksettings() to
      call phy_read_status() and with that modify the link status - and
      with that bricking the phy state machine.
      
      This patch adds a fail-safe check while in RUNNING, which causes to
      move to CHANGELINK when the link is gone and we are still RUNNING.
      Signed-off-by: default avatarZefir Kurtisi <zefir.kurtisi@neratec.com>
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a9873711
    • Nicholas Mc Guire's avatar
      x86/boot: Add missing declaration of string functions · db01878c
      Nicholas Mc Guire authored
      
      [ Upstream commit fac69d0e ]
      
      Add the missing declarations of basic string functions to string.h to allow
      a clean build.
      
      Fixes: 5be86566 ("String-handling functions for the new x86 setup code.")
      Signed-off-by: default avatarNicholas Mc Guire <hofrat@osadl.org>
      Link: http://lkml.kernel.org/r/1483781911-21399-1-git-send-email-hofrat@osadl.orgSigned-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      db01878c
    • Michael Chan's avatar
      tg3: Fix race condition in tg3_get_stats64(). · 032422cc
      Michael Chan authored
      
      [ Upstream commit f5992b72 ]
      
      The driver's ndo_get_stats64() method is not always called under RTNL.
      So it can race with driver close or ethtool reconfigurations.  Fix the
      race condition by taking tp->lock spinlock in tg3_free_consistent()
      when freeing the tp->hw_stats memory block.  tg3_get_stats64() is
      already taking tp->lock.
      Reported-by: default avatarWang Yufen <wangyufen@huawei.com>
      Signed-off-by: default avatarMichael Chan <michael.chan@broadcom.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      032422cc
    • Grygorii Strashko's avatar
      net: phy: dp83867: fix irq generation · 93585e81
      Grygorii Strashko authored
      
      [ Upstream commit 5ca7d1ca ]
      
      For proper IRQ generation by DP83867 phy the INT/PWDN pin has to be
      programmed as an interrupt output instead of a Powerdown input in
      Configuration Register 3 (CFG3), Address 0x001E, bit 7 INT_OE = 1. The
      current driver doesn't do this and as result IRQs will not be generated by
      DP83867 phy even if they are properly configured in DT.
      
      Hence, fix IRQ generation by properly configuring CFG3.INT_OE bit and
      ensure that Link Status Change (LINK_STATUS_CHNG_INT) and Auto-Negotiation
      Complete (AUTONEG_COMP_INT) interrupt are enabled. After this the DP83867
      driver will work properly in interrupt enabled mode.
      Signed-off-by: default avatarGrygorii Strashko <grygorii.strashko@ti.com>
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      93585e81
    • Sergei Shtylyov's avatar
      sh_eth: R8A7740 supports packet shecksumming · 41433e31
      Sergei Shtylyov authored
      
      [ Upstream commit 0f1f9cbc ]
      
      The R8A7740 GEther controller supports the packet checksum offloading
      but the 'hw_crc' (bad name, I'll fix it) flag isn't set in the R8A7740
      data,  thus CSMR isn't cleared...
      
      Fixes: 73a0d907 ("net: sh_eth: add support R8A7740")
      Signed-off-by: default avatarSergei Shtylyov <sergei.shtylyov@cogentembedded.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      41433e31
    • Arnd Bergmann's avatar
      wext: handle NULL extra data in iwe_stream_add_point better · 50231cef
      Arnd Bergmann authored
      commit 93be2b74 upstream.
      
      gcc-7 complains that wl3501_cs passes NULL into a function that
      then uses the argument as the input for memcpy:
      
      drivers/net/wireless/wl3501_cs.c: In function 'wl3501_get_scan':
      include/net/iw_handler.h:559:3: error: argument 2 null where non-null expected [-Werror=nonnull]
         memcpy(stream + point_len, extra, iwe->u.data.length);
      
      This works fine here because iwe->u.data.length is guaranteed to be 0
      and the memcpy doesn't actually have an effect.
      
      Making the length check explicit avoids the warning and should have
      no other effect here.
      
      Also check the pointer itself, since otherwise we get warnings
      elsewhere in the code.
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      50231cef
    • Jane Chu's avatar
      sparc64: Measure receiver forward progress to avoid send mondo timeout · cada8caa
      Jane Chu authored
      
      [ Upstream commit 9d53caec ]
      
      A large sun4v SPARC system may have moments of intensive xcall activities,
      usually caused by unmapping many pages on many CPUs concurrently. This can
      flood receivers with CPU mondo interrupts for an extended period, causing
      some unlucky senders to hit send-mondo timeout. This problem gets worse
      as cpu count increases because sometimes mappings must be invalidated on
      all CPUs, and sometimes all CPUs may gang up on a single CPU.
      
      But a busy system is not a broken system. In the above scenario, as long
      as the receiver is making forward progress processing mondo interrupts,
      the sender should continue to retry.
      
      This patch implements the receiver's forward progress meter by introducing
      a per cpu counter 'cpu_mondo_counter[cpu]' where 'cpu' is in the range
      of 0..NR_CPUS. The receiver increments its counter as soon as it receives
      a mondo and the sender tracks the receiver's counter. If the receiver has
      stopped making forward progress when the retry limit is reached, the sender
      declares send-mondo-timeout and panic; otherwise, the receiver is allowed
      to keep making forward progress.
      
      In addition, it's been observed that PCIe hotplug events generate Correctable
      Errors that are handled by hypervisor and then OS. Hypervisor 'borrows'
      a guest cpu strand briefly to provide the service. If the cpu strand is
      simultaneously the only cpu targeted by a mondo, it may not be available
      for the mondo in 20msec, causing SUN4V mondo timeout. It appears that 1 second
      is the agreed wait time between hypervisor and guest OS, this patch makes
      the adjustment.
      
      Orabug: 25476541
      Orabug: 26417466
      Signed-off-by: default avatarJane Chu <jane.chu@oracle.com>
      Reviewed-by: default avatarSteve Sistare <steven.sistare@oracle.com>
      Reviewed-by: default avatarAnthony Yznaga <anthony.yznaga@oracle.com>
      Reviewed-by: default avatarRob Gardner <rob.gardner@oracle.com>
      Reviewed-by: default avatarThomas Tai <thomas.tai@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      cada8caa
    • Wei Liu's avatar
      xen-netback: correctly schedule rate-limited queues · 7c37101c
      Wei Liu authored
      
      [ Upstream commit dfa523ae ]
      
      Add a flag to indicate if a queue is rate-limited. Test the flag in
      NAPI poll handler and avoid rescheduling the queue if true, otherwise
      we risk locking up the host. The rescheduling will be done in the
      timer callback function.
      Reported-by: default avatarJean-Louis Dupond <jean-louis@dupond.be>
      Signed-off-by: default avatarWei Liu <wei.liu2@citrix.com>
      Tested-by: default avatarJean-Louis Dupond <jean-louis@dupond.be>
      Reviewed-by: default avatarPaul Durrant <paul.durrant@citrix.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7c37101c
    • Florian Fainelli's avatar
      net: phy: Fix PHY unbind crash · 2933fb22
      Florian Fainelli authored
      commit 7b9a88a3 upstream.
      
      The PHY library does not deal very well with bind and unbind events. The first
      thing we would see is that we were not properly canceling the PHY state machine
      workqueue, so we would be crashing while dereferencing phydev->drv since there
      is no driver attached anymore.
      Suggested-by: default avatarRussell King <rmk+kernel@armlinux.org.uk>
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Cc: Guenter Roeck <linux@roeck-us.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2933fb22
    • Florian Fainelli's avatar
      net: phy: Correctly process PHY_HALTED in phy_stop_machine() · a8f1b40b
      Florian Fainelli authored
      
      [ Upstream commit 7ad813f2 ]
      
      Marc reported that he was not getting the PHY library adjust_link()
      callback function to run when calling phy_stop() + phy_disconnect()
      which does not indeed happen because we set the state machine to
      PHY_HALTED but we don't get to run it to process this state past that
      point.
      
      Fix this with a synchronous call to phy_state_machine() in order to have
      the state machine actually act on PHY_HALTED, set the PHY device's link
      down, turn the network device's carrier off and finally call the
      adjust_link() function.
      Reported-by: default avatarMarc Gonzalez <marc_gonzalez@sigmadesigns.com>
      Fixes: a390d1f3 ("phylib: convert state_queue work to delayed_work")
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarMarc Gonzalez <marc_gonzalez@sigmadesigns.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a8f1b40b
    • Moshe Shemesh's avatar
      net/mlx5: Fix command bad flow on command entry allocation failure · dc413279
      Moshe Shemesh authored
      
      [ Upstream commit 219c81f7 ]
      
      When driver fail to allocate an entry to send command to FW, it must
      notify the calling function and release the memory allocated for
      this command.
      
      Fixes: e126ba97 ('mlx5: Add driver for Mellanox Connect-IB adapters')
      Signed-off-by: default avatarMoshe Shemesh <moshe@mellanox.com>
      Cc: kernel-team@fb.com
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      dc413279
    • Xin Long's avatar
      sctp: fix the check for _sctp_walk_params and _sctp_walk_errors · de666960
      Xin Long authored
      
      [ Upstream commit 6b84202c ]
      
      Commit b1f5bfc2 ("sctp: don't dereference ptr before leaving
      _sctp_walk_{params, errors}()") tried to fix the issue that it
      may overstep the chunk end for _sctp_walk_{params, errors} with
      'chunk_end > offset(length) + sizeof(length)'.
      
      But it introduced a side effect: When processing INIT, it verifies
      the chunks with 'param.v == chunk_end' after iterating all params
      by sctp_walk_params(). With the check 'chunk_end > offset(length)
      + sizeof(length)', it would return when the last param is not yet
      accessed. Because the last param usually is fwdtsn supported param
      whose size is 4 and 'chunk_end == offset(length) + sizeof(length)'
      
      This is a badly issue even causing sctp couldn't process 4-shakes.
      Client would always get abort when connecting to server, due to
      the failure of INIT chunk verification on server.
      
      The patch is to use 'chunk_end <= offset(length) + sizeof(length)'
      instead of 'chunk_end < offset(length) + sizeof(length)' for both
      _sctp_walk_params and _sctp_walk_errors.
      
      Fixes: b1f5bfc2 ("sctp: don't dereference ptr before leaving _sctp_walk_{params, errors}()")
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Acked-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      de666960
    • Alexander Potapenko's avatar
      sctp: don't dereference ptr before leaving _sctp_walk_{params, errors}() · 2bac20a4
      Alexander Potapenko authored
      
      [ Upstream commit b1f5bfc2 ]
      
      If the length field of the iterator (|pos.p| or |err|) is past the end
      of the chunk, we shouldn't access it.
      
      This bug has been detected by KMSAN. For the following pair of system
      calls:
      
        socket(PF_INET6, SOCK_STREAM, 0x84 /* IPPROTO_??? */) = 3
        sendto(3, "A", 1, MSG_OOB, {sa_family=AF_INET6, sin6_port=htons(0),
               inet_pton(AF_INET6, "::1", &sin6_addr), sin6_flowinfo=0,
               sin6_scope_id=0}, 28) = 1
      
      the tool has reported a use of uninitialized memory:
      
        ==================================================================
        BUG: KMSAN: use of uninitialized memory in sctp_rcv+0x17b8/0x43b0
        CPU: 1 PID: 2940 Comm: probe Not tainted 4.11.0-rc5+ #2926
        Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs
        01/01/2011
        Call Trace:
         <IRQ>
         __dump_stack lib/dump_stack.c:16
         dump_stack+0x172/0x1c0 lib/dump_stack.c:52
         kmsan_report+0x12a/0x180 mm/kmsan/kmsan.c:927
         __msan_warning_32+0x61/0xb0 mm/kmsan/kmsan_instr.c:469
         __sctp_rcv_init_lookup net/sctp/input.c:1074
         __sctp_rcv_lookup_harder net/sctp/input.c:1233
         __sctp_rcv_lookup net/sctp/input.c:1255
         sctp_rcv+0x17b8/0x43b0 net/sctp/input.c:170
         sctp6_rcv+0x32/0x70 net/sctp/ipv6.c:984
         ip6_input_finish+0x82f/0x1ee0 net/ipv6/ip6_input.c:279
         NF_HOOK ./include/linux/netfilter.h:257
         ip6_input+0x239/0x290 net/ipv6/ip6_input.c:322
         dst_input ./include/net/dst.h:492
         ip6_rcv_finish net/ipv6/ip6_input.c:69
         NF_HOOK ./include/linux/netfilter.h:257
         ipv6_rcv+0x1dbd/0x22e0 net/ipv6/ip6_input.c:203
         __netif_receive_skb_core+0x2f6f/0x3a20 net/core/dev.c:4208
         __netif_receive_skb net/core/dev.c:4246
         process_backlog+0x667/0xba0 net/core/dev.c:4866
         napi_poll net/core/dev.c:5268
         net_rx_action+0xc95/0x1590 net/core/dev.c:5333
         __do_softirq+0x485/0x942 kernel/softirq.c:284
         do_softirq_own_stack+0x1c/0x30 arch/x86/entry/entry_64.S:902
         </IRQ>
         do_softirq kernel/softirq.c:328
         __local_bh_enable_ip+0x25b/0x290 kernel/softirq.c:181
         local_bh_enable+0x37/0x40 ./include/linux/bottom_half.h:31
         rcu_read_unlock_bh ./include/linux/rcupdate.h:931
         ip6_finish_output2+0x19b2/0x1cf0 net/ipv6/ip6_output.c:124
         ip6_finish_output+0x764/0x970 net/ipv6/ip6_output.c:149
         NF_HOOK_COND ./include/linux/netfilter.h:246
         ip6_output+0x456/0x520 net/ipv6/ip6_output.c:163
         dst_output ./include/net/dst.h:486
         NF_HOOK ./include/linux/netfilter.h:257
         ip6_xmit+0x1841/0x1c00 net/ipv6/ip6_output.c:261
         sctp_v6_xmit+0x3b7/0x470 net/sctp/ipv6.c:225
         sctp_packet_transmit+0x38cb/0x3a20 net/sctp/output.c:632
         sctp_outq_flush+0xeb3/0x46e0 net/sctp/outqueue.c:885
         sctp_outq_uncork+0xb2/0xd0 net/sctp/outqueue.c:750
         sctp_side_effects net/sctp/sm_sideeffect.c:1773
         sctp_do_sm+0x6962/0x6ec0 net/sctp/sm_sideeffect.c:1147
         sctp_primitive_ASSOCIATE+0x12c/0x160 net/sctp/primitive.c:88
         sctp_sendmsg+0x43e5/0x4f90 net/sctp/socket.c:1954
         inet_sendmsg+0x498/0x670 net/ipv4/af_inet.c:762
         sock_sendmsg_nosec net/socket.c:633
         sock_sendmsg net/socket.c:643
         SYSC_sendto+0x608/0x710 net/socket.c:1696
         SyS_sendto+0x8a/0xb0 net/socket.c:1664
         do_syscall_64+0xe6/0x130 arch/x86/entry/common.c:285
         entry_SYSCALL64_slow_path+0x25/0x25 arch/x86/entry/entry_64.S:246
        RIP: 0033:0x401133
        RSP: 002b:00007fff6d99cd38 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
        RAX: ffffffffffffffda RBX: 00000000004002b0 RCX: 0000000000401133
        RDX: 0000000000000001 RSI: 0000000000494088 RDI: 0000000000000003
        RBP: 00007fff6d99cd90 R08: 00007fff6d99cd50 R09: 000000000000001c
        R10: 0000000000000001 R11: 0000000000000246 R12: 0000000000000000
        R13: 00000000004063d0 R14: 0000000000406460 R15: 0000000000000000
        origin:
         save_stack_trace+0x37/0x40 arch/x86/kernel/stacktrace.c:59
         kmsan_save_stack_with_flags mm/kmsan/kmsan.c:302
         kmsan_internal_poison_shadow+0xb1/0x1a0 mm/kmsan/kmsan.c:198
         kmsan_poison_shadow+0x6d/0xc0 mm/kmsan/kmsan.c:211
         slab_alloc_node mm/slub.c:2743
         __kmalloc_node_track_caller+0x200/0x360 mm/slub.c:4351
         __kmalloc_reserve net/core/skbuff.c:138
         __alloc_skb+0x26b/0x840 net/core/skbuff.c:231
         alloc_skb ./include/linux/skbuff.h:933
         sctp_packet_transmit+0x31e/0x3a20 net/sctp/output.c:570
         sctp_outq_flush+0xeb3/0x46e0 net/sctp/outqueue.c:885
         sctp_outq_uncork+0xb2/0xd0 net/sctp/outqueue.c:750
         sctp_side_effects net/sctp/sm_sideeffect.c:1773
         sctp_do_sm+0x6962/0x6ec0 net/sctp/sm_sideeffect.c:1147
         sctp_primitive_ASSOCIATE+0x12c/0x160 net/sctp/primitive.c:88
         sctp_sendmsg+0x43e5/0x4f90 net/sctp/socket.c:1954
         inet_sendmsg+0x498/0x670 net/ipv4/af_inet.c:762
         sock_sendmsg_nosec net/socket.c:633
         sock_sendmsg net/socket.c:643
         SYSC_sendto+0x608/0x710 net/socket.c:1696
         SyS_sendto+0x8a/0xb0 net/socket.c:1664
         do_syscall_64+0xe6/0x130 arch/x86/entry/common.c:285
         return_from_SYSCALL_64+0x0/0x6a arch/x86/entry/entry_64.S:246
        ==================================================================
      Signed-off-by: default avatarAlexander Potapenko <glider@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2bac20a4
    • Xin Long's avatar
      dccp: fix a memleak for dccp_feat_init err process · dd4edbcb
      Xin Long authored
      
      [ Upstream commit e90ce2fc ]
      
      In dccp_feat_init, when ccid_get_builtin_ccids failsto alloc
      memory for rx.val, it should free tx.val before returning an
      error.
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      dd4edbcb
    • Xin Long's avatar
      dccp: fix a memleak that dccp_ipv4 doesn't put reqsk properly · adcc8785
      Xin Long authored
      
      [ Upstream commit b7953d3c ]
      
      The patch "dccp: fix a memleak that dccp_ipv6 doesn't put reqsk
      properly" fixed reqsk refcnt leak for dccp_ipv6. The same issue
      exists on dccp_ipv4.
      
      This patch is to fix it for dccp_ipv4.
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      adcc8785
    • Xin Long's avatar
      dccp: fix a memleak that dccp_ipv6 doesn't put reqsk properly · c3278ed3
      Xin Long authored
      
      [ Upstream commit 0c2232b0 ]
      
      In dccp_v6_conn_request, after reqsk gets alloced and hashed into
      ehash table, reqsk's refcnt is set 3. one is for req->rsk_timer,
      one is for hlist, and the other one is for current using.
      
      The problem is when dccp_v6_conn_request returns and finishes using
      reqsk, it doesn't put reqsk. This will cause reqsk refcnt leaks and
      reqsk obj never gets freed.
      
      Jianlin found this issue when running dccp_memleak.c in a loop, the
      system memory would run out.
      
      dccp_memleak.c:
        int s1 = socket(PF_INET6, 6, IPPROTO_IP);
        bind(s1, &sa1, 0x20);
        listen(s1, 0x9);
        int s2 = socket(PF_INET6, 6, IPPROTO_IP);
        connect(s2, &sa1, 0x20);
        close(s1);
        close(s2);
      
      This patch is to put the reqsk before dccp_v6_conn_request returns,
      just as what tcp_conn_request does.
      Reported-by: default avatarJianlin Shi <jishi@redhat.com>
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c3278ed3
    • Marc Gonzalez's avatar
      net: ethernet: nb8800: Handle all 4 RGMII modes identically · 91c5aa7e
      Marc Gonzalez authored
      
      [ Upstream commit 4813497b ]
      
      Before commit bf8f6952 ("Add blurb about RGMII") it was unclear
      whose responsibility it was to insert the required clock skew, and
      in hindsight, some PHY drivers got it wrong. The solution forward
      is to introduce a new property, explicitly requiring skew from the
      node to which it is attached. In the interim, this driver will handle
      all 4 RGMII modes identically (no skew).
      
      Fixes: 52dfc830 ("net: ethernet: add driver for Aurora VLSI NB8800 Ethernet controller")
      Signed-off-by: default avatarMarc Gonzalez <marc_gonzalez@sigmadesigns.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      91c5aa7e
    • Stefano Brivio's avatar
      ipv6: Don't increase IPSTATS_MIB_FRAGFAILS twice in ip6_fragment() · d1ed1f8a
      Stefano Brivio authored
      
      [ Upstream commit afce615a ]
      
      RFC 2465 defines ipv6IfStatsOutFragFails as:
      
      	"The number of IPv6 datagrams that have been discarded
      	 because they needed to be fragmented at this output
      	 interface but could not be."
      
      The existing implementation, instead, would increase the counter
      twice in case we fail to allocate room for single fragments:
      once for the fragment, once for the datagram.
      
      This didn't look intentional though. In one of the two affected
      affected failure paths, the double increase was simply a result
      of a new 'goto fail' statement, introduced to avoid a skb leak.
      The other path appears to be affected since at least 2.6.12-rc2.
      Reported-by: default avatarSabrina Dubroca <sdubroca@redhat.com>
      Fixes: 1d325d21 ("ipv6: ip6_fragment: fix headroom tests and skb leak")
      Signed-off-by: default avatarStefano Brivio <sbrivio@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d1ed1f8a
    • WANG Cong's avatar
      packet: fix use-after-free in prb_retire_rx_blk_timer_expired() · 49933896
      WANG Cong authored
      
      [ Upstream commit c800aaf8 ]
      
      There are multiple reports showing we have a use-after-free in
      the timer prb_retire_rx_blk_timer_expired(), where we use struct
      tpacket_kbdq_core::pkbdq, a pg_vec, after it gets freed by
      free_pg_vec().
      
      The interesting part is it is not freed via packet_release() but
      via packet_setsockopt(), which means we are not closing the socket.
      Looking into the big and fat function packet_set_ring(), this could
      happen if we satisfy the following conditions:
      
      1. closing == 0, not on packet_release() path
      2. req->tp_block_nr == 0, we don't allocate a new pg_vec
      3. rx_ring->pg_vec is already set as V3, which means we already called
         packet_set_ring() wtih req->tp_block_nr > 0 previously
      4. req->tp_frame_nr == 0, pass sanity check
      5. po->mapped == 0, never called mmap()
      
      In this scenario we are clearing the old rx_ring->pg_vec, so we need
      to free this pg_vec, but we don't stop the timer on this path because
      of closing==0.
      
      The timer has to be stopped as long as we need to free pg_vec, therefore
      the check on closing!=0 is wrong, we should check pg_vec!=NULL instead.
      
      Thanks to liujian for testing different fixes.
      
      Reported-by: alexander.levin@verizon.com
      Reported-by: default avatarDave Jones <davej@codemonkey.org.uk>
      Reported-by: default avatarliujian (CE) <liujian56@huawei.com>
      Tested-by: default avatarliujian (CE) <liujian56@huawei.com>
      Cc: Ding Tianhong <dingtianhong@huawei.com>
      Cc: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
      Signed-off-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      49933896
    • Liping Zhang's avatar
      openvswitch: fix potential out of bound access in parse_ct · 23f787ce
      Liping Zhang authored
      
      [ Upstream commit 69ec932e ]
      
      Before the 'type' is validated, we shouldn't use it to fetch the
      ovs_ct_attr_lens's minlen and maxlen, else, out of bound access
      may happen.
      
      Fixes: 7f8a436e ("openvswitch: Add conntrack action")
      Signed-off-by: default avatarLiping Zhang <zlpnobody@gmail.com>
      Acked-by: default avatarPravin B Shelar <pshelar@ovn.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      23f787ce
    • Thomas Jarosch's avatar
      mcs7780: Fix initialization when CONFIG_VMAP_STACK is enabled · 6d1e34ee
      Thomas Jarosch authored
      
      [ Upstream commit 9476d393 ]
      
      DMA transfers are not allowed to buffers that are on the stack.
      Therefore allocate a buffer to store the result of usb_control_message().
      
      Fixes these bugreports:
      https://bugzilla.kernel.org/show_bug.cgi?id=195217
      
      https://bugzilla.redhat.com/show_bug.cgi?id=1421387
      https://bugzilla.redhat.com/show_bug.cgi?id=1427398
      
      Shortened kernel backtrace from 4.11.9-200.fc25.x86_64:
      kernel: ------------[ cut here ]------------
      kernel: WARNING: CPU: 3 PID: 2957 at drivers/usb/core/hcd.c:1587
      kernel: transfer buffer not dma capable
      kernel: Call Trace:
      kernel: dump_stack+0x63/0x86
      kernel: __warn+0xcb/0xf0
      kernel: warn_slowpath_fmt+0x5a/0x80
      kernel: usb_hcd_map_urb_for_dma+0x37f/0x570
      kernel: ? try_to_del_timer_sync+0x53/0x80
      kernel: usb_hcd_submit_urb+0x34e/0xb90
      kernel: ? schedule_timeout+0x17e/0x300
      kernel: ? del_timer_sync+0x50/0x50
      kernel: ? __slab_free+0xa9/0x300
      kernel: usb_submit_urb+0x2f4/0x560
      kernel: ? urb_destroy+0x24/0x30
      kernel: usb_start_wait_urb+0x6e/0x170
      kernel: usb_control_msg+0xdc/0x120
      kernel: mcs_get_reg+0x36/0x40 [mcs7780]
      kernel: mcs_net_open+0xb5/0x5c0 [mcs7780]
      ...
      
      Regression goes back to 4.9, so it's a good candidate for -stable.
      Though it's the decision of the maintainer.
      
      Thanks to Dan Williams for adding the "transfer buffer not dma capable"
      warning in the first place. It instantly pointed me in the right direction.
      
      Patch has been tested with transferring data from a Polar watch.
      Signed-off-by: default avatarThomas Jarosch <thomas.jarosch@intra2net.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6d1e34ee
    • WANG Cong's avatar
      rtnetlink: allocate more memory for dev_set_mac_address() · d0594690
      WANG Cong authored
      
      [ Upstream commit 153711f9 ]
      
      virtnet_set_mac_address() interprets mac address as struct
      sockaddr, but upper layer only allocates dev->addr_len
      which is ETH_ALEN + sizeof(sa_family_t) in this case.
      
      We lack a unified definition for mac address, so just fix
      the upper layer, this also allows drivers to interpret it
      to struct sockaddr freely.
      Reported-by: default avatarDavid Ahern <dsahern@gmail.com>
      Signed-off-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d0594690
    • Mahesh Bandewar's avatar
      ipv4: initialize fib_trie prior to register_netdev_notifier call. · 31afa8b5
      Mahesh Bandewar authored
      
      [ Upstream commit 8799a221 ]
      
      Net stack initialization currently initializes fib-trie after the
      first call to netdevice_notifier() call. In fact fib_trie initialization
      needs to happen before first rtnl_register(). It does not cause any problem
      since there are no devices UP at this moment, but trying to bring 'lo'
      UP at initialization would make this assumption wrong and exposes the issue.
      
      Fixes following crash
      
       Call Trace:
        ? alternate_node_alloc+0x76/0xa0
        fib_table_insert+0x1b7/0x4b0
        fib_magic.isra.17+0xea/0x120
        fib_add_ifaddr+0x7b/0x190
        fib_netdev_event+0xc0/0x130
        register_netdevice_notifier+0x1c1/0x1d0
        ip_fib_init+0x72/0x85
        ip_rt_init+0x187/0x1e9
        ip_init+0xe/0x1a
        inet_init+0x171/0x26c
        ? ipv4_offload_init+0x66/0x66
        do_one_initcall+0x43/0x160
        kernel_init_freeable+0x191/0x219
        ? rest_init+0x80/0x80
        kernel_init+0xe/0x150
        ret_from_fork+0x22/0x30
       Code: f6 46 23 04 74 86 4c 89 f7 e8 ae 45 01 00 49 89 c7 4d 85 ff 0f 85 7b ff ff ff 31 db eb 08 4c 89 ff e8 16 47 01 00 48 8b 44 24 38 <45> 8b 6e 14 4d 63 76 74 48 89 04 24 0f 1f 44 00 00 48 83 c4 08
       RIP: kmem_cache_alloc+0xcf/0x1c0 RSP: ffff9b1500017c28
       CR2: 0000000000000014
      
      Fixes: 7b1a74fd ("[NETNS]: Refactor fib initialization so it can handle multiple namespaces.")
      Fixes: 7f9b8052 ("[IPV4]: fib hash|trie initialization")
      Signed-off-by: default avatarMahesh Bandewar <maheshb@google.com>
      Acked-by: default avatar"Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      31afa8b5
    • Sabrina Dubroca's avatar
      ipv6: avoid overflow of offset in ip6_find_1stfragopt · f09db755
      Sabrina Dubroca authored
      
      [ Upstream commit 6399f1fa ]
      
      In some cases, offset can overflow and can cause an infinite loop in
      ip6_find_1stfragopt(). Make it unsigned int to prevent the overflow, and
      cap it at IPV6_MAXPLEN, since packets larger than that should be invalid.
      
      This problem has been here since before the beginning of git history.
      Signed-off-by: default avatarSabrina Dubroca <sd@queasysnail.net>
      Acked-by: default avatarHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f09db755
    • David S. Miller's avatar
      net: Zero terminate ifr_name in dev_ifname(). · e9b2f461
      David S. Miller authored
      
      [ Upstream commit 63679112 ]
      
      The ifr.ifr_name is passed around and assumed to be NULL terminated.
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e9b2f461
    • Alexander Potapenko's avatar
      ipv4: ipv6: initialize treq->txhash in cookie_v[46]_check() · c10e874b
      Alexander Potapenko authored
      
      [ Upstream commit 18bcf290 ]
      
      KMSAN reported use of uninitialized memory in skb_set_hash_from_sk(),
      which originated from the TCP request socket created in
      cookie_v6_check():
      
       ==================================================================
       BUG: KMSAN: use of uninitialized memory in tcp_transmit_skb+0xf77/0x3ec0
       CPU: 1 PID: 2949 Comm: syz-execprog Not tainted 4.11.0-rc5+ #2931
       Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
       TCP: request_sock_TCPv6: Possible SYN flooding on port 20028. Sending cookies.  Check SNMP counters.
       Call Trace:
        <IRQ>
        __dump_stack lib/dump_stack.c:16
        dump_stack+0x172/0x1c0 lib/dump_stack.c:52
        kmsan_report+0x12a/0x180 mm/kmsan/kmsan.c:927
        __msan_warning_32+0x61/0xb0 mm/kmsan/kmsan_instr.c:469
        skb_set_hash_from_sk ./include/net/sock.h:2011
        tcp_transmit_skb+0xf77/0x3ec0 net/ipv4/tcp_output.c:983
        tcp_send_ack+0x75b/0x830 net/ipv4/tcp_output.c:3493
        tcp_delack_timer_handler+0x9a6/0xb90 net/ipv4/tcp_timer.c:284
        tcp_delack_timer+0x1b0/0x310 net/ipv4/tcp_timer.c:309
        call_timer_fn+0x240/0x520 kernel/time/timer.c:1268
        expire_timers kernel/time/timer.c:1307
        __run_timers+0xc13/0xf10 kernel/time/timer.c:1601
        run_timer_softirq+0x36/0xa0 kernel/time/timer.c:1614
        __do_softirq+0x485/0x942 kernel/softirq.c:284
        invoke_softirq kernel/softirq.c:364
        irq_exit+0x1fa/0x230 kernel/softirq.c:405
        exiting_irq+0xe/0x10 ./arch/x86/include/asm/apic.h:657
        smp_apic_timer_interrupt+0x5a/0x80 arch/x86/kernel/apic/apic.c:966
        apic_timer_interrupt+0x86/0x90 arch/x86/entry/entry_64.S:489
       RIP: 0010:native_restore_fl ./arch/x86/include/asm/irqflags.h:36
       RIP: 0010:arch_local_irq_restore ./arch/x86/include/asm/irqflags.h:77
       RIP: 0010:__msan_poison_alloca+0xed/0x120 mm/kmsan/kmsan_instr.c:440
       RSP: 0018:ffff880024917cd8 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff10
       RAX: 0000000000000246 RBX: ffff8800224c0000 RCX: 0000000000000005
       RDX: 0000000000000004 RSI: ffff880000000000 RDI: ffffea0000b6d770
       RBP: ffff880024917d58 R08: 0000000000000dd8 R09: 0000000000000004
       R10: 0000160000000000 R11: 0000000000000000 R12: ffffffff85abf810
       R13: ffff880024917dd8 R14: 0000000000000010 R15: ffffffff81cabde4
        </IRQ>
        poll_select_copy_remaining+0xac/0x6b0 fs/select.c:293
        SYSC_select+0x4b4/0x4e0 fs/select.c:653
        SyS_select+0x76/0xa0 fs/select.c:634
        entry_SYSCALL_64_fastpath+0x13/0x94 arch/x86/entry/entry_64.S:204
       RIP: 0033:0x4597e7
       RSP: 002b:000000c420037ee0 EFLAGS: 00000246 ORIG_RAX: 0000000000000017
       RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00000000004597e7
       RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
       RBP: 000000c420037ef0 R08: 000000c420037ee0 R09: 0000000000000059
       R10: 0000000000000000 R11: 0000000000000246 R12: 000000000042dc20
       R13: 00000000000000f3 R14: 0000000000000030 R15: 0000000000000003
       chained origin:
        save_stack_trace+0x37/0x40 arch/x86/kernel/stacktrace.c:59
        kmsan_save_stack_with_flags mm/kmsan/kmsan.c:302
        kmsan_save_stack mm/kmsan/kmsan.c:317
        kmsan_internal_chain_origin+0x12a/0x1f0 mm/kmsan/kmsan.c:547
        __msan_store_shadow_origin_4+0xac/0x110 mm/kmsan/kmsan_instr.c:259
        tcp_create_openreq_child+0x709/0x1ae0 net/ipv4/tcp_minisocks.c:472
        tcp_v6_syn_recv_sock+0x7eb/0x2a30 net/ipv6/tcp_ipv6.c:1103
        tcp_get_cookie_sock+0x136/0x5f0 net/ipv4/syncookies.c:212
        cookie_v6_check+0x17a9/0x1b50 net/ipv6/syncookies.c:245
        tcp_v6_cookie_check net/ipv6/tcp_ipv6.c:989
        tcp_v6_do_rcv+0xdd8/0x1c60 net/ipv6/tcp_ipv6.c:1298
        tcp_v6_rcv+0x41a3/0x4f00 net/ipv6/tcp_ipv6.c:1487
        ip6_input_finish+0x82f/0x1ee0 net/ipv6/ip6_input.c:279
        NF_HOOK ./include/linux/netfilter.h:257
        ip6_input+0x239/0x290 net/ipv6/ip6_input.c:322
        dst_input ./include/net/dst.h:492
        ip6_rcv_finish net/ipv6/ip6_input.c:69
        NF_HOOK ./include/linux/netfilter.h:257
        ipv6_rcv+0x1dbd/0x22e0 net/ipv6/ip6_input.c:203
        __netif_receive_skb_core+0x2f6f/0x3a20 net/core/dev.c:4208
        __netif_receive_skb net/core/dev.c:4246
        process_backlog+0x667/0xba0 net/core/dev.c:4866
        napi_poll net/core/dev.c:5268
        net_rx_action+0xc95/0x1590 net/core/dev.c:5333
        __do_softirq+0x485/0x942 kernel/softirq.c:284
       origin:
        save_stack_trace+0x37/0x40 arch/x86/kernel/stacktrace.c:59
        kmsan_save_stack_with_flags mm/kmsan/kmsan.c:302
        kmsan_internal_poison_shadow+0xb1/0x1a0 mm/kmsan/kmsan.c:198
        kmsan_kmalloc+0x7f/0xe0 mm/kmsan/kmsan.c:337
        kmem_cache_alloc+0x1c2/0x1e0 mm/slub.c:2766
        reqsk_alloc ./include/net/request_sock.h:87
        inet_reqsk_alloc+0xa4/0x5b0 net/ipv4/tcp_input.c:6200
        cookie_v6_check+0x4f4/0x1b50 net/ipv6/syncookies.c:169
        tcp_v6_cookie_check net/ipv6/tcp_ipv6.c:989
        tcp_v6_do_rcv+0xdd8/0x1c60 net/ipv6/tcp_ipv6.c:1298
        tcp_v6_rcv+0x41a3/0x4f00 net/ipv6/tcp_ipv6.c:1487
        ip6_input_finish+0x82f/0x1ee0 net/ipv6/ip6_input.c:279
        NF_HOOK ./include/linux/netfilter.h:257
        ip6_input+0x239/0x290 net/ipv6/ip6_input.c:322
        dst_input ./include/net/dst.h:492
        ip6_rcv_finish net/ipv6/ip6_input.c:69
        NF_HOOK ./include/linux/netfilter.h:257
        ipv6_rcv+0x1dbd/0x22e0 net/ipv6/ip6_input.c:203
        __netif_receive_skb_core+0x2f6f/0x3a20 net/core/dev.c:4208
        __netif_receive_skb net/core/dev.c:4246
        process_backlog+0x667/0xba0 net/core/dev.c:4866
        napi_poll net/core/dev.c:5268
        net_rx_action+0xc95/0x1590 net/core/dev.c:5333
        __do_softirq+0x485/0x942 kernel/softirq.c:284
       ==================================================================
      
      Similar error is reported for cookie_v4_check().
      
      Fixes: 58d607d3 ("tcp: provide skb->hash to synack packets")
      Signed-off-by: default avatarAlexander Potapenko <glider@google.com>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c10e874b