1. 18 Feb, 2017 28 commits
    • Eric Dumazet's avatar
      l2tp: do not use udp_ioctl() · 5b1bb4cb
      Eric Dumazet authored
      [ Upstream commit 72fb96e7 ]
      
      udp_ioctl(), as its name suggests, is used by UDP protocols,
      but is also used by L2TP :(
      
      L2TP should use its own handler, because it really does not
      look the same.
      
      SIOCINQ for instance should not assume UDP checksum or headers.
      
      Thanks to Andrey and syzkaller team for providing the report
      and a nice reproducer.
      
      While crashes only happen on recent kernels (after commit
      7c13f97f ("udp: do fwd memory scheduling on dequeue")), this
      probably needs to be backported to older kernels.
      
      Fixes: 7c13f97f ("udp: do fwd memory scheduling on dequeue")
      Fixes: 85584672 ("udp: Fix udp_poll() and ioctl()")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Acked-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5b1bb4cb
    • Florian Fainelli's avatar
      net: dsa: Do not destroy invalid network devices · 12758a28
      Florian Fainelli authored
      [ Upstream commit 382e1eea ]
      
      dsa_slave_create() can fail, and dsa_user_port_unapply() will properly check
      for the network device not being NULL before attempting to destroy it. We were
      not setting the slave network device as NULL if dsa_slave_create() failed, so
      we would later on be calling dsa_slave_destroy() on a now free'd and
      unitialized network device, causing crashes in dsa_slave_destroy().
      
      Fixes: 83c0afae ("net: dsa: Add new binding implementation")
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      12758a28
    • WANG Cong's avatar
      ping: fix a null pointer dereference · a700cf26
      WANG Cong authored
      [ Upstream commit 73d2c667 ]
      
      Andrey reported a kernel crash:
      
        general protection fault: 0000 [#1] SMP KASAN
        Dumping ftrace buffer:
           (ftrace buffer empty)
        Modules linked in:
        CPU: 2 PID: 3880 Comm: syz-executor1 Not tainted 4.10.0-rc6+ #124
        Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
        task: ffff880060048040 task.stack: ffff880069be8000
        RIP: 0010:ping_v4_push_pending_frames net/ipv4/ping.c:647 [inline]
        RIP: 0010:ping_v4_sendmsg+0x1acd/0x23f0 net/ipv4/ping.c:837
        RSP: 0018:ffff880069bef8b8 EFLAGS: 00010206
        RAX: dffffc0000000000 RBX: ffff880069befb90 RCX: 0000000000000000
        RDX: 0000000000000018 RSI: ffff880069befa30 RDI: 00000000000000c2
        RBP: ffff880069befbb8 R08: 0000000000000008 R09: 0000000000000000
        R10: 0000000000000002 R11: 0000000000000000 R12: ffff880069befab0
        R13: ffff88006c624a80 R14: ffff880069befa70 R15: 0000000000000000
        FS:  00007f6f7c716700(0000) GS:ffff88006de00000(0000) knlGS:0000000000000000
        CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
        CR2: 00000000004a6f28 CR3: 000000003a134000 CR4: 00000000000006e0
        Call Trace:
         inet_sendmsg+0x164/0x5b0 net/ipv4/af_inet.c:744
         sock_sendmsg_nosec net/socket.c:635 [inline]
         sock_sendmsg+0xca/0x110 net/socket.c:645
         SYSC_sendto+0x660/0x810 net/socket.c:1687
         SyS_sendto+0x40/0x50 net/socket.c:1655
         entry_SYSCALL_64_fastpath+0x1f/0xc2
      
      This is because we miss a check for NULL pointer for skb_peek() when
      the queue is empty. Other places already have the same check.
      
      Fixes: c319b4d7 ("net: ipv4: add IPPROTO_ICMP socket kind")
      Reported-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Tested-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Signed-off-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a700cf26
    • Willem de Bruijn's avatar
      packet: round up linear to header len · 82849541
      Willem de Bruijn authored
      [ Upstream commit 57031eb7 ]
      
      Link layer protocols may unconditionally pull headers, as Ethernet
      does in eth_type_trans. Ensure that the entire link layer header
      always lies in the skb linear segment. tpacket_snd has such a check.
      Extend this to packet_snd.
      
      Variable length link layer headers complicate the computation
      somewhat. Here skb->len may be smaller than dev->hard_header_len.
      
      Round up the linear length to be at least as long as the smallest of
      the two.
      Reported-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Signed-off-by: default avatarWillem de Bruijn <willemb@google.com>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      82849541
    • Willem de Bruijn's avatar
      net: introduce device min_header_len · 6ebde312
      Willem de Bruijn authored
      [ Upstream commit 217e6fa2 ]
      
      The stack must not pass packets to device drivers that are shorter
      than the minimum link layer header length.
      
      Previously, packet sockets would drop packets smaller than or equal
      to dev->hard_header_len, but this has false positives. Zero length
      payload is used over Ethernet. Other link layer protocols support
      variable length headers. Support for validation of these protocols
      removed the min length check for all protocols.
      
      Introduce an explicit dev->min_header_len parameter and drop all
      packets below this value. Initially, set it to non-zero only for
      Ethernet and loopback. Other protocols can follow in a patch to
      net-next.
      
      Fixes: 9ed988cd ("packet: validate variable length ll headers")
      Reported-by: default avatarSowmini Varadhan <sowmini.varadhan@oracle.com>
      Signed-off-by: default avatarWillem de Bruijn <willemb@google.com>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Acked-by: default avatarSowmini Varadhan <sowmini.varadhan@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6ebde312
    • WANG Cong's avatar
      sit: fix a double free on error path · 4cd03621
      WANG Cong authored
      [ Upstream commit d7426c69 ]
      
      Dmitry reported a double free in sit_init_net():
      
        kernel BUG at mm/percpu.c:689!
        invalid opcode: 0000 [#1] SMP KASAN
        Dumping ftrace buffer:
           (ftrace buffer empty)
        Modules linked in:
        CPU: 0 PID: 15692 Comm: syz-executor1 Not tainted 4.10.0-rc6-next-20170206 #1
        Hardware name: Google Google Compute Engine/Google Compute Engine,
        BIOS Google 01/01/2011
        task: ffff8801c9cc27c0 task.stack: ffff88017d1d8000
        RIP: 0010:pcpu_free_area+0x68b/0x810 mm/percpu.c:689
        RSP: 0018:ffff88017d1df488 EFLAGS: 00010046
        RAX: 0000000000010000 RBX: 00000000000007c0 RCX: ffffc90002829000
        RDX: 0000000000010000 RSI: ffffffff81940efb RDI: ffff8801db841d94
        RBP: ffff88017d1df590 R08: dffffc0000000000 R09: 1ffffffff0bb3bdd
        R10: dffffc0000000000 R11: 00000000000135dd R12: ffff8801db841d80
        R13: 0000000000038e40 R14: 00000000000007c0 R15: 00000000000007c0
        FS:  00007f6ea608f700(0000) GS:ffff8801dbe00000(0000) knlGS:0000000000000000
        CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
        CR2: 000000002000aff8 CR3: 00000001c8d44000 CR4: 00000000001426f0
        DR0: 0000000020000000 DR1: 0000000020000000 DR2: 0000000000000000
        DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000600
        Call Trace:
         free_percpu+0x212/0x520 mm/percpu.c:1264
         ipip6_dev_free+0x43/0x60 net/ipv6/sit.c:1335
         sit_init_net+0x3cb/0xa10 net/ipv6/sit.c:1831
         ops_init+0x10a/0x530 net/core/net_namespace.c:115
         setup_net+0x2ed/0x690 net/core/net_namespace.c:291
         copy_net_ns+0x26c/0x530 net/core/net_namespace.c:396
         create_new_namespaces+0x409/0x860 kernel/nsproxy.c:106
         unshare_nsproxy_namespaces+0xae/0x1e0 kernel/nsproxy.c:205
         SYSC_unshare kernel/fork.c:2281 [inline]
         SyS_unshare+0x64e/0xfc0 kernel/fork.c:2231
         entry_SYSCALL_64_fastpath+0x1f/0xc2
      
      This is because when tunnel->dst_cache init fails, we free dev->tstats
      once in ipip6_tunnel_init() and twice in sit_init_net(). This looks
      redundant but its ndo_uinit() does not seem enough to clean up everything
      here. So avoid this by setting dev->tstats to NULL after the first free,
      at least for -net.
      Reported-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Signed-off-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4cd03621
    • David Ahern's avatar
      lwtunnel: valid encap attr check should return 0 when lwtunnel is disabled · 2b7f50d6
      David Ahern authored
      [ Upstream commit 2bd137de ]
      
      An error was reported upgrading to 4.9.8:
          root@Typhoon:~# ip route add default table 210 nexthop dev eth0 via 10.68.64.1
          weight 1 nexthop dev eth0 via 10.68.64.2 weight 1
          RTNETLINK answers: Operation not supported
      
      The problem occurs when CONFIG_LWTUNNEL is not enabled and a multipath
      route is submitted.
      
      The point of lwtunnel_valid_encap_type_attr is catch modules that
      need to be loaded before any references are taken with rntl held. With
      CONFIG_LWTUNNEL disabled, there will be no modules to load so the
      lwtunnel_valid_encap_type_attr stub should just return 0.
      
      Fixes: 9ed59592 ("lwtunnel: fix autoload of lwt modules")
      Reported-by: pupilla@libero.it
      Signed-off-by: default avatarDavid Ahern <dsa@cumulusnetworks.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2b7f50d6
    • Marcelo Ricardo Leitner's avatar
      sctp: avoid BUG_ON on sctp_wait_for_sndbuf · 00eff2eb
      Marcelo Ricardo Leitner authored
      [ Upstream commit 2dcab598 ]
      
      Alexander Popov reported that an application may trigger a BUG_ON in
      sctp_wait_for_sndbuf if the socket tx buffer is full, a thread is
      waiting on it to queue more data and meanwhile another thread peels off
      the association being used by the first thread.
      
      This patch replaces the BUG_ON call with a proper error handling. It
      will return -EPIPE to the original sendmsg call, similarly to what would
      have been done if the association wasn't found in the first place.
      Acked-by: default avatarAlexander Popov <alex.popov@linux.com>
      Signed-off-by: default avatarMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Reviewed-by: default avatarXin Long <lucien.xin@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      00eff2eb
    • Benjamin Poirier's avatar
      mlx4: Invoke softirqs after napi_reschedule · 4400acce
      Benjamin Poirier authored
      [ Upstream commit bd4ce941 ]
      
      mlx4 may schedule napi from a workqueue. Afterwards, softirqs are not run
      in a deterministic time frame and the following message may be logged:
      NOHZ: local_softirq_pending 08
      
      The problem is the same as what was described in commit ec13ee80
      ("virtio_net: invoke softirqs after __napi_schedule") and this patch
      applies the same fix to mlx4.
      
      Fixes: 07841f9d ("net/mlx4_en: Schedule napi when RX buffers allocation fails")
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: default avatarBenjamin Poirier <bpoirier@suse.com>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Reviewed-by: default avatarTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4400acce
    • Ben Hutchings's avatar
      catc: Use heap buffer for memory size test · 970390fd
      Ben Hutchings authored
      [ Upstream commit 2d6a0e9d ]
      
      Allocating USB buffers on the stack is not portable, and no longer
      works on x86_64 (with VMAP_STACK enabled as per default).
      
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      970390fd
    • Ben Hutchings's avatar
      61bf9f38
    • Ben Hutchings's avatar
      rtl8150: Use heap buffers for all register access · e898f6f0
      Ben Hutchings authored
      [ Upstream commit 7926aff5 ]
      
      Allocating USB buffers on the stack is not portable, and no longer
      works on x86_64 (with VMAP_STACK enabled as per default).
      
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e898f6f0
    • Ben Hutchings's avatar
      pegasus: Use heap buffers for all register access · 878b015b
      Ben Hutchings authored
      [ Upstream commit 5593523f ]
      
      Allocating USB buffers on the stack is not portable, and no longer
      works on x86_64 (with VMAP_STACK enabled as per default).
      
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      References: https://bugs.debian.org/852556Reported-by: default avatarLisandro Damián Nicanor Pérez Meyer <lisandro@debian.org>
      Tested-by: default avatarLisandro Damián Nicanor Pérez Meyer <lisandro@debian.org>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      878b015b
    • Willem de Bruijn's avatar
      macvtap: read vnet_hdr_size once · b90cb484
      Willem de Bruijn authored
      [ Upstream commit 837585a5 ]
      
      When IFF_VNET_HDR is enabled, a virtio_net header must precede data.
      Data length is verified to be greater than or equal to expected header
      length tun->vnet_hdr_sz before copying.
      
      Macvtap functions read the value once, but unless READ_ONCE is used,
      the compiler may ignore this and read multiple times. Enforce a single
      read and locally cached value to avoid updates between test and use.
      Signed-off-by: default avatarWillem de Bruijn <willemb@google.com>
      Suggested-by: default avatarEric Dumazet <edumazet@google.com>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b90cb484
    • Willem de Bruijn's avatar
      tun: read vnet_hdr_sz once · 26989c9d
      Willem de Bruijn authored
      [ Upstream commit e1edab87 ]
      
      When IFF_VNET_HDR is enabled, a virtio_net header must precede data.
      Data length is verified to be greater than or equal to expected header
      length tun->vnet_hdr_sz before copying.
      
      Read this value once and cache locally, as it can be updated between
      the test and use (TOCTOU).
      Signed-off-by: default avatarWillem de Bruijn <willemb@google.com>
      Reported-by: default avatarDmitry Vyukov <dvyukov@google.com>
      CC: Eric Dumazet <edumazet@google.com>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      26989c9d
    • Eric Dumazet's avatar
      tcp: avoid infinite loop in tcp_splice_read() · 0f895f51
      Eric Dumazet authored
      [ Upstream commit ccf7abb9 ]
      
      Splicing from TCP socket is vulnerable when a packet with URG flag is
      received and stored into receive queue.
      
      __tcp_splice_read() returns 0, and sk_wait_data() immediately
      returns since there is the problematic skb in queue.
      
      This is a nice way to burn cpu (aka infinite loop) and trigger
      soft lockups.
      
      Again, this gem was found by syzkaller tool.
      
      Fixes: 9c55e01c ("[TCP]: Splice receive support.")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarDmitry Vyukov  <dvyukov@google.com>
      Cc: Willy Tarreau <w@1wt.eu>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0f895f51
    • Eric Dumazet's avatar
      ipv6: tcp: add a missing tcp_v6_restore_cb() · 1e340bb2
      Eric Dumazet authored
      [ Upstream commit ebf6c9cb ]
      
      Dmitry reported use-after-free in ip6_datagram_recv_specific_ctl()
      
      A similar bug was fixed in commit 8ce48623 ("ipv6: tcp: restore
      IP6CB for pktoptions skbs"), but I missed another spot.
      
      tcp_v6_syn_recv_sock() can indeed set np->pktoptions from ireq->pktopts
      
      Fixes: 971f10ec ("tcp: better TCP_SKB_CB layout to reduce cache line misses")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1e340bb2
    • Eric Dumazet's avatar
      ip6_gre: fix ip6gre_err() invalid reads · ae1768bb
      Eric Dumazet authored
      [ Upstream commit 7892032c ]
      
      Andrey Konovalov reported out of bound accesses in ip6gre_err()
      
      If GRE flags contains GRE_KEY, the following expression
      *(((__be32 *)p) + (grehlen / 4) - 1)
      
      accesses data ~40 bytes after the expected point, since
      grehlen includes the size of IPv6 headers.
      
      Let's use a "struct gre_base_hdr *greh" pointer to make this
      code more readable.
      
      p[1] becomes greh->protocol.
      grhlen is the GRE header length.
      
      Fixes: c12b395a ("gre: Support GRE over IPv6")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ae1768bb
    • Eric Dumazet's avatar
      netlabel: out of bound access in cipso_v4_validate() · 66cdd434
      Eric Dumazet authored
      [ Upstream commit d71b7896 ]
      
      syzkaller found another out of bound access in ip_options_compile(),
      or more exactly in cipso_v4_validate()
      
      Fixes: 20e2a864 ("cipso: handle CIPSO options correctly when NetLabel is disabled")
      Fixes: 446fda4f ("[NetLabel]: CIPSOv4 engine")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarDmitry Vyukov  <dvyukov@google.com>
      Cc: Paul Moore <paul@paul-moore.com>
      Acked-by: default avatarPaul Moore <paul@paul-moore.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      66cdd434
    • Eric Dumazet's avatar
      ipv4: keep skb->dst around in presence of IP options · f5b54446
      Eric Dumazet authored
      [ Upstream commit 34b2cef2 ]
      
      Andrey Konovalov got crashes in __ip_options_echo() when a NULL skb->dst
      is accessed.
      
      ipv4_pktinfo_prepare() should not drop the dst if (evil) IP options
      are present.
      
      We could refine the test to the presence of ts_needtime or srr,
      but IP options are not often used, so let's be conservative.
      
      Thanks to syzkaller team for finding this bug.
      
      Fixes: d826eb14 ("ipv4: PKTINFO doesnt need dst reference")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f5b54446
    • Eric Dumazet's avatar
      net: use a work queue to defer net_disable_timestamp() work · d5b6fd77
      Eric Dumazet authored
      [ Upstream commit 5fa8bbda ]
      
      Dmitry reported a warning [1] showing that we were calling
      net_disable_timestamp() -> static_key_slow_dec() from a non
      process context.
      
      Grabbing a mutex while holding a spinlock or rcu_read_lock()
      is not allowed.
      
      As Cong suggested, we now use a work queue.
      
      It is possible netstamp_clear() exits while netstamp_needed_deferred
      is not zero, but it is probably not worth trying to do better than that.
      
      netstamp_needed_deferred atomic tracks the exact number of deferred
      decrements.
      
      [1]
      [ INFO: suspicious RCU usage. ]
      4.10.0-rc5+ #192 Not tainted
      -------------------------------
      ./include/linux/rcupdate.h:561 Illegal context switch in RCU read-side
      critical section!
      
      other info that might help us debug this:
      
      rcu_scheduler_active = 2, debug_locks = 0
      2 locks held by syz-executor14/23111:
       #0:  (sk_lock-AF_INET6){+.+.+.}, at: [<ffffffff83a35c35>] lock_sock
      include/net/sock.h:1454 [inline]
       #0:  (sk_lock-AF_INET6){+.+.+.}, at: [<ffffffff83a35c35>]
      rawv6_sendmsg+0x1e65/0x3ec0 net/ipv6/raw.c:919
       #1:  (rcu_read_lock){......}, at: [<ffffffff83ae2678>] nf_hook
      include/linux/netfilter.h:201 [inline]
       #1:  (rcu_read_lock){......}, at: [<ffffffff83ae2678>]
      __ip6_local_out+0x258/0x840 net/ipv6/output_core.c:160
      
      stack backtrace:
      CPU: 2 PID: 23111 Comm: syz-executor14 Not tainted 4.10.0-rc5+ #192
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs
      01/01/2011
      Call Trace:
       __dump_stack lib/dump_stack.c:15 [inline]
       dump_stack+0x2ee/0x3ef lib/dump_stack.c:51
       lockdep_rcu_suspicious+0x139/0x180 kernel/locking/lockdep.c:4452
       rcu_preempt_sleep_check include/linux/rcupdate.h:560 [inline]
       ___might_sleep+0x560/0x650 kernel/sched/core.c:7748
       __might_sleep+0x95/0x1a0 kernel/sched/core.c:7739
       mutex_lock_nested+0x24f/0x1730 kernel/locking/mutex.c:752
       atomic_dec_and_mutex_lock+0x119/0x160 kernel/locking/mutex.c:1060
       __static_key_slow_dec+0x7a/0x1e0 kernel/jump_label.c:149
       static_key_slow_dec+0x51/0x90 kernel/jump_label.c:174
       net_disable_timestamp+0x3b/0x50 net/core/dev.c:1728
       sock_disable_timestamp+0x98/0xc0 net/core/sock.c:403
       __sk_destruct+0x27d/0x6b0 net/core/sock.c:1441
       sk_destruct+0x47/0x80 net/core/sock.c:1460
       __sk_free+0x57/0x230 net/core/sock.c:1468
       sock_wfree+0xae/0x120 net/core/sock.c:1645
       skb_release_head_state+0xfc/0x200 net/core/skbuff.c:655
       skb_release_all+0x15/0x60 net/core/skbuff.c:668
       __kfree_skb+0x15/0x20 net/core/skbuff.c:684
       kfree_skb+0x16e/0x4c0 net/core/skbuff.c:705
       inet_frag_destroy+0x121/0x290 net/ipv4/inet_fragment.c:304
       inet_frag_put include/net/inet_frag.h:133 [inline]
       nf_ct_frag6_gather+0x1106/0x3840
      net/ipv6/netfilter/nf_conntrack_reasm.c:617
       ipv6_defrag+0x1be/0x2b0 net/ipv6/netfilter/nf_defrag_ipv6_hooks.c:68
       nf_hook_entry_hookfn include/linux/netfilter.h:102 [inline]
       nf_hook_slow+0xc3/0x290 net/netfilter/core.c:310
       nf_hook include/linux/netfilter.h:212 [inline]
       __ip6_local_out+0x489/0x840 net/ipv6/output_core.c:160
       ip6_local_out+0x2d/0x170 net/ipv6/output_core.c:170
       ip6_send_skb+0xa1/0x340 net/ipv6/ip6_output.c:1722
       ip6_push_pending_frames+0xb3/0xe0 net/ipv6/ip6_output.c:1742
       rawv6_push_pending_frames net/ipv6/raw.c:613 [inline]
       rawv6_sendmsg+0x2d1a/0x3ec0 net/ipv6/raw.c:927
       inet_sendmsg+0x164/0x5b0 net/ipv4/af_inet.c:744
       sock_sendmsg_nosec net/socket.c:635 [inline]
       sock_sendmsg+0xca/0x110 net/socket.c:645
       sock_write_iter+0x326/0x600 net/socket.c:848
       do_iter_readv_writev+0x2e3/0x5b0 fs/read_write.c:695
       do_readv_writev+0x42c/0x9b0 fs/read_write.c:872
       vfs_writev+0x87/0xc0 fs/read_write.c:911
       do_writev+0x110/0x2c0 fs/read_write.c:944
       SYSC_writev fs/read_write.c:1017 [inline]
       SyS_writev+0x27/0x30 fs/read_write.c:1014
       entry_SYSCALL_64_fastpath+0x1f/0xc2
      RIP: 0033:0x445559
      RSP: 002b:00007f6f46fceb58 EFLAGS: 00000292 ORIG_RAX: 0000000000000014
      RAX: ffffffffffffffda RBX: 0000000000000005 RCX: 0000000000445559
      RDX: 0000000000000001 RSI: 0000000020f1eff0 RDI: 0000000000000005
      RBP: 00000000006e19c0 R08: 0000000000000000 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000292 R12: 0000000000700000
      R13: 0000000020f59000 R14: 0000000000000015 R15: 0000000000020400
      BUG: sleeping function called from invalid context at
      kernel/locking/mutex.c:752
      in_atomic(): 1, irqs_disabled(): 0, pid: 23111, name: syz-executor14
      INFO: lockdep is turned off.
      CPU: 2 PID: 23111 Comm: syz-executor14 Not tainted 4.10.0-rc5+ #192
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs
      01/01/2011
      Call Trace:
       __dump_stack lib/dump_stack.c:15 [inline]
       dump_stack+0x2ee/0x3ef lib/dump_stack.c:51
       ___might_sleep+0x47e/0x650 kernel/sched/core.c:7780
       __might_sleep+0x95/0x1a0 kernel/sched/core.c:7739
       mutex_lock_nested+0x24f/0x1730 kernel/locking/mutex.c:752
       atomic_dec_and_mutex_lock+0x119/0x160 kernel/locking/mutex.c:1060
       __static_key_slow_dec+0x7a/0x1e0 kernel/jump_label.c:149
       static_key_slow_dec+0x51/0x90 kernel/jump_label.c:174
       net_disable_timestamp+0x3b/0x50 net/core/dev.c:1728
       sock_disable_timestamp+0x98/0xc0 net/core/sock.c:403
       __sk_destruct+0x27d/0x6b0 net/core/sock.c:1441
       sk_destruct+0x47/0x80 net/core/sock.c:1460
       __sk_free+0x57/0x230 net/core/sock.c:1468
       sock_wfree+0xae/0x120 net/core/sock.c:1645
       skb_release_head_state+0xfc/0x200 net/core/skbuff.c:655
       skb_release_all+0x15/0x60 net/core/skbuff.c:668
       __kfree_skb+0x15/0x20 net/core/skbuff.c:684
       kfree_skb+0x16e/0x4c0 net/core/skbuff.c:705
       inet_frag_destroy+0x121/0x290 net/ipv4/inet_fragment.c:304
       inet_frag_put include/net/inet_frag.h:133 [inline]
       nf_ct_frag6_gather+0x1106/0x3840
      net/ipv6/netfilter/nf_conntrack_reasm.c:617
       ipv6_defrag+0x1be/0x2b0 net/ipv6/netfilter/nf_defrag_ipv6_hooks.c:68
       nf_hook_entry_hookfn include/linux/netfilter.h:102 [inline]
       nf_hook_slow+0xc3/0x290 net/netfilter/core.c:310
       nf_hook include/linux/netfilter.h:212 [inline]
       __ip6_local_out+0x489/0x840 net/ipv6/output_core.c:160
       ip6_local_out+0x2d/0x170 net/ipv6/output_core.c:170
       ip6_send_skb+0xa1/0x340 net/ipv6/ip6_output.c:1722
       ip6_push_pending_frames+0xb3/0xe0 net/ipv6/ip6_output.c:1742
       rawv6_push_pending_frames net/ipv6/raw.c:613 [inline]
       rawv6_sendmsg+0x2d1a/0x3ec0 net/ipv6/raw.c:927
       inet_sendmsg+0x164/0x5b0 net/ipv4/af_inet.c:744
       sock_sendmsg_nosec net/socket.c:635 [inline]
       sock_sendmsg+0xca/0x110 net/socket.c:645
       sock_write_iter+0x326/0x600 net/socket.c:848
       do_iter_readv_writev+0x2e3/0x5b0 fs/read_write.c:695
       do_readv_writev+0x42c/0x9b0 fs/read_write.c:872
       vfs_writev+0x87/0xc0 fs/read_write.c:911
       do_writev+0x110/0x2c0 fs/read_write.c:944
       SYSC_writev fs/read_write.c:1017 [inline]
       SyS_writev+0x27/0x30 fs/read_write.c:1014
       entry_SYSCALL_64_fastpath+0x1f/0xc2
      RIP: 0033:0x445559
      
      Fixes: b90e5794 ("net: dont call jump_label_dec from irq context")
      Suggested-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Reported-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d5b6fd77
    • Alexey Brodkin's avatar
      stmmac: Discard masked flags in interrupt status register · 455a4577
      Alexey Brodkin authored
      [ Upstream commit 0a764db1 ]
      
      DW GMAC databook says the following about bits in "Register 15 (Interrupt
      Mask Register)":
      --------------------------->8-------------------------
      When set, this bit __disables_the_assertion_of_the_interrupt_signal__
      because of the setting of XXX bit in Register 14 (Interrupt
      Status Register).
      --------------------------->8-------------------------
      
      In fact even if we mask one bit in the mask register it doesn't prevent
      corresponding bit to appear in the status register, it only disables
      interrupt generation for corresponding event.
      
      But currently we expect a bit different behavior: status bits to be in
      sync with their masks, i.e. if mask for bit A is set in the mask
      register then bit A won't appear in the interrupt status register.
      
      This was proven to be incorrect assumption, see discussion here [1].
      That misunderstanding causes unexpected behaviour of the GMAC, for
      example we were happy enough to just see bogus messages about link
      state changes.
      
      So from now on we'll be only checking bits that really may trigger an
      interrupt.
      
      [1] https://lkml.org/lkml/2016/11/3/413Signed-off-by: default avatarAlexey Brodkin <abrodkin@synopsys.com>
      Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com>
      Cc: Fabrice Gasnier <fabrice.gasnier@st.com>
      Cc: Joachim Eastwood <manabian@gmail.com>
      Cc: Phil Reid <preid@electromag.com.au>
      Cc: David Miller <davem@davemloft.net>
      Cc: Alexandre Torgue <alexandre.torgue@gmail.com>
      Cc: Vineet Gupta <vgupta@synopsys.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      455a4577
    • Eric Dumazet's avatar
      tcp: fix 0 divide in __tcp_select_window() · ca876dff
      Eric Dumazet authored
      [ Upstream commit 06425c30 ]
      
      syszkaller fuzzer was able to trigger a divide by zero, when
      TCP window scaling is not enabled.
      
      SO_RCVBUF can be used not only to increase sk_rcvbuf, also
      to decrease it below current receive buffers utilization.
      
      If mss is negative or 0, just return a zero TCP window.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarDmitry Vyukov  <dvyukov@google.com>
      Acked-by: default avatarNeal Cardwell <ncardwell@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ca876dff
    • Dan Carpenter's avatar
      ipv6: pointer math error in ip6_tnl_parse_tlv_enc_lim() · e6fbace8
      Dan Carpenter authored
      [ Upstream commit 63117f09 ]
      
      Casting is a high precedence operation but "off" and "i" are in terms of
      bytes so we need to have some parenthesis here.
      
      Fixes: fbfa743a ("ipv6: fix ip6_tnl_parse_tlv_enc_lim()")
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e6fbace8
    • Eric Dumazet's avatar
      ipv6: fix ip6_tnl_parse_tlv_enc_lim() · a7fe4e5d
      Eric Dumazet authored
      [ Upstream commit fbfa743a ]
      
      This function suffers from multiple issues.
      
      First one is that pskb_may_pull() may reallocate skb->head,
      so the 'raw' pointer needs either to be reloaded or not used at all.
      
      Second issue is that NEXTHDR_DEST handling does not validate
      that the options are present in skb->data, so we might read
      garbage or access non existent memory.
      
      With help from Willem de Bruijn.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarDmitry Vyukov  <dvyukov@google.com>
      Cc: Willem de Bruijn <willemb@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a7fe4e5d
    • Yotam Gigi's avatar
      net/sched: matchall: Fix configuration race · 6c8556f6
      Yotam Gigi authored
      [ Upstream commit fd62d9f5 ]
      
      In the current version, the matchall internal state is split into two
      structs: cls_matchall_head and cls_matchall_filter. This makes little
      sense, as matchall instance supports only one filter, and there is no
      situation where one exists and the other does not. In addition, that led
      to some races when filter was deleted while packet was processed.
      
      Unify that two structs into one, thus simplifying the process of matchall
      creation and deletion. As a result, the new, delete and get callbacks have
      a dummy implementation where all the work is done in destroy and change
      callbacks, as was done in cls_cgroup.
      
      Fixes: bf3994d2 ("net/sched: introduce Match-all classifier")
      Reported-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarYotam Gigi <yotamg@mellanox.com>
      Acked-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6c8556f6
    • Gal Pressman's avatar
      net/mlx5e: Fix update of hash function/key via ethtool · 64cc7ef5
      Gal Pressman authored
      [ Upstream commit a100ff3e ]
      
      Modifying TIR hash should change selected fields bitmask in addition to
      the function and key.
      
      Formerly, Only on ethool mlx5e_set_rxfh "ethtoo -X" we would not set this
      field resulting in zeroing of its value, which means no packet fields are
      used for RX RSS hash calculation thus causing all traffic to arrive in
      RQ[0].
      
      On driver load out of the box we don't have this issue, since the TIR
      hash is fully created from scratch.
      
      Tested:
      ethtool -X ethX hkey  <new key>
      ethtool -X ethX hfunc <new func>
      ethtool -X ethX equal <new indirection table>
      
      All cases are verified with TCP Multi-Stream traffic over IPv4 & IPv6.
      
      Fixes: bdfc028d ("net/mlx5e: Fix ethtool RX hash func configuration change")
      Signed-off-by: default avatarGal Pressman <galp@mellanox.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      64cc7ef5
    • Eric Dumazet's avatar
      can: Fix kernel panic at security_sock_rcv_skb · adf86d59
      Eric Dumazet authored
      [ Upstream commit f1712c73 ]
      
      Zhang Yanmin reported crashes [1] and provided a patch adding a
      synchronize_rcu() call in can_rx_unregister()
      
      The main problem seems that the sockets themselves are not RCU
      protected.
      
      If CAN uses RCU for delivery, then sockets should be freed only after
      one RCU grace period.
      
      Recent kernels could use sock_set_flag(sk, SOCK_RCU_FREE), but let's
      ease stable backports with the following fix instead.
      
      [1]
      BUG: unable to handle kernel NULL pointer dereference at (null)
      IP: [<ffffffff81495e25>] selinux_socket_sock_rcv_skb+0x65/0x2a0
      
      Call Trace:
       <IRQ>
       [<ffffffff81485d8c>] security_sock_rcv_skb+0x4c/0x60
       [<ffffffff81d55771>] sk_filter+0x41/0x210
       [<ffffffff81d12913>] sock_queue_rcv_skb+0x53/0x3a0
       [<ffffffff81f0a2b3>] raw_rcv+0x2a3/0x3c0
       [<ffffffff81f06eab>] can_rcv_filter+0x12b/0x370
       [<ffffffff81f07af9>] can_receive+0xd9/0x120
       [<ffffffff81f07beb>] can_rcv+0xab/0x100
       [<ffffffff81d362ac>] __netif_receive_skb_core+0xd8c/0x11f0
       [<ffffffff81d36734>] __netif_receive_skb+0x24/0xb0
       [<ffffffff81d37f67>] process_backlog+0x127/0x280
       [<ffffffff81d36f7b>] net_rx_action+0x33b/0x4f0
       [<ffffffff810c88d4>] __do_softirq+0x184/0x440
       [<ffffffff81f9e86c>] do_softirq_own_stack+0x1c/0x30
       <EOI>
       [<ffffffff810c76fb>] do_softirq.part.18+0x3b/0x40
       [<ffffffff810c8bed>] do_softirq+0x1d/0x20
       [<ffffffff81d30085>] netif_rx_ni+0xe5/0x110
       [<ffffffff8199cc87>] slcan_receive_buf+0x507/0x520
       [<ffffffff8167ef7c>] flush_to_ldisc+0x21c/0x230
       [<ffffffff810e3baf>] process_one_work+0x24f/0x670
       [<ffffffff810e44ed>] worker_thread+0x9d/0x6f0
       [<ffffffff810e4450>] ? rescuer_thread+0x480/0x480
       [<ffffffff810ebafc>] kthread+0x12c/0x150
       [<ffffffff81f9ccef>] ret_from_fork+0x3f/0x70
      Reported-by: default avatarZhang Yanmin <yanmin.zhang@intel.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Acked-by: default avatarOliver Hartkopp <socketcan@hartkopp.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      adf86d59
  2. 14 Feb, 2017 12 commits