1. 01 Mar, 2017 28 commits
    • David S. Miller's avatar
      irda: Fix lockdep annotations in hashbin_delete(). · 7132afee
      David S. Miller authored
      [ Upstream commit 4c03b862 ]
      
      A nested lock depth was added to the hasbin_delete() code but it
      doesn't actually work some well and results in tons of lockdep splats.
      
      Fix the code instead to properly drop the lock around the operation
      and just keep peeking the head of the hashbin queue.
      Reported-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Tested-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      7132afee
    • Andrey Konovalov's avatar
      dccp: fix freeing skb too early for IPV6_RECVPKTINFO · 336d459d
      Andrey Konovalov authored
      [ Upstream commit 5edabca9 ]
      
      In the current DCCP implementation an skb for a DCCP_PKT_REQUEST packet
      is forcibly freed via __kfree_skb in dccp_rcv_state_process if
      dccp_v6_conn_request successfully returns.
      
      However, if IPV6_RECVPKTINFO is set on a socket, the address of the skb
      is saved to ireq->pktopts and the ref count for skb is incremented in
      dccp_v6_conn_request, so skb is still in use. Nevertheless, it gets freed
      in dccp_rcv_state_process.
      
      Fix by calling consume_skb instead of doing goto discard and therefore
      calling __kfree_skb.
      
      Similar fixes for TCP:
      
      fb7e2399 [TCP]: skb is unexpectedly freed.
      0aea76d3 tcp: SYN packets are now
      simply consumed
      Signed-off-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      336d459d
    • Anoob Soman's avatar
      packet: Do not call fanout_release from atomic contexts · fbc87738
      Anoob Soman authored
      [ Upstream commit 2bd624b4 ]
      
      Commit 66644982 ("packet: call fanout_release, while UNREGISTERING a
      netdev"), unfortunately, introduced the following issues.
      
      1. calling mutex_lock(&fanout_mutex) (fanout_release()) from inside
      rcu_read-side critical section. rcu_read_lock disables preemption, most often,
      which prohibits calling sleeping functions.
      
      [  ] include/linux/rcupdate.h:560 Illegal context switch in RCU read-side critical section!
      [  ]
      [  ] rcu_scheduler_active = 1, debug_locks = 0
      [  ] 4 locks held by ovs-vswitchd/1969:
      [  ]  #0:  (cb_lock){++++++}, at: [<ffffffff8158a6c9>] genl_rcv+0x19/0x40
      [  ]  #1:  (ovs_mutex){+.+.+.}, at: [<ffffffffa04878ca>] ovs_vport_cmd_del+0x4a/0x100 [openvswitch]
      [  ]  #2:  (rtnl_mutex){+.+.+.}, at: [<ffffffff81564157>] rtnl_lock+0x17/0x20
      [  ]  #3:  (rcu_read_lock){......}, at: [<ffffffff81614165>] packet_notifier+0x5/0x3f0
      [  ]
      [  ] Call Trace:
      [  ]  [<ffffffff813770c1>] dump_stack+0x85/0xc4
      [  ]  [<ffffffff810c9077>] lockdep_rcu_suspicious+0x107/0x110
      [  ]  [<ffffffff810a2da7>] ___might_sleep+0x57/0x210
      [  ]  [<ffffffff810a2fd0>] __might_sleep+0x70/0x90
      [  ]  [<ffffffff8162e80c>] mutex_lock_nested+0x3c/0x3a0
      [  ]  [<ffffffff810de93f>] ? vprintk_default+0x1f/0x30
      [  ]  [<ffffffff81186e88>] ? printk+0x4d/0x4f
      [  ]  [<ffffffff816106dd>] fanout_release+0x1d/0xe0
      [  ]  [<ffffffff81614459>] packet_notifier+0x2f9/0x3f0
      
      2. calling mutex_lock(&fanout_mutex) inside spin_lock(&po->bind_lock).
      "sleeping function called from invalid context"
      
      [  ] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:620
      [  ] in_atomic(): 1, irqs_disabled(): 0, pid: 1969, name: ovs-vswitchd
      [  ] INFO: lockdep is turned off.
      [  ] Call Trace:
      [  ]  [<ffffffff813770c1>] dump_stack+0x85/0xc4
      [  ]  [<ffffffff810a2f52>] ___might_sleep+0x202/0x210
      [  ]  [<ffffffff810a2fd0>] __might_sleep+0x70/0x90
      [  ]  [<ffffffff8162e80c>] mutex_lock_nested+0x3c/0x3a0
      [  ]  [<ffffffff816106dd>] fanout_release+0x1d/0xe0
      [  ]  [<ffffffff81614459>] packet_notifier+0x2f9/0x3f0
      
      3. calling dev_remove_pack(&fanout->prot_hook), from inside
      spin_lock(&po->bind_lock) or rcu_read-side critical-section. dev_remove_pack()
      -> synchronize_net(), which might sleep.
      
      [  ] BUG: scheduling while atomic: ovs-vswitchd/1969/0x00000002
      [  ] INFO: lockdep is turned off.
      [  ] Call Trace:
      [  ]  [<ffffffff813770c1>] dump_stack+0x85/0xc4
      [  ]  [<ffffffff81186274>] __schedule_bug+0x64/0x73
      [  ]  [<ffffffff8162b8cb>] __schedule+0x6b/0xd10
      [  ]  [<ffffffff8162c5db>] schedule+0x6b/0x80
      [  ]  [<ffffffff81630b1d>] schedule_timeout+0x38d/0x410
      [  ]  [<ffffffff810ea3fd>] synchronize_sched_expedited+0x53d/0x810
      [  ]  [<ffffffff810ea6de>] synchronize_rcu_expedited+0xe/0x10
      [  ]  [<ffffffff8154eab5>] synchronize_net+0x35/0x50
      [  ]  [<ffffffff8154eae3>] dev_remove_pack+0x13/0x20
      [  ]  [<ffffffff8161077e>] fanout_release+0xbe/0xe0
      [  ]  [<ffffffff81614459>] packet_notifier+0x2f9/0x3f0
      
      4. fanout_release() races with calls from different CPU.
      
      To fix the above problems, remove the call to fanout_release() under
      rcu_read_lock(). Instead, call __dev_remove_pack(&fanout->prot_hook) and
      netdev_run_todo will be happy that &dev->ptype_specific list is empty. In order
      to achieve this, I moved dev_{add,remove}_pack() out of fanout_{add,release} to
      __fanout_{link,unlink}. So, call to {,__}unregister_prot_hook() will make sure
      fanout->prot_hook is removed as well.
      
      [js] no rollover in 3.12
      
      Fixes: 66644982 ("packet: call fanout_release, while UNREGISTERING a netdev")
      Reported-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarAnoob Soman <anoob.soman@citrix.com>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      fbc87738
    • Eric Dumazet's avatar
      packet: fix races in fanout_add() · 6d46193d
      Eric Dumazet authored
      [ Upstream commit d199fab6 ]
      
      Multiple threads can call fanout_add() at the same time.
      
      We need to grab fanout_mutex earlier to avoid races that could
      lead to one thread freeing po->rollover that was set by another thread.
      
      Do the same in fanout_release(), for peace of mind, and to help us
      finding lockdep issues earlier.
      
      [js] no rollover in 3.12
      
      Fixes: dc99f600 ("packet: Add fanout support.")
      Fixes: 0648ab70 ("packet: rollover prepare: per-socket state")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Willem de Bruijn <willemb@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      6d46193d
    • Eric Dumazet's avatar
      net/llc: avoid BUG_ON() in skb_orphan() · c112a93a
      Eric Dumazet authored
      [ Upstream commit 8b74d439 ]
      
      It seems nobody used LLC since linux-3.12.
      
      Fortunately fuzzers like syzkaller still know how to run this code,
      otherwise it would be no fun.
      
      Setting skb->sk without skb->destructor leads to all kinds of
      bugs, we now prefer to be very strict about it.
      
      Ideally here we would use skb_set_owner() but this helper does not exist yet,
      only CAN seems to have a private helper for that.
      
      [js] take sock_efree from 62bccb8c
      
      Fixes: 376c7311 ("net: add a temporary sanity check in skb_orphan()")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      c112a93a
    • Colin Ian King's avatar
      rtc: interface: ignore expired timers when enqueuing new timers · 8cf69a63
      Colin Ian King authored
      commit 2b2f5ff0 upstream.
      
      This patch fixes a RTC wakealarm issue, namely, the event fires during
      hibernate and is not cleared from the list, causing hwclock to block.
      
      The current enqueuing does not trigger an alarm if any expired timers
      already exist on the timerqueue. This can occur when a RTC wake alarm
      is used to wake a machine out of hibernate and the resumed state has
      old expired timers that have not been removed from the timer queue.
      This fix skips over any expired timers and triggers an alarm if there
      are no pending timers on the timerqueue. Note that the skipped expired
      timer will get reaped later on, so there is no need to clean it up
      immediately.
      
      The issue can be reproduced by putting a machine into hibernate and
      waking it with the RTC wakealarm.  Running the example RTC test program
      from tools/testing/selftests/timers/rtctest.c after the hibernate will
      block indefinitely.  With the fix, it no longer blocks after the
      hibernate resume.
      
      BugLink: http://bugs.launchpad.net/bugs/1333569Signed-off-by: default avatarColin Ian King <colin.king@canonical.com>
      Signed-off-by: default avatarAlexandre Belloni <alexandre.belloni@free-electrons.com>
      Cc: Sumit Semwal <sumit.semwal@linaro.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      8cf69a63
    • Sergey Senozhatsky's avatar
      printk: use rcuidle console tracepoint · 035a1e6f
      Sergey Senozhatsky authored
      commit fc98c3c8 upstream.
      
      Use rcuidle console tracepoint because, apparently, it may be issued
      from an idle CPU:
      
        hw-breakpoint: Failed to enable monitor mode on CPU 0.
        hw-breakpoint: CPU 0 failed to disable vector catch
      
        ===============================
        [ ERR: suspicious RCU usage.  ]
        4.10.0-rc8-next-20170215+ #119 Not tainted
        -------------------------------
        ./include/trace/events/printk.h:32 suspicious rcu_dereference_check() usage!
      
        other info that might help us debug this:
      
        RCU used illegally from idle CPU!
        rcu_scheduler_active = 2, debug_locks = 0
        RCU used illegally from extended quiescent state!
        2 locks held by swapper/0/0:
         #0:  (cpu_pm_notifier_lock){......}, at: [<c0237e2c>] cpu_pm_exit+0x10/0x54
         #1:  (console_lock){+.+.+.}, at: [<c01ab350>] vprintk_emit+0x264/0x474
      
        stack backtrace:
        CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.10.0-rc8-next-20170215+ #119
        Hardware name: Generic OMAP4 (Flattened Device Tree)
          console_unlock
          vprintk_emit
          vprintk_default
          printk
          reset_ctrl_regs
          dbg_cpu_pm_notify
          notifier_call_chain
          cpu_pm_exit
          omap_enter_idle_coupled
          cpuidle_enter_state
          cpuidle_enter_state_coupled
          do_idle
          cpu_startup_entry
          start_kernel
      
      This RCU warning, however, is suppressed by lockdep_off() in printk().
      lockdep_off() increments the ->lockdep_recursion counter and thus
      disables RCU_LOCKDEP_WARN() and debug_lockdep_rcu_enabled(), which want
      lockdep to be enabled "current->lockdep_recursion == 0".
      
      Link: http://lkml.kernel.org/r/20170217015932.11898-1-sergey.senozhatsky@gmail.comSigned-off-by: default avatarSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Reported-by: default avatarTony Lindgren <tony@atomide.com>
      Tested-by: default avatarTony Lindgren <tony@atomide.com>
      Acked-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Acked-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      Cc: Petr Mladek <pmladek@suse.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tony Lindgren <tony@atomide.com>
      Cc: Russell King <rmk@armlinux.org.uk>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      035a1e6f
    • Yang Yang's avatar
      futex: Move futex_init() to core_initcall · 66074989
      Yang Yang authored
      commit 25f71d1c upstream.
      
      The UEVENT user mode helper is enabled before the initcalls are executed
      and is available when the root filesystem has been mounted.
      
      The user mode helper is triggered by device init calls and the executable
      might use the futex syscall.
      
      futex_init() is marked __initcall which maps to device_initcall, but there
      is no guarantee that futex_init() is invoked _before_ the first device init
      call which triggers the UEVENT user mode helper.
      
      If the user mode helper uses the futex syscall before futex_init() then the
      syscall crashes with a NULL pointer dereference because the futex subsystem
      has not been initialized yet.
      
      Move futex_init() to core_initcall so futexes are initialized before the
      root filesystem is mounted and the usermode helper becomes available.
      
      [ tglx: Rewrote changelog ]
      Signed-off-by: default avatarYang Yang <yang.yang29@zte.com.cn>
      Cc: jiang.biao2@zte.com.cn
      Cc: jiang.zhengxiong@zte.com.cn
      Cc: zhong.weidong@zte.com.cn
      Cc: deng.huali@zte.com.cn
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1483085875-6130-1-git-send-email-yang.yang29@zte.com.cnSigned-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      66074989
    • Johannes Thumshirn's avatar
      scsi: don't BUG_ON() empty DMA transfers · a1c59352
      Johannes Thumshirn authored
      commit fd3fc0b4 upstream.
      
      Don't crash the machine just because of an empty transfer. Use WARN_ON()
      combined with returning an error.
      
      Found by Dmitry Vyukov and syzkaller.
      
      [ Changed to "WARN_ON_ONCE()". Al has a patch that should fix the root
        cause, but a BUG_ON() is not acceptable in any case, and a WARN_ON()
        might still be a cause of excessive log spamming.
      
        NOTE! If this warning ever triggers, we may end up leaking resources,
        since this doesn't bother to try to clean the command up. So this
        WARN_ON_ONCE() triggering does imply real problems. But BUG_ON() is
        much worse.
      
        People really need to stop using BUG_ON() for "this shouldn't ever
        happen". It makes pretty much any bug worse.     - Linus ]
      Signed-off-by: default avatarJohannes Thumshirn <jthumshirn@suse.de>
      Reported-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Cc: James Bottomley <jejb@linux.vnet.ibm.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      a1c59352
    • Mauro Carvalho Chehab's avatar
      siano: make it work again with CONFIG_VMAP_STACK · 405a5ed0
      Mauro Carvalho Chehab authored
      commit f9c85ee6 upstream.
      
      Reported as a Kaffeine bug:
      	https://bugs.kde.org/show_bug.cgi?id=375811
      
      The USB control messages require DMA to work. We cannot pass
      a stack-allocated buffer, as it is not warranted that the
      stack would be into a DMA enabled area.
      
      On Kernel 4.9, the default is to not accept DMA on stack anymore
      on x86 architecture. On other architectures, this has been a
      requirement since Kernel 2.2. So, after this patch, this driver
      should likely work fine on all archs.
      
      Tested with USB ID 2040:5510: Hauppauge Windham
      Signed-off-by: default avatarMauro Carvalho Chehab <mchehab@s-opensource.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      405a5ed0
    • Miklos Szeredi's avatar
      vfs: fix uninitialized flags in splice_to_pipe() · 278f2fd4
      Miklos Szeredi authored
      commit 5a81e6a1 upstream.
      
      Flags (PIPE_BUF_FLAG_PACKET, PIPE_BUF_FLAG_GIFT) could remain on the
      unused part of the pipe ring buffer.  Previously splice_to_pipe() left
      the flags value alone, which could result in incorrect behavior.
      
      Uninitialized flags appears to have been there from the introduction of
      the splice syscall.
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@redhat.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      278f2fd4
    • Christoph Hellwig's avatar
      scsi: move the nr_phys_segments assert into scsi_init_io · b035da65
      Christoph Hellwig authored
      commit 635d98b1 upstream.
      
      scsi_init_io should only be called for requests that transfer data,
      so move the assert that a request has segments from the callers into
      scsi_init_io.
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Reviewed-by: default avatarHannes Reinecke <hare@suse.de>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      b035da65
    • Eric Dumazet's avatar
      l2tp: do not use udp_ioctl() · f7c55cea
      Eric Dumazet authored
      [ Upstream commit 72fb96e7 ]
      
      udp_ioctl(), as its name suggests, is used by UDP protocols,
      but is also used by L2TP :(
      
      L2TP should use its own handler, because it really does not
      look the same.
      
      SIOCINQ for instance should not assume UDP checksum or headers.
      
      Thanks to Andrey and syzkaller team for providing the report
      and a nice reproducer.
      
      While crashes only happen on recent kernels (after commit
      7c13f97f ("udp: do fwd memory scheduling on dequeue")), this
      probably needs to be backported to older kernels.
      
      Fixes: 7c13f97f ("udp: do fwd memory scheduling on dequeue")
      Fixes: 85584672 ("udp: Fix udp_poll() and ioctl()")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Acked-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      f7c55cea
    • WANG Cong's avatar
      ping: fix a null pointer dereference · 25c09419
      WANG Cong authored
      [ Upstream commit 73d2c667 ]
      
      Andrey reported a kernel crash:
      
        general protection fault: 0000 [#1] SMP KASAN
        Dumping ftrace buffer:
           (ftrace buffer empty)
        Modules linked in:
        CPU: 2 PID: 3880 Comm: syz-executor1 Not tainted 4.10.0-rc6+ #124
        Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
        task: ffff880060048040 task.stack: ffff880069be8000
        RIP: 0010:ping_v4_push_pending_frames net/ipv4/ping.c:647 [inline]
        RIP: 0010:ping_v4_sendmsg+0x1acd/0x23f0 net/ipv4/ping.c:837
        RSP: 0018:ffff880069bef8b8 EFLAGS: 00010206
        RAX: dffffc0000000000 RBX: ffff880069befb90 RCX: 0000000000000000
        RDX: 0000000000000018 RSI: ffff880069befa30 RDI: 00000000000000c2
        RBP: ffff880069befbb8 R08: 0000000000000008 R09: 0000000000000000
        R10: 0000000000000002 R11: 0000000000000000 R12: ffff880069befab0
        R13: ffff88006c624a80 R14: ffff880069befa70 R15: 0000000000000000
        FS:  00007f6f7c716700(0000) GS:ffff88006de00000(0000) knlGS:0000000000000000
        CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
        CR2: 00000000004a6f28 CR3: 000000003a134000 CR4: 00000000000006e0
        Call Trace:
         inet_sendmsg+0x164/0x5b0 net/ipv4/af_inet.c:744
         sock_sendmsg_nosec net/socket.c:635 [inline]
         sock_sendmsg+0xca/0x110 net/socket.c:645
         SYSC_sendto+0x660/0x810 net/socket.c:1687
         SyS_sendto+0x40/0x50 net/socket.c:1655
         entry_SYSCALL_64_fastpath+0x1f/0xc2
      
      This is because we miss a check for NULL pointer for skb_peek() when
      the queue is empty. Other places already have the same check.
      
      Fixes: c319b4d7 ("net: ipv4: add IPPROTO_ICMP socket kind")
      Reported-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Tested-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Signed-off-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      25c09419
    • Willem de Bruijn's avatar
      packet: round up linear to header len · da00bddd
      Willem de Bruijn authored
      [ Upstream commit 57031eb7 ]
      
      Link layer protocols may unconditionally pull headers, as Ethernet
      does in eth_type_trans. Ensure that the entire link layer header
      always lies in the skb linear segment. tpacket_snd has such a check.
      Extend this to packet_snd.
      
      Variable length link layer headers complicate the computation
      somewhat. Here skb->len may be smaller than dev->hard_header_len.
      
      Round up the linear length to be at least as long as the smallest of
      the two.
      
      [js] no virtio helpers in 3.12
      Reported-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Signed-off-by: default avatarWillem de Bruijn <willemb@google.com>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      da00bddd
    • Marcelo Ricardo Leitner's avatar
      sctp: avoid BUG_ON on sctp_wait_for_sndbuf · 7a814bf5
      Marcelo Ricardo Leitner authored
      [ Upstream commit 2dcab598 ]
      
      Alexander Popov reported that an application may trigger a BUG_ON in
      sctp_wait_for_sndbuf if the socket tx buffer is full, a thread is
      waiting on it to queue more data and meanwhile another thread peels off
      the association being used by the first thread.
      
      This patch replaces the BUG_ON call with a proper error handling. It
      will return -EPIPE to the original sendmsg call, similarly to what would
      have been done if the association wasn't found in the first place.
      Acked-by: default avatarAlexander Popov <alex.popov@linux.com>
      Signed-off-by: default avatarMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Reviewed-by: default avatarXin Long <lucien.xin@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      7a814bf5
    • Willem de Bruijn's avatar
      macvtap: read vnet_hdr_size once · e1413a0e
      Willem de Bruijn authored
      [ Upstream commit 837585a5 ]
      
      When IFF_VNET_HDR is enabled, a virtio_net header must precede data.
      Data length is verified to be greater than or equal to expected header
      length tun->vnet_hdr_sz before copying.
      
      Macvtap functions read the value once, but unless READ_ONCE is used,
      the compiler may ignore this and read multiple times. Enforce a single
      read and locally cached value to avoid updates between test and use.
      Signed-off-by: default avatarWillem de Bruijn <willemb@google.com>
      Suggested-by: default avatarEric Dumazet <edumazet@google.com>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      e1413a0e
    • Willem de Bruijn's avatar
      tun: read vnet_hdr_sz once · 5cc7e001
      Willem de Bruijn authored
      [ Upstream commit e1edab87 ]
      
      When IFF_VNET_HDR is enabled, a virtio_net header must precede data.
      Data length is verified to be greater than or equal to expected header
      length tun->vnet_hdr_sz before copying.
      
      Read this value once and cache locally, as it can be updated between
      the test and use (TOCTOU).
      
      [js] we have TUN_VNET_HDR in 3.12
      Signed-off-by: default avatarWillem de Bruijn <willemb@google.com>
      Reported-by: default avatarDmitry Vyukov <dvyukov@google.com>
      CC: Eric Dumazet <edumazet@google.com>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      5cc7e001
    • Eric Dumazet's avatar
      tcp: avoid infinite loop in tcp_splice_read() · 917c6663
      Eric Dumazet authored
      [ Upstream commit ccf7abb9 ]
      
      Splicing from TCP socket is vulnerable when a packet with URG flag is
      received and stored into receive queue.
      
      __tcp_splice_read() returns 0, and sk_wait_data() immediately
      returns since there is the problematic skb in queue.
      
      This is a nice way to burn cpu (aka infinite loop) and trigger
      soft lockups.
      
      Again, this gem was found by syzkaller tool.
      
      Fixes: 9c55e01c ("[TCP]: Splice receive support.")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarDmitry Vyukov  <dvyukov@google.com>
      Cc: Willy Tarreau <w@1wt.eu>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      917c6663
    • Eric Dumazet's avatar
      ip6_gre: fix ip6gre_err() invalid reads · 1ae4b12d
      Eric Dumazet authored
      [ Upstream commit 7892032c ]
      
      Andrey Konovalov reported out of bound accesses in ip6gre_err()
      
      If GRE flags contains GRE_KEY, the following expression
      *(((__be32 *)p) + (grehlen / 4) - 1)
      
      accesses data ~40 bytes after the expected point, since
      grehlen includes the size of IPv6 headers.
      
      Let's use a "struct gre_base_hdr *greh" pointer to make this
      code more readable.
      
      p[1] becomes greh->protocol.
      grhlen is the GRE header length.
      
      Fixes: c12b395a ("gre: Support GRE over IPv6")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      1ae4b12d
    • Eric Dumazet's avatar
      netlabel: out of bound access in cipso_v4_validate() · 0bccd6e1
      Eric Dumazet authored
      [ Upstream commit d71b7896 ]
      
      syzkaller found another out of bound access in ip_options_compile(),
      or more exactly in cipso_v4_validate()
      
      Fixes: 20e2a864 ("cipso: handle CIPSO options correctly when NetLabel is disabled")
      Fixes: 446fda4f ("[NetLabel]: CIPSOv4 engine")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarDmitry Vyukov  <dvyukov@google.com>
      Cc: Paul Moore <paul@paul-moore.com>
      Acked-by: default avatarPaul Moore <paul@paul-moore.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      0bccd6e1
    • Eric Dumazet's avatar
      ipv4: keep skb->dst around in presence of IP options · e5a82d66
      Eric Dumazet authored
      [ Upstream commit 34b2cef2 ]
      
      Andrey Konovalov got crashes in __ip_options_echo() when a NULL skb->dst
      is accessed.
      
      ipv4_pktinfo_prepare() should not drop the dst if (evil) IP options
      are present.
      
      We could refine the test to the presence of ts_needtime or srr,
      but IP options are not often used, so let's be conservative.
      
      Thanks to syzkaller team for finding this bug.
      
      Fixes: d826eb14 ("ipv4: PKTINFO doesnt need dst reference")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      e5a82d66
    • Eric Dumazet's avatar
      net: use a work queue to defer net_disable_timestamp() work · 33b07484
      Eric Dumazet authored
      [ Upstream commit 5fa8bbda ]
      
      Dmitry reported a warning [1] showing that we were calling
      net_disable_timestamp() -> static_key_slow_dec() from a non
      process context.
      
      Grabbing a mutex while holding a spinlock or rcu_read_lock()
      is not allowed.
      
      As Cong suggested, we now use a work queue.
      
      It is possible netstamp_clear() exits while netstamp_needed_deferred
      is not zero, but it is probably not worth trying to do better than that.
      
      netstamp_needed_deferred atomic tracks the exact number of deferred
      decrements.
      
      [1]
      [ INFO: suspicious RCU usage. ]
      4.10.0-rc5+ #192 Not tainted
      -------------------------------
      ./include/linux/rcupdate.h:561 Illegal context switch in RCU read-side
      critical section!
      
      other info that might help us debug this:
      
      rcu_scheduler_active = 2, debug_locks = 0
      2 locks held by syz-executor14/23111:
       #0:  (sk_lock-AF_INET6){+.+.+.}, at: [<ffffffff83a35c35>] lock_sock
      include/net/sock.h:1454 [inline]
       #0:  (sk_lock-AF_INET6){+.+.+.}, at: [<ffffffff83a35c35>]
      rawv6_sendmsg+0x1e65/0x3ec0 net/ipv6/raw.c:919
       #1:  (rcu_read_lock){......}, at: [<ffffffff83ae2678>] nf_hook
      include/linux/netfilter.h:201 [inline]
       #1:  (rcu_read_lock){......}, at: [<ffffffff83ae2678>]
      __ip6_local_out+0x258/0x840 net/ipv6/output_core.c:160
      
      stack backtrace:
      CPU: 2 PID: 23111 Comm: syz-executor14 Not tainted 4.10.0-rc5+ #192
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs
      01/01/2011
      Call Trace:
       __dump_stack lib/dump_stack.c:15 [inline]
       dump_stack+0x2ee/0x3ef lib/dump_stack.c:51
       lockdep_rcu_suspicious+0x139/0x180 kernel/locking/lockdep.c:4452
       rcu_preempt_sleep_check include/linux/rcupdate.h:560 [inline]
       ___might_sleep+0x560/0x650 kernel/sched/core.c:7748
       __might_sleep+0x95/0x1a0 kernel/sched/core.c:7739
       mutex_lock_nested+0x24f/0x1730 kernel/locking/mutex.c:752
       atomic_dec_and_mutex_lock+0x119/0x160 kernel/locking/mutex.c:1060
       __static_key_slow_dec+0x7a/0x1e0 kernel/jump_label.c:149
       static_key_slow_dec+0x51/0x90 kernel/jump_label.c:174
       net_disable_timestamp+0x3b/0x50 net/core/dev.c:1728
       sock_disable_timestamp+0x98/0xc0 net/core/sock.c:403
       __sk_destruct+0x27d/0x6b0 net/core/sock.c:1441
       sk_destruct+0x47/0x80 net/core/sock.c:1460
       __sk_free+0x57/0x230 net/core/sock.c:1468
       sock_wfree+0xae/0x120 net/core/sock.c:1645
       skb_release_head_state+0xfc/0x200 net/core/skbuff.c:655
       skb_release_all+0x15/0x60 net/core/skbuff.c:668
       __kfree_skb+0x15/0x20 net/core/skbuff.c:684
       kfree_skb+0x16e/0x4c0 net/core/skbuff.c:705
       inet_frag_destroy+0x121/0x290 net/ipv4/inet_fragment.c:304
       inet_frag_put include/net/inet_frag.h:133 [inline]
       nf_ct_frag6_gather+0x1106/0x3840
      net/ipv6/netfilter/nf_conntrack_reasm.c:617
       ipv6_defrag+0x1be/0x2b0 net/ipv6/netfilter/nf_defrag_ipv6_hooks.c:68
       nf_hook_entry_hookfn include/linux/netfilter.h:102 [inline]
       nf_hook_slow+0xc3/0x290 net/netfilter/core.c:310
       nf_hook include/linux/netfilter.h:212 [inline]
       __ip6_local_out+0x489/0x840 net/ipv6/output_core.c:160
       ip6_local_out+0x2d/0x170 net/ipv6/output_core.c:170
       ip6_send_skb+0xa1/0x340 net/ipv6/ip6_output.c:1722
       ip6_push_pending_frames+0xb3/0xe0 net/ipv6/ip6_output.c:1742
       rawv6_push_pending_frames net/ipv6/raw.c:613 [inline]
       rawv6_sendmsg+0x2d1a/0x3ec0 net/ipv6/raw.c:927
       inet_sendmsg+0x164/0x5b0 net/ipv4/af_inet.c:744
       sock_sendmsg_nosec net/socket.c:635 [inline]
       sock_sendmsg+0xca/0x110 net/socket.c:645
       sock_write_iter+0x326/0x600 net/socket.c:848
       do_iter_readv_writev+0x2e3/0x5b0 fs/read_write.c:695
       do_readv_writev+0x42c/0x9b0 fs/read_write.c:872
       vfs_writev+0x87/0xc0 fs/read_write.c:911
       do_writev+0x110/0x2c0 fs/read_write.c:944
       SYSC_writev fs/read_write.c:1017 [inline]
       SyS_writev+0x27/0x30 fs/read_write.c:1014
       entry_SYSCALL_64_fastpath+0x1f/0xc2
      RIP: 0033:0x445559
      RSP: 002b:00007f6f46fceb58 EFLAGS: 00000292 ORIG_RAX: 0000000000000014
      RAX: ffffffffffffffda RBX: 0000000000000005 RCX: 0000000000445559
      RDX: 0000000000000001 RSI: 0000000020f1eff0 RDI: 0000000000000005
      RBP: 00000000006e19c0 R08: 0000000000000000 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000292 R12: 0000000000700000
      R13: 0000000020f59000 R14: 0000000000000015 R15: 0000000000020400
      BUG: sleeping function called from invalid context at
      kernel/locking/mutex.c:752
      in_atomic(): 1, irqs_disabled(): 0, pid: 23111, name: syz-executor14
      INFO: lockdep is turned off.
      CPU: 2 PID: 23111 Comm: syz-executor14 Not tainted 4.10.0-rc5+ #192
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs
      01/01/2011
      Call Trace:
       __dump_stack lib/dump_stack.c:15 [inline]
       dump_stack+0x2ee/0x3ef lib/dump_stack.c:51
       ___might_sleep+0x47e/0x650 kernel/sched/core.c:7780
       __might_sleep+0x95/0x1a0 kernel/sched/core.c:7739
       mutex_lock_nested+0x24f/0x1730 kernel/locking/mutex.c:752
       atomic_dec_and_mutex_lock+0x119/0x160 kernel/locking/mutex.c:1060
       __static_key_slow_dec+0x7a/0x1e0 kernel/jump_label.c:149
       static_key_slow_dec+0x51/0x90 kernel/jump_label.c:174
       net_disable_timestamp+0x3b/0x50 net/core/dev.c:1728
       sock_disable_timestamp+0x98/0xc0 net/core/sock.c:403
       __sk_destruct+0x27d/0x6b0 net/core/sock.c:1441
       sk_destruct+0x47/0x80 net/core/sock.c:1460
       __sk_free+0x57/0x230 net/core/sock.c:1468
       sock_wfree+0xae/0x120 net/core/sock.c:1645
       skb_release_head_state+0xfc/0x200 net/core/skbuff.c:655
       skb_release_all+0x15/0x60 net/core/skbuff.c:668
       __kfree_skb+0x15/0x20 net/core/skbuff.c:684
       kfree_skb+0x16e/0x4c0 net/core/skbuff.c:705
       inet_frag_destroy+0x121/0x290 net/ipv4/inet_fragment.c:304
       inet_frag_put include/net/inet_frag.h:133 [inline]
       nf_ct_frag6_gather+0x1106/0x3840
      net/ipv6/netfilter/nf_conntrack_reasm.c:617
       ipv6_defrag+0x1be/0x2b0 net/ipv6/netfilter/nf_defrag_ipv6_hooks.c:68
       nf_hook_entry_hookfn include/linux/netfilter.h:102 [inline]
       nf_hook_slow+0xc3/0x290 net/netfilter/core.c:310
       nf_hook include/linux/netfilter.h:212 [inline]
       __ip6_local_out+0x489/0x840 net/ipv6/output_core.c:160
       ip6_local_out+0x2d/0x170 net/ipv6/output_core.c:170
       ip6_send_skb+0xa1/0x340 net/ipv6/ip6_output.c:1722
       ip6_push_pending_frames+0xb3/0xe0 net/ipv6/ip6_output.c:1742
       rawv6_push_pending_frames net/ipv6/raw.c:613 [inline]
       rawv6_sendmsg+0x2d1a/0x3ec0 net/ipv6/raw.c:927
       inet_sendmsg+0x164/0x5b0 net/ipv4/af_inet.c:744
       sock_sendmsg_nosec net/socket.c:635 [inline]
       sock_sendmsg+0xca/0x110 net/socket.c:645
       sock_write_iter+0x326/0x600 net/socket.c:848
       do_iter_readv_writev+0x2e3/0x5b0 fs/read_write.c:695
       do_readv_writev+0x42c/0x9b0 fs/read_write.c:872
       vfs_writev+0x87/0xc0 fs/read_write.c:911
       do_writev+0x110/0x2c0 fs/read_write.c:944
       SYSC_writev fs/read_write.c:1017 [inline]
       SyS_writev+0x27/0x30 fs/read_write.c:1014
       entry_SYSCALL_64_fastpath+0x1f/0xc2
      RIP: 0033:0x445559
      
      Fixes: b90e5794 ("net: dont call jump_label_dec from irq context")
      Suggested-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Reported-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      33b07484
    • Eric Dumazet's avatar
      tcp: fix 0 divide in __tcp_select_window() · c48ae3c2
      Eric Dumazet authored
      [ Upstream commit 06425c30 ]
      
      syszkaller fuzzer was able to trigger a divide by zero, when
      TCP window scaling is not enabled.
      
      SO_RCVBUF can be used not only to increase sk_rcvbuf, also
      to decrease it below current receive buffers utilization.
      
      If mss is negative or 0, just return a zero TCP window.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarDmitry Vyukov  <dvyukov@google.com>
      Acked-by: default avatarNeal Cardwell <ncardwell@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      c48ae3c2
    • Dan Carpenter's avatar
      ipv6: pointer math error in ip6_tnl_parse_tlv_enc_lim() · 47ba15dc
      Dan Carpenter authored
      [ Upstream commit 63117f09 ]
      
      Casting is a high precedence operation but "off" and "i" are in terms of
      bytes so we need to have some parenthesis here.
      
      Fixes: fbfa743a ("ipv6: fix ip6_tnl_parse_tlv_enc_lim()")
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      47ba15dc
    • Eric Dumazet's avatar
      ipv6: fix ip6_tnl_parse_tlv_enc_lim() · decccc92
      Eric Dumazet authored
      [ Upstream commit fbfa743a ]
      
      This function suffers from multiple issues.
      
      First one is that pskb_may_pull() may reallocate skb->head,
      so the 'raw' pointer needs either to be reloaded or not used at all.
      
      Second issue is that NEXTHDR_DEST handling does not validate
      that the options are present in skb->data, so we might read
      garbage or access non existent memory.
      
      With help from Willem de Bruijn.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarDmitry Vyukov  <dvyukov@google.com>
      Cc: Willem de Bruijn <willemb@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      decccc92
    • Eric Dumazet's avatar
      can: Fix kernel panic at security_sock_rcv_skb · d900adb4
      Eric Dumazet authored
      [ Upstream commit f1712c73 ]
      
      Zhang Yanmin reported crashes [1] and provided a patch adding a
      synchronize_rcu() call in can_rx_unregister()
      
      The main problem seems that the sockets themselves are not RCU
      protected.
      
      If CAN uses RCU for delivery, then sockets should be freed only after
      one RCU grace period.
      
      Recent kernels could use sock_set_flag(sk, SOCK_RCU_FREE), but let's
      ease stable backports with the following fix instead.
      
      [1]
      BUG: unable to handle kernel NULL pointer dereference at (null)
      IP: [<ffffffff81495e25>] selinux_socket_sock_rcv_skb+0x65/0x2a0
      
      Call Trace:
       <IRQ>
       [<ffffffff81485d8c>] security_sock_rcv_skb+0x4c/0x60
       [<ffffffff81d55771>] sk_filter+0x41/0x210
       [<ffffffff81d12913>] sock_queue_rcv_skb+0x53/0x3a0
       [<ffffffff81f0a2b3>] raw_rcv+0x2a3/0x3c0
       [<ffffffff81f06eab>] can_rcv_filter+0x12b/0x370
       [<ffffffff81f07af9>] can_receive+0xd9/0x120
       [<ffffffff81f07beb>] can_rcv+0xab/0x100
       [<ffffffff81d362ac>] __netif_receive_skb_core+0xd8c/0x11f0
       [<ffffffff81d36734>] __netif_receive_skb+0x24/0xb0
       [<ffffffff81d37f67>] process_backlog+0x127/0x280
       [<ffffffff81d36f7b>] net_rx_action+0x33b/0x4f0
       [<ffffffff810c88d4>] __do_softirq+0x184/0x440
       [<ffffffff81f9e86c>] do_softirq_own_stack+0x1c/0x30
       <EOI>
       [<ffffffff810c76fb>] do_softirq.part.18+0x3b/0x40
       [<ffffffff810c8bed>] do_softirq+0x1d/0x20
       [<ffffffff81d30085>] netif_rx_ni+0xe5/0x110
       [<ffffffff8199cc87>] slcan_receive_buf+0x507/0x520
       [<ffffffff8167ef7c>] flush_to_ldisc+0x21c/0x230
       [<ffffffff810e3baf>] process_one_work+0x24f/0x670
       [<ffffffff810e44ed>] worker_thread+0x9d/0x6f0
       [<ffffffff810e4450>] ? rescuer_thread+0x480/0x480
       [<ffffffff810ebafc>] kthread+0x12c/0x150
       [<ffffffff81f9ccef>] ret_from_fork+0x3f/0x70
      Reported-by: default avatarZhang Yanmin <yanmin.zhang@intel.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Acked-by: default avatarOliver Hartkopp <socketcan@hartkopp.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      d900adb4
    • Herbert Xu's avatar
      tun: Fix TUN_PKT_STRIP setting · 935eb896
      Herbert Xu authored
      commit 2eb783c4 upstream.
      
      We set the flag TUN_PKT_STRIP if the user buffer provided is too
      small to contain the entire packet plus meta-data.  However, this
      has been broken ever since we added GSO meta-data.  VLAN acceleration
      also has the same problem.
      
      This patch fixes this by taking both into account when setting the
      TUN_PKT_STRIP flag.
      
      The fact that this has been broken for six years without anyone
      realising means that nobody actually uses this flag.
      
      Fixes: f43798c2 ("tun: Allow GSO using virtio_net_hdr")
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      935eb896
  2. 16 Feb, 2017 12 commits