1. 10 Feb, 2017 40 commits
    • Eric Dumazet's avatar
      tcp: take care of truncations done by sk_filter() · 56325d9f
      Eric Dumazet authored
      commit ac6e7800 upstream.
      
      With syzkaller help, Marco Grassi found a bug in TCP stack,
      crashing in tcp_collapse()
      
      Root cause is that sk_filter() can truncate the incoming skb,
      but TCP stack was not really expecting this to happen.
      It probably was expecting a simple DROP or ACCEPT behavior.
      
      We first need to make sure no part of TCP header could be removed.
      Then we need to adjust TCP_SKB_CB(skb)->end_seq
      
      Many thanks to syzkaller team and Marco for giving us a reproducer.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarMarco Grassi <marco.gra@gmail.com>
      Reported-by: default avatarVladis Dronov <vdronov@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      56325d9f
    • Douglas Caetano dos Santos's avatar
      tcp: fix wrong checksum calculation on MTU probing · d318f82f
      Douglas Caetano dos Santos authored
      commit 2fe664f1 upstream.
      
      With TCP MTU probing enabled and offload TX checksumming disabled,
      tcp_mtu_probe() calculated the wrong checksum when a fragment being copied
      into the probe's SKB had an odd length. This was caused by the direct use
      of skb_copy_and_csum_bits() to calculate the checksum, as it pads the
      fragment being copied, if needed. When this fragment was not the last, a
      subsequent call used the previous checksum without considering this
      padding.
      
      The effect was a stale connection in one way, as even retransmissions
      wouldn't solve the problem, because the checksum was never recalculated for
      the full SKB length.
      Signed-off-by: default avatarDouglas Caetano dos Santos <douglascs@taghos.com.br>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      d318f82f
    • Eric Dumazet's avatar
      tcp: fix overflow in __tcp_retransmit_skb() · 93522d31
      Eric Dumazet authored
      commit ffb4d6c8 upstream.
      
      If a TCP socket gets a large write queue, an overflow can happen
      in a test in __tcp_retransmit_skb() preventing all retransmits.
      
      The flow then stalls and resets after timeouts.
      
      Tested:
      
      sysctl -w net.core.wmem_max=1000000000
      netperf -H dest -- -s 1000000000
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      93522d31
    • Eric Dumazet's avatar
      tcp: properly scale window in tcp_v[46]_reqsk_send_ack() · 1c50d3ae
      Eric Dumazet authored
      commit 20a2b49f upstream.
      
      When sending an ack in SYN_RECV state, we must scale the offered
      window if wscale option was negotiated and accepted.
      
      Tested:
       Following packetdrill test demonstrates the issue :
      
      0.000 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
      +0 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
      
      +0 bind(3, ..., ...) = 0
      +0 listen(3, 1) = 0
      
      // Establish a connection.
      +0 < S 0:0(0) win 20000 <mss 1000,sackOK,wscale 7, nop, TS val 100 ecr 0>
      +0 > S. 0:0(0) ack 1 win 28960 <mss 1460,sackOK, TS val 100 ecr 100, nop, wscale 7>
      
      +0 < . 1:11(10) ack 1 win 156 <nop,nop,TS val 99 ecr 100>
      // check that window is properly scaled !
      +0 > . 1:1(0) ack 1 win 226 <nop,nop,TS val 200 ecr 100>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Yuchung Cheng <ycheng@google.com>
      Cc: Neal Cardwell <ncardwell@google.com>
      Acked-by: default avatarYuchung Cheng <ycheng@google.com>
      Acked-by: default avatarNeal Cardwell <ncardwell@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      1c50d3ae
    • Eric Dumazet's avatar
      tcp: fix use after free in tcp_xmit_retransmit_queue() · 13403121
      Eric Dumazet authored
      commit bb1fceca upstream.
      
      When tcp_sendmsg() allocates a fresh and empty skb, it puts it at the
      tail of the write queue using tcp_add_write_queue_tail()
      
      Then it attempts to copy user data into this fresh skb.
      
      If the copy fails, we undo the work and remove the fresh skb.
      
      Unfortunately, this undo lacks the change done to tp->highest_sack and
      we can leave a dangling pointer (to a freed skb)
      
      Later, tcp_xmit_retransmit_queue() can dereference this pointer and
      access freed memory. For regular kernels where memory is not unmapped,
      this might cause SACK bugs because tcp_highest_sack_seq() is buggy,
      returning garbage instead of tp->snd_nxt, but with various debug
      features like CONFIG_DEBUG_PAGEALLOC, this can crash the kernel.
      
      This bug was found by Marco Grassi thanks to syzkaller.
      
      Fixes: 6859d494 ("[TCP]: Abstract tp->highest_sack accessing & point to next skb")
      Reported-by: default avatarMarco Grassi <marco.gra@gmail.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
      Cc: Yuchung Cheng <ycheng@google.com>
      Cc: Neal Cardwell <ncardwell@google.com>
      Acked-by: default avatarNeal Cardwell <ncardwell@google.com>
      Reviewed-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      13403121
    • Vegard Nossum's avatar
      net/irda: handle iriap_register_lsap() allocation failure · 942878ee
      Vegard Nossum authored
      commit 5ba092ef upstream.
      
      If iriap_register_lsap() fails to allocate memory, self->lsap is
      set to NULL. However, none of the callers handle the failure and
      irlmp_connect_request() will happily dereference it:
      
          iriap_register_lsap: Unable to allocated LSAP!
          ================================================================================
          UBSAN: Undefined behaviour in net/irda/irlmp.c:378:2
          member access within null pointer of type 'struct lsap_cb'
          CPU: 1 PID: 15403 Comm: trinity-c0 Not tainted 4.8.0-rc1+ #81
          Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.3-0-ge2fc41e-prebuilt.qemu-project.org
          04/01/2014
           0000000000000000 ffff88010c7e78a8 ffffffff82344f40 0000000041b58ab3
           ffffffff84f98000 ffffffff82344e94 ffff88010c7e78d0 ffff88010c7e7880
           ffff88010630ad00 ffffffff84a5fae0 ffffffff84d3f5c0 000000000000017a
          Call Trace:
           [<ffffffff82344f40>] dump_stack+0xac/0xfc
           [<ffffffff8242f5a8>] ubsan_epilogue+0xd/0x8a
           [<ffffffff824302bf>] __ubsan_handle_type_mismatch+0x157/0x411
           [<ffffffff83b7bdbc>] irlmp_connect_request+0x7ac/0x970
           [<ffffffff83b77cc0>] iriap_connect_request+0xa0/0x160
           [<ffffffff83b77f48>] state_s_disconnect+0x88/0xd0
           [<ffffffff83b78904>] iriap_do_client_event+0x94/0x120
           [<ffffffff83b77710>] iriap_getvaluebyclass_request+0x3e0/0x6d0
           [<ffffffff83ba6ebb>] irda_find_lsap_sel+0x1eb/0x630
           [<ffffffff83ba90c8>] irda_connect+0x828/0x12d0
           [<ffffffff833c0dfb>] SYSC_connect+0x22b/0x340
           [<ffffffff833c7e09>] SyS_connect+0x9/0x10
           [<ffffffff81007bd3>] do_syscall_64+0x1b3/0x4b0
           [<ffffffff845f946a>] entry_SYSCALL64_slow_path+0x25/0x25
          ================================================================================
      
      The bug seems to have been around since forever.
      
      There's more problems with missing error checks in iriap_init() (and
      indeed all of irda_init()), but that's a bigger problem that needs
      very careful review and testing. This patch will fix the most serious
      bug (as it's easily reached from unprivileged userspace).
      
      I have tested my patch with a reproducer.
      Signed-off-by: default avatarVegard Nossum <vegard.nossum@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      942878ee
    • Paolo Abeni's avatar
      ip6_tunnel: disable caching when the traffic class is inherited · 68836e49
      Paolo Abeni authored
      commit b5c2d495 upstream.
      
      If an ip6 tunnel is configured to inherit the traffic class from
      the inner header, the dst_cache must be disabled or it will foul
      the policy routing.
      
      The issue is apprently there since at leat Linux-2.6.12-rc2.
      Reported-by: default avatarLiam McBirnie <liam.mcbirnie@boeing.com>
      Cc: Liam McBirnie <liam.mcbirnie@boeing.com>
      Acked-by: default avatarHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      68836e49
    • Eli Cooper's avatar
      ip6_tunnel: Clear IP6CB in ip6tunnel_xmit() · b030cd1a
      Eli Cooper authored
      commit 23f4ffed upstream.
      
      skb->cb may contain data from previous layers. In the observed scenario,
      the garbage data were misinterpreted as IP6CB(skb)->frag_max_size, so
      that small packets sent through the tunnel are mistakenly fragmented.
      
      This patch unconditionally clears the control buffer in ip6tunnel_xmit(),
      which affects ip6_tunnel, ip6_udp_tunnel and ip6_gre. Currently none of
      these tunnels set IP6CB(skb)->flags, otherwise it needs to be done earlier.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarEli Cooper <elicooper@gmx.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      b030cd1a
    • Eric Dumazet's avatar
      ipv6: dccp: add missing bind_conflict to dccp_ipv6_mapped · c3a924e1
      Eric Dumazet authored
      commit 990ff4d8 upstream.
      
      While fuzzing kernel with syzkaller, Andrey reported a nasty crash
      in inet6_bind() caused by DCCP lacking a required method.
      
      Fixes: ab1e0a13 ("[SOCK] proto: Add hashinfo member to struct proto")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Tested-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Acked-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      c3a924e1
    • Eric Dumazet's avatar
      ipv6: dccp: fix out of bound access in dccp_v6_err() · bd380617
      Eric Dumazet authored
      commit 1aa9d1a0 upstream.
      
      dccp_v6_err() does not use pskb_may_pull() and might access garbage.
      
      We only need 4 bytes at the beginning of the DCCP header, like TCP,
      so the 8 bytes pulled in icmpv6_notify() are more than enough.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      bd380617
    • Nicolas Dichtel's avatar
      ipv6: correctly add local routes when lo goes up · 14ba02f9
      Nicolas Dichtel authored
      commit a220445f upstream.
      
      The goal of the patch is to fix this scenario:
       ip link add dummy1 type dummy
       ip link set dummy1 up
       ip link set lo down ; ip link set lo up
      
      After that sequence, the local route to the link layer address of dummy1 is
      not there anymore.
      
      When the loopback is set down, all local routes are deleted by
      addrconf_ifdown()/rt6_ifdown(). At this time, the rt6_info entry still
      exists, because the corresponding idev has a reference on it. After the rcu
      grace period, dst_rcu_free() is called, and thus ___dst_free(), which will
      set obsolete to DST_OBSOLETE_DEAD.
      
      In this case, init_loopback() is called before dst_rcu_free(), thus
      obsolete is still sets to something <= 0. So, the function doesn't add the
      route again. To avoid that race, let's check the rt6 refcnt instead.
      
      Fixes: 25fb6ca4 ("net IPv6 : Fix broken IPv6 routing table after loopback down-up")
      Fixes: a881ae1f ("ipv6: don't call addrconf_dst_alloc again when enable lo")
      Fixes: 33d99113 ("ipv6: reallocate addrconf router for ipv6 address when lo device up")
      Reported-by: default avatarFrancesco Santoro <francesco.santoro@6wind.com>
      Reported-by: default avatarSamuel Gauthier <samuel.gauthier@6wind.com>
      CC: Balakumaran Kannan <Balakumaran.Kannan@ap.sony.com>
      CC: Maruthi Thotad <Maruthi.Thotad@ap.sony.com>
      CC: Sabrina Dubroca <sd@queasysnail.net>
      CC: Hannes Frederic Sowa <hannes@stressinduktion.org>
      CC: Weilong Chen <chenweilong@huawei.com>
      CC: Gao feng <gaofeng@cn.fujitsu.com>
      Signed-off-by: default avatarNicolas Dichtel <nicolas.dichtel@6wind.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      14ba02f9
    • Lance Richardson's avatar
      ip6_gre: fix flowi6_proto value in ip6gre_xmit_other() · 1735aaba
      Lance Richardson authored
      commit db32e4e4 upstream.
      
      Similar to commit 3be07244 ("ip6_gre: fix flowi6_proto value in
      xmit path"), set flowi6_proto to IPPROTO_GRE for output route lookup.
      
      Up until now, ip6gre_xmit_other() has set flowi6_proto to a bogus value.
      This affected output route lookup for packets sent on an ip6gretap device
      in cases where routing was dependent on the value of flowi6_proto.
      
      Since the correct proto is already set in the tunnel flowi6 template via
      commit 252f3f5a ("ip6_gre: Set flowi6_proto as IPPROTO_GRE in xmit
      path."), simply delete the line setting the incorrect flowi6_proto value.
      Suggested-by: default avatarJiri Benc <jbenc@redhat.com>
      Fixes: c12b395a ("gre: Support GRE over IPv6")
      Reviewed-by: default avatarShmulik Ladkani <shmulik.ladkani@gmail.com>
      Signed-off-by: default avatarLance Richardson <lrichard@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      1735aaba
    • Sabrina Dubroca's avatar
      ipv6: fix rtnl locking in setsockopt for anycast and multicast · b64222a9
      Sabrina Dubroca authored
      commit a9ed4a29 upstream.
      
      Calling setsockopt with IPV6_JOIN_ANYCAST or IPV6_LEAVE_ANYCAST
      triggers the assertion in addrconf_join_solict()/addrconf_leave_solict()
      
      ipv6_sock_ac_join(), ipv6_sock_ac_drop(), ipv6_sock_ac_close() need to
      take RTNL before calling ipv6_dev_ac_inc/dec. Same thing with
      ipv6_sock_mc_join(), ipv6_sock_mc_drop(), ipv6_sock_mc_close() before
      calling ipv6_dev_mc_inc/dec.
      
      This patch moves ASSERT_RTNL() up a level in the call stack.
      Signed-off-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: default avatarSabrina Dubroca <sd@queasysnail.net>
      Reported-by: default avatarTommi Rantala <tt.rantala@gmail.com>
      Acked-by: default avatarHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Cc: <stable@vger.kernel.org> # 3.10.y: b7b1bfce: ipv6: split dad and rs timers
      Cc: <stable@vger.kernel.org> # 3.10.y: c15b1cca: ipv6: move dad to workqueue
      Cc: <stable@vger.kernel.org> # 3.10.y
      [Mike Manning <mmanning@brocade.com>: resolved minor conflicts in addrconf.c]
      Signed-off-by: default avatarMike Manning <mmanning@brocade.com>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      b64222a9
    • Wei Yongjun's avatar
      ipv6: addrconf: fix dev refcont leak when DAD failed · 04355d67
      Wei Yongjun authored
      commit 751eb6b6 upstream.
      
      In general, when DAD detected IPv6 duplicate address, ifp->state
      will be set to INET6_IFADDR_STATE_ERRDAD and DAD is stopped by a
      delayed work, the call tree should be like this:
      
      ndisc_recv_ns
        -> addrconf_dad_failure        <- missing ifp put
           -> addrconf_mod_dad_work
             -> schedule addrconf_dad_work()
               -> addrconf_dad_stop()  <- missing ifp hold before call it
      
      addrconf_dad_failure() called with ifp refcont holding but not put.
      addrconf_dad_work() call addrconf_dad_stop() without extra holding
      refcount. This will not cause any issue normally.
      
      But the race between addrconf_dad_failure() and addrconf_dad_work()
      may cause ifp refcount leak and netdevice can not be unregister,
      dmesg show the following messages:
      
      IPv6: eth0: IPv6 duplicate address fe80::XX:XXXX:XXXX:XX detected!
      ...
      unregister_netdevice: waiting for eth0 to become free. Usage count = 1
      
      Cc: stable@vger.kernel.org
      Fixes: c15b1cca ("ipv6: move DAD and addrconf_verify processing
      to workqueue")
      Signed-off-by: default avatarWei Yongjun <weiyongjun1@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Cc: <stable@vger.kernel.org> # 3.10.y
      Signed-off-by: default avatarMike Manning <mmanning@brocade.com>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      04355d67
    • Hannes Frederic Sowa's avatar
      ipv6: move DAD and addrconf_verify processing to workqueue · 835b474b
      Hannes Frederic Sowa authored
      commit c15b1cca upstream.
      
      addrconf_join_solict and addrconf_join_anycast may cause actions which
      need rtnl locked, especially on first address creation.
      
      A new DAD state is introduced which defers processing of the initial
      DAD processing into a workqueue.
      
      To get rtnl lock we need to push the code paths which depend on those
      calls up to workqueues, specifically addrconf_verify and the DAD
      processing.
      
      (v2)
      addrconf_dad_failure needs to be queued up to the workqueue, too. This
      patch introduces a new DAD state and stop the DAD processing in the
      workqueue (this is because of the possible ipv6_del_addr processing
      which removes the solicited multicast address from the device).
      
      addrconf_verify_lock is removed, too. After the transition it is not
      needed any more.
      
      As we are not processing in bottom half anymore we need to be a bit more
      careful about disabling bottom half out when we lock spin_locks which are also
      used in bh.
      
      Relevant backtrace:
      [  541.030090] RTNL: assertion failed at net/core/dev.c (4496)
      [  541.031143] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G           O 3.10.33-1-amd64-vyatta #1
      [  541.031145] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
      [  541.031146]  ffffffff8148a9f0 000000000000002f ffffffff813c98c1 ffff88007c4451f8
      [  541.031148]  0000000000000000 0000000000000000 ffffffff813d3540 ffff88007fc03d18
      [  541.031150]  0000880000000006 ffff88007c445000 ffffffffa0194160 0000000000000000
      [  541.031152] Call Trace:
      [  541.031153]  <IRQ>  [<ffffffff8148a9f0>] ? dump_stack+0xd/0x17
      [  541.031180]  [<ffffffff813c98c1>] ? __dev_set_promiscuity+0x101/0x180
      [  541.031183]  [<ffffffff813d3540>] ? __hw_addr_create_ex+0x60/0xc0
      [  541.031185]  [<ffffffff813cfe1a>] ? __dev_set_rx_mode+0xaa/0xc0
      [  541.031189]  [<ffffffff813d3a81>] ? __dev_mc_add+0x61/0x90
      [  541.031198]  [<ffffffffa01dcf9c>] ? igmp6_group_added+0xfc/0x1a0 [ipv6]
      [  541.031208]  [<ffffffff8111237b>] ? kmem_cache_alloc+0xcb/0xd0
      [  541.031212]  [<ffffffffa01ddcd7>] ? ipv6_dev_mc_inc+0x267/0x300 [ipv6]
      [  541.031216]  [<ffffffffa01c2fae>] ? addrconf_join_solict+0x2e/0x40 [ipv6]
      [  541.031219]  [<ffffffffa01ba2e9>] ? ipv6_dev_ac_inc+0x159/0x1f0 [ipv6]
      [  541.031223]  [<ffffffffa01c0772>] ? addrconf_join_anycast+0x92/0xa0 [ipv6]
      [  541.031226]  [<ffffffffa01c311e>] ? __ipv6_ifa_notify+0x11e/0x1e0 [ipv6]
      [  541.031229]  [<ffffffffa01c3213>] ? ipv6_ifa_notify+0x33/0x50 [ipv6]
      [  541.031233]  [<ffffffffa01c36c8>] ? addrconf_dad_completed+0x28/0x100 [ipv6]
      [  541.031241]  [<ffffffff81075c1d>] ? task_cputime+0x2d/0x50
      [  541.031244]  [<ffffffffa01c38d6>] ? addrconf_dad_timer+0x136/0x150 [ipv6]
      [  541.031247]  [<ffffffffa01c37a0>] ? addrconf_dad_completed+0x100/0x100 [ipv6]
      [  541.031255]  [<ffffffff8105313a>] ? call_timer_fn.isra.22+0x2a/0x90
      [  541.031258]  [<ffffffffa01c37a0>] ? addrconf_dad_completed+0x100/0x100 [ipv6]
      
      Hunks and backtrace stolen from a patch by Stephen Hemminger.
      Reported-by: default avatarStephen Hemminger <stephen@networkplumber.org>
      Signed-off-by: default avatarStephen Hemminger <stephen@networkplumber.org>
      Signed-off-by: default avatarHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Cc: <stable@vger.kernel.org> # 3.10.y: b7b1bfce: ipv6: split dad and rs timers
      Cc: <stable@vger.kernel.org> # 3.10.y
      [Mike Manning <mmanning@brocade.com>: resolved minor conflicts in addrconf.c]
      Signed-off-by: default avatarMike Manning <mmanning@brocade.com>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      835b474b
    • Hannes Frederic Sowa's avatar
      ipv6: split duplicate address detection and router solicitation timer · 973d5956
      Hannes Frederic Sowa authored
      commit b7b1bfce upstream.
      
      This patch splits the timers for duplicate address detection and router
      solicitations apart. The router solicitations timer goes into inet6_dev
      and the dad timer stays in inet6_ifaddr.
      
      The reason behind this patch is to reduce the number of unneeded router
      solicitations send out by the host if additional link-local addresses
      are created. Currently we send out RS for every link-local address on
      an interface.
      
      If the RS timer fires we pick a source address with ipv6_get_lladdr. This
      change could hurt people adding additional link-local addresses and
      specifying these addresses in the radvd clients section because we
      no longer guarantee that we use every ll address as source address in
      router solicitations.
      
      Cc: Flavio Leitner <fleitner@redhat.com>
      Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
      Cc: David Stevens <dlstevens@us.ibm.com>
      Signed-off-by: default avatarHannes Frederic Sowa <hannes@stressinduktion.org>
      Reviewed-by: default avatarFlavio Leitner <fbl@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Cc: <stable@vger.kernel.org> # 3.10.y
      [Mike Manning <mmanning@brocade.com>: resolved conflicts with 36bddb]
      Signed-off-by: default avatarMike Manning <mmanning@brocade.com>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      973d5956
    • Michal Kubeček's avatar
      ipv6: don't call fib6_run_gc() until routing is ready · af80b973
      Michal Kube�ek authored
      commit 2c861cc6 upstream.
      
      When loading the ipv6 module, ndisc_init() is called before
      ip6_route_init(). As the former registers a handler calling
      fib6_run_gc(), this opens a window to run the garbage collector
      before necessary data structures are initialized. If a network
      device is initialized in this window, adding MAC address to it
      triggers a NETDEV_CHANGEADDR event, leading to a crash in
      fib6_clean_all().
      
      Take the event handler registration out of ndisc_init() into a
      separate function ndisc_late_init() and move it after
      ip6_route_init().
      Signed-off-by: default avatarMichal Kubecek <mkubecek@suse.cz>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Cc: <stable@vger.kernel.org> # 3.10.y
      Signed-off-by: default avatarMike Manning <mmanning@brocade.com>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      af80b973
    • Joe Perches's avatar
      stddef.h: move offsetofend inside #ifndef/#endif guard, neaten · 349759be
      Joe Perches authored
      commit 8c7fbe57 upstream.
      
      Commit 38764884 ("include/stddef.h: Move offsetofend() from vfio.h
      to a generic kernel header") added offsetofend outside the normal
      include #ifndef/#endif guard.  Move it inside.
      
      Miscellanea:
      
      o remove unnecessary blank line
      o standardize offsetof macros whitespace style
      Signed-off-by: default avatarJoe Perches <joe@perches.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      [wt: backported only for ipv6 out-of-bounds fix]
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      349759be
    • Denys Vlasenko's avatar
      include/stddef.h: Move offsetofend() from vfio.h to a generic kernel header · 1ddb7944
      Denys Vlasenko authored
      commit 38764884 upstream.
      
      Suggested by Andy.
      Suggested-by: default avatarAndy Lutomirski <luto@amacapital.net>
      Signed-off-by: default avatarDenys Vlasenko <dvlasenk@redhat.com>
      Acked-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Cc: Alexei Starovoitov <ast@plumgrid.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Will Drewry <wad@chromium.org>
      Link: http://lkml.kernel.org/r/1425912738-559-1-git-send-email-dvlasenk@redhat.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      [wt: backported only for ipv6 out-of-bounds fix]
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      1ddb7944
    • Gavin Shan's avatar
      drivers/vfio: Rework offsetofend() · 6cc73a1c
      Gavin Shan authored
      commit b13460b9 upstream.
      
      The macro offsetofend() introduces unnecessary temporary variable
      "tmp". The patch avoids that and saves a bit memory in stack.
      Signed-off-by: default avatarGavin Shan <gwshan@linux.vnet.ibm.com>
      Signed-off-by: default avatarAlex Williamson <alex.williamson@redhat.com>
      [wt: backported only for ipv6 out-of-bounds fix]
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      6cc73a1c
    • Scot Doyle's avatar
      vt: clear selection before resizing · 5812a9bc
      Scot Doyle authored
      commit 009e39ae upstream.
      
      When resizing a vt its selection may exceed the new size, resulting in
      an invalid memory access [1]. Clear the selection before resizing.
      
      [1] http://lkml.kernel.org/r/CACT4Y+acDTwy4umEvf5ROBGiRJNrxHN4Cn5szCXE5Jw-d1B=Xw@mail.gmail.comReported-and-tested-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Signed-off-by: default avatarScot Doyle <lkml14@scotdoyle.com>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      5812a9bc
    • Jiri Slaby's avatar
      tty: vt, fix bogus division in csi_J · 902fd8d5
      Jiri Slaby authored
      commit 42acfc66 upstream.
      
      In csi_J(3), the third parameter of scr_memsetw (vc_screenbuf_size) is
      divided by 2 inappropriatelly. But scr_memsetw expects size, not
      count, because it divides the size by 2 on its own before doing actual
      memset-by-words.
      
      So remove the bogus division.
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      Cc: Petr Písař <ppisar@redhat.com>
      Fixes: f8df13e0 (tty: Clean console safely)
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      902fd8d5
    • Dmitry Vyukov's avatar
      tty: limit terminal size to 4M chars · 35059401
      Dmitry Vyukov authored
      commit 32b2921e upstream.
      
      Size of kmalloc() in vc_do_resize() is controlled by user.
      Too large kmalloc() size triggers WARNING message on console.
      Put a reasonable upper bound on terminal size to prevent WARNINGs.
      Signed-off-by: default avatarDmitry Vyukov <dvyukov@google.com>
      CC: David Rientjes <rientjes@google.com>
      Cc: One Thousand Gnomes <gnomes@lxorguk.ukuu.org.uk>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Jiri Slaby <jslaby@suse.com>
      Cc: Peter Hurley <peter@hurleysoftware.com>
      Cc: linux-kernel@vger.kernel.org
      Cc: syzkaller@googlegroups.com
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      35059401
    • Peter Hurley's avatar
      tty: Prevent ldisc drivers from re-using stale tty fields · 67de5f0a
      Peter Hurley authored
      commit dd42bf11 upstream.
      
      Line discipline drivers may mistakenly misuse ldisc-related fields
      when initializing. For example, a failure to initialize tty->receive_room
      in the N_GIGASET_M101 line discipline was recently found and fixed [1].
      Now, the N_X25 line discipline has been discovered accessing the previous
      line discipline's already-freed private data [2].
      
      Harden the ldisc interface against misuse by initializing revelant
      tty fields before instancing the new line discipline.
      
      [1]
          commit fd98e941
          Author: Tilman Schmidt <tilman@imap.cc>
          Date:   Tue Jul 14 00:37:13 2015 +0200
      
          isdn/gigaset: reset tty->receive_room when attaching ser_gigaset
      
      [2] Report from Sasha Levin <sasha.levin@oracle.com>
          [  634.336761] ==================================================================
          [  634.338226] BUG: KASAN: use-after-free in x25_asy_open_tty+0x13d/0x490 at addr ffff8800a743efd0
          [  634.339558] Read of size 4 by task syzkaller_execu/8981
          [  634.340359] =============================================================================
          [  634.341598] BUG kmalloc-512 (Not tainted): kasan: bad access detected
          ...
          [  634.405018] Call Trace:
          [  634.405277] dump_stack (lib/dump_stack.c:52)
          [  634.405775] print_trailer (mm/slub.c:655)
          [  634.406361] object_err (mm/slub.c:662)
          [  634.406824] kasan_report_error (mm/kasan/report.c:138 mm/kasan/report.c:236)
          [  634.409581] __asan_report_load4_noabort (mm/kasan/report.c:279)
          [  634.411355] x25_asy_open_tty (drivers/net/wan/x25_asy.c:559 (discriminator 1))
          [  634.413997] tty_ldisc_open.isra.2 (drivers/tty/tty_ldisc.c:447)
          [  634.414549] tty_set_ldisc (drivers/tty/tty_ldisc.c:567)
          [  634.415057] tty_ioctl (drivers/tty/tty_io.c:2646 drivers/tty/tty_io.c:2879)
          [  634.423524] do_vfs_ioctl (fs/ioctl.c:43 fs/ioctl.c:607)
          [  634.427491] SyS_ioctl (fs/ioctl.c:622 fs/ioctl.c:613)
          [  634.427945] entry_SYSCALL_64_fastpath (arch/x86/entry/entry_64.S:188)
      
      Cc: Tilman Schmidt <tilman@imap.cc>
      Cc: Sasha Levin <sasha.levin@oracle.com>
      Signed-off-by: default avatarPeter Hurley <peter@hurleysoftware.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      [wt: adjust context]
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      67de5f0a
    • Peter Zijlstra's avatar
      perf: Tighten (and fix) the grouping condition · ac74acf2
      Peter Zijlstra authored
      commit c3c87e77 upstream.
      
      The fix from 9fc81d87 ("perf: Fix events installation during
      moving group") was incomplete in that it failed to recognise that
      creating a group with events for different CPUs is semantically
      broken -- they cannot be co-scheduled.
      
      Furthermore, it leads to real breakage where, when we create an event
      for CPU Y and then migrate it to form a group on CPU X, the code gets
      confused where the counter is programmed -- triggered in practice
      as well by me via the perf fuzzer.
      
      Fix this by tightening the rules for creating groups. Only allow
      grouping of counters that can be co-scheduled in the same context.
      This means for the same task and/or the same cpu.
      
      Fixes: 9fc81d87 ("perf: Fix events installation during moving group")
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Link: http://lkml.kernel.org/r/20150123125834.090683288@infradead.orgSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      ac74acf2
    • Arnaldo Carvalho de Melo's avatar
      perf symbols: Fixup symbol sizes before picking best ones · d636f64a
      Arnaldo Carvalho de Melo authored
      commit 432746f8 upstream.
      
      When we call symbol__fixup_duplicate() we use algorithms to pick the
      "best" symbols for cases where there are various functions/aliases to an
      address, and those check zero size symbols, which, before calling
      symbol__fixup_end() are _all_ symbols in a just parsed kallsyms file.
      
      So first fixup the end, then fixup the duplicates.
      
      Found while trying to figure out why 'perf test vmlinux' failed, see the
      output of 'perf test -v vmlinux' to see cases where the symbols picked
      as best for vmlinux don't match the ones picked for kallsyms.
      
      Cc: Anton Blanchard <anton@samba.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Fixes: 694bf407 ("perf symbols: Add some heuristics for choosing the best duplicate symbol")
      Link: http://lkml.kernel.org/n/tip-rxqvdgr0mqjdxee0kf8i2ufn@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      d636f64a
    • Karl Beldan's avatar
      mtd: nand: davinci: Reinitialize the HW ECC engine in 4bit hwctl · 44e3b0c7
      Karl Beldan authored
      commit f6d7c1b5 upstream.
      
      This fixes subpage writes when using 4-bit HW ECC.
      
      There has been numerous reports about ECC errors with devices using this
      driver for a while.  Also the 4-bit ECC has been reported as broken with
      subpages in [1] and with 16 bits NANDs in the driver and in mach* board
      files both in mainline and in the vendor BSPs.
      
      What I saw with 4-bit ECC on a 16bits NAND (on an LCDK) which got me to
      try reinitializing the ECC engine:
      - R/W on whole pages properly generates/checks RS code
      - try writing the 1st subpage only of a blank page, the subpage is well
        written and the RS code properly generated, re-reading the same page
        the HW detects some ECC error, reading the same page again no ECC
        error is detected
      
      Note that the ECC engine is already reinitialized in the 1-bit case.
      
      Tested on my LCDK with UBI+UBIFS using subpages.
      This could potentially get rid of the issue workarounded in [1].
      
      [1] 28c015a9 ("mtd: davinci-nand: disable subpage write for keystone-nand")
      
      Fixes: 6a4123e5 ("mtd: nand: davinci_nand, 4-bit ECC for smallpage")
      Signed-off-by: default avatarKarl Beldan <kbeldan@baylibre.com>
      Acked-by: default avatarBoris Brezillon <boris.brezillon@free-electrons.com>
      Signed-off-by: default avatarBrian Norris <computersforpeace@gmail.com>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      44e3b0c7
    • Dan Carpenter's avatar
      mtd: pmcmsp-flash: Allocating too much in init_msp_flash() · 1b47a57f
      Dan Carpenter authored
      commit 79ad07d4 upstream.
      
      There is a cut and paste issue here.  The bug is that we are allocating
      more memory than necessary for msp_maps.  We should be allocating enough
      space for a map_info struct (144 bytes) but we instead allocate enough
      for an mtd_info struct (1840 bytes).  It's a small waste.
      
      The other part of this is not harmful but when we allocated msp_flash
      then we allocated enough space fro a map_info pointer instead of an
      mtd_info pointer.  But since pointers are the same size it works out
      fine.
      
      Anyway, I decided to clean up all three allocations a bit to make them
      a bit more consistent and clear.
      
      Fixes: 68aa0fa8 ('[MTD] PMC MSP71xx flash/rootfs mappings')
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarBrian Norris <computersforpeace@gmail.com>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      1b47a57f
    • Brian Norris's avatar
      mtd: blkdevs: fix potential deadlock + lockdep warnings · bdd7043b
      Brian Norris authored
      commit f3c63795 upstream.
      
      Commit 073db4a5 ("mtd: fix: avoid race condition when accessing
      mtd->usecount") fixed a race condition but due to poor ordering of the
      mutex acquisition, introduced a potential deadlock.
      
      The deadlock can occur, for example, when rmmod'ing the m25p80 module, which
      will delete one or more MTDs, along with any corresponding mtdblock
      devices. This could potentially race with an acquisition of the block
      device as follows.
      
       -> blktrans_open()
          ->  mutex_lock(&dev->lock);
          ->  mutex_lock(&mtd_table_mutex);
      
       -> del_mtd_device()
          ->  mutex_lock(&mtd_table_mutex);
          ->  blktrans_notify_remove() -> del_mtd_blktrans_dev()
             ->  mutex_lock(&dev->lock);
      
      This is a classic (potential) ABBA deadlock, which can be fixed by
      making the A->B ordering consistent everywhere. There was no real
      purpose to the ordering in the original patch, AFAIR, so this shouldn't
      be a problem. This ordering was actually already present in
      del_mtd_blktrans_dev(), for one, where the function tried to ensure that
      its caller already held mtd_table_mutex before it acquired &dev->lock:
      
              if (mutex_trylock(&mtd_table_mutex)) {
                      mutex_unlock(&mtd_table_mutex);
                      BUG();
              }
      
      So, reverse the ordering of acquisition of &dev->lock and &mtd_table_mutex so
      we always acquire mtd_table_mutex first.
      
      Snippets of the lockdep output follow:
      
        # modprobe -r m25p80
        [   53.419251]
        [   53.420838] ======================================================
        [   53.427300] [ INFO: possible circular locking dependency detected ]
        [   53.433865] 4.3.0-rc6 #96 Not tainted
        [   53.437686] -------------------------------------------------------
        [   53.444220] modprobe/372 is trying to acquire lock:
        [   53.449320]  (&new->lock){+.+...}, at: [<c043fe4c>] del_mtd_blktrans_dev+0x80/0xdc
        [   53.457271]
        [   53.457271] but task is already holding lock:
        [   53.463372]  (mtd_table_mutex){+.+.+.}, at: [<c0439994>] del_mtd_device+0x18/0x100
        [   53.471321]
        [   53.471321] which lock already depends on the new lock.
        [   53.471321]
        [   53.479856]
        [   53.479856] the existing dependency chain (in reverse order) is:
        [   53.487660]
        -> #1 (mtd_table_mutex){+.+.+.}:
        [   53.492331]        [<c043fc5c>] blktrans_open+0x34/0x1a4
        [   53.497879]        [<c01afce0>] __blkdev_get+0xc4/0x3b0
        [   53.503364]        [<c01b0bb8>] blkdev_get+0x108/0x320
        [   53.508743]        [<c01713c0>] do_dentry_open+0x218/0x314
        [   53.514496]        [<c0180454>] path_openat+0x4c0/0xf9c
        [   53.519959]        [<c0182044>] do_filp_open+0x5c/0xc0
        [   53.525336]        [<c0172758>] do_sys_open+0xfc/0x1cc
        [   53.530716]        [<c000f740>] ret_fast_syscall+0x0/0x1c
        [   53.536375]
        -> #0 (&new->lock){+.+...}:
        [   53.540587]        [<c063f124>] mutex_lock_nested+0x38/0x3cc
        [   53.546504]        [<c043fe4c>] del_mtd_blktrans_dev+0x80/0xdc
        [   53.552606]        [<c043f164>] blktrans_notify_remove+0x7c/0x84
        [   53.558891]        [<c04399f0>] del_mtd_device+0x74/0x100
        [   53.564544]        [<c043c670>] del_mtd_partitions+0x80/0xc8
        [   53.570451]        [<c0439aa0>] mtd_device_unregister+0x24/0x48
        [   53.576637]        [<c046ce6c>] spi_drv_remove+0x1c/0x34
        [   53.582207]        [<c03de0f0>] __device_release_driver+0x88/0x114
        [   53.588663]        [<c03de19c>] device_release_driver+0x20/0x2c
        [   53.594843]        [<c03dd9e8>] bus_remove_device+0xd8/0x108
        [   53.600748]        [<c03dacc0>] device_del+0x10c/0x210
        [   53.606127]        [<c03dadd0>] device_unregister+0xc/0x20
        [   53.611849]        [<c046d878>] __unregister+0x10/0x20
        [   53.617211]        [<c03da868>] device_for_each_child+0x50/0x7c
        [   53.623387]        [<c046eae8>] spi_unregister_master+0x58/0x8c
        [   53.629578]        [<c03e12f0>] release_nodes+0x15c/0x1c8
        [   53.635223]        [<c03de0f8>] __device_release_driver+0x90/0x114
        [   53.641689]        [<c03de900>] driver_detach+0xb4/0xb8
        [   53.647147]        [<c03ddc78>] bus_remove_driver+0x4c/0xa0
        [   53.652970]        [<c00cab50>] SyS_delete_module+0x11c/0x1e4
        [   53.658976]        [<c000f740>] ret_fast_syscall+0x0/0x1c
        [   53.664621]
        [   53.664621] other info that might help us debug this:
        [   53.664621]
        [   53.672979]  Possible unsafe locking scenario:
        [   53.672979]
        [   53.679169]        CPU0                    CPU1
        [   53.683900]        ----                    ----
        [   53.688633]   lock(mtd_table_mutex);
        [   53.692383]                                lock(&new->lock);
        [   53.698306]                                lock(mtd_table_mutex);
        [   53.704658]   lock(&new->lock);
        [   53.707946]
        [   53.707946]  *** DEADLOCK ***
      
      Fixes: 073db4a5 ("mtd: fix: avoid race condition when accessing mtd->usecount")
      Reported-by: default avatarFelipe Balbi <balbi@ti.com>
      Tested-by: default avatarFelipe Balbi <balbi@ti.com>
      Signed-off-by: default avatarBrian Norris <computersforpeace@gmail.com>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      bdd7043b
    • Mark Bloch's avatar
      IB/cm: Mark stale CM id's whenever the mad agent was unregistered · aa19a889
      Mark Bloch authored
      commit 9db0ff53 upstream.
      
      When there is a CM id object that has port assigned to it, it means that
      the cm-id asked for the specific port that it should go by it, but if
      that port was removed (hot-unplug event) the cm-id was not updated.
      In order to fix that the port keeps a list of all the cm-id's that are
      planning to go by it, whenever the port is removed it marks all of them
      as invalid.
      
      This commit fixes a kernel panic which happens when running traffic between
      guests and we force reboot a guest mid traffic, it triggers a kernel panic:
      
       Call Trace:
        [<ffffffff815271fa>] ? panic+0xa7/0x16f
        [<ffffffff8152b534>] ? oops_end+0xe4/0x100
        [<ffffffff8104a00b>] ? no_context+0xfb/0x260
        [<ffffffff81084db2>] ? del_timer_sync+0x22/0x30
        [<ffffffff8104a295>] ? __bad_area_nosemaphore+0x125/0x1e0
        [<ffffffff81084240>] ? process_timeout+0x0/0x10
        [<ffffffff8104a363>] ? bad_area_nosemaphore+0x13/0x20
        [<ffffffff8104aabf>] ? __do_page_fault+0x31f/0x480
        [<ffffffff81065df0>] ? default_wake_function+0x0/0x20
        [<ffffffffa0752675>] ? free_msg+0x55/0x70 [mlx5_core]
        [<ffffffffa0753434>] ? cmd_exec+0x124/0x840 [mlx5_core]
        [<ffffffff8105a924>] ? find_busiest_group+0x244/0x9f0
        [<ffffffff8152d45e>] ? do_page_fault+0x3e/0xa0
        [<ffffffff8152a815>] ? page_fault+0x25/0x30
        [<ffffffffa024da25>] ? cm_alloc_msg+0x35/0xc0 [ib_cm]
        [<ffffffffa024e821>] ? ib_send_cm_dreq+0xb1/0x1e0 [ib_cm]
        [<ffffffffa024f836>] ? cm_destroy_id+0x176/0x320 [ib_cm]
        [<ffffffffa024fb00>] ? ib_destroy_cm_id+0x10/0x20 [ib_cm]
        [<ffffffffa034f527>] ? ipoib_cm_free_rx_reap_list+0xa7/0x110 [ib_ipoib]
        [<ffffffffa034f590>] ? ipoib_cm_rx_reap+0x0/0x20 [ib_ipoib]
        [<ffffffffa034f5a5>] ? ipoib_cm_rx_reap+0x15/0x20 [ib_ipoib]
        [<ffffffff81094d20>] ? worker_thread+0x170/0x2a0
        [<ffffffff8109b2a0>] ? autoremove_wake_function+0x0/0x40
        [<ffffffff81094bb0>] ? worker_thread+0x0/0x2a0
        [<ffffffff8109aef6>] ? kthread+0x96/0xa0
        [<ffffffff8100c20a>] ? child_rip+0xa/0x20
        [<ffffffff8109ae60>] ? kthread+0x0/0xa0
        [<ffffffff8100c200>] ? child_rip+0x0/0x20
      
      Fixes: a977049d ("[PATCH] IB: Add the kernel CM implementation")
      Signed-off-by: default avatarMark Bloch <markb@mellanox.com>
      Signed-off-by: default avatarErez Shitrit <erezsh@mellanox.com>
      Reviewed-by: default avatarMaor Gottlieb <maorg@mellanox.com>
      Signed-off-by: default avatarLeon Romanovsky <leon@kernel.org>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      aa19a889
    • Tariq Toukan's avatar
      IB/uverbs: Fix leak of XRC target QPs · 52aac91d
      Tariq Toukan authored
      commit 5b810a24 upstream.
      
      The real QP is destroyed in case of the ref count reaches zero, but
      for XRC target QPs this call was missed and caused to QP leaks.
      
      Let's call to destroy for all flows.
      
      Fixes: 0e0ec7e0 ('RDMA/core: Export ib_open_qp() to share XRC...')
      Signed-off-by: default avatarTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: default avatarNoa Osherovich <noaos@mellanox.com>
      Signed-off-by: default avatarLeon Romanovsky <leon@kernel.org>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      52aac91d
    • Matan Barak's avatar
      IB/mlx4: Fix create CQ error flow · 1aecb8e4
      Matan Barak authored
      commit 593ff73b upstream.
      
      Currently, if ib_copy_to_udata fails, the CQ
      won't be deleted from the radix tree and the HW (HW2SW).
      
      Fixes: 225c7b1f ('IB/mlx4: Add a driver Mellanox ConnectX InfiniBand adapters')
      Signed-off-by: default avatarMatan Barak <matanb@mellanox.com>
      Signed-off-by: default avatarDaniel Jurgens <danielj@mellanox.com>
      Reviewed-by: default avatarMark Bloch <markb@mellanox.com>
      Signed-off-by: default avatarLeon Romanovsky <leon@kernel.org>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      1aecb8e4
    • Alex Vesker's avatar
      IB/mlx4: Fix incorrect MC join state bit-masking on SR-IOV · 95bc51b5
      Alex Vesker authored
      commit e5ac40cd upstream.
      
      Because of an incorrect bit-masking done on the join state bits, when
      handling a join request we failed to detect a difference between the
      group join state and the request join state when joining as send only
      full member (0x8). This caused the MC join request not to be sent.
      This issue is relevant only when SRIOV is enabled and SM supports
      send only full member.
      
      This fix separates scope bits and join states bits a nibble each.
      
      Fixes: b9c5d6a6 ('IB/mlx4: Add multicast group (MCG) paravirtualization for SR-IOV')
      Signed-off-by: default avatarAlex Vesker <valex@mellanox.com>
      Signed-off-by: default avatarLeon Romanovsky <leon@kernel.org>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      95bc51b5
    • Alex Vesker's avatar
      IB/ipoib: Don't allow MC joins during light MC flush · b81459c7
      Alex Vesker authored
      commit 344bacca upstream.
      
      This fix solves a race between light flush and on the fly joins.
      Light flush doesn't set the device to down and unset IPOIB_OPER_UP
      flag, this means that if while flushing we have a MC join in progress
      and the QP was attached to BC MGID we can have a mismatches when
      re-attaching a QP to the BC MGID.
      
      The light flush would set the broadcast group to NULL causing an on
      the fly join to rejoin and reattach to the BC MCG as well as adding
      the BC MGID to the multicast list. The flush process would later on
      remove the BC MGID and detach it from the QP. On the next flush
      the BC MGID is present in the multicast list but not found when trying
      to detach it because of the previous double attach and single detach.
      
      [18332.714265] ------------[ cut here ]------------
      [18332.717775] WARNING: CPU: 6 PID: 3767 at drivers/infiniband/core/verbs.c:280 ib_dealloc_pd+0xff/0x120 [ib_core]
      ...
      [18332.775198] Hardware name: Red Hat KVM, BIOS Bochs 01/01/2011
      [18332.779411]  0000000000000000 ffff8800b50dfbb0 ffffffff813fed47 0000000000000000
      [18332.784960]  0000000000000000 ffff8800b50dfbf0 ffffffff8109add1 0000011832f58300
      [18332.790547]  ffff880226a596c0 ffff880032482000 ffff880032482830 ffff880226a59280
      [18332.796199] Call Trace:
      [18332.798015]  [<ffffffff813fed47>] dump_stack+0x63/0x8c
      [18332.801831]  [<ffffffff8109add1>] __warn+0xd1/0xf0
      [18332.805403]  [<ffffffff8109aebd>] warn_slowpath_null+0x1d/0x20
      [18332.809706]  [<ffffffffa025d90f>] ib_dealloc_pd+0xff/0x120 [ib_core]
      [18332.814384]  [<ffffffffa04f3d7c>] ipoib_transport_dev_cleanup+0xfc/0x1d0 [ib_ipoib]
      [18332.820031]  [<ffffffffa04ed648>] ipoib_ib_dev_cleanup+0x98/0x110 [ib_ipoib]
      [18332.825220]  [<ffffffffa04e62c8>] ipoib_dev_cleanup+0x2d8/0x550 [ib_ipoib]
      [18332.830290]  [<ffffffffa04e656f>] ipoib_uninit+0x2f/0x40 [ib_ipoib]
      [18332.834911]  [<ffffffff81772a8a>] rollback_registered_many+0x1aa/0x2c0
      [18332.839741]  [<ffffffff81772bd1>] rollback_registered+0x31/0x40
      [18332.844091]  [<ffffffff81773b18>] unregister_netdevice_queue+0x48/0x80
      [18332.848880]  [<ffffffffa04f489b>] ipoib_vlan_delete+0x1fb/0x290 [ib_ipoib]
      [18332.853848]  [<ffffffffa04df1cd>] delete_child+0x7d/0xf0 [ib_ipoib]
      [18332.858474]  [<ffffffff81520c08>] dev_attr_store+0x18/0x30
      [18332.862510]  [<ffffffff8127fe4a>] sysfs_kf_write+0x3a/0x50
      [18332.866349]  [<ffffffff8127f4e0>] kernfs_fop_write+0x120/0x170
      [18332.870471]  [<ffffffff81207198>] __vfs_write+0x28/0xe0
      [18332.874152]  [<ffffffff810e09bf>] ? percpu_down_read+0x1f/0x50
      [18332.878274]  [<ffffffff81208062>] vfs_write+0xa2/0x1a0
      [18332.881896]  [<ffffffff812093a6>] SyS_write+0x46/0xa0
      [18332.885632]  [<ffffffff810039b7>] do_syscall_64+0x57/0xb0
      [18332.889709]  [<ffffffff81883321>] entry_SYSCALL64_slow_path+0x25/0x25
      [18332.894727] ---[ end trace 09ebbe31f831ef17 ]---
      
      Fixes: ee1e2c82 ("IPoIB: Refresh paths instead of flushing them on SM change events")
      Signed-off-by: default avatarAlex Vesker <valex@mellanox.com>
      Signed-off-by: default avatarLeon Romanovsky <leon@kernel.org>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      b81459c7
    • Erez Shitrit's avatar
      IB/core: Fix use after free in send_leave function · 4f0992c3
      Erez Shitrit authored
      commit 68c6bcdd upstream.
      
      The function send_leave sets the member: group->query_id
      (group->query_id = ret) after calling the sa_query, but leave_handler
      can be executed before the setting and it might delete the group object,
      and will get a memory corruption.
      
      Additionally, this patch gets rid of group->query_id variable which is
      not used.
      
      Fixes: faec2f7b ('IB/sa: Track multicast join/leave requests')
      Signed-off-by: default avatarErez Shitrit <erezsh@mellanox.com>
      Signed-off-by: default avatarLeon Romanovsky <leon@kernel.org>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      4f0992c3
    • Erez Shitrit's avatar
      IB/ipoib: Fix memory corruption in ipoib cm mode connect flow · 27816fef
      Erez Shitrit authored
      commit 546481c2 upstream.
      
      When a new CM connection is being requested, ipoib driver copies data
      from the path pointer in the CM/tx object, the path object might be
      invalid at the point and memory corruption will happened later when now
      the CM driver will try using that data.
      
      The next scenario demonstrates it:
      	neigh_add_path --> ipoib_cm_create_tx -->
      	queue_work (pointer to path is in the cm/tx struct)
      	#while the work is still in the queue,
      	#the port goes down and causes the ipoib_flush_paths:
      	ipoib_flush_paths --> path_free --> kfree(path)
      	#at this point the work scheduled starts.
      	ipoib_cm_tx_start --> copy from the (invalid)path pointer:
      	(memcpy(&pathrec, &p->path->pathrec, sizeof pathrec);)
      	 -> memory corruption.
      
      To fix that the driver now starts the CM/tx connection only if that
      specific path exists in the general paths database.
      This check is protected with the relevant locks, and uses the gid from
      the neigh member in the CM/tx object which is valid according to the ref
      count that was taken by the CM/tx.
      
      Fixes: 839fcaba ('IPoIB: Connected mode experimental support')
      Signed-off-by: default avatarErez Shitrit <erezsh@mellanox.com>
      Signed-off-by: default avatarLeon Romanovsky <leon@kernel.org>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      27816fef
    • Emmanouil Maroudas's avatar
      EDAC: Increment correct counter in edac_inc_ue_error() · c6983c1f
      Emmanouil Maroudas authored
      commit 993f88f1 upstream.
      
      Fix typo in edac_inc_ue_error() to increment ue_noinfo_count instead of
      ce_noinfo_count.
      Signed-off-by: default avatarEmmanouil Maroudas <emmanouil.maroudas@gmail.com>
      Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
      Cc: linux-edac <linux-edac@vger.kernel.org>
      Fixes: 4275be63 ("edac: Change internal representation to work with layers")
      Link: http://lkml.kernel.org/r/1461425580-5898-1-git-send-email-emmanouil.maroudas@gmail.comSigned-off-by: default avatarBorislav Petkov <bp@suse.de>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      c6983c1f
    • Tejun Heo's avatar
      timers: Use proper base migration in add_timer_on() · de3e6236
      Tejun Heo authored
      commit 22b886dd upstream.
      
      Regardless of the previous CPU a timer was on, add_timer_on()
      currently simply sets timer->flags to the new CPU.  As the caller must
      be seeing the timer as idle, this is locally fine, but the timer
      leaving the old base while unlocked can lead to race conditions as
      follows.
      
      Let's say timer was on cpu 0.
      
        cpu 0					cpu 1
        -----------------------------------------------------------------------------
        del_timer(timer) succeeds
      					del_timer(timer)
      					  lock_timer_base(timer) locks cpu_0_base
        add_timer_on(timer, 1)
          spin_lock(&cpu_1_base->lock)
          timer->flags set to cpu_1_base
          operates on @timer			  operates on @timer
      
      This triggered with mod_delayed_work_on() which contains
      "if (del_timer()) add_timer_on()" sequence eventually leading to the
      following oops.
      
        BUG: unable to handle kernel NULL pointer dereference at           (null)
        IP: [<ffffffff810ca6e9>] detach_if_pending+0x69/0x1a0
        ...
        Workqueue: wqthrash wqthrash_workfunc [wqthrash]
        task: ffff8800172ca680 ti: ffff8800172d0000 task.ti: ffff8800172d0000
        RIP: 0010:[<ffffffff810ca6e9>]  [<ffffffff810ca6e9>] detach_if_pending+0x69/0x1a0
        ...
        Call Trace:
         [<ffffffff810cb0b4>] del_timer+0x44/0x60
         [<ffffffff8106e836>] try_to_grab_pending+0xb6/0x160
         [<ffffffff8106e913>] mod_delayed_work_on+0x33/0x80
         [<ffffffffa0000081>] wqthrash_workfunc+0x61/0x90 [wqthrash]
         [<ffffffff8106dba8>] process_one_work+0x1e8/0x650
         [<ffffffff8106e05e>] worker_thread+0x4e/0x450
         [<ffffffff810746af>] kthread+0xef/0x110
         [<ffffffff8185980f>] ret_from_fork+0x3f/0x70
      
      Fix it by updating add_timer_on() to perform proper migration as
      __mod_timer() does.
      
      Mike: apply tglx backport
      Reported-and-tested-by: default avatarJeff Layton <jlayton@poochiereds.net>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Chris Worley <chris.worley@primarydata.com>
      Cc: bfields@fieldses.org
      Cc: Michael Skralivetsky <michael.skralivetsky@primarydata.com>
      Cc: Trond Myklebust <trond.myklebust@primarydata.com>
      Cc: Shaohua Li <shli@fb.com>
      Cc: Jeff Layton <jlayton@poochiereds.net>
      Cc: kernel-team@fb.com
      Cc: stable@vger.kernel.org
      Link: http://lkml.kernel.org/r/20151029103113.2f893924@tlielax.poochiereds.net
      Link: http://lkml.kernel.org/r/20151104171533.GI5749@mtj.duckdns.orgSigned-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarMike Galbraith <mgalbraith@suse.de>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      de3e6236
    • Gavin Li's avatar
      cdc-acm: fix wrong pipe type on rx interrupt xfers · d4898081
      Gavin Li authored
      commit add12505 upstream.
      
      This fixes the "BOGUS urb xfer" warning logged by usb_submit_urb().
      Signed-off-by: default avatarGavin Li <git@thegavinli.com>
      Acked-by: default avatarOliver Neukum <oneukum@suse.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      d4898081
    • Krzysztof Kozlowski's avatar
      thermal: hwmon: Properly report critical temperature in sysfs · 995a3876
      Krzysztof Kozlowski authored
      commit f37fabb8 upstream.
      
      In the critical sysfs entry the thermal hwmon was returning wrong
      temperature to the user-space.  It was reporting the temperature of the
      first trip point instead of the temperature of critical trip point.
      
      For example:
      	/sys/class/hwmon/hwmon0/temp1_crit:50000
      	/sys/class/thermal/thermal_zone0/trip_point_0_temp:50000
      	/sys/class/thermal/thermal_zone0/trip_point_0_type:active
      	/sys/class/thermal/thermal_zone0/trip_point_3_temp:120000
      	/sys/class/thermal/thermal_zone0/trip_point_3_type:critical
      
      Since commit e68b16ab ("thermal: add hwmon sysfs I/F") the driver
      have been registering a sysfs entry if get_crit_temp() callback was
      provided.  However when accessed, it was calling get_trip_temp() instead
      of the get_crit_temp().
      
      Fixes: e68b16ab ("thermal: add hwmon sysfs I/F")
      Signed-off-by: default avatarKrzysztof Kozlowski <krzk@kernel.org>
      Signed-off-by: default avatarZhang Rui <rui.zhang@intel.com>
      [wt: s/thermal_hwmon.c/thermal_core.c in 3.10]
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      995a3876