1. 19 Aug, 2012 8 commits
    • Hangbin Liu's avatar
      tcp: Add TCP_USER_TIMEOUT negative value check · df7ba2e6
      Hangbin Liu authored
      [ Upstream commit 42493570 ]
      
      TCP_USER_TIMEOUT is a TCP level socket option that takes an unsigned int. But
      patch "tcp: Add TCP_USER_TIMEOUT socket option"(dca43c75) didn't check the negative
      values. If a user assign -1 to it, the socket will set successfully and wait
      for 4294967295 miliseconds. This patch add a negative value check to avoid
      this issue.
      Signed-off-by: default avatarHangbin Liu <liuhangbin@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      df7ba2e6
    • Alan Cox's avatar
      wanmain: comparing array with NULL · 761bfbd3
      Alan Cox authored
      [ Upstream commit 8b72ff64 ]
      
      gcc really should warn about these !
      Signed-off-by: default avatarAlan Cox <alan@linux.intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      761bfbd3
    • Alan Cox's avatar
      caif: fix NULL pointer check · c4726fb2
      Alan Cox authored
      [ Upstream commit c66b9b7d ]
      
      Reported-by: <rucsoftsec@gmail.com>
      Resolves-bug: http://bugzilla.kernel.org/show_bug?44441Signed-off-by: default avatarAlan Cox <alan@linux.intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      c4726fb2
    • Paul Moore's avatar
      cipso: don't follow a NULL pointer when setsockopt() is called · 7f6453d8
      Paul Moore authored
      [ Upstream commit 89d7ae34 ]
      
      As reported by Alan Cox, and verified by Lin Ming, when a user
      attempts to add a CIPSO option to a socket using the CIPSO_V4_TAG_LOCAL
      tag the kernel dies a terrible death when it attempts to follow a NULL
      pointer (the skb argument to cipso_v4_validate() is NULL when called via
      the setsockopt() syscall).
      
      This patch fixes this by first checking to ensure that the skb is
      non-NULL before using it to find the incoming network interface.  In
      the unlikely case where the skb is NULL and the user attempts to add
      a CIPSO option with the _TAG_LOCAL tag we return an error as this is
      not something we want to allow.
      
      A simple reproducer, kindly supplied by Lin Ming, although you must
      have the CIPSO DOI #3 configure on the system first or you will be
      caught early in cipso_v4_validate():
      
      	#include <sys/types.h>
      	#include <sys/socket.h>
      	#include <linux/ip.h>
      	#include <linux/in.h>
      	#include <string.h>
      
      	struct local_tag {
      		char type;
      		char length;
      		char info[4];
      	};
      
      	struct cipso {
      		char type;
      		char length;
      		char doi[4];
      		struct local_tag local;
      	};
      
      	int main(int argc, char **argv)
      	{
      		int sockfd;
      		struct cipso cipso = {
      			.type = IPOPT_CIPSO,
      			.length = sizeof(struct cipso),
      			.local = {
      				.type = 128,
      				.length = sizeof(struct local_tag),
      			},
      		};
      
      		memset(cipso.doi, 0, 4);
      		cipso.doi[3] = 3;
      
      		sockfd = socket(AF_INET, SOCK_DGRAM, 0);
      		#define SOL_IP 0
      		setsockopt(sockfd, SOL_IP, IP_OPTIONS,
      			&cipso, sizeof(struct cipso));
      
      		return 0;
      	}
      
      CC: Lin Ming <mlin@ss.pku.edu.cn>
      Reported-by: default avatarAlan Cox <alan@lxorguk.ukuu.org.uk>
      Signed-off-by: default avatarPaul Moore <pmoore@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      7f6453d8
    • Sjur Brændeland's avatar
      caif: Fix access to freed pernet memory · 9fbc36b8
      Sjur Brændeland authored
      [ Upstream commit 96f80d12 ]
      
      unregister_netdevice_notifier() must be called before
      unregister_pernet_subsys() to avoid accessing already freed
      pernet memory. This fixes the following oops when doing rmmod:
      
      Call Trace:
       [<ffffffffa0f802bd>] caif_device_notify+0x4d/0x5a0 [caif]
       [<ffffffff81552ba9>] unregister_netdevice_notifier+0xb9/0x100
       [<ffffffffa0f86dcc>] caif_device_exit+0x1c/0x250 [caif]
       [<ffffffff810e7734>] sys_delete_module+0x1a4/0x300
       [<ffffffff810da82d>] ? trace_hardirqs_on_caller+0x15d/0x1e0
       [<ffffffff813517de>] ? trace_hardirqs_on_thunk+0x3a/0x3
       [<ffffffff81696bad>] system_call_fastpath+0x1a/0x1f
      
      RIP
       [<ffffffffa0f7f561>] caif_get+0x51/0xb0 [caif]
      Signed-off-by: default avatarSjur Brændeland <sjur.brandeland@stericsson.com>
      Acked-by: default avatar"Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      9fbc36b8
    • Neil Horman's avatar
      sctp: Fix list corruption resulting from freeing an association on a list · e44b8b47
      Neil Horman authored
      [ Upstream commit 2eebc1e1 ]
      
      A few days ago Dave Jones reported this oops:
      
      [22766.294255] general protection fault: 0000 [#1] PREEMPT SMP
      [22766.295376] CPU 0
      [22766.295384] Modules linked in:
      [22766.387137]  ffffffffa169f292 6b6b6b6b6b6b6b6b ffff880147c03a90
      ffff880147c03a74
      [22766.387135] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 00000000000
      [22766.387136] Process trinity-watchdo (pid: 10896, threadinfo ffff88013e7d2000,
      [22766.387137] Stack:
      [22766.387140]  ffff880147c03a10
      [22766.387140]  ffffffffa169f2b6
      [22766.387140]  ffff88013ed95728
      [22766.387143]  0000000000000002
      [22766.387143]  0000000000000000
      [22766.387143]  ffff880003fad062
      [22766.387144]  ffff88013c120000
      [22766.387144]
      [22766.387145] Call Trace:
      [22766.387145]  <IRQ>
      [22766.387150]  [<ffffffffa169f292>] ? __sctp_lookup_association+0x62/0xd0
      [sctp]
      [22766.387154]  [<ffffffffa169f2b6>] __sctp_lookup_association+0x86/0xd0 [sctp]
      [22766.387157]  [<ffffffffa169f597>] sctp_rcv+0x207/0xbb0 [sctp]
      [22766.387161]  [<ffffffff810d4da8>] ? trace_hardirqs_off_caller+0x28/0xd0
      [22766.387163]  [<ffffffff815827e3>] ? nf_hook_slow+0x133/0x210
      [22766.387166]  [<ffffffff815902fc>] ? ip_local_deliver_finish+0x4c/0x4c0
      [22766.387168]  [<ffffffff8159043d>] ip_local_deliver_finish+0x18d/0x4c0
      [22766.387169]  [<ffffffff815902fc>] ? ip_local_deliver_finish+0x4c/0x4c0
      [22766.387171]  [<ffffffff81590a07>] ip_local_deliver+0x47/0x80
      [22766.387172]  [<ffffffff8158fd80>] ip_rcv_finish+0x150/0x680
      [22766.387174]  [<ffffffff81590c54>] ip_rcv+0x214/0x320
      [22766.387176]  [<ffffffff81558c07>] __netif_receive_skb+0x7b7/0x910
      [22766.387178]  [<ffffffff8155856c>] ? __netif_receive_skb+0x11c/0x910
      [22766.387180]  [<ffffffff810d423e>] ? put_lock_stats.isra.25+0xe/0x40
      [22766.387182]  [<ffffffff81558f83>] netif_receive_skb+0x23/0x1f0
      [22766.387183]  [<ffffffff815596a9>] ? dev_gro_receive+0x139/0x440
      [22766.387185]  [<ffffffff81559280>] napi_skb_finish+0x70/0xa0
      [22766.387187]  [<ffffffff81559cb5>] napi_gro_receive+0xf5/0x130
      [22766.387218]  [<ffffffffa01c4679>] e1000_receive_skb+0x59/0x70 [e1000e]
      [22766.387242]  [<ffffffffa01c5aab>] e1000_clean_rx_irq+0x28b/0x460 [e1000e]
      [22766.387266]  [<ffffffffa01c9c18>] e1000e_poll+0x78/0x430 [e1000e]
      [22766.387268]  [<ffffffff81559fea>] net_rx_action+0x1aa/0x3d0
      [22766.387270]  [<ffffffff810a495f>] ? account_system_vtime+0x10f/0x130
      [22766.387273]  [<ffffffff810734d0>] __do_softirq+0xe0/0x420
      [22766.387275]  [<ffffffff8169826c>] call_softirq+0x1c/0x30
      [22766.387278]  [<ffffffff8101db15>] do_softirq+0xd5/0x110
      [22766.387279]  [<ffffffff81073bc5>] irq_exit+0xd5/0xe0
      [22766.387281]  [<ffffffff81698b03>] do_IRQ+0x63/0xd0
      [22766.387283]  [<ffffffff8168ee2f>] common_interrupt+0x6f/0x6f
      [22766.387283]  <EOI>
      [22766.387284]
      [22766.387285]  [<ffffffff8168eed9>] ? retint_swapgs+0x13/0x1b
      [22766.387285] Code: c0 90 5d c3 66 0f 1f 44 00 00 4c 89 c8 5d c3 0f 1f 00 55 48
      89 e5 48 83
      ec 20 48 89 5d e8 4c 89 65 f0 4c 89 6d f8 66 66 66 66 90 <0f> b7 87 98 00 00 00
      48 89 fb
      49 89 f5 66 c1 c0 08 66 39 46 02
      [22766.387307]
      [22766.387307] RIP
      [22766.387311]  [<ffffffffa168a2c9>] sctp_assoc_is_match+0x19/0x90 [sctp]
      [22766.387311]  RSP <ffff880147c039b0>
      [22766.387142]  ffffffffa16ab120
      [22766.599537] ---[ end trace 3f6dae82e37b17f5 ]---
      [22766.601221] Kernel panic - not syncing: Fatal exception in interrupt
      
      It appears from his analysis and some staring at the code that this is likely
      occuring because an association is getting freed while still on the
      sctp_assoc_hashtable.  As a result, we get a gpf when traversing the hashtable
      while a freed node corrupts part of the list.
      
      Nominally I would think that an mibalanced refcount was responsible for this,
      but I can't seem to find any obvious imbalance.  What I did note however was
      that the two places where we create an association using
      sctp_primitive_ASSOCIATE (__sctp_connect and sctp_sendmsg), have failure paths
      which free a newly created association after calling sctp_primitive_ASSOCIATE.
      sctp_primitive_ASSOCIATE brings us into the sctp_sf_do_prm_asoc path, which
      issues a SCTP_CMD_NEW_ASOC side effect, which in turn adds a new association to
      the aforementioned hash table.  the sctp command interpreter that process side
      effects has not way to unwind previously processed commands, so freeing the
      association from the __sctp_connect or sctp_sendmsg error path would lead to a
      freed association remaining on this hash table.
      
      I've fixed this but modifying sctp_[un]hash_established to use hlist_del_init,
      which allows us to proerly use hlist_unhashed to check if the node is on a
      hashlist safely during a delete.  That in turn alows us to safely call
      sctp_unhash_established in the __sctp_connect and sctp_sendmsg error paths
      before freeing them, regardles of what the associations state is on the hash
      list.
      
      I noted, while I was doing this, that the __sctp_unhash_endpoint was using
      hlist_unhsashed in a simmilar fashion, but never nullified any removed nodes
      pointers to make that function work properly, so I fixed that up in a simmilar
      fashion.
      
      I attempted to test this using a virtual guest running the SCTP_RR test from
      netperf in a loop while running the trinity fuzzer, both in a loop.  I wasn't
      able to recreate the problem prior to this fix, nor was I able to trigger the
      failure after (neither of which I suppose is suprising).  Given the trace above
      however, I think its likely that this is what we hit.
      Signed-off-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      Reported-by: davej@redhat.com
      CC: davej@redhat.com
      CC: "David S. Miller" <davem@davemloft.net>
      CC: Vlad Yasevich <vyasevich@gmail.com>
      CC: Sridhar Samudrala <sri@us.ibm.com>
      CC: linux-sctp@vger.kernel.org
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      e44b8b47
    • Alan Cox's avatar
      ab12cbb5
    • Michael Chan's avatar
      bnx2: Fix bug in bnx2_free_tx_skbs(). · 6ccc9598
      Michael Chan authored
      [ Upstream commit c1f5163d ]
      
      In rare cases, bnx2x_free_tx_skbs() can unmap the wrong DMA address
      when it gets to the last entry of the tx ring.  We were not using
      the proper macro to skip the last entry when advancing the tx index.
      Reported-by: default avatarZongyun Lai <zlai@vmware.com>
      Reviewed-by: default avatarJeffrey Huang <huangjw@broadcom.com>
      Signed-off-by: default avatarMichael Chan <mchan@broadcom.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      6ccc9598
  2. 09 Aug, 2012 32 commits
    • Ben Hutchings's avatar
      Linux 3.2.27 · 8524c787
      Ben Hutchings authored
      8524c787
    • Tomoya MORINAGA's avatar
      pch_uart: Fix parity setting issue · dbf6df1f
      Tomoya MORINAGA authored
      commit 38bd2a1a upstream.
      
      Parity Setting value is reverse.
      E.G. In case of setting ODD parity, EVEN value is set.
      This patch inverts "if" condition.
      Signed-off-by: default avatarTomoya MORINAGA <tomoya.rohm@gmail.com>
      Acked-by: default avatarAlan Cox <alan@linux.intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      dbf6df1f
    • Tomoya MORINAGA's avatar
      pch_uart: Fix rx error interrupt setting issue · 0eaadc03
      Tomoya MORINAGA authored
      commit 9539dfb7 upstream.
      
      Rx Error interrupt(E.G. parity error) is not enabled.
      So, when parity error occurs, error interrupt is not occurred.
      As a result, the received data is not dropped.
      
      This patch adds enable/disable rx error interrupt code.
      Signed-off-by: default avatarTomoya MORINAGA <tomoya.rohm@gmail.com>
      Acked-by: default avatarAlan Cox <alan@linux.intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      [Backported by Tomoya MORINGA: adjusted context]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      0eaadc03
    • Alan Cox's avatar
      pch_uart: Fix missing break for 16 byte fifo · 882cd0e8
      Alan Cox authored
      commit 9bc03743 upstream.
      
      Otherwise we fall back to the wrong value.
      
      Reported-by: <dcb314@hotmail.com>
      Resolves-bug: https://bugzilla.kernel.org/show_bug.cgi?id=44091Signed-off-by: default avatarAlan Cox <alan@linux.intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      882cd0e8
    • Eric Dumazet's avatar
      drop_monitor: dont sleep in atomic context · d4a6acbb
      Eric Dumazet authored
      commit bec4596b upstream.
      
      drop_monitor calls several sleeping functions while in atomic context.
      
       BUG: sleeping function called from invalid context at mm/slub.c:943
       in_atomic(): 1, irqs_disabled(): 0, pid: 2103, name: kworker/0:2
       Pid: 2103, comm: kworker/0:2 Not tainted 3.5.0-rc1+ #55
       Call Trace:
        [<ffffffff810697ca>] __might_sleep+0xca/0xf0
        [<ffffffff811345a3>] kmem_cache_alloc_node+0x1b3/0x1c0
        [<ffffffff8105578c>] ? queue_delayed_work_on+0x11c/0x130
        [<ffffffff815343fb>] __alloc_skb+0x4b/0x230
        [<ffffffffa00b0360>] ? reset_per_cpu_data+0x160/0x160 [drop_monitor]
        [<ffffffffa00b022f>] reset_per_cpu_data+0x2f/0x160 [drop_monitor]
        [<ffffffffa00b03ab>] send_dm_alert+0x4b/0xb0 [drop_monitor]
        [<ffffffff810568e0>] process_one_work+0x130/0x4c0
        [<ffffffff81058249>] worker_thread+0x159/0x360
        [<ffffffff810580f0>] ? manage_workers.isra.27+0x240/0x240
        [<ffffffff8105d403>] kthread+0x93/0xa0
        [<ffffffff816be6d4>] kernel_thread_helper+0x4/0x10
        [<ffffffff8105d370>] ? kthread_freezable_should_stop+0x80/0x80
        [<ffffffff816be6d0>] ? gs_change+0xb/0xb
      
      Rework the logic to call the sleeping functions in right context.
      
      Use standard timer/workqueue api to let system chose any cpu to perform
      the allocation and netlink send.
      
      Also avoid a loop if reset_per_cpu_data() cannot allocate memory :
      use mod_timer() to wait 1/10 second before next try.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Neil Horman <nhorman@tuxdriver.com>
      Reviewed-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      d4a6acbb
    • Neil Horman's avatar
      drop_monitor: prevent init path from scheduling on the wrong cpu · 1b51d69a
      Neil Horman authored
      commit 4fdcfa12 upstream.
      
      I just noticed after some recent updates, that the init path for the drop
      monitor protocol has a minor error.  drop monitor maintains a per cpu structure,
      that gets initalized from a single cpu.  Normally this is fine, as the protocol
      isn't in use yet, but I recently made a change that causes a failed skb
      allocation to reschedule itself .  Given the current code, the implication is
      that this workqueue reschedule will take place on the wrong cpu.  If drop
      monitor is used early during the boot process, its possible that two cpus will
      access a single per-cpu structure in parallel, possibly leading to data
      corruption.
      
      This patch fixes the situation, by storing the cpu number that a given instance
      of this per-cpu data should be accessed from.  In the case of a need for a
      reschedule, the cpu stored in the struct is assigned the rescheule, rather than
      the currently executing cpu
      
      Tested successfully by myself.
      Signed-off-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      CC: David Miller <davem@davemloft.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      1b51d69a
    • Neil Horman's avatar
      drop_monitor: Make updating data->skb smp safe · ea39e338
      Neil Horman authored
      commit 3885ca78 upstream.
      
      Eric Dumazet pointed out to me that the drop_monitor protocol has some holes in
      its smp protections.  Specifically, its possible to replace data->skb while its
      being written.  This patch corrects that by making data->skb an rcu protected
      variable.  That will prevent it from being overwritten while a tracepoint is
      modifying it.
      Signed-off-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      Reported-by: default avatarEric Dumazet <eric.dumazet@gmail.com>
      CC: David Miller <davem@davemloft.net>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      ea39e338
    • Neil Horman's avatar
      drop_monitor: fix sleeping in invalid context warning · caaf10b6
      Neil Horman authored
      commit cde2e9a6 upstream.
      
      Eric Dumazet pointed out this warning in the drop_monitor protocol to me:
      
      [   38.352571] BUG: sleeping function called from invalid context at kernel/mutex.c:85
      [   38.352576] in_atomic(): 1, irqs_disabled(): 0, pid: 4415, name: dropwatch
      [   38.352580] Pid: 4415, comm: dropwatch Not tainted 3.4.0-rc2+ #71
      [   38.352582] Call Trace:
      [   38.352592]  [<ffffffff8153aaf0>] ? trace_napi_poll_hit+0xd0/0xd0
      [   38.352599]  [<ffffffff81063f2a>] __might_sleep+0xca/0xf0
      [   38.352606]  [<ffffffff81655b16>] mutex_lock+0x26/0x50
      [   38.352610]  [<ffffffff8153aaf0>] ? trace_napi_poll_hit+0xd0/0xd0
      [   38.352616]  [<ffffffff810b72d9>] tracepoint_probe_register+0x29/0x90
      [   38.352621]  [<ffffffff8153a585>] set_all_monitor_traces+0x105/0x170
      [   38.352625]  [<ffffffff8153a8ca>] net_dm_cmd_trace+0x2a/0x40
      [   38.352630]  [<ffffffff8154a81a>] genl_rcv_msg+0x21a/0x2b0
      [   38.352636]  [<ffffffff810f8029>] ? zone_statistics+0x99/0xc0
      [   38.352640]  [<ffffffff8154a600>] ? genl_rcv+0x30/0x30
      [   38.352645]  [<ffffffff8154a059>] netlink_rcv_skb+0xa9/0xd0
      [   38.352649]  [<ffffffff8154a5f0>] genl_rcv+0x20/0x30
      [   38.352653]  [<ffffffff81549a7e>] netlink_unicast+0x1ae/0x1f0
      [   38.352658]  [<ffffffff81549d76>] netlink_sendmsg+0x2b6/0x310
      [   38.352663]  [<ffffffff8150824f>] sock_sendmsg+0x10f/0x130
      [   38.352668]  [<ffffffff8150abe0>] ? move_addr_to_kernel+0x60/0xb0
      [   38.352673]  [<ffffffff81515f04>] ? verify_iovec+0x64/0xe0
      [   38.352677]  [<ffffffff81509c46>] __sys_sendmsg+0x386/0x390
      [   38.352682]  [<ffffffff810ffaf9>] ? handle_mm_fault+0x139/0x210
      [   38.352687]  [<ffffffff8165b5bc>] ? do_page_fault+0x1ec/0x4f0
      [   38.352693]  [<ffffffff8106ba4d>] ? set_next_entity+0x9d/0xb0
      [   38.352699]  [<ffffffff81310b49>] ? tty_ldisc_deref+0x9/0x10
      [   38.352703]  [<ffffffff8106d363>] ? pick_next_task_fair+0x63/0x140
      [   38.352708]  [<ffffffff8150b8d4>] sys_sendmsg+0x44/0x80
      [   38.352713]  [<ffffffff8165f8e2>] system_call_fastpath+0x16/0x1b
      
      It stems from holding a spinlock (trace_state_lock) while attempting to register
      or unregister tracepoint hooks, making in_atomic() true in this context, leading
      to the warning when the tracepoint calls might_sleep() while its taking a mutex.
      Since we only use the trace_state_lock to prevent trace protocol state races, as
      well as hardware stat list updates on an rcu write side, we can just convert the
      spinlock to a mutex to avoid this problem.
      Signed-off-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      Reported-by: default avatarEric Dumazet <eric.dumazet@gmail.com>
      CC: David Miller <davem@davemloft.net>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      caaf10b6
    • Jeongdo Son's avatar
      rt2x00: Add support for BUFFALO WLI-UC-GNM2 to rt2800usb. · 6cf299fa
      Jeongdo Son authored
      commit a769f957 upstream.
      
      This is a RT3070 based device.
      Signed-off-by: default avatarJeongdo Son <sohn9086@gmail.com>
      Signed-off-by: default avatarJohn W. Linville <linville@tuxdriver.com>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      6cf299fa
    • Jesse Barnes's avatar
      drm/i915: prefer wide & slow to fast & narrow in DP configs · 38aa8510
      Jesse Barnes authored
      commit 2514bc51 upstream.
      
      High frequency link configurations have the potential to cause trouble
      with long and/or cheap cables, so prefer slow and wide configurations
      instead.  This patch has the potential to cause trouble for eDP
      configurations that lie about available lanes, so if we run into that we
      can make it conditional on eDP.
      
      Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=45801
      Tested-by: peter@colberg.org
      Signed-off-by: default avatarJesse Barnes <jbarnes@virtuousgeek.org>
      Signed-off-by: default avatarDaniel Vetter <daniel.vetter@ffwll.ch>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      38aa8510
    • Andreas Schwab's avatar
      m68k: Make sys_atomic_cmpxchg_32 work on classic m68k · 56c5dc34
      Andreas Schwab authored
      commit 9e2760d1 upstream.
      
      User space access must always go through uaccess accessors, since on
      classic m68k user space and kernel space are completely separate.
      Signed-off-by: default avatarAndreas Schwab <schwab@linux-m68k.org>
      Tested-by: default avatarThorsten Glaser <tg@debian.org>
      Signed-off-by: default avatarGeert Uytterhoeven <geert@linux-m68k.org>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      56c5dc34
    • Boaz Harrosh's avatar
      ore: Fix out-of-bounds access in _ios_obj() · 922c9791
      Boaz Harrosh authored
      commit 9e62bb44 upstream.
      
      _ios_obj() is accessed by group_index not device_table index.
      
      The oc->comps array is only a group_full of devices at a time
      it is not like ore_comp_dev() which is indexed by a global
      device_table index.
      
      This did not BUG until now because exofs only uses a single
      COMP for all devices. But with other FSs like PanFS this is
      not true.
      
      This bug was only in the write_path, all other users were
      using it correctly
      
      [This is a bug since 3.2 Kernel]
      Signed-off-by: default avatarBoaz Harrosh <bharrosh@panasas.com>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      922c9791
    • Takashi Iwai's avatar
      ALSA: hda - Support dock on Lenovo Thinkpad T530 with ALC269VC · 18fbe5a7
      Takashi Iwai authored
      commit 707fba3f upstream.
      
      Lenovo Thinkpad T530 with ALC269VC codec has a dock port but BIOS
      doesn't set up the pins properly.  Enable the pins as well as on
      Thinkpad X230 Tablet.
      Reported-and-tested-by: default avatarMario <anyc@hadiko.de>
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      18fbe5a7
    • Daniel Mack's avatar
      ALSA: snd-usb: fix clock source validity index · 768049e3
      Daniel Mack authored
      commit aff252a8 upstream.
      
      uac_clock_source_is_valid() uses the control selector value to access
      the bmControls bitmap of the clock source unit. This is wrong, as
      control selector values start from 1, while the bitmap uses all
      available bits.
      
      In other words, "Clock Validity Control" is stored in D3..2, not D5..4
      of the clock selector unit's bmControls.
      Signed-off-by: default avatarDaniel Mack <zonque@gmail.com>
      Reported-by: default avatarAndreas Koch <andreas@akdesigninc.com>
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      768049e3
    • Mel Gorman's avatar
      mm: hugetlbfs: close race during teardown of hugetlbfs shared page tables · 6f72a41f
      Mel Gorman authored
      commit d833352a upstream.
      
      If a process creates a large hugetlbfs mapping that is eligible for page
      table sharing and forks heavily with children some of whom fault and
      others which destroy the mapping then it is possible for page tables to
      get corrupted.  Some teardowns of the mapping encounter a "bad pmd" and
      output a message to the kernel log.  The final teardown will trigger a
      BUG_ON in mm/filemap.c.
      
      This was reproduced in 3.4 but is known to have existed for a long time
      and goes back at least as far as 2.6.37.  It was probably was introduced
      in 2.6.20 by [39dde65c: shared page table for hugetlb page].  The messages
      look like this;
      
      [  ..........] Lots of bad pmd messages followed by this
      [  127.164256] mm/memory.c:391: bad pmd ffff880412e04fe8(80000003de4000e7).
      [  127.164257] mm/memory.c:391: bad pmd ffff880412e04ff0(80000003de6000e7).
      [  127.164258] mm/memory.c:391: bad pmd ffff880412e04ff8(80000003de0000e7).
      [  127.186778] ------------[ cut here ]------------
      [  127.186781] kernel BUG at mm/filemap.c:134!
      [  127.186782] invalid opcode: 0000 [#1] SMP
      [  127.186783] CPU 7
      [  127.186784] Modules linked in: af_packet cpufreq_conservative cpufreq_userspace cpufreq_powersave acpi_cpufreq mperf ext3 jbd dm_mod coretemp crc32c_intel usb_storage ghash_clmulni_intel aesni_intel i2c_i801 r8169 mii uas sr_mod cdrom sg iTCO_wdt iTCO_vendor_support shpchp serio_raw cryptd aes_x86_64 e1000e pci_hotplug dcdbas aes_generic container microcode ext4 mbcache jbd2 crc16 sd_mod crc_t10dif i915 drm_kms_helper drm i2c_algo_bit ehci_hcd ahci libahci usbcore rtc_cmos usb_common button i2c_core intel_agp video intel_gtt fan processor thermal thermal_sys hwmon ata_generic pata_atiixp libata scsi_mod
      [  127.186801]
      [  127.186802] Pid: 9017, comm: hugetlbfs-test Not tainted 3.4.0-autobuild #53 Dell Inc. OptiPlex 990/06D7TR
      [  127.186804] RIP: 0010:[<ffffffff810ed6ce>]  [<ffffffff810ed6ce>] __delete_from_page_cache+0x15e/0x160
      [  127.186809] RSP: 0000:ffff8804144b5c08  EFLAGS: 00010002
      [  127.186810] RAX: 0000000000000001 RBX: ffffea000a5c9000 RCX: 00000000ffffffc0
      [  127.186811] RDX: 0000000000000000 RSI: 0000000000000009 RDI: ffff88042dfdad00
      [  127.186812] RBP: ffff8804144b5c18 R08: 0000000000000009 R09: 0000000000000003
      [  127.186813] R10: 0000000000000000 R11: 000000000000002d R12: ffff880412ff83d8
      [  127.186814] R13: ffff880412ff83d8 R14: 0000000000000000 R15: ffff880412ff83d8
      [  127.186815] FS:  00007fe18ed2c700(0000) GS:ffff88042dce0000(0000) knlGS:0000000000000000
      [  127.186816] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      [  127.186817] CR2: 00007fe340000503 CR3: 0000000417a14000 CR4: 00000000000407e0
      [  127.186818] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [  127.186819] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      [  127.186820] Process hugetlbfs-test (pid: 9017, threadinfo ffff8804144b4000, task ffff880417f803c0)
      [  127.186821] Stack:
      [  127.186822]  ffffea000a5c9000 0000000000000000 ffff8804144b5c48 ffffffff810ed83b
      [  127.186824]  ffff8804144b5c48 000000000000138a 0000000000001387 ffff8804144b5c98
      [  127.186825]  ffff8804144b5d48 ffffffff811bc925 ffff8804144b5cb8 0000000000000000
      [  127.186827] Call Trace:
      [  127.186829]  [<ffffffff810ed83b>] delete_from_page_cache+0x3b/0x80
      [  127.186832]  [<ffffffff811bc925>] truncate_hugepages+0x115/0x220
      [  127.186834]  [<ffffffff811bca43>] hugetlbfs_evict_inode+0x13/0x30
      [  127.186837]  [<ffffffff811655c7>] evict+0xa7/0x1b0
      [  127.186839]  [<ffffffff811657a3>] iput_final+0xd3/0x1f0
      [  127.186840]  [<ffffffff811658f9>] iput+0x39/0x50
      [  127.186842]  [<ffffffff81162708>] d_kill+0xf8/0x130
      [  127.186843]  [<ffffffff81162812>] dput+0xd2/0x1a0
      [  127.186845]  [<ffffffff8114e2d0>] __fput+0x170/0x230
      [  127.186848]  [<ffffffff81236e0e>] ? rb_erase+0xce/0x150
      [  127.186849]  [<ffffffff8114e3ad>] fput+0x1d/0x30
      [  127.186851]  [<ffffffff81117db7>] remove_vma+0x37/0x80
      [  127.186853]  [<ffffffff81119182>] do_munmap+0x2d2/0x360
      [  127.186855]  [<ffffffff811cc639>] sys_shmdt+0xc9/0x170
      [  127.186857]  [<ffffffff81410a39>] system_call_fastpath+0x16/0x1b
      [  127.186858] Code: 0f 1f 44 00 00 48 8b 43 08 48 8b 00 48 8b 40 28 8b b0 40 03 00 00 85 f6 0f 88 df fe ff ff 48 89 df e8 e7 cb 05 00 e9 d2 fe ff ff <0f> 0b 55 83 e2 fd 48 89 e5 48 83 ec 30 48 89 5d d8 4c 89 65 e0
      [  127.186868] RIP  [<ffffffff810ed6ce>] __delete_from_page_cache+0x15e/0x160
      [  127.186870]  RSP <ffff8804144b5c08>
      [  127.186871] ---[ end trace 7cbac5d1db69f426 ]---
      
      The bug is a race and not always easy to reproduce.  To reproduce it I was
      doing the following on a single socket I7-based machine with 16G of RAM.
      
      $ hugeadm --pool-pages-max DEFAULT:13G
      $ echo $((18*1048576*1024)) > /proc/sys/kernel/shmmax
      $ echo $((18*1048576*1024)) > /proc/sys/kernel/shmall
      $ for i in `seq 1 9000`; do ./hugetlbfs-test; done
      
      On my particular machine, it usually triggers within 10 minutes but
      enabling debug options can change the timing such that it never hits.
      Once the bug is triggered, the machine is in trouble and needs to be
      rebooted.  The machine will respond but processes accessing proc like "ps
      aux" will hang due to the BUG_ON.  shutdown will also hang and needs a
      hard reset or a sysrq-b.
      
      The basic problem is a race between page table sharing and teardown.  For
      the most part page table sharing depends on i_mmap_mutex.  In some cases,
      it is also taking the mm->page_table_lock for the PTE updates but with
      shared page tables, it is the i_mmap_mutex that is more important.
      
      Unfortunately it appears to be also insufficient. Consider the following
      situation
      
      Process A					Process B
      ---------					---------
      hugetlb_fault					shmdt
        						LockWrite(mmap_sem)
          						  do_munmap
      						    unmap_region
      						      unmap_vmas
      						        unmap_single_vma
      						          unmap_hugepage_range
            						            Lock(i_mmap_mutex)
      							    Lock(mm->page_table_lock)
      							    huge_pmd_unshare/unmap tables <--- (1)
      							    Unlock(mm->page_table_lock)
            						            Unlock(i_mmap_mutex)
        huge_pte_alloc				      ...
          Lock(i_mmap_mutex)				      ...
          vma_prio_walk, find svma, spte		      ...
          Lock(mm->page_table_lock)			      ...
          share spte					      ...
          Unlock(mm->page_table_lock)			      ...
          Unlock(i_mmap_mutex)			      ...
        hugetlb_no_page									  <--- (2)
      						      free_pgtables
      						        unlink_file_vma
      							hugetlb_free_pgd_range
      						    remove_vma_list
      
      In this scenario, it is possible for Process A to share page tables with
      Process B that is trying to tear them down.  The i_mmap_mutex on its own
      does not prevent Process A walking Process B's page tables.  At (1) above,
      the page tables are not shared yet so it unmaps the PMDs.  Process A sets
      up page table sharing and at (2) faults a new entry.  Process B then trips
      up on it in free_pgtables.
      
      This patch fixes the problem by adding a new function
      __unmap_hugepage_range_final that is only called when the VMA is about to
      be destroyed.  This function clears VM_MAYSHARE during
      unmap_hugepage_range() under the i_mmap_mutex.  This makes the VMA
      ineligible for sharing and avoids the race.  Superficially this looks like
      it would then be vunerable to truncate and madvise issues but hugetlbfs
      has its own truncate handlers so does not use unmap_mapping_range() and
      does not support madvise(DONTNEED).
      
      This should be treated as a -stable candidate if it is merged.
      
      Test program is as follows. The test case was mostly written by Michal
      Hocko with a few minor changes to reproduce this bug.
      
      ==== CUT HERE ====
      
      static size_t huge_page_size = (2UL << 20);
      static size_t nr_huge_page_A = 512;
      static size_t nr_huge_page_B = 5632;
      
      unsigned int get_random(unsigned int max)
      {
      	struct timeval tv;
      
      	gettimeofday(&tv, NULL);
      	srandom(tv.tv_usec);
      	return random() % max;
      }
      
      static void play(void *addr, size_t size)
      {
      	unsigned char *start = addr,
      		      *end = start + size,
      		      *a;
      	start += get_random(size/2);
      
      	/* we could itterate on huge pages but let's give it more time. */
      	for (a = start; a < end; a += 4096)
      		*a = 0;
      }
      
      int main(int argc, char **argv)
      {
      	key_t key = IPC_PRIVATE;
      	size_t sizeA = nr_huge_page_A * huge_page_size;
      	size_t sizeB = nr_huge_page_B * huge_page_size;
      	int shmidA, shmidB;
      	void *addrA = NULL, *addrB = NULL;
      	int nr_children = 300, n = 0;
      
      	if ((shmidA = shmget(key, sizeA, IPC_CREAT|SHM_HUGETLB|0660)) == -1) {
      		perror("shmget:");
      		return 1;
      	}
      
      	if ((addrA = shmat(shmidA, addrA, SHM_R|SHM_W)) == (void *)-1UL) {
      		perror("shmat");
      		return 1;
      	}
      	if ((shmidB = shmget(key, sizeB, IPC_CREAT|SHM_HUGETLB|0660)) == -1) {
      		perror("shmget:");
      		return 1;
      	}
      
      	if ((addrB = shmat(shmidB, addrB, SHM_R|SHM_W)) == (void *)-1UL) {
      		perror("shmat");
      		return 1;
      	}
      
      fork_child:
      	switch(fork()) {
      		case 0:
      			switch (n%3) {
      			case 0:
      				play(addrA, sizeA);
      				break;
      			case 1:
      				play(addrB, sizeB);
      				break;
      			case 2:
      				break;
      			}
      			break;
      		case -1:
      			perror("fork:");
      			break;
      		default:
      			if (++n < nr_children)
      				goto fork_child;
      			play(addrA, sizeA);
      			break;
      	}
      	shmdt(addrA);
      	shmdt(addrB);
      	do {
      		wait(NULL);
      	} while (--n > 0);
      	shmctl(shmidA, IPC_RMID, NULL);
      	shmctl(shmidB, IPC_RMID, NULL);
      	return 0;
      }
      
      [akpm@linux-foundation.org: name the declaration's args, fix CONFIG_HUGETLBFS=n build]
      Signed-off-by: default avatarHugh Dickins <hughd@google.com>
      Reviewed-by: default avatarMichal Hocko <mhocko@suse.cz>
      Signed-off-by: default avatarMel Gorman <mgorman@suse.de>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      [bwh: Backported to 3.2:
       - Adjust context
       - Drop the mmu_gather * parameters]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      6f72a41f
    • Xiao Guangrong's avatar
      mm: mmu_notifier: fix freed page still mapped in secondary MMU · 5d7a6068
      Xiao Guangrong authored
      commit 3ad3d901 upstream.
      
      mmu_notifier_release() is called when the process is exiting.  It will
      delete all the mmu notifiers.  But at this time the page belonging to the
      process is still present in page tables and is present on the LRU list, so
      this race will happen:
      
            CPU 0                 CPU 1
      mmu_notifier_release:    try_to_unmap:
         hlist_del_init_rcu(&mn->hlist);
                                  ptep_clear_flush_notify:
                                        mmu nofifler not found
                                  free page  !!!!!!
                                  /*
                                   * At the point, the page has been
                                   * freed, but it is still mapped in
                                   * the secondary MMU.
                                   */
      
        mn->ops->release(mn, mm);
      
      Then the box is not stable and sometimes we can get this bug:
      
      [  738.075923] BUG: Bad page state in process migrate-perf  pfn:03bec
      [  738.075931] page:ffffea00000efb00 count:0 mapcount:0 mapping:          (null) index:0x8076
      [  738.075936] page flags: 0x20000000000014(referenced|dirty)
      
      The same issue is present in mmu_notifier_unregister().
      
      We can call ->release before deleting the notifier to ensure the page has
      been unmapped from the secondary MMU before it is freed.
      Signed-off-by: default avatarXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
      Cc: Avi Kivity <avi@redhat.com>
      Cc: Marcelo Tosatti <mtosatti@redhat.com>
      Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      5d7a6068
    • Xishi Qiu's avatar
      mm: setup pageblock_order before it's used by sparsemem · 3927df7e
      Xishi Qiu authored
      commit ca57df79 upstream.
      
      On architectures with CONFIG_HUGETLB_PAGE_SIZE_VARIABLE set, such as
      Itanium, pageblock_order is a variable with default value of 0.  It's set
      to the right value by set_pageblock_order() in function
      free_area_init_core().
      
      But pageblock_order may be used by sparse_init() before free_area_init_core()
      is called along path:
      sparse_init()
          ->sparse_early_usemaps_alloc_node()
      	->usemap_size()
      	    ->SECTION_BLOCKFLAGS_BITS
      		->((1UL << (PFN_SECTION_SHIFT - pageblock_order)) *
      NR_PAGEBLOCK_BITS)
      
      The uninitialized pageblock_size will cause memory wasting because
      usemap_size() returns a much bigger value then it's really needed.
      
      For example, on an Itanium platform,
      sparse_init() pageblock_order=0 usemap_size=24576
      free_area_init_core() before pageblock_order=0, usemap_size=24576
      free_area_init_core() after pageblock_order=12, usemap_size=8
      
      That means 24K memory has been wasted for each section, so fix it by calling
      set_pageblock_order() from sparse_init().
      Signed-off-by: default avatarXishi Qiu <qiuxishi@huawei.com>
      Signed-off-by: default avatarJiang Liu <liuj97@gmail.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Keping Chen <chenkeping@huawei.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      [bwh: Backported to 3.2: adjust context]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      3927df7e
    • Andrew Morton's avatar
      mm/page_alloc.c: remove pageblock_default_order() · 27c4b68b
      Andrew Morton authored
      commit 955c1cd7 upstream.
      
      This has always been broken: one version takes an unsigned int and the
      other version takes no arguments.  This bug was hidden because one
      version of set_pageblock_order() was a macro which doesn't evaluate its
      argument.
      
      Simplify it all and remove pageblock_default_order() altogether.
      Reported-by: default avatarrajman mekaco <rajman.mekaco@gmail.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Minchan Kim <minchan.kim@gmail.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      27c4b68b
    • Mark Brown's avatar
      ASoC: wm8962: Allow VMID time to fully ramp · 2145a2c9
      Mark Brown authored
      commit 9d40e558 upstream.
      
      Required for reliable power up from cold.
      Signed-off-by: default avatarMark Brown <broonie@opensource.wolfsonmicro.com>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      2145a2c9
    • Colin Ian King's avatar
      USB: echi-dbgp: increase the controller wait time to come out of halt. · 0776e23d
      Colin Ian King authored
      commit f96a4216 upstream.
      
      The default 10 microsecond delay for the controller to come out of
      halt in dbgp_ehci_startup is too short, so increase it to 1 millisecond.
      
      This is based on emperical testing on various USB debug ports on
      modern machines such as a Lenovo X220i and an Ivybridge development
      platform that needed to wait ~450-950 microseconds.
      Signed-off-by: default avatarColin Ian King <colin.king@canonical.com>
      Signed-off-by: default avatarJason Wessel <jason.wessel@windriver.com>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      0776e23d
    • Russell King's avatar
      ARM: Fix undefined instruction exception handling · 83358b9b
      Russell King authored
      commit 15ac49b6 upstream.
      
      While trying to get a v3.5 kernel booted on the cubox, I noticed that
      VFP does not work correctly with VFP bounce handling.  This is because
      of the confusion over 16-bit vs 32-bit instructions, and where PC is
      supposed to point to.
      
      The rule is that FP handlers are entered with regs->ARM_pc pointing at
      the _next_ instruction to be executed.  However, if the exception is
      not handled, regs->ARM_pc points at the faulting instruction.
      
      This is easy for ARM mode, because we know that the next instruction and
      previous instructions are separated by four bytes.  This is not true of
      Thumb2 though.
      
      Since all FP instructions are 32-bit in Thumb2, it makes things easy.
      We just need to select the appropriate adjustment.  Do this by moving
      the adjustment out of do_undefinstr() into the assembly code, as only
      the assembly code knows whether it's dealing with a 32-bit or 16-bit
      instruction.
      Acked-by: default avatarWill Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarRussell King <rmk+kernel@arm.linux.org.uk>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      83358b9b
    • Will Deacon's avatar
      ARM: 7478/1: errata: extend workaround for erratum #720789 · 1b481958
      Will Deacon authored
      commit 5a783cbc upstream.
      
      Commit cdf357f1 ("ARM: 6299/1: errata: TLBIASIDIS and TLBIMVAIS
      operations can broadcast a faulty ASID") replaced by-ASID TLB flushing
      operations with all-ASID variants to workaround A9 erratum #720789.
      
      This patch extends the workaround to include the tlb_range operations,
      which were overlooked by the original patch.
      Tested-by: default avatarSteve Capper <steve.capper@arm.com>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarRussell King <rmk+kernel@arm.linux.org.uk>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      1b481958
    • Colin Cross's avatar
      ARM: 7477/1: vfp: Always save VFP state in vfp_pm_suspend on UP · cdab6eed
      Colin Cross authored
      commit 24b35521 upstream.
      
      vfp_pm_suspend should save the VFP state in suspend after
      any lazy context switch.  If it only saves when the VFP is enabled,
      the state can get lost when, on a UP system:
        Thread 1 uses the VFP
        Context switch occurs to thread 2, VFP is disabled but the
           VFP context is not saved
        Thread 2 initiates suspend
        vfp_pm_suspend is called with the VFP disabled, and the unsaved
           VFP context of Thread 1 in the registers
      
      Modify vfp_pm_suspend to save the VFP context whenever
      vfp_current_hw_state is not NULL.
      
      Includes a fix from Ido Yariv <ido@wizery.com>, who pointed out that on
      SMP systems, the state pointer can be pointing to a freed task struct if
      a task exited on another cpu, fixed by using #ifndef CONFIG_SMP in the
      new if clause.
      
      Cc: Barry Song <bs14@csr.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Ido Yariv <ido@wizery.com>
      Cc: Daniel Drake <dsd@laptop.org>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarColin Cross <ccross@android.com>
      Signed-off-by: default avatarRussell King <rmk+kernel@arm.linux.org.uk>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      cdab6eed
    • Colin Cross's avatar
      ARM: 7476/1: vfp: only clear vfp state for current cpu in vfp_pm_suspend · fc257bc5
      Colin Cross authored
      commit a84b895a upstream.
      
      vfp_pm_suspend runs on each cpu, only clear the hardware state
      pointer for the current cpu.  Prevents a possible crash if one
      cpu clears the hw state pointer when another cpu has already
      checked if it is valid.
      Signed-off-by: default avatarColin Cross <ccross@android.com>
      Signed-off-by: default avatarRussell King <rmk+kernel@arm.linux.org.uk>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      fc257bc5
    • Will Deacon's avatar
      ARM: 7467/1: mutex: use generic xchg-based implementation for ARMv6+ · c51cf242
      Will Deacon authored
      commit a76d7bd9 upstream.
      
      The open-coded mutex implementation for ARMv6+ cores suffers from a
      severe lack of barriers, so in the uncontended case we don't actually
      protect any accesses performed during the critical section.
      
      Furthermore, the code is largely a duplication of the ARMv6+ atomic_dec
      code but optimised to remove a branch instruction, as the mutex fastpath
      was previously inlined. Now that this is executed out-of-line, we can
      reuse the atomic access code for the locking (in fact, we use the xchg
      code as this produces shorter critical sections).
      
      This patch uses the generic xchg based implementation for mutexes on
      ARMv6+, which introduces barriers to the lock/unlock operations and also
      has the benefit of removing a fair amount of inline assembly code.
      Acked-by: default avatarArnd Bergmann <arnd@arndb.de>
      Acked-by: default avatarNicolas Pitre <nico@linaro.org>
      Reported-by: default avatarShan Kang <kangshan0910@gmail.com>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarRussell King <rmk+kernel@arm.linux.org.uk>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      c51cf242
    • Shawn Guo's avatar
      ARM: 7466/1: disable interrupt before spinning endlessly · 2bd8f381
      Shawn Guo authored
      commit 98bd8b96 upstream.
      
      The CPU will endlessly spin at the end of machine_halt and
      machine_restart calls.  However, this will lead to a soft lockup
      warning after about 20 seconds, if CONFIG_LOCKUP_DETECTOR is enabled,
      as system timer is still alive.
      
      Disable interrupt before going to spin endlessly, so that the lockup
      warning will never be seen.
      Reported-by: default avatarMarek Vasut <marex@denx.de>
      Signed-off-by: default avatarShawn Guo <shawn.guo@linaro.org>
      Signed-off-by: default avatarRussell King <rmk+kernel@arm.linux.org.uk>
      [bwh: Backported to 3.2: adjust context]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      2bd8f381
    • Stanislav Kinsbursky's avatar
      SUNRPC: return negative value in case rpcbind client creation error · baea1631
      Stanislav Kinsbursky authored
      commit caea33da upstream.
      
      Without this patch kernel will panic on LockD start, because lockd_up() checks
      lockd_up_net() result for negative value.
      From my pow it's better to return negative value from rpcbind routines instead
      of replacing all such checks like in lockd_up().
      Signed-off-by: default avatarStanislav Kinsbursky <skinsbursky@parallels.com>
      Signed-off-by: default avatarTrond Myklebust <Trond.Myklebust@netapp.com>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      baea1631
    • Ryusuke Konishi's avatar
      nilfs2: fix deadlock issue between chcp and thaw ioctls · 58546c72
      Ryusuke Konishi authored
      commit 572d8b39 upstream.
      
      An fs-thaw ioctl causes deadlock with a chcp or mkcp -s command:
      
       chcp            D ffff88013870f3d0     0  1325   1324 0x00000004
       ...
       Call Trace:
         nilfs_transaction_begin+0x11c/0x1a0 [nilfs2]
         wake_up_bit+0x20/0x20
         copy_from_user+0x18/0x30 [nilfs2]
         nilfs_ioctl_change_cpmode+0x7d/0xcf [nilfs2]
         nilfs_ioctl+0x252/0x61a [nilfs2]
         do_page_fault+0x311/0x34c
         get_unmapped_area+0x132/0x14e
         do_vfs_ioctl+0x44b/0x490
         __set_task_blocked+0x5a/0x61
         vm_mmap_pgoff+0x76/0x87
         __set_current_blocked+0x30/0x4a
         sys_ioctl+0x4b/0x6f
         system_call_fastpath+0x16/0x1b
       thaw            D ffff88013870d890     0  1352   1351 0x00000004
       ...
       Call Trace:
         rwsem_down_failed_common+0xdb/0x10f
         call_rwsem_down_write_failed+0x13/0x20
         down_write+0x25/0x27
         thaw_super+0x13/0x9e
         do_vfs_ioctl+0x1f5/0x490
         vm_mmap_pgoff+0x76/0x87
         sys_ioctl+0x4b/0x6f
         filp_close+0x64/0x6c
         system_call_fastpath+0x16/0x1b
      
      where the thaw ioctl deadlocked at thaw_super() when called while chcp was
      waiting at nilfs_transaction_begin() called from
      nilfs_ioctl_change_cpmode().  This deadlock is 100% reproducible.
      
      This is because nilfs_ioctl_change_cpmode() first locks sb->s_umount in
      read mode and then waits for unfreezing in nilfs_transaction_begin(),
      whereas thaw_super() locks sb->s_umount in write mode.  The locking of
      sb->s_umount here was intended to make snapshot mounts and the downgrade
      of snapshots to checkpoints exclusive.
      
      This fixes the deadlock issue by replacing the sb->s_umount usage in
      nilfs_ioctl_change_cpmode() with a dedicated mutex which protects snapshot
      mounts.
      Signed-off-by: default avatarRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
      Cc: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp>
      Tested-by: default avatarRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      58546c72
    • Dan Rosenberg's avatar
      lib/vsprintf.c: kptr_restrict: fix pK-error in SysRq show-all-timers(Q) · 1b772a14
      Dan Rosenberg authored
      commit 3715c530 upstream.
      
      When using ALT+SysRq+Q all the pointers are replaced with "pK-error" like
      this:
      
      	[23153.208033]   .base:               pK-error
      
      with echo h > /proc/sysrq-trigger it works:
      
      	[23107.776363]   .base:       ffff88023e60d540
      
      The intent behind this behavior was to return "pK-error" in cases where
      the %pK format specifier was used in interrupt context, because the
      CAP_SYSLOG check wouldn't be meaningful.  Clearly this should only apply
      when kptr_restrict is actually enabled though.
      Reported-by: default avatarStevie Trujillo <stevie.trujillo@gmail.com>
      Signed-off-by: default avatarDan Rosenberg <dan.j.rosenberg@gmail.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      1b772a14
    • Greg Pearson's avatar
      pcdp: use early_ioremap/early_iounmap to access pcdp table · ffb709e6
      Greg Pearson authored
      commit 6c4088ac upstream.
      
      efi_setup_pcdp_console() is called during boot to parse the HCDP/PCDP
      EFI system table and setup an early console for printk output.  The
      routine uses ioremap/iounmap to setup access to the HCDP/PCDP table
      information.
      
      The call to ioremap is happening early in the boot process which leads
      to a panic on x86_64 systems:
      
          panic+0x01ca
          do_exit+0x043c
          oops_end+0x00a7
          no_context+0x0119
          __bad_area_nosemaphore+0x0138
          bad_area_nosemaphore+0x000e
          do_page_fault+0x0321
          page_fault+0x0020
          reserve_memtype+0x02a1
          __ioremap_caller+0x0123
          ioremap_nocache+0x0012
          efi_setup_pcdp_console+0x002b
          setup_arch+0x03a9
          start_kernel+0x00d4
          x86_64_start_reservations+0x012c
          x86_64_start_kernel+0x00fe
      
      This replaces the calls to ioremap/iounmap in efi_setup_pcdp_console()
      with calls to early_ioremap/early_iounmap which can be called during
      early boot.
      
      This patch was tested on an x86_64 prototype system which uses the
      HCDP/PCDP table for early console setup.
      Signed-off-by: default avatarGreg Pearson <greg.pearson@hp.com>
      Acked-by: default avatarKhalid Aziz <khalid.aziz@hp.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      ffb709e6
    • NeilBrown's avatar
      md/raid1: don't abort a resync on the first badblock. · 898ece8e
      NeilBrown authored
      commit b7219ccb upstream.
      
      If a resync of a RAID1 array with 2 devices finds a known bad block
      one device it will neither read from, or write to, that device for
      this block offset.
      So there will be one read_target (The other device) and zero write
      targets.
      This condition causes md/raid1 to abort the resync assuming that it
      has finished - without known bad blocks this would be true.
      
      When there are no write targets because of the presence of bad blocks
      we should only skip over the area covered by the bad block.
      RAID10 already gets this right, raid1 doesn't.  Or didn't.
      
      As this can cause a 'sync' to abort early and appear to have succeeded
      it could lead to some data corruption, so it suitable for -stable.
      Reported-by: default avatarAlexander Lyakas <alex.bolshoy@gmail.com>
      Signed-off-by: default avatarNeilBrown <neilb@suse.de>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      898ece8e
    • Jeff Layton's avatar
      nfs: skip commit in releasepage if we're freeing memory for fs-related reasons · 1c88c581
      Jeff Layton authored
      commit 5cf02d09 upstream.
      
      We've had some reports of a deadlock where rpciod ends up with a stack
      trace like this:
      
          PID: 2507   TASK: ffff88103691ab40  CPU: 14  COMMAND: "rpciod/14"
           #0 [ffff8810343bf2f0] schedule at ffffffff814dabd9
           #1 [ffff8810343bf3b8] nfs_wait_bit_killable at ffffffffa038fc04 [nfs]
           #2 [ffff8810343bf3c8] __wait_on_bit at ffffffff814dbc2f
           #3 [ffff8810343bf418] out_of_line_wait_on_bit at ffffffff814dbcd8
           #4 [ffff8810343bf488] nfs_commit_inode at ffffffffa039e0c1 [nfs]
           #5 [ffff8810343bf4f8] nfs_release_page at ffffffffa038bef6 [nfs]
           #6 [ffff8810343bf528] try_to_release_page at ffffffff8110c670
           #7 [ffff8810343bf538] shrink_page_list.clone.0 at ffffffff81126271
           #8 [ffff8810343bf668] shrink_inactive_list at ffffffff81126638
           #9 [ffff8810343bf818] shrink_zone at ffffffff8112788f
          #10 [ffff8810343bf8c8] do_try_to_free_pages at ffffffff81127b1e
          #11 [ffff8810343bf958] try_to_free_pages at ffffffff8112812f
          #12 [ffff8810343bfa08] __alloc_pages_nodemask at ffffffff8111fdad
          #13 [ffff8810343bfb28] kmem_getpages at ffffffff81159942
          #14 [ffff8810343bfb58] fallback_alloc at ffffffff8115a55a
          #15 [ffff8810343bfbd8] ____cache_alloc_node at ffffffff8115a2d9
          #16 [ffff8810343bfc38] kmem_cache_alloc at ffffffff8115b09b
          #17 [ffff8810343bfc78] sk_prot_alloc at ffffffff81411808
          #18 [ffff8810343bfcb8] sk_alloc at ffffffff8141197c
          #19 [ffff8810343bfce8] inet_create at ffffffff81483ba6
          #20 [ffff8810343bfd38] __sock_create at ffffffff8140b4a7
          #21 [ffff8810343bfd98] xs_create_sock at ffffffffa01f649b [sunrpc]
          #22 [ffff8810343bfdd8] xs_tcp_setup_socket at ffffffffa01f6965 [sunrpc]
          #23 [ffff8810343bfe38] worker_thread at ffffffff810887d0
          #24 [ffff8810343bfee8] kthread at ffffffff8108dd96
          #25 [ffff8810343bff48] kernel_thread at ffffffff8100c1ca
      
      rpciod is trying to allocate memory for a new socket to talk to the
      server. The VM ends up calling ->releasepage to get more memory, and it
      tries to do a blocking commit. That commit can't succeed however without
      a connected socket, so we deadlock.
      
      Fix this by setting PF_FSTRANS on the workqueue task prior to doing the
      socket allocation, and having nfs_release_page check for that flag when
      deciding whether to do a commit call. Also, set PF_FSTRANS
      unconditionally in rpc_async_schedule since that function can also do
      allocations sometimes.
      Signed-off-by: default avatarJeff Layton <jlayton@redhat.com>
      Signed-off-by: default avatarTrond Myklebust <Trond.Myklebust@netapp.com>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      1c88c581