Commits · 58a07b8e5a1eae96eef6e14d544a71698afc52b2 · Kirill Smelkov / linux

27 Feb, 2015 12 commits

bridge: dont send notification when skb->len == 0 in rtnl_bridge_notify · 58a07b8e

Roopa Prabhu authored Jan 28, 2015

[ Upstream commit 59ccaaaa ]

Reported in: https://bugzilla.kernel.org/show_bug.cgi?id=92081

This patch avoids calling rtnl_notify if the device ndo_bridge_getlink
handler does not return any bytes in the skb.

Alternately, the skb->len check can be moved inside rtnl_notify.

For the bridge vlan case described in 92081, there is also a fix needed
in bridge driver to generate a proper notification. Will fix that in
subsequent patch.

v2: rebase patch on net tree
Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

58a07b8e

net: don't OOPS on socket aio · 49871587

Christoph Hellwig authored Jan 27, 2015

[ Upstream commit 06539d30 ]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

49871587

bnx2x: fix napi poll return value for repoll · db99ad88

Govindarajulu Varadarajan authored Jan 25, 2015

[ Upstream commit 24e579c8 ]

With the commit d75b1ade ("net: less interrupt masking in NAPI") napi
repoll is done only when work_done == budget. When in busy_poll is we return 0
in napi_poll. We should return budget.
Signed-off-by: Govindarajulu Varadarajan <_govind@gmx.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

db99ad88

ipv6: replacing a rt6_info needs to purge possible propagated rt6_infos too · c4dfc9f3

Hannes Frederic Sowa authored Jan 26, 2015

[ Upstream commit 6e9e16e6 ]

Lubomir Rintel reported that during replacing a route the interface
reference counter isn't correctly decremented.

To quote bug <https://bugzilla.kernel.org/show_bug.cgi?id=91941>:
| [root@rhel7-5 lkundrak]# sh -x lal
| + ip link add dev0 type dummy
| + ip link set dev0 up
| + ip link add dev1 type dummy
| + ip link set dev1 up
| + ip addr add 2001:db8:8086::2/64 dev dev0
| + ip route add 2001:db8:8086::/48 dev dev0 proto static metric 20
| + ip route add 2001:db8:8088::/48 dev dev1 proto static metric 10
| + ip route replace 2001:db8:8086::/48 dev dev1 proto static metric 20
| + ip link del dev0 type dummy
| Message from syslogd@rhel7-5 at Jan 23 10:54:41 ...
|  kernel:unregister_netdevice: waiting for dev0 to become free. Usage count = 2
|
| Message from syslogd@rhel7-5 at Jan 23 10:54:51 ...
|  kernel:unregister_netdevice: waiting for dev0 to become free. Usage count = 2

During replacement of a rt6_info we must walk all parent nodes and check
if the to be replaced rt6_info got propagated. If so, replace it with
an alive one.

Fixes: 4a287eba ("IPv6 routing, NLM_F_* flag support: REPLACE and EXCL flags support, warn about missing CREATE flag")
Reported-by: Lubomir Rintel <lkundrak@v3.sk>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Tested-by: Lubomir Rintel <lkundrak@v3.sk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

c4dfc9f3

ping: Fix race in free in receive path · d0db2610

subashab@codeaurora.org authored Jan 23, 2015

[ Upstream commit fc752f1f ]

An exception is seen in ICMP ping receive path where the skb
destructor sock_rfree() tries to access a freed socket. This happens
because ping_rcv() releases socket reference with sock_put() and this
internally frees up the socket. Later icmp_rcv() will try to free the
skb and as part of this, skb destructor is called and which leads
to a kernel panic as the socket is freed already in ping_rcv().

-->|exception
-007|sk_mem_uncharge
-007|sock_rfree
-008|skb_release_head_state
-009|skb_release_all
-009|__kfree_skb
-010|kfree_skb
-011|icmp_rcv
-012|ip_local_deliver_finish

Fix this incorrect free by cloning this skb and processing this cloned
skb instead.

This patch was suggested by Eric Dumazet
Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
Cc: Eric Dumazet <edumazet@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

d0db2610

udp_diag: Fix socket skipping within chain · b55e8b26

Herbert Xu authored Jan 24, 2015

[ Upstream commit 86f3cddb ]

While working on rhashtable walking I noticed that the UDP diag
dumping code is buggy.  In particular, the socket skipping within
a chain never happens, even though we record the number of sockets
that should be skipped.

As this code was supposedly copied from TCP, this patch does what
TCP does and resets num before we walk a chain.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

b55e8b26

ipv4: try to cache dst_entries which would cause a redirect · ee6db0ad

Hannes Frederic Sowa authored Jan 23, 2015

[ Upstream commit df4d9254 ]

Not caching dst_entries which cause redirects could be exploited by hosts
on the same subnet, causing a severe DoS attack. This effect aggravated
since commit f8864972 ("ipv4: fix dst race in sk_dst_get()").

Lookups causing redirects will be allocated with DST_NOCACHE set which
will force dst_release to free them via RCU.  Unfortunately waiting for
RCU grace period just takes too long, we can end up with >1M dst_entries
waiting to be released and the system will run OOM. rcuos threads cannot
catch up under high softirq load.

Attaching the flag to emit a redirect later on to the specific skb allows
us to cache those dst_entries thus reducing the pressure on allocation
and deallocation.

This issue was discovered by Marcelo Leitner.

Cc: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Marcelo Leitner <mleitner@redhat.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ee6db0ad

net: sctp: fix slab corruption from use after free on INIT collisions · faf1368d

Daniel Borkmann authored Jan 22, 2015

[ Upstream commit 600ddd68 ]

When hitting an INIT collision case during the 4WHS with AUTH enabled, as
already described in detail in commit 1be9a950 ("net: sctp: inherit
auth_capable on INIT collisions"), it can happen that we occasionally
still remotely trigger the following panic on server side which seems to
have been uncovered after the fix from commit 1be9a950 ...

[  533.876389] BUG: unable to handle kernel paging request at 00000000ffffffff
[  533.913657] IP: [<ffffffff811ac385>] __kmalloc+0x95/0x230
[  533.940559] PGD 5030f2067 PUD 0
[  533.957104] Oops: 0000 [#1] SMP
[  533.974283] Modules linked in: sctp mlx4_en [...]
[  534.939704] Call Trace:
[  534.951833]  [<ffffffff81294e30>] ? crypto_init_shash_ops+0x60/0xf0
[  534.984213]  [<ffffffff81294e30>] crypto_init_shash_ops+0x60/0xf0
[  535.015025]  [<ffffffff8128c8ed>] __crypto_alloc_tfm+0x6d/0x170
[  535.045661]  [<ffffffff8128d12c>] crypto_alloc_base+0x4c/0xb0
[  535.074593]  [<ffffffff8160bd42>] ? _raw_spin_lock_bh+0x12/0x50
[  535.105239]  [<ffffffffa0418c11>] sctp_inet_listen+0x161/0x1e0 [sctp]
[  535.138606]  [<ffffffff814e43bd>] SyS_listen+0x9d/0xb0
[  535.166848]  [<ffffffff816149a9>] system_call_fastpath+0x16/0x1b

... or depending on the the application, for example this one:

[ 1370.026490] BUG: unable to handle kernel paging request at 00000000ffffffff
[ 1370.026506] IP: [<ffffffff811ab455>] kmem_cache_alloc+0x75/0x1d0
[ 1370.054568] PGD 633c94067 PUD 0
[ 1370.070446] Oops: 0000 [#1] SMP
[ 1370.085010] Modules linked in: sctp kvm_amd kvm [...]
[ 1370.963431] Call Trace:
[ 1370.974632]  [<ffffffff8120f7cf>] ? SyS_epoll_ctl+0x53f/0x960
[ 1371.000863]  [<ffffffff8120f7cf>] SyS_epoll_ctl+0x53f/0x960
[ 1371.027154]  [<ffffffff812100d3>] ? anon_inode_getfile+0xd3/0x170
[ 1371.054679]  [<ffffffff811e3d67>] ? __alloc_fd+0xa7/0x130
[ 1371.080183]  [<ffffffff816149a9>] system_call_fastpath+0x16/0x1b

With slab debugging enabled, we can see that the poison has been overwritten:

[  669.826368] BUG kmalloc-128 (Tainted: G        W     ): Poison overwritten
[  669.826385] INFO: 0xffff880228b32e50-0xffff880228b32e50. First byte 0x6a instead of 0x6b
[  669.826414] INFO: Allocated in sctp_auth_create_key+0x23/0x50 [sctp] age=3 cpu=0 pid=18494
[  669.826424]  __slab_alloc+0x4bf/0x566
[  669.826433]  __kmalloc+0x280/0x310
[  669.826453]  sctp_auth_create_key+0x23/0x50 [sctp]
[  669.826471]  sctp_auth_asoc_create_secret+0xcb/0x1e0 [sctp]
[  669.826488]  sctp_auth_asoc_init_active_key+0x68/0xa0 [sctp]
[  669.826505]  sctp_do_sm+0x29d/0x17c0 [sctp] [...]
[  669.826629] INFO: Freed in kzfree+0x31/0x40 age=1 cpu=0 pid=18494
[  669.826635]  __slab_free+0x39/0x2a8
[  669.826643]  kfree+0x1d6/0x230
[  669.826650]  kzfree+0x31/0x40
[  669.826666]  sctp_auth_key_put+0x19/0x20 [sctp]
[  669.826681]  sctp_assoc_update+0x1ee/0x2d0 [sctp]
[  669.826695]  sctp_do_sm+0x674/0x17c0 [sctp]

Since this only triggers in some collision-cases with AUTH, the problem at
heart is that sctp_auth_key_put() on asoc->asoc_shared_key is called twice
when having refcnt 1, once directly in sctp_assoc_update() and yet again
from within sctp_auth_asoc_init_active_key() via sctp_assoc_update() on
the already kzfree'd memory, which is also consistent with the observation
of the poison decrease from 0x6b to 0x6a (note: the overwrite is detected
at a later point in time when poison is checked on new allocation).

Reference counting of auth keys revisited:

Shared keys for AUTH chunks are being stored in endpoints and associations
in endpoint_shared_keys list. On endpoint creation, a null key is being
added; on association creation, all endpoint shared keys are being cached
and thus cloned over to the association. struct sctp_shared_key only holds
a pointer to the actual key bytes, that is, struct sctp_auth_bytes which
keeps track of users internally through refcounting. Naturally, on assoc
or enpoint destruction, sctp_shared_key are being destroyed directly and
the reference on sctp_auth_bytes dropped.

User space can add keys to either list via setsockopt(2) through struct
sctp_authkey and by passing that to sctp_auth_set_key() which replaces or
adds a new auth key. There, sctp_auth_create_key() creates a new sctp_auth_bytes
with refcount 1 and in case of replacement drops the reference on the old
sctp_auth_bytes. A key can be set active from user space through setsockopt()
on the id via sctp_auth_set_active_key(), which iterates through either
endpoint_shared_keys and in case of an assoc, invokes (one of various places)
sctp_auth_asoc_init_active_key().

sctp_auth_asoc_init_active_key() computes the actual secret from local's
and peer's random, hmac and shared key parameters and returns a new key
directly as sctp_auth_bytes, that is asoc->asoc_shared_key, plus drops
the reference if there was a previous one. The secret, which where we
eventually double drop the ref comes from sctp_auth_asoc_set_secret() with
intitial refcount of 1, which also stays unchanged eventually in
sctp_assoc_update(). This key is later being used for crypto layer to
set the key for the hash in crypto_hash_setkey() from sctp_auth_calculate_hmac().

To close the loop: asoc->asoc_shared_key is freshly allocated secret
material and independant of the sctp_shared_key management keeping track
of only shared keys in endpoints and assocs. Hence, also commit 4184b2a7
("net: sctp: fix memory leak in auth key management") is independant of
this bug here since it concerns a different layer (though same structures
being used eventually). asoc->asoc_shared_key is reference dropped correctly
on assoc destruction in sctp_association_free() and when active keys are
being replaced in sctp_auth_asoc_init_active_key(), it always has a refcount
of 1. Hence, it's freed prematurely in sctp_assoc_update(). Simple fix is
to remove that sctp_auth_key_put() from there which fixes these panics.

Fixes: 730fc3d0 ("[SCTP]: Implete SCTP-AUTH parameter processing")
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

faf1368d

netxen: fix netxen_nic_poll() logic · 93feca3d

Eric Dumazet authored Jan 22, 2015

[ Upstream commit 6088beef ]

NAPI poll logic now enforces that a poller returns exactly the budget
when it wants to be called again.

If a driver limits TX completion, it has to return budget as well when
the limit is hit, not the number of received packets.
Reported-and-tested-by: Mike Galbraith <umgwanakikbuti@gmail.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Fixes: d75b1ade ("net: less interrupt masking in NAPI")
Cc: Manish Chopra <manish.chopra@qlogic.com>
Acked-by: Manish Chopra <manish.chopra@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

93feca3d

ipv6: stop sending PTB packets for MTU < 1280 · 7aa58c22

Hagen Paul Pfeifer authored Jan 15, 2015

[ Upstream commit 9d289715 ]

Reduce the attack vector and stop generating IPv6 Fragment Header for
paths with an MTU smaller than the minimum required IPv6 MTU
size (1280 byte) - called atomic fragments.

See IETF I-D "Deprecating the Generation of IPv6 Atomic Fragments" [1]
for more information and how this "feature" can be misused.

[1] https://tools.ietf.org/html/draft-ietf-6man-deprecate-atomfrag-generation-00Signed-off-by: Fernando Gont <fgont@si6networks.com>
Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

7aa58c22

net: rps: fix cpu unplug · 64d76f56

Eric Dumazet authored Jan 15, 2015

[ Upstream commit ac64da0b ]

softnet_data.input_pkt_queue is protected by a spinlock that
we must hold when transferring packets from victim queue to an active
one. This is because other cpus could still be trying to enqueue packets
into victim queue.

A second problem is that when we transfert the NAPI poll_list from
victim to current cpu, we absolutely need to special case the percpu
backlog, because we do not want to add complex locking to protect
process_queue : Only owner cpu is allowed to manipulate it, unless cpu
is offline.

Based on initial patch from Prasad Sodagudi & Subash Abhinov
Kasiviswanathan.

This version is better because we do not slow down packet processing,
only make migration safer.
Reported-by: Prasad Sodagudi <psodagud@codeaurora.org>
Reported-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

64d76f56

ip: zero sockaddr returned on error queue · ada98740

Willem de Bruijn authored Jan 15, 2015

[ Upstream commit f812116b ]

The sockaddr is returned in IP(V6)_RECVERR as part of errhdr. That
structure is defined and allocated on the stack as

    struct {
            struct sock_extended_err ee;
            struct sockaddr_in(6)    offender;
    } errhdr;

The second part is only initialized for certain SO_EE_ORIGIN values.
Always initialize it completely.

An MTU exceeded error on a SOCK_RAW/IPPROTO_RAW is one example that
would return uninitialized bytes.
Signed-off-by: Willem de Bruijn <willemb@google.com>

----

Also verified that there is no padding between errhdr.ee and
errhdr.offender that could leak additional kernel data.
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ada98740

11 Feb, 2015 22 commits

Linux 3.14.33 · a74f1d12
Greg Kroah-Hartman authored Feb 11, 2015

a74f1d12

crypto: crc32c - add missing crypto module alias · 28e24c6d

Mathias Krause authored Feb 10, 2015

The backport of commit 5d26a105 ("crypto: prefix module autoloading
with "crypto-"") lost the MODULE_ALIAS_CRYPTO() annotation of crc32c.c.
Add it to fix the reported filesystem related regressions.
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Reported-by: Philip Müller <philm@manjaro.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Rob McCathie <rob@manjaro.org>
Cc: Luis Henriques <luis.henriques@canonical.com>
Cc: Kamal Mostafa <kamal@canonical.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

28e24c6d

x86,kvm,vmx: Preserve CR4 across VM entry · 5fb88e88

Andy Lutomirski authored Oct 08, 2014

commit d974baa3 upstream.

CR4 isn't constant; at least the TSD and PCE bits can vary.

TBH, treating CR0 and CR3 as constant scares me a bit, too, but it looks
like it's correct.

This adds a branch and a read from cr4 to each vm entry.  Because it is
extremely likely that consecutive entries into the same vcpu will have
the same host cr4 value, this fixes up the vmcs instead of restoring cr4
after the fact.  A subsequent patch will add a kernel-wide cr4 shadow,
reducing the overhead in the common case to just two memory reads and a
branch.
Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Cc: Petr Matousek <pmatouse@redhat.com>
Cc: Gleb Natapov <gleb@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
[wangkai: Backport to 3.10: adjust context]
Signed-off-by: Wang Kai <morgan.wang@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

5fb88e88

smpboot: Add missing get_online_cpus() in smpboot_register_percpu_thread() · 4748a404

Lai Jiangshan authored Jul 31, 2014

commit 4bee9686 upstream.

The following race exists in the smpboot percpu threads management:

CPU0	      	   	     CPU1
cpu_up(2)
  get_online_cpus();
  smpboot_create_threads(2);
			     smpboot_register_percpu_thread();
			     for_each_online_cpu();
			       __smpboot_create_thread();
  __cpu_up(2);

This results in a missing per cpu thread for the newly onlined cpu2 and
in a NULL pointer dereference on a consecutive offline of that cpu.

Proctect smpboot_register_percpu_thread() with get_online_cpus() to
prevent that.

[ tglx: Massaged changelog and removed the change in
        smpboot_unregister_percpu_thread() because that's an
        optimization and therefor not stable material. ]
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Cc: David Rientjes <rientjes@google.com>
Link: http://lkml.kernel.org/r/1406777421-12830-1-git-send-email-laijs@cn.fujitsu.comSigned-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

4748a404

ALSA: ak411x: Fix stall in work callback · 2451dcef

Takashi Iwai authored Jan 13, 2015

commit 4161b450 upstream.

When ak4114 work calls its callback and the callback invokes
ak4114_reinit(), it stalls due to flush_delayed_work().  For avoiding
this, control the reentrance by introducing a refcount.  Also
flush_delayed_work() is replaced with cancel_delayed_work_sync().

The exactly same bug is present in ak4113.c and fixed as well.
Reported-by: Pavel Hofman <pavel.hofman@ivitera.com>
Acked-by: Jaroslav Kysela <perex@perex.cz>
Tested-by: Pavel Hofman <pavel.hofman@ivitera.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

2451dcef

ASoC: sgtl5000: add delay before first I2C access · 48e81ac2

Eric Nelson authored Jan 30, 2015

commit 58cc9c9a upstream.

To quote from section 1.3.1 of the data sheet:
	The SGTL5000 has an internal reset that is deasserted
	8 SYS_MCLK cycles after all power rails have been brought
	up. After this time, communication can start

	...
	1.0us represents 8 SYS_MCLK cycles at the minimum 8.0 MHz SYS_MCLK.
Signed-off-by: Eric Nelson <eric.nelson@boundarydevices.com>
Reviewed-by: Fabio Estevam <fabio.estevam@freescale.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

48e81ac2

ASoC: atmel_ssc_dai: fix start event for I2S mode · e23daf1c

Bo Shen authored Jan 20, 2015

commit a43bd7e1 upstream.

According to the I2S specification information as following:
  - WS = 0, channel 1 (left)
  - WS = 1, channel 2 (right)
So, the start event should be TF/RF falling edge.
Reported-by: Songjun Wu <songjun.wu@atmel.com>
Signed-off-by: Bo Shen <voice.shen@atmel.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

e23daf1c

lib/checksum.c: fix build for generic csum_tcpudp_nofold · 52e81bd5

karl beldan authored Jan 29, 2015

commit 9ce35779 upstream.

Fixed commit added from64to32 under _#ifndef do_csum_ but used it
under _#ifndef csum_tcpudp_nofold_, breaking some builds (Fengguang's
robot reported TILEGX's). Move from64to32 under the latter.

Fixes: 150ae0e9 ("lib/checksum.c: fix carry in csum_tcpudp_nofold")
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Karl Beldan <karl.beldan@rivierawaves.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cc: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

52e81bd5

ext4: prevent bugon on race between write/fcntl · 07110343

Dmitry Monakhov authored Oct 30, 2014

commit a41537e6 upstream.

O_DIRECT flags can be toggeled via fcntl(F_SETFL). But this value checked
twice inside ext4_file_write_iter() and __generic_file_write() which
result in BUG_ON inside ext4_direct_IO.

Let's initialize iocb->private unconditionally.

TESTCASE: xfstest:generic/036  https://patchwork.ozlabs.org/patch/402445/

#TYPICAL STACK TRACE:
kernel BUG at fs/ext4/inode.c:2960!
invalid opcode: 0000 [#1] SMP
Modules linked in: brd iTCO_wdt lpc_ich mfd_core igb ptp dm_mirror dm_region_hash dm_log dm_mod
CPU: 6 PID: 5505 Comm: aio-dio-fcntl-r Not tainted 3.17.0-rc2-00176-gff5c017 #161
Hardware name: Intel Corporation W2600CR/W2600CR, BIOS SE5C600.86B.99.99.x028.061320111235 06/13/2011
task: ffff88080e95a7c0 ti: ffff88080f908000 task.ti: ffff88080f908000
RIP: 0010:[<ffffffff811fabf2>]  [<ffffffff811fabf2>] ext4_direct_IO+0x162/0x3d0
RSP: 0018:ffff88080f90bb58  EFLAGS: 00010246
RAX: 0000000000000400 RBX: ffff88080fdb2a28 RCX: 00000000a802c818
RDX: 0000040000080000 RSI: ffff88080d8aeb80 RDI: 0000000000000001
RBP: ffff88080f90bbc8 R08: 0000000000000000 R09: 0000000000001581
R10: 0000000000000000 R11: 0000000000000000 R12: ffff88080d8aeb80
R13: ffff88080f90bbf8 R14: ffff88080fdb28c8 R15: ffff88080fdb2a28
FS:  00007f23b2055700(0000) GS:ffff880818400000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f23b2045000 CR3: 000000080cedf000 CR4: 00000000000407e0
Stack:
 ffff88080f90bb98 0000000000000000 7ffffffffffffffe ffff88080fdb2c30
 0000000000000200 0000000000000200 0000000000000001 0000000000000200
 ffff88080f90bbc8 ffff88080fdb2c30 ffff88080f90be08 0000000000000200
Call Trace:
 [<ffffffff8112ca9d>] generic_file_direct_write+0xed/0x180
 [<ffffffff8112f2b2>] __generic_file_write_iter+0x222/0x370
 [<ffffffff811f495b>] ext4_file_write_iter+0x34b/0x400
 [<ffffffff811bd709>] ? aio_run_iocb+0x239/0x410
 [<ffffffff811bd709>] ? aio_run_iocb+0x239/0x410
 [<ffffffff810990e5>] ? local_clock+0x25/0x30
 [<ffffffff810abd94>] ? __lock_acquire+0x274/0x700
 [<ffffffff811f4610>] ? ext4_unwritten_wait+0xb0/0xb0
 [<ffffffff811bd756>] aio_run_iocb+0x286/0x410
 [<ffffffff810990e5>] ? local_clock+0x25/0x30
 [<ffffffff810ac359>] ? lock_release_holdtime+0x29/0x190
 [<ffffffff811bc05b>] ? lookup_ioctx+0x4b/0xf0
 [<ffffffff811bde3b>] do_io_submit+0x55b/0x740
 [<ffffffff811bdcaa>] ? do_io_submit+0x3ca/0x740
 [<ffffffff811be030>] SyS_io_submit+0x10/0x20
 [<ffffffff815ce192>] system_call_fastpath+0x16/0x1b
Code: 01 48 8b 80 f0 01 00 00 48 8b 18 49 8b 45 10 0f 85 f1 01 00 00 48 03 45 c8 48 3b 43 48 0f 8f e3 01 00 00 49 83 7c
24 18 00 75 04 <0f> 0b eb fe f0 ff 83 ec 01 00 00 49 8b 44 24 18 8b 00 85 c0 89
RIP  [<ffffffff811fabf2>] ext4_direct_IO+0x162/0x3d0
 RSP <ffff88080f90bb58>
Reported-by: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
[hujianyang: Backported to 3.10
 - Move initialization of iocb->private to ext4_file_write() as we don't
   have ext4_file_write_iter(), which is introduced by commit 9b884164.
 - Adjust context to make 'overwrite' changes apply to ext4_file_dio_write()
   as ext4_file_dio_write() is not move into ext4_file_write()]
Signed-off-by: hujianyang <hujianyang@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

07110343

arm64: Fix up /proc/cpuinfo · 42b34c73

Mark Rutland authored Oct 24, 2014

commit 44b82b77 upstream.

Commit d7a49086 (arm64: cpuinfo: print info for all CPUs)
attempted to clean up /proc/cpuinfo, but due to concerns regarding
further changes was reverted in commit 5e39977e (Revert "arm64:
cpuinfo: print info for all CPUs").

There are two major issues with the arm64 /proc/cpuinfo format
currently:

* The "Features" line describes (only) the 64-bit hwcaps, which is
  problematic for some 32-bit applications which attempt to parse it. As
  the same names are used for analogous ISA features (e.g. aes) despite
  these generally being architecturally unrelated, it is not possible to
  simply append the 64-bit and 32-bit hwcaps in a manner that might not
  be misleading to some applications.

  Various potential solutions have appeared in vendor kernels. Typically
  the format of the Features line varies depending on whether the task
  is 32-bit.

* Information is only printed regarding a single CPU. This does not
  match the ARM format, and does not provide sufficient information in
  big.LITTLE systems where CPUs are heterogeneous. The CPU information
  printed is queried from the current CPU's registers, which is racy
  w.r.t. cross-cpu migration.

This patch attempts to solve these issues. The following changes are
made:

* When a task with a LINUX32 personality attempts to read /proc/cpuinfo,
  the "Features" line contains the decoded 32-bit hwcaps, as with the
  arm port. Otherwise, the decoded 64-bit hwcaps are shown. This aligns
  with the behaviour of COMPAT_UTS_MACHINE and COMPAT_ELF_PLATFORM. In
  the absense of compat support, the Features line is empty.

  The set of hwcaps injected into a task's auxval are unaffected.

* Properties are printed per-cpu, as with the ARM port. The per-cpu
  information is queried from pre-recorded cpu information (as used by
  the sanity checks).

* As with the previous attempt at fixing up /proc/cpuinfo, the hardware
  field is removed. The only users so far are 32-bit applications tied
  to particular boards, so no portable applications should be affected,
  and this should prevent future tying to particular boards.

The following differences remain:

* No model_name is printed, as this cannot be queried from the hardware
  and cannot be provided in a stable fashion. Use of the CPU
  {implementor,variant,part,revision} fields is sufficient to identify a
  CPU and is portable across arm and arm64.

* The following system-wide properties are not provided, as they are not
  possible to provide generally. Programs relying on these are already
  tied to particular (32-bit only) boards:
  - Hardware
  - Revision
  - Serial

No software has yet been identified for which these remaining
differences are problematic.

Cc: Greg Hackmann <ghackmann@google.com>
Cc: Ian Campbell <ijc@hellion.org.uk>
Cc: Serban Constantinescu <serban.constantinescu@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: cross-distro@lists.linaro.org
Cc: linux-api@vger.kernel.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

42b34c73

kconfig: Fix warning "‘jump’ may be used uninitialized" · 7b823e82

Peter Kümmel authored Nov 04, 2014

commit 2d560306 upstream.

Warning:
In file included from scripts/kconfig/zconf.tab.c:2537:0:
scripts/kconfig/menu.c: In function ‘get_symbol_str’:
scripts/kconfig/menu.c:590:18: warning: ‘jump’ may be used uninitialized in this function [-Wmaybe-uninitialized]
     jump->offset = strlen(r->s);

Simplifies the test logic because (head && local) means (jump != 0)
and makes GCC happy when checking if the jump pointer was initialized.
Signed-off-by: Peter Kümmel <syntheticpp@gmx.net>
Signed-off-by: Michal Marek <mmarek@suse.cz>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

7b823e82

nilfs2: fix deadlock of segment constructor over I_SYNC flag · 52e87609

Ryusuke Konishi authored Feb 05, 2015

commit 7ef3ff2f upstream.

Nilfs2 eventually hangs in a stress test with fsstress program.  This
issue was caused by the following deadlock over I_SYNC flag between
nilfs_segctor_thread() and writeback_sb_inodes():

  nilfs_segctor_thread()
    nilfs_segctor_thread_construct()
      nilfs_segctor_unlock()
        nilfs_dispose_list()
          iput()
            iput_final()
              evict()
                inode_wait_for_writeback()  * wait for I_SYNC flag

  writeback_sb_inodes()
     * set I_SYNC flag on inode->i_state
    __writeback_single_inode()
      do_writepages()
        nilfs_writepages()
          nilfs_construct_dsync_segment()
            nilfs_segctor_sync()
               * wait for completion of segment constructor
    inode_sync_complete()
       * clear I_SYNC flag after __writeback_single_inode() completed

writeback_sb_inodes() calls do_writepages() for dirty inodes after
setting I_SYNC flag on inode->i_state.  do_writepages() in turn calls
nilfs_writepages(), which can run segment constructor and wait for its
completion.  On the other hand, segment constructor calls iput(), which
can call evict() and wait for the I_SYNC flag on
inode_wait_for_writeback().

Since segment constructor doesn't know when I_SYNC will be set, it
cannot know whether iput() will block or not unless inode->i_nlink has a
non-zero count.  We can prevent evict() from being called in iput() by
implementing sop->drop_inode(), but it's not preferable to leave inodes
with i_nlink == 0 for long periods because it even defers file
truncation and inode deallocation.  So, this instead resolves the
deadlock by calling iput() asynchronously with a workqueue for inodes
with i_nlink == 0.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Tested-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

52e87609

lib/checksum.c: fix carry in csum_tcpudp_nofold · 1ff733ae

karl beldan authored Jan 28, 2015

commit 150ae0e9 upstream.

The carry from the 64->32bits folding was dropped, e.g with:
saddr=0xFFFFFFFF daddr=0xFF0000FF len=0xFFFF proto=0 sum=1,
csum_tcpudp_nofold returned 0 instead of 1.
Signed-off-by: Karl Beldan <karl.beldan@rivierawaves.com>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Mike Frysinger <vapier@gentoo.org>
Cc: netdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

1ff733ae

mm: pagewalk: call pte_hole() for VM_PFNMAP during walk_page_range · 75a94c27

Shiraz Hashim authored Feb 05, 2015

commit 23aaed66 upstream.

walk_page_range() silently skips vma having VM_PFNMAP set, which leads
to undesirable behaviour at client end (who called walk_page_range).
Userspace applications get the wrong data, so the effect is like just
confusing users (if the applications just display the data) or sometimes
killing the processes (if the applications do something with
misunderstanding virtual addresses due to the wrong data.)

For example for pagemap_read, when no callbacks are called against
VM_PFNMAP vma, pagemap_read may prepare pagemap data for next virtual
address range at wrong index.

Eventually userspace may get wrong pagemap data for a task.
Corresponding to a VM_PFNMAP marked vma region, kernel may report
mappings from subsequent vma regions.  User space in turn may account
more pages (than really are) to the task.

In my case I was using procmem, procrack (Android utility) which uses
pagemap interface to account RSS pages of a task.  Due to this bug it
was giving a wrong picture for vmas (with VM_PFNMAP set).

Fixes: a9ff785e ("mm/pagewalk.c: walk_page_range should avoid VM_PFNMAP areas")
Signed-off-by: Shiraz Hashim <shashim@codeaurora.org>
Acked-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

75a94c27

Complete oplock break jobs before closing file handle · 7af7e9a4

Sachin Prabhu authored Jan 15, 2015

commit ca7df8e0 upstream.

Commit
c11f1df5
requires writers to wait for any pending oplock break handler to
complete before proceeding to write. This is done by waiting on bit
CIFS_INODE_PENDING_OPLOCK_BREAK in cifsFileInfo->flags. This bit is
cleared by the oplock break handler job queued on the workqueue once it
has completed handling the oplock break allowing writers to proceed with
writing to the file.

While testing, it was noticed that the filehandle could be closed while
there is a pending oplock break which results in the oplock break
handler on the cifsiod workqueue being cancelled before it has had a
chance to execute and clear the CIFS_INODE_PENDING_OPLOCK_BREAK bit.
Any subsequent attempt to write to this file hangs waiting for the
CIFS_INODE_PENDING_OPLOCK_BREAK bit to be cleared.

We fix this by ensuring that we also clear the bit
CIFS_INODE_PENDING_OPLOCK_BREAK when we remove the oplock break handler
from the workqueue.

The bug was found by Red Hat QA while testing using ltp's fsstress
command.
Signed-off-by: Sachin Prabhu <sprabhu@redhat.com>
Acked-by: Shirish Pargaonkar <shirishpargaonkar@gmail.com>
Signed-off-by: Jeff Layton <jlayton@samba.org>
Signed-off-by: Steve French <steve.french@primarydata.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

7af7e9a4

ARM: 8299/1: mm: ensure local active ASID is marked as allocated on rollover · 95f61468

Will Deacon authored Jan 29, 2015

commit 8e648066 upstream.

Commit e1a5848e ("ARM: 7924/1: mm: don't bother with reserved ttbr0
when running with LPAE") removed the use of the reserved TTBR0 value
for LPAE systems, since the ASID is held in the TTBR and can be updated
atomicly with the pgd of the next mm.

Unfortunately, this patch forgot to update flush_context, which
deliberately avoids marking the local active ASID as allocated, since we
used to switch via ASID zero and didn't need to allocate the ASID of
the previous mm. The side-effect of this is that we can allocate the
same ASID to the next mm and, between flushing the local TLB and updating
TTBR0, we can perform speculative TLB fills for userspace nG mappings
using the page table of the previous mm.

The consequence of this is that the next mm can erroneously hit some
mappings of the previous mm. Note that this was made significantly
harder to hit by a391263c ("ARM: 8203/1: mm: try to re-use old ASID
assignments following a rollover") but is still theoretically possible.

This patch fixes the problem by removing the code from flush_context
that forces the allocated ASID to zero for the local CPU. Many thanks
to the Broadcom guys for tracking this one down.

Fixes: e1a5848e ("ARM: 7924/1: mm: don't bother with reserved ttbr0 when running with LPAE")
Reported-by: Raymond Ngun <rngun@broadcom.com>
Tested-by: Raymond Ngun <rngun@broadcom.com>
Reviewed-by: Gregory Fong <gregory.0xf0@gmail.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

95f61468

MIPS: Fix kernel lockup or crash after CPU offline/online · d7eb804c

Hemmo Nieminen authored Jan 15, 2015

commit c7754e75 upstream.

As printk() invocation can cause e.g. a TLB miss, printk() cannot be
called before the exception handlers have been properly initialized.
This can happen e.g. when netconsole has been loaded as a kernel module
and the TLB table has been cleared when a CPU was offline.

Call cpu_report() in start_secondary() only after the exception handlers
have been initialized to fix this.

Without the patch the kernel will randomly either lockup or crash
after a CPU is onlined and the console driver is a module.
Signed-off-by: Hemmo Nieminen <hemmo.nieminen@iki.fi>
Signed-off-by: Aaro Koskinen <aaro.koskinen@iki.fi>
Cc: David Daney <david.daney@cavium.com>
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/8953/Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

d7eb804c

MIPS: OCTEON: fix kernel crash when offlining a CPU · 1d594605

Aaro Koskinen authored Jan 15, 2015

commit 63a87fe0 upstream.

octeon_cpu_disable() will unconditionally enable interrupts when called.
We can assume that the routine is always called with interrupts disabled,
so just delete the incorrect local_irq_disable/enable().

The patch fixes the following crash when offlining a CPU:

[   93.818785] ------------[ cut here ]------------
[   93.823421] WARNING: CPU: 1 PID: 10 at kernel/smp.c:231 flush_smp_call_function_queue+0x1c4/0x1d0()
[   93.836215] Modules linked in:
[   93.839287] CPU: 1 PID: 10 Comm: migration/1 Not tainted 3.19.0-rc4-octeon-los_b5f0 #1
[   93.847212] Stack : 0000000000000001 ffffffff81b2cf90 0000000000000004 ffffffff81630000
	  0000000000000000 0000000000000000 0000000000000000 000000000000004a
	  0000000000000006 ffffffff8117e550 0000000000000000 0000000000000000
	  ffffffff81b30000 ffffffff81b26808 8000000032c77748 ffffffff81627e07
	  ffffffff81595ec8 ffffffff81b26808 000000000000000a 0000000000000001
	  0000000000000001 0000000000000003 0000000010008ce1 ffffffff815030c8
	  8000000032cbbb38 ffffffff8113d42c 0000000010008ce1 ffffffff8117f36c
	  8000000032c77300 8000000032cbba50 0000000000000001 ffffffff81503984
	  0000000000000000 0000000000000000 0000000000000000 0000000000000000
	  0000000000000000 ffffffff81121668 0000000000000000 0000000000000000
	  ...
[   93.912819] Call Trace:
[   93.915273] [<ffffffff81121668>] show_stack+0x68/0x80
[   93.920335] [<ffffffff81503984>] dump_stack+0x6c/0x90
[   93.925395] [<ffffffff8113d58c>] warn_slowpath_common+0x94/0xd8
[   93.931324] [<ffffffff811a402c>] flush_smp_call_function_queue+0x1c4/0x1d0
[   93.938208] [<ffffffff811a4128>] hotplug_cfd+0xf0/0x108
[   93.943444] [<ffffffff8115bacc>] notifier_call_chain+0x5c/0xb8
[   93.949286] [<ffffffff8113d704>] cpu_notify+0x24/0x60
[   93.954348] [<ffffffff81501738>] take_cpu_down+0x38/0x58
[   93.959670] [<ffffffff811b343c>] multi_cpu_stop+0x154/0x180
[   93.965250] [<ffffffff811b3768>] cpu_stopper_thread+0xd8/0x160
[   93.971093] [<ffffffff8115ea4c>] smpboot_thread_fn+0x1ec/0x1f8
[   93.976936] [<ffffffff8115ab04>] kthread+0xd4/0xf0
[   93.981735] [<ffffffff8111c4f0>] ret_from_kernel_thread+0x14/0x1c
[   93.987835]
[   93.989326] ---[ end trace c9e3815ee655bda9 ]---
[   93.993951] Kernel bug detected[#1]:
[   93.997533] CPU: 1 PID: 10 Comm: migration/1 Tainted: G        W      3.19.0-rc4-octeon-los_b5f0 #1
[   94.006591] task: 8000000032c77300 ti: 8000000032cb8000 task.ti: 8000000032cb8000
[   94.014081] $ 0   : 0000000000000000 0000000010000ce1 0000000000000001 ffffffff81620000
[   94.022146] $ 4   : 8000000002c72ac0 0000000000000000 00000000000001a7 ffffffff813b06f0
[   94.030210] $ 8   : ffffffff813b20d8 0000000000000000 0000000000000000 ffffffff81630000
[   94.038275] $12   : 0000000000000087 0000000000000000 0000000000000086 0000000000000000
[   94.046339] $16   : ffffffff81623168 0000000000000001 0000000000000000 0000000000000008
[   94.054405] $20   : 0000000000000001 0000000000000001 0000000000000001 0000000000000003
[   94.062470] $24   : 0000000000000038 ffffffff813b7f10
[   94.070536] $28   : 8000000032cb8000 8000000032cbbc20 0000000010008ce1 ffffffff811bcaf4
[   94.078601] Hi    : 0000000000f188e8
[   94.082179] Lo    : d4fdf3b646c09d55
[   94.085760] epc   : ffffffff811bc9d0 irq_work_run_list+0x8/0xf8
[   94.091686]     Tainted: G        W
[   94.095613] ra    : ffffffff811bcaf4 irq_work_run+0x34/0x60
[   94.101192] Status: 10000ce3	KX SX UX KERNEL EXL IE
[   94.106235] Cause : 40808034
[   94.109119] PrId  : 000d9301 (Cavium Octeon II)
[   94.113653] Modules linked in:
[   94.116721] Process migration/1 (pid: 10, threadinfo=8000000032cb8000, task=8000000032c77300, tls=0000000000000000)
[   94.127168] Stack : 8000000002c74c80 ffffffff811a4128 0000000000000001 ffffffff81635720
	  fffffffffffffff2 ffffffff8115bacc 80000000320fbce0 80000000320fbca4
	  80000000320fbc80 0000000000000002 0000000000000004 ffffffff8113d704
	  80000000320fbce0 ffffffff81501738 0000000000000003 ffffffff811b343c
	  8000000002c72aa0 8000000002c72aa8 ffffffff8159cae8 ffffffff8159caa0
	  ffffffff81650000 80000000320fbbf0 80000000320fbc80 ffffffff811b32e8
	  0000000000000000 ffffffff811b3768 ffffffff81622b80 ffffffff815148a8
	  8000000032c77300 8000000002c73e80 ffffffff815148a8 8000000032c77300
	  ffffffff81622b80 ffffffff815148a8 8000000032c77300 ffffffff81503f48
	  ffffffff8115ea0c ffffffff81620000 0000000000000000 ffffffff81174d64
	  ...
[   94.192771] Call Trace:
[   94.195222] [<ffffffff811bc9d0>] irq_work_run_list+0x8/0xf8
[   94.200802] [<ffffffff811bcaf4>] irq_work_run+0x34/0x60
[   94.206036] [<ffffffff811a4128>] hotplug_cfd+0xf0/0x108
[   94.211269] [<ffffffff8115bacc>] notifier_call_chain+0x5c/0xb8
[   94.217111] [<ffffffff8113d704>] cpu_notify+0x24/0x60
[   94.222171] [<ffffffff81501738>] take_cpu_down+0x38/0x58
[   94.227491] [<ffffffff811b343c>] multi_cpu_stop+0x154/0x180
[   94.233072] [<ffffffff811b3768>] cpu_stopper_thread+0xd8/0x160
[   94.238914] [<ffffffff8115ea4c>] smpboot_thread_fn+0x1ec/0x1f8
[   94.244757] [<ffffffff8115ab04>] kthread+0xd4/0xf0
[   94.249555] [<ffffffff8111c4f0>] ret_from_kernel_thread+0x14/0x1c
[   94.255654]
[   94.257146]
Code: a2423c40  40026000  30420001 <00020336> dc820000  10400037  00000000  0000010f  0000010f
[   94.267183] ---[ end trace c9e3815ee655bdaa ]---
[   94.271804] Fatal exception: panic in 5 seconds
Reported-by: Hemmo Nieminen <hemmo.nieminen@iki.fi>
Signed-off-by: Aaro Koskinen <aaro.koskinen@iki.fi>
Acked-by: David Daney <david.daney@cavium.com>
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/8952/Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

1d594605

MIPS: IRQ: Fix disable_irq on CPU IRQs · f939ec03

Felix Fietkau authored Jan 15, 2015

commit a3e6c1ef upstream.

If the irq_chip does not define .irq_disable, any call to disable_irq
will defer disabling the IRQ until it fires while marked as disabled.
This assumes that the handler function checks for this condition, which
handle_percpu_irq does not. In this case, calling disable_irq leads to
an IRQ storm, if the interrupt fires while disabled.

This optimization is only useful when disabling the IRQ is slow, which
is not true for the MIPS CPU IRQ.

Disable this optimization by implementing .irq_disable and .irq_enable
Signed-off-by: Felix Fietkau <nbd@openwrt.org>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/8949/Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

f939ec03

PCI: Add NEC variants to Stratus ftServer PCIe DMI check · 65456b7b

Charlotte Richardson authored Feb 02, 2015

commit 51ac3d2f upstream.

NEC OEMs the same platforms as Stratus does, which have multiple devices on
some PCIe buses under downstream ports.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=51331
Fixes: 1278998f ("PCI: Work around Stratus ftServer broken PCIe hierarchy (fix DMI check)")
Signed-off-by: Charlotte Richardson <charlotte.richardson@stratus.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: Myron Stowe <myron.stowe@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

65456b7b

gpio: sysfs: fix memory leak in gpiod_sysfs_set_active_low · 2739bb7a

Johan Hovold authored Jan 26, 2015

commit 49d2ca84 upstream.

Fix memory leak in the gpio sysfs interface due to failure to drop
reference to device returned by class_find_device when setting the
gpio-line polarity.

Fixes: 07697461 ("gpiolib: add support for changing value polarity in sysfs")
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

2739bb7a

gpio: sysfs: fix memory leak in gpiod_export_link · 3adca859

Johan Hovold authored Jan 26, 2015

commit 0f303db0 upstream.

Fix memory leak in the gpio sysfs interface due to failure to drop
reference to device returned by class_find_device when creating a link.

Fixes: a4177ee7 ("gpiolib: allow exported GPIO nodes to be named using sysfs links")
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

3adca859

06 Feb, 2015 6 commits

Linux 3.14.32 · 4ccf212f
Greg Kroah-Hartman authored Feb 06, 2015

4ccf212f

target: Drop arbitrary maximum I/O size limit · 4a8860d8

Nicholas Bellinger authored Jan 06, 2015

commit 046ba642 upstream.

This patch drops the arbitrary maximum I/O size limit in sbc_parse_cdb(),
which currently for fabric_max_sectors is hardcoded to 8192 (4 MB for 512
byte sector devices), and for hw_max_sectors is a backend driver dependent
value.

This limit is problematic because Linux initiators have only recently
started to honor block limits MAXIMUM TRANSFER LENGTH, and other non-Linux
based initiators (eg: MSFT Fibre Channel) can also generate I/Os larger
than 4 MB in size.

Currently when this happens, the following message will appear on the
target resulting in I/Os being returned with non recoverable status:

  SCSI OP 28h with too big sectors 16384 exceeds fabric_max_sectors: 8192

Instead, drop both [fabric,hw]_max_sector checks in sbc_parse_cdb(),
and convert the existing hw_max_sectors into a purely informational
attribute used to represent the granuality that backend driver and/or
subsystem code is splitting I/Os upon.

Also, update FILEIO with an explicit FD_MAX_BYTES check in fd_execute_rw()
to deal with the one special iovec limitiation case.

v2 changes:
  - Drop hw_max_sectors check in sbc_parse_cdb()
Reported-by: Lance Gropper <lance.gropper@qosserver.com>
Reported-by: Stefan Priebe <s.priebe@profihost.ag>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: Roland Dreier <roland@purestorage.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

4a8860d8

workqueue: fix subtle pool management issue which can stall whole worker_pool · 336d5519

Tejun Heo authored Jan 16, 2015

commit 29187a9e upstream.

A worker_pool's forward progress is guaranteed by the fact that the
last idle worker assumes the manager role to create more workers and
summon the rescuers if creating workers doesn't succeed in timely
manner before proceeding to execute work items.

This manager role is implemented in manage_workers(), which indicates
whether the worker may proceed to work item execution with its return
value.  This is necessary because multiple workers may contend for the
manager role, and, if there already is a manager, others should
proceed to work item execution.

Unfortunately, the function also indicates that the worker may proceed
to work item execution if need_to_create_worker() is false at the head
of the function.  need_to_create_worker() tests the following
conditions.

	pending work items && !nr_running && !nr_idle

The first and third conditions are protected by pool->lock and thus
won't change while holding pool->lock; however, nr_running can change
asynchronously as other workers block and resume and while it's likely
to be zero, as someone woke this worker up in the first place, some
other workers could have become runnable inbetween making it non-zero.

If this happens, manage_worker() could return false even with zero
nr_idle making the worker, the last idle one, proceed to execute work
items.  If then all workers of the pool end up blocking on a resource
which can only be released by a work item which is pending on that
pool, the whole pool can deadlock as there's no one to create more
workers or summon the rescuers.

This patch fixes the problem by removing the early exit condition from
maybe_create_worker() and making manage_workers() return false iff
there's already another manager, which ensures that the last worker
doesn't start executing work items.

We can leave the early exit condition alone and just ignore the return
value but the only reason it was put there is because the
manage_workers() used to perform both creations and destructions of
workers and thus the function may be invoked while the pool is trying
to reduce the number of workers.  Now that manage_workers() is called
only when more workers are needed, the only case this early exit
condition is triggered is rare race conditions rendering it pointless.

Tested with simulated workload and modified workqueue code which
trigger the pool deadlock reliably without this patch.

tj: Updated to v3.14 where manage_workers() is responsible not only
    for creating more workers but also destroying surplus ones.
    maybe_create_worker() needs to keep its early exit condition to
    avoid creating a new worker when manage_workers() is called to
    destroy surplus ones.  Other than that, the adaptabion is
    straight-forward.  Both maybe_{create|destroy}_worker() functions
    are converted to return void and manage_workers() returns %false
    iff it lost manager arbitration.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Eric Sandeen <sandeen@sandeen.net>
Link: http://lkml.kernel.org/g/54B019F4.8030009@sandeen.net
Cc: Dave Chinner <david@fromorbit.com>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

336d5519

rbd: fix rbd_dev_parent_get() when parent_overlap == 0 · b49e4a87

Ilya Dryomov authored Jan 19, 2015

commit ae43e9d0 upstream.

The comment for rbd_dev_parent_get() said

    * We must get the reference before checking for the overlap to
    * coordinate properly with zeroing the parent overlap in
    * rbd_dev_v2_parent_info() when an image gets flattened.  We
    * drop it again if there is no overlap.

but the "drop it again if there is no overlap" part was missing from
the implementation.  This lead to absurd parent_ref values for images
with parent_overlap == 0, as parent_ref was incremented for each
img_request and virtually never decremented.

Fix this by leveraging the fact that refresh path calls
rbd_dev_v2_parent_info() under header_rwsem and use it for read in
rbd_dev_parent_get(), instead of messing around with atomics.  Get rid
of barriers in rbd_dev_v2_parent_info() while at it - I don't see what
they'd pair with now and I suspect we are in a pretty miserable
situation as far as proper locking goes regardless.
Signed-off-by: Ilya Dryomov <idryomov@redhat.com>
Reviewed-by: Josh Durgin <jdurgin@redhat.com>
Reviewed-by: Alex Elder <elder@linaro.org>
[idryomov@redhat.com: backport to 3.14: context]
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

b49e4a87

pstore: Fix NULL pointer fault if get NULL prz in ramoops_get_next_prz · c1f42138

Liu ShuoX authored Mar 17, 2014

commit b0aa931f upstream.

ramoops_get_next_prz get the prz according the paramters. If it get a
uninitialized prz, access its members by following persistent_ram_old_size(prz)
will cause a NULL pointer crash.
Ex: if ftrace_size is 0, fprz will be NULL.

Fix it by return NULL in advance.
Signed-off-by: Liu ShuoX <shuox.liu@intel.com>
Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Cc: HuKeping <hukeping@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

c1f42138

pstore: skip zero size persistent ram buffer in traverse · b310479c

Liu ShuoX authored Mar 17, 2014

commit aa9a4a1e upstream.

In ramoops_pstore_read, a valid prz pointer with zero size buffer will
break traverse of all persistent ram buffers.  The latter buffer might be
lost.
Signed-off-by: Liu ShuoX <shuox.liu@intel.com>
Cc: "Zhang, Yanmin" <yanmin_zhang@linux.intel.com>
Cc: Colin Cross <ccross@android.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Cc: HuKeping <hukeping@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

b310479c