Commits · 0aa0408c2f8d21e88a4e61120e3b94bf84aef809 · Kirill Smelkov / linux

06 Dec, 2014 9 commits

crypto: caam - remove duplicated sg copy functions · 0aa0408c

Cristian Stoica authored Aug 14, 2014

Replace equivalent (and partially incorrect) scatter-gather functions
with ones from crypto-API.

The replacement is motivated by page-faults in sg_copy_part triggered
by successive calls to crypto_hash_update. The following fault appears
after calling crypto_ahash_update twice, first with 13 and then
with 285 bytes:

Unable to handle kernel paging request for data at address 0x00000008
Faulting instruction address: 0xf9bf9a8c
Oops: Kernel access of bad area, sig: 11 [#1]
SMP NR_CPUS=8 CoreNet Generic
Modules linked in: tcrypt(+) caamhash caam_jr caam tls
CPU: 6 PID: 1497 Comm: cryptomgr_test Not tainted
3.12.19-rt30-QorIQ-SDK-V1.6+g9fda9f2 #75
task: e9308530 ti: e700e000 task.ti: e700e000
NIP: f9bf9a8c LR: f9bfcf28 CTR: c0019ea0
REGS: e700fb80 TRAP: 0300   Not tainted
(3.12.19-rt30-QorIQ-SDK-V1.6+g9fda9f2)
MSR: 00029002 <CE,EE,ME>  CR: 44f92024  XER: 20000000
DEAR: 00000008, ESR: 00000000

GPR00: f9bfcf28 e700fc30 e9308530 e70b1e55 00000000 ffffffdd e70b1e54 0bebf888
GPR08: 902c7ef5 c0e771e2 00000002 00000888 c0019ea0 00000000 00000000 c07a4154
GPR16: c08d0000 e91a8f9c 00000001 e98fb400 00000100 e9c83028 e70b1e08 e70b1d48
GPR24: e992ce10 e70b1dc8 f9bfe4f4 e70b1e55 ffffffdd e70b1ce0 00000000 00000000
NIP [f9bf9a8c] sg_copy+0x1c/0x100 [caamhash]
LR [f9bfcf28] ahash_update_no_ctx+0x628/0x660 [caamhash]
Call Trace:
[e700fc30] [f9bf9c50] sg_copy_part+0xe0/0x160 [caamhash] (unreliable)
[e700fc50] [f9bfcf28] ahash_update_no_ctx+0x628/0x660 [caamhash]
[e700fcb0] [f954e19c] crypto_tls_genicv+0x13c/0x300 [tls]
[e700fd10] [f954e65c] crypto_tls_encrypt+0x5c/0x260 [tls]
[e700fd40] [c02250ec] __test_aead.constprop.9+0x2bc/0xb70
[e700fe40] [c02259f0] alg_test_aead+0x50/0xc0
[e700fe60] [c02241e4] alg_test+0x114/0x2e0
[e700fee0] [c022276c] cryptomgr_test+0x4c/0x60
[e700fef0] [c004f658] kthread+0x98/0xa0
[e700ff40] [c000fd04] ret_from_kernel_thread+0x5c/0x64
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

(cherry picked from commit 307fd543)
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

0aa0408c

hwrng: pseries - Return errors to upper levels in pseries-rng.c · 6a1fb9d2

Michael Ellerman authored Sep 25, 2013

We don't expect to get errors from the hypervisor when reading the rng,
but if we do we should pass the error up to the hwrng driver. Otherwise
the hwrng driver will continue calling us forever.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

(cherry picked from commit d319fe2a)
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

6a1fb9d2

freezer: Do not freeze tasks killed by OOM killer · d0f72bea

Cong Wang authored Oct 21, 2014

Since f660daac (oom: thaw threads if oom killed thread is frozen
before deferring) OOM killer relies on being able to thaw a frozen task
to handle OOM situation but a3201227 (freezer: make freezing() test
freeze conditions in effect instead of TIF_FREEZE) has reorganized the
code and stopped clearing freeze flag in __thaw_task. This means that
the target task only wakes up and goes into the fridge again because the
freezing condition hasn't changed for it. This reintroduces the bug
fixed by f660daac.

Fix the issue by checking for TIF_MEMDIE thread flag in
freezing_slow_path and exclude the task from freezing completely. If a
task was already frozen it would get woken by __thaw_task from OOM killer
and get out of freezer after rechecking freezing().

Changes since v1
- put TIF_MEMDIE check into freezing_slowpath rather than in __refrigerator
  as per Oleg
- return __thaw_task into oom_scan_process_thread because
  oom_kill_process will not wake task in the fridge because it is
  sleeping uninterruptible

[mhocko@suse.cz: rewrote the changelog]
Fixes: a3201227 (freezer: make freezing() test freeze conditions in effect instead of TIF_FREEZE)
Cc: 3.3+ <stable@vger.kernel.org> # 3.3+
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Michal Hocko <mhocko@suse.cz>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

(cherry picked from commit 51fae6da)
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

d0f72bea

regmap: fix kernel hang on regmap_bulk_write with zero val_count. · 77fc2877

Quentin Casasnovas authored Nov 12, 2014

Fixes commit 2f06fa04 which was an
incorrect backported version of commit
d6b41cb0 upstream.

If val_count is zero we return -EINVAL with map->lock_arg locked, which
will deadlock the kernel next time we try to acquire this lock.

This was introduced by f5942dd ("regmap: fix possible ZERO_SIZE_PTR pointer
dereferencing error.") which improperly back-ported d6b41cb0.

This issue was found during review of Ubuntu Trusty 3.13.0-40.68 kernel to
prepare Ksplice rebootless updates.

Fixes: f5942dd ("regmap: fix possible ZERO_SIZE_PTR pointer dereferencing error.")
Signed-off-by: Quentin Casasnovas <quentin.casasnovas@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 197b3975)

(cherry picked from commit HEAD)
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

77fc2877

sparc64: sparse irq · 0819bf79

bob picco authored Sep 25, 2014

This patch attempts to do a few things. The highlights are: 1) enable
SPARSE_IRQ unconditionally, 2) kills off !SPARSE_IRQ code 3) allocates
ivector_table at boot time and 4) default to cookie only VIRQ mechanism
for supported firmware. The first firmware with cookie only support for
me appears on T5. You can optionally force the HV firmware to not cookie
only mode which is the sysino support.

The sysino is a deprecated HV mechanism according to the most recent
SPARC Virtual Machine Specification. HV_GRP_INTR is what controls the
cookie/sysino firmware versioning.

The history of this interface is:

1) Major version 1.0 only supported sysino based interrupt interfaces.

2) Major version 2.0 added cookie based VIRQs, however due to the fact
   that OSs were using the VIRQs without negoatiating major version
   2.0 (Linux and Solaris are both guilty), the VIRQs calls were
   allowed even with major version 1.0

   To complicate things even further, the VIRQ interfaces were only
   actually hooked up in the hypervisor for LDC interrupt sources.
   VIRQ calls on other device types would result in HV_EINVAL errors.

   So effectively, major version 2.0 is unusable.

3) Major version 3.0 was created to signal use of VIRQs and the fact
   that the hypervisor has these calls hooked up for all interrupt
   sources, not just those for LDC devices.

A new boot option is provided should cookie only HV support have issues.
hvirq - this is the version for HV_GRP_INTR. This is related to HV API
versioning.  The code attempts major=3 first by default. The option can
be used to override this default.

I've tested with SPARSE_IRQ on T5-8, M7-4 and T4-X and Jalap?no.
Signed-off-by: Bob Picco <bob.picco@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

(cherry picked from commit ee6a9333)
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

0819bf79

sparc64: T5 PMU · 8e5413eb

bob picco authored Sep 16, 2014

The T5 (niagara5) has different PCR related HV fast trap values and a new
HV API Group. This patch utilizes these and shares when possible with niagara4.

We use the same sparc_pmu niagara4_pmu. Should there be new effort to
obtain the MCU perf statistics then this would have to be changed.

Cc: sparclinux@vger.kernel.org
Signed-off-by: Bob Picco <bob.picco@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

(cherry picked from commit 05aa1651)
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

8e5413eb

regmap: fix possible ZERO_SIZE_PTR pointer dereferencing error. · 2a93d594

Xiubo Li authored Sep 28, 2014

Since we cannot make sure the 'val_count' will always be none zero
here, and then if it equals to zero, the kmemdup() will return
ZERO_SIZE_PTR, which equals to ((void *)16).

So this patch fix this with just doing the zero check before calling
kmemdup().
Signed-off-by: Xiubo Li <Li.Xiubo@freescale.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Cc: stable@vger.kernel.org

(cherry picked from commit d6b41cb0)
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

2a93d594

usb: pch_udc: usb gadget device support for Intel Quark X1000 · df13d997

Bryan O'Donoghue authored Aug 04, 2014

This patch is to enable the USB gadget device for Intel Quark X1000
Signed-off-by: Bryan O'Donoghue <bryan.odonoghue@intel.com>
Signed-off-by: Bing Niu <bing.niu@intel.com>
Signed-off-by: Alvin (Weike) Chen <alvin.chen@intel.com>
Signed-off-by: Felipe Balbi <balbi@ti.com>

(cherry picked from commit a68df706)
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

df13d997

asus-nb-wmi: Add wapf4 quirk for the X550VB · 8fc366ad

Stanislaw Gruszka authored Oct 26, 2014

X550VB as many others Asus laptops need wapf4 quirk to make RFKILL
switch be functional. Otherwise system boots with wireless card
disabled and is only possible to enable it by suspend/resume.

Bug report:
http://bugzilla.redhat.com/show_bug.cgi?id=1089731#c23Reported-and-tested-by: Vratislav Podzimek <vpodzime@redhat.com>
Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: Darren Hart <dvhart@linux.intel.com>

(cherry picked from commit 4ec7a45b)
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

8fc366ad

22 Nov, 2014 2 commits

Linux 3.8.13.34 · 2772c92e
Sasha Levin authored Nov 22, 2014
```
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
```
2772c92e

net: fix regression introduced in 2.6.32.62 by sysctl fixes · 036611b2

Willy Tarreau authored Jun 14, 2014

Commits b7c9e4ee ("sysctl net: Keep tcp_syn_retries inside the boundary")
and eedcafdc ("net: check net.core.somaxconn sysctl values") were missing
a .strategy entry which is still required in 2.6.32. Because of this, the
Ubuntu kernel team has faced kernel dumps during their testing.

Tyler Hicks and Luis Henriques proposed this patch to fix the issue,
which properly sets .strategy as needed in 2.6.32.
Reported-by: Luis Henriques <luis.henriques@canonical.com>
Cc: tyler.hicks@canonical.com
Cc: Michal Tesar <mtesar@redhat.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
(cherry picked from commit b90422de)

(cherry picked from commit HEAD)
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

036611b2

21 Nov, 2014 29 commits

net: socket: error on a negative msg_namelen · 4e073683

Matthew Leach authored Mar 11, 2014

When copying in a struct msghdr from the user, if the user has set the
msg_namelen parameter to a negative value it gets clamped to a valid
size due to a comparison between signed and unsigned values.

Ensure the syscall errors when the user passes in a negative value.
Signed-off-by: Matthew Leach <matthew.leach@arm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

(cherry picked from commit dbb490b9)
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

4e073683

net: core: Always propagate flag changes to interfaces · 66e1bc6b

Vlad Yasevich authored Nov 19, 2013

The following commit:
    b6c40d68
    net: only invoke dev->change_rx_flags when device is UP

tried to fix a problem with VLAN devices and promiscuouse flag setting.
The issue was that VLAN device was setting a flag on an interface that
was down, thus resulting in bad promiscuity count.
This commit blocked flag propagation to any device that is currently
down.

A later commit:
    deede2fa
    vlan: Don't propagate flag changes on down interfaces

fixed VLAN code to only propagate flags when the VLAN interface is up,
thus fixing the same issue as above, only localized to VLAN.

The problem we have now is that if we have create a complex stack
involving multiple software devices like bridges, bonds, and vlans,
then it is possible that the flags would not propagate properly to
the physical devices.  A simple examle of the scenario is the
following:

  eth0----> bond0 ----> bridge0 ---> vlan50

If bond0 or eth0 happen to be down at the time bond0 is added to
the bridge, then eth0 will never have promisc mode set which is
currently required for operation as part of the bridge.  As a
result, packets with vlan50 will be dropped by the interface.

The only 2 devices that implement the special flag handling are
VLAN and DSA and they both have required code to prevent incorrect
flag propagation.  As a result we can remove the generic solution
introduced in b6c40d68 and leave
it to the individual devices to decide whether they will block
flag propagation or not.
Reported-by: Stefan Priebe <s.priebe@profihost.ag>
Suggested-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

(cherry picked from commit d2615bf4)
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

66e1bc6b

net: clamp ->msg_namelen instead of returning an error · a6ac08ab

Dan Carpenter authored Nov 27, 2013

If kmsg->msg_namelen > sizeof(struct sockaddr_storage) then in the
original code that would lead to memory corruption in the kernel if you
had audit configured.  If you didn't have audit configured it was
harmless.

There are some programs such as beta versions of Ruby which use too
large of a buffer and returning an error code breaks them.  We should
clamp the ->msg_namelen value instead.

Fixes: 1661bf36 ("net: heap overflow in __audit_sockaddr()")
Reported-by: Eric Wong <normalperson@yhbt.net>
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Tested-by: Eric Wong <normalperson@yhbt.net>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

(cherry picked from commit db31c55a)
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

a6ac08ab

net: check net.core.somaxconn sysctl values · 3f5d7e1d

Roman Gushchin authored Aug 02, 2013

It's possible to assign an invalid value to the net.core.somaxconn
sysctl variable, because there is no checks at all.

The sk_max_ack_backlog field of the sock structure is defined as
unsigned short. Therefore, the backlog argument in inet_listen()
shouldn't exceed USHRT_MAX. The backlog argument in the listen() syscall
is truncated to the somaxconn value. So, the somaxconn value shouldn't
exceed 65535 (USHRT_MAX).
Also, negative values of somaxconn are meaningless.

before:
$ sysctl -w net.core.somaxconn=256
net.core.somaxconn = 256
$ sysctl -w net.core.somaxconn=65536
net.core.somaxconn = 65536
$ sysctl -w net.core.somaxconn=-100
net.core.somaxconn = -100

after:
$ sysctl -w net.core.somaxconn=256
net.core.somaxconn = 256
$ sysctl -w net.core.somaxconn=65536
error: "Invalid argument" setting key "net.core.somaxconn"
$ sysctl -w net.core.somaxconn=-100
error: "Invalid argument" setting key "net.core.somaxconn"

Based on a prior patch from Changli Gao.
Signed-off-by: Roman Gushchin <klamm@yandex-team.ru>
Reported-by: Changli Gao <xiaosuo@gmail.com>
Suggested-by: Eric Dumazet <edumazet@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

(cherry picked from commit 5f671d6b)
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

3f5d7e1d

sysctl net: Keep tcp_syn_retries inside the boundary · f71a8f4c

Michal Tesar authored Jul 19, 2013

Limit the min/max value passed to the
/proc/sys/net/ipv4/tcp_syn_retries.
Signed-off-by: Michal Tesar <mtesar@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

(cherry picked from commit 651e9271)
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

f71a8f4c

crypto: api - Fix race condition in larval lookup · e323f565

Nikola Pajkovsky authored Oct 11, 2013

crypto_larval_lookup should only return a larval if it created one.
Any larval created by another entity must be processed through
crypto_larval_wait before being returned.

Otherwise this will lead to a larval being killed twice, which
will most likely lead to a crash.

Cc: stable@vger.kernel.org
Reported-by: Kees Cook <keescook@chromium.org>
Tested-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

(cherry picked from commit 77dbd7a9)
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

e323f565

ipvs: fix CHECKSUM_PARTIAL for TCP, UDP · 9ad4a0fa

Julian Anastasov authored Oct 17, 2010

 	Fix CHECKSUM_PARTIAL handling. Tested for IPv4 TCP,
UDP not tested because it needs network card with HW CSUM support.
May be fixes problem where IPVS can not be used in virtual boxes.
Problem appears with DNAT to local address when the local stack
sends reply in CHECKSUM_PARTIAL mode.

 	Fix tcp_dnat_handler and udp_dnat_handler to provide
vaddr and daddr in right order (old and new IP) when calling
tcp_partial_csum_update/udp_partial_csum_update (CHECKSUM_PARTIAL).
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>

(cherry picked from commit 5bc9068e)
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

9ad4a0fa

x86, ptrace: fix build breakage with gcc 4.7 (second try) · 084375ba

Willy Tarreau authored Jun 13, 2013

syscall_trace_enter() and syscall_trace_leave() are only called from
within asm code and do not need to be declared in the .c at all.
Removing their reference fixes the build issue that was happening
with gcc 4.7.

Both Sven-Haegar Koch and Christoph Biedl confirmed this patch
addresses their respective build issues.

Cc: Sven-Haegar Koch <haegar@sdinet.de>
Cc: Christoph Biedl <linux-kernel.bfrz@manchmal.in-ulm.de>
Signed-off-by: Willy Tarreau <w@1wt.eu>
(cherry picked from commit 40c74e0d)

(cherry picked from commit HEAD)
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

084375ba

rds: set correct msg_namelen · 94cc7d0e

Weiping Pan authored Jul 23, 2012

Jay Fenlason (fenlason@redhat.com) found a bug,
that recvfrom() on an RDS socket can return the contents of random kernel
memory to userspace if it was called with a address length larger than
sizeof(struct sockaddr_in).
rds_recvmsg() also fails to set the addr_len paramater properly before
returning, but that's just a bug.
There are also a number of cases wher recvfrom() can return an entirely bogus
address. Anything in rds_recvmsg() that returns a non-negative value but does
not go through the "sin = (struct sockaddr_in *)msg->msg_name;" code path
at the end of the while(1) loop will return up to 128 bytes of kernel memory
to userspace.

And I write two test programs to reproduce this bug, you will see that in
rds_server, fromAddr will be overwritten and the following sock_fd will be
destroyed.
Yes, it is the programmer's fault to set msg_namelen incorrectly, but it is
better to make the kernel copy the real length of address to user space in
such case.

How to run the test programs ?
I test them on 32bit x86 system, 3.5.0-rc7.

1 compile
gcc -o rds_client rds_client.c
gcc -o rds_server rds_server.c

2 run ./rds_server on one console

3 run ./rds_client on another console

4 you will see something like:
server is waiting to receive data...
old socket fd=3
server received data from client:data from client
msg.msg_namelen=32
new socket fd=-1067277685
sendmsg()
: Bad file descriptor

/***************** rds_client.c ********************/

int main(void)
{
	int sock_fd;
	struct sockaddr_in serverAddr;
	struct sockaddr_in toAddr;
	char recvBuffer[128] = "data from client";
	struct msghdr msg;
	struct iovec iov;

	sock_fd = socket(AF_RDS, SOCK_SEQPACKET, 0);
	if (sock_fd < 0) {
		perror("create socket error\n");
		exit(1);
	}

	memset(&serverAddr, 0, sizeof(serverAddr));
	serverAddr.sin_family = AF_INET;
	serverAddr.sin_addr.s_addr = inet_addr("127.0.0.1");
	serverAddr.sin_port = htons(4001);

	if (bind(sock_fd, (struct sockaddr*)&serverAddr, sizeof(serverAddr)) < 0) {
		perror("bind() error\n");
		close(sock_fd);
		exit(1);
	}

	memset(&toAddr, 0, sizeof(toAddr));
	toAddr.sin_family = AF_INET;
	toAddr.sin_addr.s_addr = inet_addr("127.0.0.1");
	toAddr.sin_port = htons(4000);
	msg.msg_name = &toAddr;
	msg.msg_namelen = sizeof(toAddr);
	msg.msg_iov = &iov;
	msg.msg_iovlen = 1;
	msg.msg_iov->iov_base = recvBuffer;
	msg.msg_iov->iov_len = strlen(recvBuffer) + 1;
	msg.msg_control = 0;
	msg.msg_controllen = 0;
	msg.msg_flags = 0;

	if (sendmsg(sock_fd, &msg, 0) == -1) {
		perror("sendto() error\n");
		close(sock_fd);
		exit(1);
	}

	printf("client send data:%s\n", recvBuffer);

	memset(recvBuffer, '\0', 128);

	msg.msg_name = &toAddr;
	msg.msg_namelen = sizeof(toAddr);
	msg.msg_iov = &iov;
	msg.msg_iovlen = 1;
	msg.msg_iov->iov_base = recvBuffer;
	msg.msg_iov->iov_len = 128;
	msg.msg_control = 0;
	msg.msg_controllen = 0;
	msg.msg_flags = 0;
	if (recvmsg(sock_fd, &msg, 0) == -1) {
		perror("recvmsg() error\n");
		close(sock_fd);
		exit(1);
	}

	printf("receive data from server:%s\n", recvBuffer);

	close(sock_fd);

	return 0;
}

/***************** rds_server.c ********************/

int main(void)
{
	struct sockaddr_in fromAddr;
	int sock_fd;
	struct sockaddr_in serverAddr;
	unsigned int addrLen;
	char recvBuffer[128];
	struct msghdr msg;
	struct iovec iov;

	sock_fd = socket(AF_RDS, SOCK_SEQPACKET, 0);
	if(sock_fd < 0) {
		perror("create socket error\n");
		exit(0);
	}

	memset(&serverAddr, 0, sizeof(serverAddr));
	serverAddr.sin_family = AF_INET;
	serverAddr.sin_addr.s_addr = inet_addr("127.0.0.1");
	serverAddr.sin_port = htons(4000);
	if (bind(sock_fd, (struct sockaddr*)&serverAddr, sizeof(serverAddr)) < 0) {
		perror("bind error\n");
		close(sock_fd);
		exit(1);
	}

	printf("server is waiting to receive data...\n");
	msg.msg_name = &fromAddr;

	/*
	 * I add 16 to sizeof(fromAddr), ie 32,
	 * and pay attention to the definition of fromAddr,
	 * recvmsg() will overwrite sock_fd,
	 * since kernel will copy 32 bytes to userspace.
	 *
	 * If you just use sizeof(fromAddr), it works fine.
	 * */
	msg.msg_namelen = sizeof(fromAddr) + 16;
	/* msg.msg_namelen = sizeof(fromAddr); */
	msg.msg_iov = &iov;
	msg.msg_iovlen = 1;
	msg.msg_iov->iov_base = recvBuffer;
	msg.msg_iov->iov_len = 128;
	msg.msg_control = 0;
	msg.msg_controllen = 0;
	msg.msg_flags = 0;

	while (1) {
		printf("old socket fd=%d\n", sock_fd);
		if (recvmsg(sock_fd, &msg, 0) == -1) {
			perror("recvmsg() error\n");
			close(sock_fd);
			exit(1);
		}
		printf("server received data from client:%s\n", recvBuffer);
		printf("msg.msg_namelen=%d\n", msg.msg_namelen);
		printf("new socket fd=%d\n", sock_fd);
		strcat(recvBuffer, "--data from server");
		if (sendmsg(sock_fd, &msg, 0) == -1) {
			perror("sendmsg()\n");
			close(sock_fd);
			exit(1);
		}
	}

	close(sock_fd);
	return 0;
}
Signed-off-by: Weiping Pan <wpan@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

(cherry picked from commit 06b6a1cf)
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

94cc7d0e

llc: Fix missing msg_namelen update in llc_ui_recvmsg() · d66e6f36

Mathias Krause authored Apr 07, 2013

For stream sockets the code misses to update the msg_namelen member
to 0 and therefore makes net/socket.c leak the local, uninitialized
sockaddr_storage variable to userland -- 128 bytes of kernel stack
memory. The msg_namelen update is also missing for datagram sockets
in case the socket is shutting down during receive.

Fix both issues by setting msg_namelen to 0 early. It will be
updated later if we're going to fill the msg_name member.

Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

(cherry picked from commit c77a4b9c)
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

d66e6f36

atm: update msg_namelen in vcc_recvmsg() · acca86d7

Mathias Krause authored Apr 07, 2013

The current code does not fill the msg_name member in case it is set.
It also does not set the msg_namelen member to 0 and therefore makes
net/socket.c leak the local, uninitialized sockaddr_storage variable
to userland -- 128 bytes of kernel stack memory.

Fix that by simply setting msg_namelen to 0 as obviously nobody cared
about vcc_recvmsg() not filling the msg_name in case it was set.
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

(cherry picked from commit 9b3e617f)
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

acca86d7

softirq: reduce latencies · 04288af7

Eric Dumazet authored Jan 10, 2013

In various network workloads, __do_softirq() latencies can be up
to 20 ms if HZ=1000, and 200 ms if HZ=100.

This is because we iterate 10 times in the softirq dispatcher,
and some actions can consume a lot of cycles.

This patch changes the fallback to ksoftirqd condition to :

- A time limit of 2 ms.
- need_resched() being set on current task

When one of this condition is met, we wakeup ksoftirqd for further
softirq processing if we still have pending softirqs.

Using need_resched() as the only condition can trigger RCU stalls,
as we can keep BH disabled for too long.

I ran several benchmarks and got no significant difference in
throughput, but a very significant reduction of latencies (one order
of magnitude) :

In following bench, 200 antagonist "netperf -t TCP_RR" are started in
background, using all available cpus.

Then we start one "netperf -t TCP_RR", bound to the cpu handling the NIC
IRQ (hard+soft)

Before patch :

# netperf -H 7.7.7.84 -t TCP_RR -T2,2 -- -k
RT_LATENCY,MIN_LATENCY,MAX_LATENCY,P50_LATENCY,P90_LATENCY,P99_LATENCY,MEAN_LATENCY,STDDEV_LATENCY
MIGRATED TCP REQUEST/RESPONSE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET
to 7.7.7.84 () port 0 AF_INET : first burst 0 : cpu bind
RT_LATENCY=550110.424
MIN_LATENCY=146858
MAX_LATENCY=997109
P50_LATENCY=305000
P90_LATENCY=550000
P99_LATENCY=710000
MEAN_LATENCY=376989.12
STDDEV_LATENCY=184046.92

After patch :

# netperf -H 7.7.7.84 -t TCP_RR -T2,2 -- -k
RT_LATENCY,MIN_LATENCY,MAX_LATENCY,P50_LATENCY,P90_LATENCY,P99_LATENCY,MEAN_LATENCY,STDDEV_LATENCY
MIGRATED TCP REQUEST/RESPONSE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET
to 7.7.7.84 () port 0 AF_INET : first burst 0 : cpu bind
RT_LATENCY=40545.492
MIN_LATENCY=9834
MAX_LATENCY=78366
P50_LATENCY=33583
P90_LATENCY=59000
P99_LATENCY=69000
MEAN_LATENCY=38364.67
STDDEV_LATENCY=12865.26
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: David Miller <davem@davemloft.net>
Cc: Tom Herbert <therbert@google.com>
Cc: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

(cherry picked from commit c10d7367)
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

04288af7

net: reduce net_rx_action() latency to 2 HZ · 1626cfc6

Eric Dumazet authored Mar 05, 2013

We should use time_after_eq() to get maximum latency of two ticks,
instead of three.

Bug added in commit 24f8b238 (net: increase receive packet quantum)
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

(cherry picked from commit d1f41b67)
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

1626cfc6

Bluetooth: RFCOMM - Fix missing msg_namelen update in rfcomm_sock_recvmsg() · 7ecb3f41

Mathias Krause authored Apr 07, 2013

If RFCOMM_DEFER_SETUP is set in the flags, rfcomm_sock_recvmsg() returns
early with 0 without updating the possibly set msg_namelen member. This,
in turn, leads to a 128 byte kernel stack leak in net/socket.c.

Fix this by updating msg_namelen in this case. For all other cases it
will be handled in bt_sock_stream_recvmsg().

Cc: Marcel Holtmann <marcel@holtmann.org>
Cc: Gustavo Padovan <gustavo@padovan.org>
Cc: Johan Hedberg <johan.hedberg@gmail.com>
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

(cherry picked from commit e11e0455)
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

7ecb3f41

[SCSI] fix crash in scsi_dispatch_cmd() · 6764a7df

James Bottomley authored Jul 07, 2011

USB surprise removal of sr is triggering an oops in
scsi_dispatch_command().  What seems to be happening is that USB is
hanging on to a queue reference until the last close of the upper
device, so the crash is caused by surprise remove of a mounted CD
followed by attempted unmount.

The problem is that USB doesn't issue its final commands as part of
the SCSI teardown path, but on last close when the block queue is long
gone.  The long term fix is probably to make sr do the teardown in the
same way as sd (so remove all the lower bits on ejection, but keep the
upper disk alive until last close of user space).  However, the
current oops can be simply fixed by not allowing any commands to be
sent to a dead queue.

Cc: stable@kernel.org
Signed-off-by: James Bottomley <JBottomley@Parallels.com>

(cherry picked from commit bfe159a5)
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

6764a7df

staging: comedi: s626: don't dereference insn->data · bed03fb3

Ian Abbott authored Sep 24, 2012

`s626_enc_insn_config()` is incorrectly dereferencing `insn->data` which
is a pointer to user memory.  It should be dereferencing the separate
`data` parameter that points to a copy of the data in kernel memory.
Signed-off-by: Ian Abbott <abbotti@mev.co.uk>
Reviewed-by: H Hartley Sweeten <hsweeten@visionengravers.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

(cherry picked from commit b655c2c4)
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

bed03fb3

x86, random: make ARCH_RANDOM prompt if EMBEDDED, not EXPERT · 3ff32c2e

Romain Francoise authored Jan 23, 2013

Before v2.6.38 CONFIG_EXPERT was known as CONFIG_EMBEDDED but the
Kconfig entry was not changed to match when upstream commit
628c6246 ("x86, random: Architectural
inlines to get random integers with RDRAND") was backported.
Signed-off-by: Romain Francoise <romain@orebokech.com>
Signed-off-by: Willy Tarreau <w@1wt.eu>
(cherry picked from commit 119274d6)

(cherry picked from commit HEAD)
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

3ff32c2e

mm: fix invalidate_complete_page2() lock ordering · cf379a4c

Hugh Dickins authored Oct 08, 2012

In fuzzing with trinity, lockdep protested "possible irq lock inversion
dependency detected" when isolate_lru_page() reenabled interrupts while
still holding the supposedly irq-safe tree_lock:

invalidate_inode_pages2
  invalidate_complete_page2
    spin_lock_irq(&mapping->tree_lock)
    clear_page_mlock
      isolate_lru_page
        spin_unlock_irq(&zone->lru_lock)

isolate_lru_page() is correct to enable interrupts unconditionally:
invalidate_complete_page2() is incorrect to call clear_page_mlock() while
holding tree_lock, which is supposed to nest inside lru_lock.

Both truncate_complete_page() and invalidate_complete_page() call
clear_page_mlock() before taking tree_lock to remove page from radix_tree.
 I guess invalidate_complete_page2() preferred to test PageDirty (again)
under tree_lock before committing to the munlock; but since the page has
already been unmapped, its state is already somewhat inconsistent, and no
worse if clear_page_mlock() moved up.
Reported-by: Sasha Levin <levinsasha928@gmail.com>
Deciphered-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: Mel Gorman <mel@csn.ul.ie>
Cc: Rik van Riel <riel@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michel Lespinasse <walken@google.com>
Cc: Ying Han <yinghan@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

(cherry picked from commit ec4d9f62)
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

cf379a4c

usermodehelper: ____call_usermodehelper() doesn't need do_exit() · caebb40a

Oleg Nesterov authored Mar 23, 2012

Minor cleanup.  ____call_usermodehelper() can simply return, no need to
call do_exit() explicitely.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Tejun Heo <tj@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

(cherry picked from commit 5b9bd473)
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

caebb40a

gen_init_cpio: avoid stack overflow when expanding · 2ea98a84

Kees Cook authored Oct 25, 2012

Fix possible overflow of the buffer used for expanding environment
variables when building file list.

In the extremely unlikely case of an attacker having control over the
environment variables visible to gen_init_cpio, control over the
contents of the file gen_init_cpio parses, and gen_init_cpio was built
without compiler hardening, the attacker can gain arbitrary execution
control via a stack buffer overflow.

  $ cat usr/crash.list
  file foo ${BIG}${BIG}${BIG}${BIG}${BIG}${BIG} 0755 0 0
  $ BIG=$(perl -e 'print "A" x 4096;') ./usr/gen_init_cpio usr/crash.list
  *** buffer overflow detected ***: ./usr/gen_init_cpio terminated

This also replaces the space-indenting with tabs.

Patch based on existing fix extracted from grsecurity.
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Brad Spengler <spender@grsecurity.net>
Cc: PaX Team <pageexec@freemail.hu>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

(cherry picked from commit 20f1de65)
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

2ea98a84

Revert "pcdp: use early_ioremap/early_iounmap to access pcdp table" · 07cba5f8

Ben Hutchings authored Jan 12, 2013

This reverts commit 2af3af56, which was
commit 6c4088ac upstream.

This broke compilation of the driver in 2.6.32.y as the
early_io{remap,unmap}() functions are not defined for ia64.  The driver
can *only* be built for ia64 (even in current mainline), so a fix for
x86_64 is pointless.
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Willy Tarreau <w@1wt.eu>
(cherry picked from commit 01ab25d5)

(cherry picked from commit HEAD)
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

07cba5f8

random: create add_device_randomness() interface · 4b269d02

Linus Torvalds authored Jul 04, 2012

Add a new interface, add_device_randomness() for adding data to the
random pool that is likely to differ between two devices (or possibly
even per boot).  This would be things like MAC addresses or serial
numbers, or the read-out of the RTC. This does *not* add any actual
entropy to the pool, but it initializes the pool to different values
for devices that might otherwise be identical and have very little
entropy available to them (particularly common in the embedded world).

[ Modified by tytso to mix in a timestamp, since there may be some
  variability caused by the time needed to detect/configure the hardware
  in question. ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: stable@vger.kernel.org

(cherry picked from commit a2080a67)
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

4b269d02

NFS: Alias the nfs module to nfs4 · 5e3d9855

bjschuma@gmail.com authored Aug 08, 2012

This allows distros to remove the line from their modprobe
configuration.
Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Cc: stable@vger.kernel.org
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

(cherry picked from commit 425e776d)
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

5e3d9855

PCI: Add quirk for still enabled interrupts on Intel Sandy Bridge GPUs · 38e46d7d

Thomas Jarosch authored Dec 07, 2011

Some BIOS implementations leave the Intel GPU interrupts enabled,
even though no one is handling them (f.e. i915 driver is never loaded).
Additionally the interrupt destination is not set up properly
and the interrupt ends up -somewhere-.

These spurious interrupts are "sticky" and the kernel disables
the (shared) interrupt line after 100.000+ generated interrupts.

Fix it by disabling the still enabled interrupts.
This resolves crashes often seen on monitor unplug.

Tested on the following boards:
- Intel DH61CR: Affected
- Intel DH67BL: Affected
- Intel S1200KP server board: Affected
- Asus P8H61-M LE: Affected, but system does not crash.
  Probably the IRQ ends up somewhere unnoticed.

According to reports on the net, the Intel DH61WW board is also affected.

Many thanks to Jesse Barnes from Intel for helping
with the register configuration and to Intel in general
for providing public hardware documentation.
Signed-off-by: Thomas Jarosch <thomas.jarosch@intra2net.com>
Tested-by: Charlie Suffin <charlie.suffin@stratus.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>

(cherry picked from commit f67fd55f)
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

38e46d7d

sched/x86: Fix overflow in cyc2ns_offset · 148cf6d3

Salman Qazi authored Mar 09, 2012

When a machine boots up, the TSC generally gets reset.  However,
when kexec is used to boot into a kernel, the TSC value would be
carried over from the previous kernel.  The computation of
cycns_offset in set_cyc2ns_scale is prone to an overflow, if the
machine has been up more than 208 days prior to the kexec.  The
overflow happens when we multiply *scale, even though there is
enough room to store the final answer.

We fix this issue by decomposing tsc_now into the quotient and
remainder of division by CYC2NS_SCALE_FACTOR and then performing
the multiplication separately on the two components.

Refactor code to share the calculation with the previous
fix in __cycles_2_ns().
Signed-off-by: Salman Qazi <sqazi@google.com>
Acked-by: John Stultz <john.stultz@linaro.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Turner <pjt@google.com>
Cc: john stultz <johnstul@us.ibm.com>
Link: http://lkml.kernel.org/r/20120310004027.19291.88460.stgit@dungbeetle.mtv.corp.google.comSigned-off-by: Ingo Molnar <mingo@elte.hu>

(cherry picked from commit 9993bc63)
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

148cf6d3

ntp: Fix integer overflow when setting time · 0b72be85

Sasha Levin authored Mar 15, 2012

'long secs' is passed as divisor to div_s64, which accepts a 32bit
divisor. On 64bit machines that value is trimmed back from 8 bytes
back to 4, causing a divide by zero when the number is bigger than
(1 << 32) - 1 and all 32 lower bits are 0.

Use div64_long() instead.
Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Cc: johnstul@us.ibm.com
Link: http://lkml.kernel.org/r/1331829374-31543-2-git-send-email-levinsasha928@gmail.com
Cc: stable@vger.kernel.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

(cherry picked from commit a078c6d0)
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

0b72be85

PNP: fix "work around Dell 1536/1546 BIOS MMCONFIG bug that breaks USB" · 79b69d20

Willy Tarreau authored Sep 30, 2012

Initial stable commit : 2215d910

This patch backported into 2.6.32.55 is enabled when CONFIG_AMD_NB is set,
but this config option does not exist in 2.6.32, it was called CONFIG_K8_NB,
so the fix was never applied. Some other changes were needed to make it work.
first, the correct include file name was asm/k8.h and not asm/amd_nb.h, and
second, amd_get_mmconfig_range() is needed and was merged by previous patch.

Thanks to Jiri Slabi who reported the issue and diagnosed all the dependencies.
Signed-off-by: Willy Tarreau <w@1wt.eu>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
(cherry picked from commit 46e8a56a)

(cherry picked from commit HEAD)
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

79b69d20

IA64: Remove COMPAT_IA32 support · af1e4375

Ben Hutchings authored Mar 09, 2012

commit 32974ad4 upstream

This just changes Kconfig rather than touching all the other files the
original commit did.

Patch description from the original commit :

  |  [IA64] Remove COMPAT_IA32 support
  |
  |  This has been broken since May 2008 when Al Viro killed altroot support.
  |  Since nobody has complained, it would appear that there are no users of
  |  this code (A plausible theory since the main OSVs that support ia64 prefer
  |  to use the IA32-EL software emulation).
  |
  |  Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Willy Tarreau <w@1wt.eu>
(cherry picked from commit d9a25c03)

(cherry picked from commit HEAD)
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

af1e4375

compat: Re-add missing asm/compat.h include to fix compile breakage on s390 · bde7f6e2

Heiko Carstens authored Mar 05, 2012

For kernels < 3.0 the backport of 048cd4e5
"compat: fix compile breakage on s390" will break compilation...

Re-add a single #include <asm/compat.h> in order to fix this.

This patch is _not_ necessary for upstream, only for stable kernels
which include the "build fix" mentioned above.
Reported-by: Jiri Slaby <jslaby@suse.cz>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Willy Tarreau <w@1wt.eu>
(cherry picked from commit ee116431)

(cherry picked from commit HEAD)
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

bde7f6e2