- 26 Oct, 2013 33 commits
-
-
Daniel Mack authored
commit f375fc52 upstream. Commit 7e8d5cd9 ("USB: Add EHCI support for MX27 and MX31 based boards") introduced code that could potentially lead to a NULL pointer dereference on driver removal. Fix this by checking for the value of pdata before dereferencing it. Signed-off-by:
Daniel Mack <zonque@gmail.com> Reported-by:
Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> [bwh: Backported to 3.2: adjust context] Signed-off-by:
Ben Hutchings <ben@decadent.org.uk>
-
Dan Carpenter authored
commit 2c4283ca upstream. In dt282x_ai_insn_read() we call this macro like: wait_for(!mux_busy(), comedi_error(dev, "timeout\n"); return -ETIME;); Because the if statement doesn't have curly braces it means we always return -ETIME and the function never succeeds. Signed-off-by:
Dan Carpenter <dan.carpenter@oracle.com> Acked-by:
Ian Abbott <abbotti@mev.co.uk> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Ben Hutchings <ben@decadent.org.uk>
-
Johan Hovold authored
commit 3b716caf upstream. Fix endianess bugs in parallel-port code which caused corrupt control-requests to be issued on big-endian machines. Reported-by:
kbuild test robot <fengguang.wu@intel.com> Signed-off-by:
Johan Hovold <jhovold@gmail.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Ben Hutchings <ben@decadent.org.uk>
-
Dan Carpenter authored
commit d0bd9a41 upstream. The write_parport_reg_nonblock() function shouldn't sleep because it's called with spinlocks held. Signed-off-by:
Dan Carpenter <dan.carpenter@oracle.com> Acked-by:
Johan Hovold <jhovold@gmail.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Ben Hutchings <ben@decadent.org.uk>
-
Tejun Heo authored
commit c34ac00c upstream. list_first_or_null() should test whether the list is empty and return pointer to the first entry if not in a RCU safe manner. It's broken in several ways. * It compares __kernel @__ptr with __rcu @__next triggering the following sparse warning. net/core/dev.c:4331:17: error: incompatible types in comparison expression (different address spaces) * It doesn't perform rcu_dereference*() and computes the entry address using container_of() directly from the __rcu pointer which is inconsitent with other rculist interface. As a result, all three in-kernel users - net/core/dev.c, macvlan, cgroup - are buggy. They dereference the pointer w/o going through read barrier. * While ->next dereference passes through list_next_rcu(), the compiler is still free to fetch ->next more than once and thus nullify the "__ptr != __next" condition check. Fix it by making list_first_or_null_rcu() dereference ->next directly using ACCESS_ONCE() and then use list_entry_rcu() on it like other rculist accessors. v2: Paul pointed out that the compiler may fetch the pointer more than once nullifying the condition check. ACCESS_ONCE() added on ->next dereference. v3: Restored () around macro param which was accidentally removed. Spotted by Paul. Signed-off-by:
Tejun Heo <tj@kernel.org> Reported-by:
Fengguang Wu <fengguang.wu@intel.com> Cc: Dipankar Sarma <dipankar@in.ibm.com> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Li Zefan <lizefan@huawei.com> Cc: Patrick McHardy <kaber@trash.net> Signed-off-by:
Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by:
Josh Triplett <josh@joshtriplett.org> Signed-off-by:
Ben Hutchings <ben@decadent.org.uk>
-
Mike Dyer authored
commit 85fa532b upstream. Bit 9 of PLL2,3 and 4 is reserved as '0'. The 24bit fractional part should be split across each register in 8bit chunks. Signed-off-by:
Mike Dyer <mike.dyer@md-soft.co.uk> Signed-off-by:
Mark Brown <broonie@linaro.org> Signed-off-by:
Ben Hutchings <ben@decadent.org.uk>
-
Felix Fietkau authored
commit a1c781bb upstream. They are not implemented, and accessing them might trigger errors Signed-off-by:
Felix Fietkau <nbd@openwrt.org> Signed-off-by:
John W. Linville <linville@tuxdriver.com> Signed-off-by:
Ben Hutchings <ben@decadent.org.uk>
-
Felix Fietkau authored
commit e96542e5 upstream. Similar to a race condition that exists in the tx path, the hardware might re-read the 'next' pointer of a descriptor of the last completed frame. This only affects non-EDMA (pre-AR93xx) devices. To deal with this race, defer clearing and re-linking a completed rx descriptor until the next one has been processed. Signed-off-by:
Felix Fietkau <nbd@openwrt.org> Signed-off-by:
John W. Linville <linville@tuxdriver.com> Signed-off-by:
Ben Hutchings <ben@decadent.org.uk>
-
Alex Williamson authored
commit 3269ee0b upstream. At best the current code only seems to free the leaf pagetables and the root. If you're unlucky enough to have a large gap (like any QEMU guest with more than 3G of memory), only the first chunk of leaf pagetables are freed (plus the root). This is a massive memory leak. This patch re-writes the pagetable freeing function to use a recursive algorithm and manages to not only free all the pagetables, but does it without any apparent performance loss versus the current broken version. Signed-off-by:
Alex Williamson <alex.williamson@redhat.com> Reviewed-by:
Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by:
Joerg Roedel <joro@8bytes.org> Signed-off-by:
Ben Hutchings <ben@decadent.org.uk>
-
Anton Blanchard authored
commit 230aef7a upstream. Normally when we haven't implemented an alignment handler for a load or store instruction the process will be terminated. The alignment handler uses the DSISR (or a pseudo one) to locate the right handler. Unfortunately ldbrx and stdbrx overlap lfs and stfs so we incorrectly think ldbrx is an lfs and stdbrx is an stfs. This bug is particularly nasty - instead of terminating the process we apply an incorrect fixup and continue on. With more and more overlapping instructions we should stop creating a pseudo DSISR and index using the instruction directly, but for now add a special case to catch ldbrx/stdbrx. Signed-off-by:
Anton Blanchard <anton@samba.org> Signed-off-by:
Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by:
Ben Hutchings <ben@decadent.org.uk>
-
Oliver Neukum authored
commit 6dd433e6 upstream. Both could want to submit the same URB. Some checks of the flag intended to prevent that were missing. Signed-off-by:
Oliver Neukum <oneukum@suse.de> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Ben Hutchings <ben@decadent.org.uk>
-
Hans de Goede authored
commit b4f17a48 upstream. While reading the config parsing code I noticed this check is missing, without this check config->desc.wTotalLength can end up with a value larger then the dev->rawdescriptors length for the config, and when userspace then tries to get the rawdescriptors bad things may happen. Signed-off-by:
Hans de Goede <hdegoede@redhat.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Ben Hutchings <ben@decadent.org.uk>
-
majianpeng authored
commit 73d9f7ee upstream. For nofail == false request, if __map_request failed, the caller does cleanup work, like releasing the relative pages. It doesn't make any sense to retry this request. Signed-off-by:
Jianpeng Ma <majianpeng@gmail.com> Reviewed-by:
Sage Weil <sage@inktank.com> [bwh: Backported to 3.2: adjust indentation] Signed-off-by:
Ben Hutchings <ben@decadent.org.uk>
-
Felix Fietkau authored
commit 026d5b07 upstream. Otherwise in some cases, EAPOL frames might be filtered during the initial handshake, causing delays and assoc failures. Signed-off-by:
Felix Fietkau <nbd@openwrt.org> Signed-off-by:
John W. Linville <linville@tuxdriver.com> Signed-off-by:
Ben Hutchings <ben@decadent.org.uk>
-
Roger Pau Monne authored
commit 5f338d90 upstream. With the current implementation, the callback in the tail of the list can be added twice, because the check done in gnttab_request_free_callback is bogus, callback->next can be NULL if it is the last callback in the list. If we add the same callback twice we end up with an infinite loop, were callback == callback->next. Replace this check with a proper one that iterates over the list to see if the callback has already been added. Signed-off-by:
Roger Pau Monné <roger.pau@citrix.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: David Vrabel <david.vrabel@citrix.com> Signed-off-by:
Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by:
Matt Wilson <msw@amazon.com> Reviewed-by:
David Vrabel <david.vrabel@citrix.com> Signed-off-by:
Ben Hutchings <ben@decadent.org.uk>
-
Manoj Chourasia authored
commit 212a871a upstream. This changes puts the commit 4fe9f8e2 back in place with the fixes for slab corruption because of the commit. When a device is unplugged, wait for all processes that have opened the device to close before deallocating the device. This commit was solving kernel crash because of the corruption in rb tree of vmalloc. The rootcause was the device data pointer was geting excessed after the memory associated with hidraw was freed. The commit 4fe9f8e2 was buggy as it was also freeing the hidraw first and then calling delete operation on the list associated with that hidraw leading to slab corruption. Signed-off-by:
Manoj Chourasia <mchourasia@nvidia.com> Tested-by:
Peter Wu <lekensteyn@gmail.com> Signed-off-by:
Jiri Kosina <jkosina@suse.cz> Signed-off-by:
Ben Hutchings <ben@decadent.org.uk>
-
Jiri Kosina authored
commit df0cfd69 upstream. This basically reverts commit 4fe9f8e2. It causes multiple problems, namely: - after rmmod/modprobe cycle of bus driver, the input is not claimed any more. This is likely because of misplaced hid_hw_close() - it causes memory corruption on hidraw_list As original patch author is not responding to requests to fix his patch, and the original deallocation mechanism is not exposing any problems, I am reverting back to it. Signed-off-by:
Jiri Kosina <jkosina@suse.cz> Signed-off-by:
Ben Hutchings <ben@decadent.org.uk>
-
Eric Dumazet authored
[ Upstream commit 55432d2b ] commit 5faa5df1 (inetpeer: Invalidate the inetpeer tree along with the routing cache) added a race : Before freeing an inetpeer, we must respect a RCU grace period, and make sure no user will attempt to increase refcnt. inetpeer_invalidate_tree() waits for a RCU grace period before inserting inetpeer tree into gc_list and waking the worker. At that time, no concurrent lookup can find a inetpeer in this tree. Signed-off-by:
Eric Dumazet <edumazet@google.com> Cc: Steffen Klassert <steffen.klassert@secunet.com> Acked-by:
Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Ben Hutchings <ben@decadent.org.uk>
-
Steffen Klassert authored
[ Upstream commit 5faa5df1 ] We initialize the routing metrics with the values cached on the inetpeer in rt_init_metrics(). So if we have the metrics cached on the inetpeer, we ignore the user configured fib_metrics. To fix this issue, we replace the old tree with a fresh initialized inet_peer_base. The old tree is removed later with a delayed work queue. Signed-off-by:
Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Ben Hutchings <ben@decadent.org.uk>
-
Ying Xue authored
[ Upstream commit 4225a398 ] When the lockdep validator is enabled, it will report the below warning when we enable a TIPC bearer: [ INFO: possible irq lock inversion dependency detected ] --------------------------------------------------------- Possible interrupt unsafe locking scenario: CPU0 CPU1 ---- ---- lock(ptype_lock); local_irq_disable(); lock(tipc_net_lock); lock(ptype_lock); <Interrupt> lock(tipc_net_lock); *** DEADLOCK *** the shortest dependencies between 2nd lock and 1st lock: -> (ptype_lock){+.+...} ops: 10 { [...] SOFTIRQ-ON-W at: [<c1089418>] __lock_acquire+0x528/0x13e0 [<c108a360>] lock_acquire+0x90/0x100 [<c1553c38>] _raw_spin_lock+0x38/0x50 [<c14651ca>] dev_add_pack+0x3a/0x60 [<c182da75>] arp_init+0x1a/0x48 [<c182dce5>] inet_init+0x181/0x27e [<c1001114>] do_one_initcall+0x34/0x170 [<c17f7329>] kernel_init+0x110/0x1b2 [<c155b6a2>] kernel_thread_helper+0x6/0x10 [...] ... key at: [<c17e4b10>] ptype_lock+0x10/0x20 ... acquired at: [<c108a360>] lock_acquire+0x90/0x100 [<c1553c38>] _raw_spin_lock+0x38/0x50 [<c14651ca>] dev_add_pack+0x3a/0x60 [<c8bc18d2>] enable_bearer+0xf2/0x140 [tipc] [<c8bb283a>] tipc_enable_bearer+0x1ba/0x450 [tipc] [<c8bb3a04>] tipc_cfg_do_cmd+0x5c4/0x830 [tipc] [<c8bbc032>] handle_cmd+0x42/0xd0 [tipc] [<c148e802>] genl_rcv_msg+0x232/0x280 [<c148d3f6>] netlink_rcv_skb+0x86/0xb0 [<c148e5bc>] genl_rcv+0x1c/0x30 [<c148d144>] netlink_unicast+0x174/0x1f0 [<c148ddab>] netlink_sendmsg+0x1eb/0x2d0 [<c1456bc1>] sock_aio_write+0x161/0x170 [<c1135a7c>] do_sync_write+0xac/0xf0 [<c11360f6>] vfs_write+0x156/0x170 [<c11361e2>] sys_write+0x42/0x70 [<c155b0df>] sysenter_do_call+0x12/0x38 [...] } -> (tipc_net_lock){+..-..} ops: 4 { [...] IN-SOFTIRQ-R at: [<c108953a>] __lock_acquire+0x64a/0x13e0 [<c108a360>] lock_acquire+0x90/0x100 [<c15541cd>] _raw_read_lock_bh+0x3d/0x50 [<c8bb874d>] tipc_recv_msg+0x1d/0x830 [tipc] [<c8bc195f>] recv_msg+0x3f/0x50 [tipc] [<c146a5fa>] __netif_receive_skb+0x22a/0x590 [<c146ab0b>] netif_receive_skb+0x2b/0xf0 [<c13c43d2>] pcnet32_poll+0x292/0x780 [<c146b00a>] net_rx_action+0xfa/0x1e0 [<c103a4be>] __do_softirq+0xae/0x1e0 [...] } >From the log, we can see three different call chains between CPU0 and CPU1: Time 0 on CPU0: kernel_init()->inet_init()->dev_add_pack() At time 0, the ptype_lock is held by CPU0 in dev_add_pack(); Time 1 on CPU1: tipc_enable_bearer()->enable_bearer()->dev_add_pack() At time 1, tipc_enable_bearer() first holds tipc_net_lock, and then wants to take ptype_lock to register TIPC protocol handler into the networking stack. But the ptype_lock has been taken by dev_add_pack() on CPU0, so at this time the dev_add_pack() running on CPU1 has to be busy looping. Time 2 on CPU0: netif_receive_skb()->recv_msg()->tipc_recv_msg() At time 2, an incoming TIPC packet arrives at CPU0, hence tipc_recv_msg() will be invoked. In tipc_recv_msg(), it first wants to hold tipc_net_lock. At the moment, below scenario happens: On CPU0, below is our sequence of taking locks: lock(ptype_lock)->lock(tipc_net_lock) On CPU1, our sequence of taking locks looks like: lock(tipc_net_lock)->lock(ptype_lock) Obviously deadlock may happen in this case. But please note the deadlock possibly doesn't occur at all when the first TIPC bearer is enabled. Before enable_bearer() -- running on CPU1 does not hold ptype_lock, so the TIPC receive handler (i.e. recv_msg()) is not registered successfully via dev_add_pack(), so the tipc_recv_msg() cannot be called by recv_msg() even if a TIPC message comes to CPU0. But when the second TIPC bearer is registered, the deadlock can perhaps really happen. To fix it, we will push the work of registering TIPC protocol handler into workqueue context. After the change, both paths taking ptype_lock are always in process contexts, thus, the deadlock should never occur. Signed-off-by:
Ying Xue <ying.xue@windriver.com> Signed-off-by:
Jon Maloy <jon.maloy@ericsson.com> Signed-off-by:
Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Ben Hutchings <ben@decadent.org.uk>
-
Jiri Bohac authored
[ Upstream commit 61e76b17 ] RFC 4443 has defined two additional codes for ICMPv6 type 1 (destination unreachable) messages: 5 - Source address failed ingress/egress policy 6 - Reject route to destination Now they are treated as protocol error and icmpv6_err_convert() converts them to EPROTO. RFC 4443 says: "Codes 5 and 6 are more informative subsets of code 1." Treat codes 5 and 6 as code 1 (EACCES) Btw, connect() returning -EPROTO confuses firefox, so that fallback to other/IPv4 addresses does not work: https://bugzilla.mozilla.org/show_bug.cgi?id=910773Signed-off-by:
Jiri Bohac <jbohac@suse.cz> Acked-by:
Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Ben Hutchings <ben@decadent.org.uk>
-
Daniel Borkmann authored
[ Upstream commit 2d98c29b ] While looking into MLDv1/v2 code, I noticed that bridging code does not convert it's max delay into jiffies for MLDv2 messages as we do in core IPv6' multicast code. RFC3810, 5.1.3. Maximum Response Code says: The Maximum Response Code field specifies the maximum time allowed before sending a responding Report. The actual time allowed, called the Maximum Response Delay, is represented in units of milliseconds, and is derived from the Maximum Response Code as follows: [...] As we update timers that work with jiffies, we need to convert it. Signed-off-by:
Daniel Borkmann <dborkman@redhat.com> Cc: Linus Lüssing <linus.luessing@web.de> Cc: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Ben Hutchings <ben@decadent.org.uk>
-
Thomas Graf authored
[ Upstream commit 25a6e6b8 ] Allocating skbs when sending out neighbour discovery messages currently uses sock_alloc_send_skb() based on a per net namespace socket and thus share a socket wmem buffer space. If a netdevice is temporarily unable to transmit due to carrier loss or for other reasons, the queued up ndisc messages will cosnume all of the wmem space and will thus prevent from any more skbs to be allocated even for netdevices that are able to transmit packets. The number of neighbour discovery messages sent is very limited, use of alloc_skb() bypasses the socket wmem buffer size enforcement while the manual call to skb_set_owner_w() maintains the socket reference needed for the IPv6 output path. This patch has orginally been posted by Eric Dumazet in a modified form. Signed-off-by:
Thomas Graf <tgraf@suug.ch> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Hannes Frederic Sowa <hannes@stressinduktion.org> Cc: Stephen Warren <swarren@wwwdotorg.org> Cc: Fabio Estevam <festevam@gmail.com> Tested-by:
Fabio Estevam <fabio.estevam@freescale.com> Tested-by:
Stephen Warren <swarren@nvidia.com> Acked-by:
Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Ben Hutchings <ben@decadent.org.uk>
-
Hannes Frederic Sowa authored
[ Upstream commit f46078cf ] It is not allowed for an ipv6 packet to contain multiple fragmentation headers. So discard packets which were already reassembled by fragmentation logic and send back a parameter problem icmp. The updates for RFC 6980 will come in later, I have to do a bit more research here. Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by:
Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Ben Hutchings <ben@decadent.org.uk>
-
Hannes Frederic Sowa authored
[ Upstream commit 4b08a8f1 ] Because of the max_addresses check attackers were able to disable privacy extensions on an interface by creating enough autoconfigured addresses: <http://seclists.org/oss-sec/2012/q4/292> But the check is not actually needed: max_addresses protects the kernel to install too many ipv6 addresses on an interface and guards addrconf_prefix_rcv to install further addresses as soon as this limit is reached. We only generate temporary addresses in direct response of a new address showing up. As soon as we filled up the maximum number of addresses of an interface, we stop installing more addresses and thus also stop generating more temp addresses. Even if the attacker tries to generate a lot of temporary addresses by announcing a prefix and removing it again (lifetime == 0) we won't install more temp addresses, because the temporary addresses do count to the maximum number of addresses, thus we would stop installing new autoconfigured addresses when the limit is reached. This patch fixes CVE-2013-0343 (but other layer-2 attacks are still possible). Thanks to Ding Tianhong to bring this topic up again. Cc: Ding Tianhong <dingtianhong@huawei.com> Cc: George Kargiotakis <kargig@void.gr> Cc: P J P <ppandit@redhat.com> Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by:
Hannes Frederic Sowa <hannes@stressinduktion.org> Acked-by:
Ding Tianhong <dingtianhong@huawei.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Ben Hutchings <ben@decadent.org.uk>
-
Dan Carpenter authored
[ Upstream commit 15718ea0 ] The recent fix d9bf5f13 "tun: compare with 0 instead of total_len" is not totally correct. Because "len" and "sizeof()" are size_t type, that means they are never less than zero. Signed-off-by:
Dan Carpenter <dan.carpenter@oracle.com> Acked-by:
Michael S. Tsirkin <mst@redhat.com> Acked-by:
Neil Horman <nhorman@tuxdriver.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Ben Hutchings <ben@decadent.org.uk>
-
Neil Horman authored
[ Upstream commits cf3c4c03 and d06f5187 (the latter is a fixup from Dave Jones) ] Self explanitory dma_mapping_error addition to the 8139 driver, based on this: https://bugzilla.redhat.com/show_bug.cgi?id=947250 It showed several backtraces arising for dma_map_* usage without checking the return code on the mapping. Add the check and abort the rx/tx operation if its failed. Untested as I have no hardware and the reporter has wandered off, but seems pretty straightforward. Signed-off-by:
Neil Horman <nhorman@tuxdriver.com> CC: "David S. Miller" <davem@davemloft.net> CC: Francois Romieu <romieu@fr.zoreil.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Ben Hutchings <ben@decadent.org.uk>
-
Hannes Frederic Sowa authored
[ Upstream commit 3e3be275 ] In case a subtree did not match we currently stop backtracking and return NULL (root table from fib_lookup). This could yield in invalid routing table lookups when using subtrees. Instead continue to backtrack until a valid subtree or node is found and return this match. Also remove unneeded NULL check. Reported-by:
Teco Boot <teco@inf-net.nl> Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Cc: David Lamparter <equinox@diac24.net> Cc: <boutier@pps.univ-paris-diderot.fr> Signed-off-by:
Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Ben Hutchings <ben@decadent.org.uk>
-
Eric Dumazet authored
[ Upstream commit cd6b423a ] While investigating about strange increase of retransmit rates on hosts ~24 days after boot, Van found hystart was disabled if ca->epoch_start was 0, as following condition is true when tcp_time_stamp high order bit is set. (s32)(tcp_time_stamp - ca->epoch_start) < HZ Quoting Van : At initialization & after every loss ca->epoch_start is set to zero so I believe that the above line will turn off hystart as soon as the 2^31 bit is set in tcp_time_stamp & hystart will stay off for 24 days. I think we've observed that cubic's restart is too aggressive without hystart so this might account for the higher drop rate we observe. Diagnosed-by:
Van Jacobson <vanj@google.com> Signed-off-by:
Eric Dumazet <edumazet@google.com> Cc: Neal Cardwell <ncardwell@google.com> Cc: Yuchung Cheng <ycheng@google.com> Acked-by:
Neal Cardwell <ncardwell@google.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Ben Hutchings <ben@decadent.org.uk>
-
Eric Dumazet authored
[ Upstream commit 2ed0edf9 ] commit 17a6e9f1 ("tcp_cubic: fix clock dependency") added an overflow error in bictcp_update() in following code : /* change the unit from HZ to bictcp_HZ */ t = ((tcp_time_stamp + msecs_to_jiffies(ca->delay_min>>3) - ca->epoch_start) << BICTCP_HZ) / HZ; Because msecs_to_jiffies() being unsigned long, compiler does implicit type promotion. We really want to constrain (tcp_time_stamp - ca->epoch_start) to a signed 32bit value, or else 't' has unexpected high values. This bugs triggers an increase of retransmit rates ~24 days after boot [1], as the high order bit of tcp_time_stamp flips. [1] for hosts with HZ=1000 Big thanks to Van Jacobson for spotting this problem. Diagnosed-by:
Van Jacobson <vanj@google.com> Signed-off-by:
Eric Dumazet <edumazet@google.com> Cc: Neal Cardwell <ncardwell@google.com> Cc: Yuchung Cheng <ycheng@google.com> Cc: Stephen Hemminger <stephen@networkplumber.org> Acked-by:
Neal Cardwell <ncardwell@google.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Ben Hutchings <ben@decadent.org.uk>
-
Eric Dumazet authored
[ Upstream commit aab515d7 ] AddressSanitizer [1] dynamic checker pointed a potential out of bound access in leaf_walk_rcu() We could allocate one more slot in tnode_new() to leave the prefetch() in-place but it looks not worth the pain. Bug added in commit 82cfbb00 ("[IPV4] fib_trie: iterator recode") [1] : https://code.google.com/p/address-sanitizer/wiki/AddressSanitizerForKernelReported-by:
Andrey Konovalov <andreyknvl@google.com> Signed-off-by:
Eric Dumazet <edumazet@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Ben Hutchings <ben@decadent.org.uk>
-
Roman Gushchin authored
[ Upstream commit 5f671d6b ] It's possible to assign an invalid value to the net.core.somaxconn sysctl variable, because there is no checks at all. The sk_max_ack_backlog field of the sock structure is defined as unsigned short. Therefore, the backlog argument in inet_listen() shouldn't exceed USHRT_MAX. The backlog argument in the listen() syscall is truncated to the somaxconn value. So, the somaxconn value shouldn't exceed 65535 (USHRT_MAX). Also, negative values of somaxconn are meaningless. before: $ sysctl -w net.core.somaxconn=256 net.core.somaxconn = 256 $ sysctl -w net.core.somaxconn=65536 net.core.somaxconn = 65536 $ sysctl -w net.core.somaxconn=-100 net.core.somaxconn = -100 after: $ sysctl -w net.core.somaxconn=256 net.core.somaxconn = 256 $ sysctl -w net.core.somaxconn=65536 error: "Invalid argument" setting key "net.core.somaxconn" $ sysctl -w net.core.somaxconn=-100 error: "Invalid argument" setting key "net.core.somaxconn" Based on a prior patch from Changli Gao. Signed-off-by:
Roman Gushchin <klamm@yandex-team.ru> Reported-by:
Changli Gao <xiaosuo@gmail.com> Suggested-by:
Eric Dumazet <edumazet@google.com> Acked-by:
Eric Dumazet <edumazet@google.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Ben Hutchings <ben@decadent.org.uk>
-
stephen hemminger authored
[ Upstream commit cbd37556 ] When userspace passes a large priority value the assignment of the unsigned value hopt->prio to signed int cl->prio causes cl->prio to become negative and the comparison is with TC_HTB_NUMPRIO is always false. The result is that HTB crashes by referencing outside the array when processing packets. With this patch the large value wraps around like other values outside the normal range. See: https://bugzilla.kernel.org/show_bug.cgi?id=60669Signed-off-by:
Stephen Hemminger <stephen@networkplumber.org> Acked-by:
Eric Dumazet <edumazet@google.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Ben Hutchings <ben@decadent.org.uk>
-
- 10 Sep, 2013 7 commits
-
-
Ben Hutchings authored
-
David Vrabel authored
commit 3bc38cbc upstream. If there are UNUSABLE regions in the machine memory map, dom0 will attempt to map them 1:1 which is not permitted by Xen and the kernel will crash. There isn't anything interesting in the UNUSABLE region that the dom0 kernel needs access to so we can avoid making the 1:1 mapping and treat it as RAM. We only do this for dom0, as that is where tboot case shows up. A PV domU could have an UNUSABLE region in its pseudo-physical map and would need to be handled in another patch. This fixes a boot failure on hosts with tboot. tboot marks a region in the e820 map as unusable and the dom0 kernel would attempt to map this region and Xen does not permit unusable regions to be mapped by guests. (XEN) 0000000000000000 - 0000000000060000 (usable) (XEN) 0000000000060000 - 0000000000068000 (reserved) (XEN) 0000000000068000 - 000000000009e000 (usable) (XEN) 0000000000100000 - 0000000000800000 (usable) (XEN) 0000000000800000 - 0000000000972000 (unusable) tboot marked this region as unusable. (XEN) 0000000000972000 - 00000000cf200000 (usable) (XEN) 00000000cf200000 - 00000000cf38f000 (reserved) (XEN) 00000000cf38f000 - 00000000cf3ce000 (ACPI data) (XEN) 00000000cf3ce000 - 00000000d0000000 (reserved) (XEN) 00000000e0000000 - 00000000f0000000 (reserved) (XEN) 00000000fe000000 - 0000000100000000 (reserved) (XEN) 0000000100000000 - 0000000630000000 (usable) Signed-off-by:
David Vrabel <david.vrabel@citrix.com> [v1: Altered the patch and description with domU's with UNUSABLE regions] Signed-off-by:
Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by:
Ben Hutchings <ben@decadent.org.uk>
-
Dominik Dingel authored
commit 2b29a9fd upstream. Any uaccess between guest_enter and guest_exit could trigger a page fault, the page fault handler would handle it as a guest fault and translate a user address as guest address. Signed-off-by:
Dominik Dingel <dingel@linux.vnet.ibm.com> Signed-off-by:
Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com> [bwh: Backported to 3.2: adjust context and add the rc variable] Signed-off-by:
Ben Hutchings <ben@decadent.org.uk>
-
Nicholas Bellinger authored
commit ee60bddb upstream. This patch fixes spc_emulate_inquiry_std() to add trailing ASCII spaces for INQUIRY vendor + model fields following SPC-4 text: "ASCII data fields described as being left-aligned shall have any unused bytes at the end of the field (i.e., highest offset) and the unused bytes shall be filled with ASCII space characters (20h)." This addresses a problem with Falconstor NSS multipathing. Reported-by:
Tomas Molota <tomas.molota@lightstorm.sk> Signed-off-by:
Nicholas Bellinger <nab@linux-iscsi.org> [bwh: Backported to 3.2, based on Nicholas's versions for 3.0 and 3.4] Signed-off-by:
Ben Hutchings <ben@decadent.org.uk>
-
Takashi Iwai authored
commit fb615499 upstream. The recent commit to delay the release of kobject triggered NULL dereferences of opti9xx drivers. The cause is that all snd-opti92x-ad1848, snd-opti92x-cs4231 and snd-opti93x drivers register the PnP card driver with the very same name, and also snd-opti92x-ad1848 and -cs4231 drivers register the ISA driver with the same name, too. When these drivers are built in, quick "register-release-and-re-register" actions occur, and this results in Oops because of the same name is assigned to the kobject. The fix is simply to assign individual names. As a bonus, by using KBUILD_MODNAME, the patch reduces more lines than it adds. The fix is based on the suggestion by Russell King. Reported-and-tested-by:
Fengguang Wu <fengguang.wu@intel.com> Signed-off-by:
Takashi Iwai <tiwai@suse.de> Signed-off-by:
Ben Hutchings <ben@decadent.org.uk>
-
David S. Miller authored
commit 74c7b289 upstream. Otherwise if no references exist in the static kernel image, we won't export the symbol properly to modules. Signed-off-by:
David S. Miller <davem@davemloft.net> Cc: Guenter Roeck <linux@roeck-us.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Ben Hutchings <ben@decadent.org.uk>
-
Sam Ravnborg authored
commit de36e66d upstream. Based on copy from microblaze add ucmpdi2 implementation. This fixes build of niu driver which failed with: drivers/built-in.o: In function `niu_get_nfc': niu.c:(.text+0x91494): undefined reference to `__ucmpdi2' This driver will never be used on a sparc32 system, but patch added to fix build breakage with all*config builds. Signed-off-by:
Sam Ravnborg <sam@ravnborg.org> Signed-off-by:
David S. Miller <davem@davemloft.net> Cc: Guenter Roeck <linux@roeck-us.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Ben Hutchings <ben@decadent.org.uk>
-