- 01 Mar, 2012 35 commits
-
-
Bruno Thomsen authored
commit c6c1e449 upstream. Signed-off-by:
Bruno Thomsen <bruno.thomsen@gmail.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Neal Cardwell authored
[ Upstream commit 0af2a0d0 ] This commit ensures that lost_cnt_hint is correctly updated in tcp_shifted_skb() for FACK TCP senders. The lost_cnt_hint adjustment in tcp_sacktag_one() only applies to non-FACK senders, so FACK senders need their own adjustment. This applies the spirit of 1e5289e1 - except now that the sequence range passed into tcp_sacktag_one() is correct we need only have a special case adjustment for FACK. Signed-off-by:
Neal Cardwell <ncardwell@google.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Neal Cardwell authored
[ Upstream commit daef52ba ] Fix the newly-SACKed range to be the range of newly-shifted bytes. Previously - since 832d11c5 - tcp_shifted_skb() incorrectly called tcp_sacktag_one() with the start and end sequence numbers of the skb it passes in set to the range just beyond the range that is newly-SACKed. This commit also removes a special-case adjustment to lost_cnt_hint in tcp_shifted_skb() since the pre-existing adjustment of lost_cnt_hint in tcp_sacktag_one() now properly handles this things now that the correct start sequence number is passed in. Signed-off-by:
Neal Cardwell <ncardwell@google.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Neal Cardwell authored
[ Upstream commit cc9a672e ] This commit allows callers of tcp_sacktag_one() to pass in sequence ranges that do not align with skb boundaries, as tcp_shifted_skb() needs to do in an upcoming fix in this patch series. In fact, now tcp_sacktag_one() does not need to depend on an input skb at all, which makes its semantics and dependencies more clear. Signed-off-by:
Neal Cardwell <ncardwell@google.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Eric Dumazet authored
[ Upstream commit 5ca3b72c ] Shlomo Pongratz reported GRO L2 header check was suited for Ethernet only, and failed on IB/ipoib traffic. He provided a patch faking a zeroed header to let GRO aggregates frames. Roland Dreier, Herbert Xu, and others suggested we change GRO L2 header check to be more generic, ie not assuming L2 header is 14 bytes, but taking into account hard_header_len. __napi_gro_receive() has special handling for the common case (Ethernet) to avoid a memcmp() call and use an inline optimized function instead. Signed-off-by:
Eric Dumazet <eric.dumazet@gmail.com> Reported-by:
Shlomo Pongratz <shlomop@mellanox.com> Cc: Roland Dreier <roland@kernel.org> Cc: Or Gerlitz <ogerlitz@mellanox.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Tested-by:
Sean Hefty <sean.hefty@intel.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Roland Dreier authored
[ Upstream commit 936d7de3 ] Commit a0417fa3 ("net: Make qdisc_skb_cb upper size bound explicit.") made it possible for a netdev driver to use skb->cb between its header_ops.create method and its .ndo_start_xmit method. Use this in ipoib_hard_header() to stash away the LL address (GID + QPN), instead of the "ipoib_pseudoheader" hack. This allows IPoIB to stop lying about its hard_header_len, which will let us fix the L2 check for GRO. Signed-off-by:
Roland Dreier <roland@purestorage.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
David S. Miller authored
[ Upstream commit 16bda13d ] Just like skb->cb[], so that qdisc_skb_cb can be encapsulated inside of other data structures. This is intended to be used by IPoIB so that it can remember addressing information stored at hard_header_ops->create() time that it can fetch when the packet gets to the transmit routine. Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Li Wei authored
[ Upstream commit 5dc7883f ] This patch fix a bug which introduced by commit ac8a4810 (ipv4: Save nexthop address of LSRR/SSRR option to IPCB.).In that patch, we saved the nexthop of SRR in ip_option->nexthop and update iph->daddr until we get to ip_forward_options(), but we need to update it before ip_rt_get_source(), otherwise we may get a wrong src. Signed-off-by:
Li Wei <lw@cn.fujitsu.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Shawn Lu authored
[ Upstream commit e2446eaa ] Binding RST packet outgoing interface to incoming interface for tcp v4 when there is no socket associate with it. when sk is not NULL, using sk->sk_bound_dev_if instead. (suggested by Eric Dumazet). This has few benefits: 1. tcp_v6_send_reset already did that. 2. This helps tcp connect with SO_BINDTODEVICE set. When connection is lost, we still able to sending out RST using same interface. 3. we are sending reply, it is most likely to be succeed if iif is used Signed-off-by:
Shawn Lu <shawn.lu@ericsson.com> Acked-by:
Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Julian Anastasov authored
[ Upstream commit e6b45241 ] Eric Dumazet found that commit 813b3b5d (ipv4: Use caller's on-stack flowi as-is in output route lookups.) that comes in 3.0 added a regression. The problem appears to be that resulting flowi4_oif is used incorrectly as input parameter to some routing lookups. The result is that when connecting to local port without listener if the IP address that is used is not on a loopback interface we incorrectly assign RTN_UNICAST to the output route because no route is matched by oif=lo. The RST packet can not be sent immediately by tcp_v4_send_reset because it expects RTN_LOCAL. So, change ip_route_connect and ip_route_newports to update the flowi4 fields that are input parameters because we do not want unnecessary binding to oif. To make it clear what are the input parameters that can be modified during lookup and to show which fields of floiw4 are reused add a new function to update the flowi4 structure: flowi4_update_output. Thanks to Yurij M. Plotnikov for providing a bug report including a program to reproduce the problem. Thanks to Eric Dumazet for tracking the problem down to tcp_v4_send_reset and providing initial fix. Reported-by:
Yurij M. Plotnikov <Yurij.Plotnikov@oktetlabs.ru> Signed-off-by:
Julian Anastasov <ja@ssi.bg> Acked-by:
Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
David Lv authored
[ Upstream commit b530b193 ] Initially diagnosed on Ubuntu 11.04 with kernel 2.6.38. velocity_close is not called during a suspend / resume cycle in this driver and it has no business playing directly with power states. Signed-off-by:
David Lv <DavidLv@viatech.com.cn> Acked-by:
Francois Romieu <romieu@fr.zoreil.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Hagen Paul Pfeifer authored
[ Upstream commit 23711438 ] VETH_INFO_PEER carries struct ifinfomsg plus optional IFLA attributes. A minimal size of sizeof(struct ifinfomsg) must be enforced or we may risk accessing that struct beyond the limits of the netlink message. Signed-off-by:
Thomas Graf <tgraf@suug.ch> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Hagen Paul Pfeifer authored
[ Upstream commit eb101924 ] Not now, but it looks you are correct. q->qdisc is NULL until another additional qdisc is attached (beside tfifo). See 50612537. The following patch should work. From: Hagen Paul Pfeifer <hagen@jauu.net> netem: catch NULL pointer by updating the real qdisc statistic Reported-by:
Vijay Subramanian <subramanian.vijay@gmail.com> Signed-off-by:
Hagen Paul Pfeifer <hagen@jauu.net> Acked-by:
Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Eric Dumazet authored
[ Upstream commit 58e05f35 ] commit 5a698af5 (bond: service netpoll arp queue on master device) tested IFF_SLAVE flag against dev->priv_flags instead of dev->flags Signed-off-by:
Eric Dumazet <eric.dumazet@gmail.com> Cc: WANG Cong <amwang@redhat.com> Acked-by:
Neil Horman <nhorman@tuxdriver.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Thomas Graf authored
[ Upstream commit 70620c46 ] Commit 653241 (net: RFC3069, private VLAN proxy arp support) changed the behavior of arp proxy to send arp replies back out on the interface the request came in even if the private VLAN feature is disabled. Previously we checked rt->dst.dev != skb->dev for in scenarios, when proxy arp is enabled on for the netdevice and also when individual proxy neighbour entries have been added. This patch adds the check back for the pneigh_lookup() scenario. Signed-off-by:
Thomas Graf <tgraf@suug.ch> Acked-by:
Jesper Dangaard Brouer <hawk@comx.dk> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Eric Dumazet authored
[ Upstream commit 3013dc0c ] Jean Delvare reported bonding on top of 3c59x adapters was not detecting network cable removal fast enough. 3c59x indeed uses a 60 seconds timer to check link status if carrier is on, and 5 seconds if carrier is off. This patch reduces timer period to 5 seconds if device is a bonding slave. Reported-by:
Jean Delvare <jdelvare@suse.de> Acked-by:
Jean Delvare <jdelvare@suse.de> Acked-by:
Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by:
Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Rabin Vincent authored
commit 8e43a905 upstream. Bootup with lockdep enabled has been broken on v7 since b46c0f74 ("ARM: 7321/1: cache-v7: Disable preemption when reading CCSIDR"). This is because v7_setup (which is called very early during boot) calls v7_flush_dcache_all, and the save_and_disable_irqs added by that patch ends up attempting to call into lockdep C code (trace_hardirqs_off()) when we are in no position to execute it (no stack, MMU off). Fix this by using a notrace variant of save_and_disable_irqs. The code already uses the notrace variant of restore_irqs. Reviewed-by:
Nicolas Pitre <nico@linaro.org> Acked-by:
Stephen Boyd <sboyd@codeaurora.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by:
Rabin Vincent <rabin@rab.in> Signed-off-by:
Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Stephen Boyd authored
commit b46c0f74 upstream. armv7's flush_cache_all() flushes caches via set/way. To determine the cache attributes (line size, number of sets, etc.) the assembly first writes the CSSELR register to select a cache level and then reads the CCSIDR register. The CSSELR register is banked per-cpu and is used to determine which cache level CCSIDR reads. If the task is migrated between when the CSSELR is written and the CCSIDR is read the CCSIDR value may be for an unexpected cache level (for example L1 instead of L2) and incorrect cache flushing could occur. Disable interrupts across the write and read so that the correct cache attributes are read and used for the cache flushing routine. We disable interrupts instead of disabling preemption because the critical section is only 3 instructions and we want to call v7_dcache_flush_all from __v7_setup which doesn't have a full kernel stack with a struct thread_info. This fixes a problem we see in scm_call() when flush_cache_all() is called from preemptible context and sometimes the L2 cache is not properly flushed out. Signed-off-by:
Stephen Boyd <sboyd@codeaurora.org> Acked-by:
Catalin Marinas <catalin.marinas@arm.com> Reviewed-by:
Nicolas Pitre <nico@linaro.org> Signed-off-by:
Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Weston Andros Adamson authored
commit abe9a6d5 upstream. server_scope would never be freed if nfs4_check_cl_exchange_flags() returned non-zero Signed-off-by:
Weston Andros Adamson <dros@netapp.com> Signed-off-by:
Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Trond Myklebust authored
commit b9f9a031 upstream. To ensure that we don't just reuse the bad delegation when we attempt to recover the nfs4_state that received the bad stateid error. Signed-off-by:
Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Trond Myklebust authored
commit 331818f1 upstream. Commit bf118a34 (NFSv4: include bitmap in nfsv4 get acl data) introduces the 'acl_scratch' page for the case where we may need to decode multi-page data. However it fails to take into account the fact that the variable may be NULL (for the case where we're not doing multi-page decode), and it also attaches it to the encoding xdr_stream rather than the decoding one. The immediate result is an Oops in nfs4_xdr_enc_getacl due to the call to page_address() with a NULL page pointer. Signed-off-by:
Trond Myklebust <Trond.Myklebust@netapp.com> Cc: Andy Adamson <andros@netapp.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Johan Rudholm authored
commit 4d6144de upstream. If the read or write buffer size associated with the command sent through the mmc_blk_ioctl is zero, do not prepare data buffer. This enables a ioctl(2) call to for instance send a MMC_SWITCH to set a byte in the ext_csd. Signed-off-by:
Johan Rudholm <johan.rudholm@stericsson.com> Signed-off-by:
Chris Ball <cjb@laptop.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Takashi Iwai authored
[Note that since the patch isn't applicable (and unnecessary) to 3.3-rc, there is no corresponding upstream fix.] The cx5051 parser calls snd_hda_input_jack_add() in the init callback to create and initialize the jack detection instances. Since the init callback is called at each time when the device gets woken up after suspend or power-saving mode, the duplicated instances are accumulated at each call. This ends up with the kernel warnings with the too large array size. The fix is simply to move the calls of snd_hda_input_jack_add() into the parser section instead of the init callback. The fix is needed only up to 3.2 kernel, since the HD-audio jack layer was redesigned in the 3.3 kernel. Reported-by:
Russell King <rmk+kernel@arm.linux.org.uk> Tested-by:
Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by:
Takashi Iwai <tiwai@suse.de> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Javi Merino authored
commit 46e33c60 upstream. This fixes the thrd->req_running field being accessed before thrd is checked for null. The error was introduced in abb959f8: ARM: 7237/1: PL330: Fix driver freeze Reference: <1326458191-23492-1-git-send-email-mans.rullgard@linaro.org> Signed-off-by:
Mans Rullgard <mans.rullgard@linaro.org> Acked-by:
Javi Merino <javi.merino@arm.com> Signed-off-by:
Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Miklos Szeredi authored
commit e188dc02 upstream. d_inode_lookup() leaks a dentry reference on IS_DEADDIR(). Signed-off-by:
Miklos Szeredi <mszeredi@suse.cz> Signed-off-by:
Al Viro <viro@zeniv.linux.org.uk> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Martin Schwidefsky authored
commit cf1eb40f upstream. The conversion of the ktime to a value suitable for the clock comparator does not take changes to wall_to_monotonic into account. In fact the conversion just needs the boot clock (sched_clock_base_cc) and the total_sleep_time. This is applicable to 3.2+ kernels. Signed-off-by:
Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Tyler Hicks authored
commit 545d6809 upstream. After passing through a ->setxattr() call, eCryptfs needs to copy the inode attributes from the lower inode to the eCryptfs inode, as they may have changed in the lower filesystem's ->setxattr() path. One example is if an extended attribute containing a POSIX Access Control List is being set. The new ACL may cause the lower filesystem to modify the mode of the lower inode and the eCryptfs inode would need to be updated to reflect the new mode. https://launchpad.net/bugs/926292Signed-off-by:
Tyler Hicks <tyhicks@canonical.com> Reported-by:
Sebastien Bacher <seb128@ubuntu.com> Cc: John Johansen <john.johansen@canonical.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Lars-Peter Clausen authored
commit 61cddc57 upstream. Currently registers with a value of 0 are ignored when initializing the register defaults from raw defaults. This worked in the past, because registers without a explicit default were assumed to have a default value of 0. This was changed in commit b03622a8 ("regmap: Ensure rbtree syncs registers set to zero properly"). As a result registers, which have a raw default value of 0 are now assumed to have no default. This again can result in unnecessary writes when syncing the cache. It will also result in unnecessary reads for e.g. the first update operation. In the case where readback is not possible this will even let the update operation fail, if the register has not been written to before. So this patch removes the check. Instead it adds a check to ignore raw defaults for registers which are volatile, since those registers are not cached. Signed-off-by:
Lars-Peter Clausen <lars@metafoo.de> Signed-off-by:
Mark Brown <broonie@opensource.wolfsonmicro.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Tim Gardner authored
commit 72ba009b upstream. BugLink: http://bugs.launchpad.net/bugs/900802Signed-off-by:
Till Kamppeter <till.kamppeter@gmail.com> Signed-off-by:
Tim Gardner <tim.gardner@canonical.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Mohammed Shafi Shajakhan authored
commit b57e6b56 upstream. read_lock(&tpt_trig->trig.leddev_list_lock) is accessed via the path ieee80211_open (->) ieee80211_do_open (->) ieee80211_mod_tpt_led_trig (->) ieee80211_start_tpt_led_trig (->) tpt_trig_timer before initializing it. the intilization of this read/write lock happens via the path ieee80211_led_init (->) led_trigger_register, but we are doing 'ieee80211_led_init' after 'ieeee80211_if_add' where we register netdev_ops. so we access leddev_list_lock before initializing it and causes the following bug in chrome laptops with AR928X cards with the following script while true do sudo modprobe -v ath9k sleep 3 sudo modprobe -r ath9k sleep 3 done BUG: rwlock bad magic on CPU#1, wpa_supplicant/358, f5b9eccc Pid: 358, comm: wpa_supplicant Not tainted 3.0.13 #1 Call Trace: [<8137b9df>] rwlock_bug+0x3d/0x47 [<81179830>] do_raw_read_lock+0x19/0x29 [<8137f063>] _raw_read_lock+0xd/0xf [<f9081957>] tpt_trig_timer+0xc3/0x145 [mac80211] [<f9081f3a>] ieee80211_mod_tpt_led_trig+0x152/0x174 [mac80211] [<f9076a3f>] ieee80211_do_open+0x11e/0x42e [mac80211] [<f9075390>] ? ieee80211_check_concurrent_iface+0x26/0x13c [mac80211] [<f9076d97>] ieee80211_open+0x48/0x4c [mac80211] [<812dbed8>] __dev_open+0x82/0xab [<812dc0c9>] __dev_change_flags+0x9c/0x113 [<812dc1ae>] dev_change_flags+0x18/0x44 [<8132144f>] devinet_ioctl+0x243/0x51a [<81321ba9>] inet_ioctl+0x93/0xac [<812cc951>] sock_ioctl+0x1c6/0x1ea [<812cc78b>] ? might_fault+0x20/0x20 [<810b1ebb>] do_vfs_ioctl+0x46e/0x4a2 [<810a6ebb>] ? fget_light+0x2f/0x70 [<812ce549>] ? sys_recvmsg+0x3e/0x48 [<810b1f35>] sys_ioctl+0x46/0x69 [<8137fa77>] sysenter_do_call+0x12/0x2 Cc: Gary Morain <gmorain@google.com> Cc: Paul Stewart <pstew@google.com> Cc: Abhijit Pradhan <abhijit@qca.qualcomm.com> Cc: Vasanthakumar Thiagarajan <vthiagar@qca.qualcomm.com> Cc: Rajkumar Manoharan <rmanohar@qca.qualcomm.com> Acked-by:
Johannes Berg <johannes.berg@intel.com> Tested-by:
Mohammed Shafi Shajakhan <mohammed@qca.qualcomm.com> Signed-off-by:
Mohammed Shafi Shajakhan <mohammed@qca.qualcomm.com> Signed-off-by:
John W. Linville <linville@tuxdriver.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Yinghai Lu authored
commit 71f6bd4a upstream. Fixes PCI device detection on IBM xSeries IBM 3850 M2 / x3950 M2 when using ACPI resources (_CRS). This is default, a manual workaround (without this patch) would be pci=nocrs boot param. V2: Add dev_warn if the workaround is hit. This should reveal how common such setups are (via google) and point to possible problems if things are still not working as expected. -> Suggested by Jan Beulich. Tested-by: garyhade@us.ibm.com Signed-off-by:
Yinghai Lu <yinghai.lu@oracle.com> Signed-off-by:
Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Alex Deucher authored
commit b7f5b7de upstream. MSI_REARM_EN register is a write only trigger register. There is no need RMW when re-arming. May fix: https://bugs.freedesktop.org/show_bug.cgi?id=41668Signed-off-by:
Alex Deucher <alexander.deucher@amd.com> Signed-off-by:
Dave Airlie <airlied@redhat.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Nicolas Ferre authored
commit e8c9dc93 upstream. Registration of at91_udc as a module will enable SoC related code. Fix following an idea from Karel Znamenacek. Signed-off-by:
Nicolas Ferre <nicolas.ferre@atmel.com> Acked-by:
Karel Znamenacek <karel@ryston.cz> Acked-by:
Jean-Christophe PLAGNIOL-VILLARD <plagnioj@jcrosoft.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Anton Blanchard authored
commit 9a45a940 upstream. perf on POWER stopped working after commit e050e3f0 (perf: Fix broken interrupt rate throttling). That patch exposed a bug in the POWER perf_events code. Since the PMCs count upwards and take an exception when the top bit is set, we want to write 0x80000000 - left in power_pmu_start. We were instead programming in left which effectively disables the counter until we eventually hit 0x80000000. This could take seconds or longer. With the patch applied I get the expected number of samples: SAMPLE events: 9948 Signed-off-by:
Anton Blanchard <anton@samba.org> Acked-by:
Paul Mackerras <paulus@samba.org> Signed-off-by:
Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Greg Kroah-Hartman authored
commit 735e93c7 upstream. This adds the .gitignore file for the autogenerated TOMOYO files to keep git from complaining after building things. Cc: Kentaro Takeda <takedakn@nttdata.co.jp> Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Cc: James Morris <jmorris@namei.org> Acked-by:
Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by:
James Morris <jmorris@namei.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 27 Feb, 2012 5 commits
-
-
Greg Kroah-Hartman authored
-
Linus Torvalds authored
commit 34ddc81a upstream. After all the FPU state cleanups and finally finding the problem that caused all our FPU save/restore problems, this re-introduces the preloading of FPU state that was removed in commit b3b0870e ("i387: do not preload FPU state at task switch time"). However, instead of simply reverting the removal, this reimplements preloading with several fixes, most notably - properly abstracted as a true FPU state switch, rather than as open-coded save and restore with various hacks. In particular, implementing it as a proper FPU state switch allows us to optimize the CR0.TS flag accesses: there is no reason to set the TS bit only to then almost immediately clear it again. CR0 accesses are quite slow and expensive, don't flip the bit back and forth for no good reason. - Make sure that the same model works for both x86-32 and x86-64, so that there are no gratuitous differences between the two due to the way they save and restore segment state differently due to architectural differences that really don't matter to the FPU state. - Avoid exposing the "preload" state to the context switch routines, and in particular allow the concept of lazy state restore: if nothing else has used the FPU in the meantime, and the process is still on the same CPU, we can avoid restoring state from memory entirely, just re-expose the state that is still in the FPU unit. That optimized lazy restore isn't actually implemented here, but the infrastructure is set up for it. Of course, older CPU's that use 'fnsave' to save the state cannot take advantage of this, since the state saving also trashes the state. In other words, there is now an actual _design_ to the FPU state saving, rather than just random historical baggage. Hopefully it's easier to follow as a result. Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Linus Torvalds authored
commit f94edacf upstream. This moves the bit that indicates whether a thread has ownership of the FPU from the TS_USEDFPU bit in thread_info->status to a word of its own (called 'has_fpu') in task_struct->thread.has_fpu. This fixes two independent bugs at the same time: - changing 'thread_info->status' from the scheduler causes nasty problems for the other users of that variable, since it is defined to be thread-synchronous (that's what the "TS_" part of the naming was supposed to indicate). So perfectly valid code could (and did) do ti->status |= TS_RESTORE_SIGMASK; and the compiler was free to do that as separate load, or and store instructions. Which can cause problems with preemption, since a task switch could happen in between, and change the TS_USEDFPU bit. The change to TS_USEDFPU would be overwritten by the final store. In practice, this seldom happened, though, because the 'status' field was seldom used more than once, so gcc would generally tend to generate code that used a read-modify-write instruction and thus happened to avoid this problem - RMW instructions are naturally low fat and preemption-safe. - On x86-32, the current_thread_info() pointer would, during interrupts and softirqs, point to a *copy* of the real thread_info, because x86-32 uses %esp to calculate the thread_info address, and thus the separate irq (and softirq) stacks would cause these kinds of odd thread_info copy aliases. This is normally not a problem, since interrupts aren't supposed to look at thread information anyway (what thread is running at interrupt time really isn't very well-defined), but it confused the heck out of irq_fpu_usable() and the code that tried to squirrel away the FPU state. (It also caused untold confusion for us poor kernel developers). It also turns out that using 'task_struct' is actually much more natural for most of the call sites that care about the FPU state, since they tend to work with the task struct for other reasons anyway (ie scheduling). And the FPU data that we are going to save/restore is found there too. Thanks to Arjan Van De Ven <arjan@linux.intel.com> for pointing us to the %esp issue. Cc: Arjan van de Ven <arjan@linux.intel.com> Reported-and-tested-by:
Raphael Prevost <raphael@buro.asia> Acked-and-tested-by:
Suresh Siddha <suresh.b.siddha@intel.com> Tested-by:
Peter Anvin <hpa@zytor.com> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Linus Torvalds authored
commit 4903062b upstream. The AMD K7/K8 CPUs don't save/restore FDP/FIP/FOP unless an exception is pending. In order to not leak FIP state from one process to another, we need to do a floating point load after the fxsave of the old process, and before the fxrstor of the new FPU state. That resets the state to the (uninteresting) kernel load, rather than some potentially sensitive user information. We used to do this directly after the FPU state save, but that is actually very inconvenient, since it (a) corrupts what is potentially perfectly good FPU state that we might want to lazy avoid restoring later and (b) on x86-64 it resulted in a very annoying ordering constraint, where "__unlazy_fpu()" in the task switch needs to be delayed until after the DS segment has been reloaded just to get the new DS value. Coupling it to the fxrstor instead of the fxsave automatically avoids both of these issues, and also ensures that we only do it when actually necessary (the FP state after a save may never actually get used). It's simply a much more natural place for the leaked state cleanup. Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Linus Torvalds authored
commit b3b0870e upstream. Yes, taking the trap to re-load the FPU/MMX state is expensive, but so is spending several days looking for a bug in the state save/restore code. And the preload code has some rather subtle interactions with both paravirtualization support and segment state restore, so it's not nearly as simple as it should be. Also, now that we no longer necessarily depend on a single bit (ie TS_USEDFPU) for keeping track of the state of the FPU, we migth be able to do better. If we are really switching between two processes that keep touching the FP state, save/restore is inevitable, but in the case of having one process that does most of the FPU usage, we may actually be able to do much better than the preloading. In particular, we may be able to keep track of which CPU the process ran on last, and also per CPU keep track of which process' FP state that CPU has. For modern CPU's that don't destroy the FPU contents on save time, that would allow us to do a lazy restore by just re-enabling the existing FPU state - with no restore cost at all! Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-