- 03 Oct, 2005 10 commits
-
-
Eric Dumazet authored
Arnaldo and I agreed it could be applied now, because I have other pending patches depending on this one (Thank you Arnaldo) (The other important patch moves skc_refcnt in a separate cache line, so that the SMP/NUMA performance doesnt suffer from cache line ping pongs) 1) First some performance data : -------------------------------- tcp_v4_rcv() wastes a *lot* of time in __inet_lookup_established() The most time critical code is : sk_for_each(sk, node, &head->chain) { if (INET_MATCH(sk, acookie, saddr, daddr, ports, dif)) goto hit; /* You sunk my battleship! */ } The sk_for_each() does use prefetch() hints but only the begining of "struct sock" is prefetched. As INET_MATCH first comparison uses inet_sk(__sk)->daddr, wich is far away from the begining of "struct sock", it has to bring into CPU cache cold cache line. Each iteration has to use at least 2 cache lines. This can be problematic if some chains are very long. 2) The goal ----------- The idea I had is to change things so that INET_MATCH() may return FALSE in 99% of cases only using the data already in the CPU cache, using one cache line per iteration. 3) Description of the patch --------------------------- Adds a new 'unsigned int skc_hash' field in 'struct sock_common', filling a 32 bits hole on 64 bits platform. struct sock_common { unsigned short skc_family; volatile unsigned char skc_state; unsigned char skc_reuse; int skc_bound_dev_if; struct hlist_node skc_node; struct hlist_node skc_bind_node; atomic_t skc_refcnt; + unsigned int skc_hash; struct proto *skc_prot; }; Store in this 32 bits field the full hash, not masked by (ehash_size - 1) Using this full hash as the first comparison done in INET_MATCH permits us immediatly skip the element without touching a second cache line in case of a miss. Suppress the sk_hashent/tw_hashent fields since skc_hash (aliased to sk_hash and tw_hash) already contains the slot number if we mask with (ehash_size - 1) File include/net/inet_hashtables.h 64 bits platforms : #define INET_MATCH(__sk, __hash, __cookie, __saddr, __daddr, __ports, __dif)\ (((__sk)->sk_hash == (__hash)) ((*((__u64 *)&(inet_sk(__sk)->daddr)))== (__cookie)) && \ ((*((__u32 *)&(inet_sk(__sk)->dport))) == (__ports)) && \ (!((__sk)->sk_bound_dev_if) || ((__sk)->sk_bound_dev_if == (__dif)))) 32bits platforms: #define TCP_IPV4_MATCH(__sk, __hash, __cookie, __saddr, __daddr, __ports, __dif)\ (((__sk)->sk_hash == (__hash)) && \ (inet_sk(__sk)->daddr == (__saddr)) && \ (inet_sk(__sk)->rcv_saddr == (__daddr)) && \ (!((__sk)->sk_bound_dev_if) || ((__sk)->sk_bound_dev_if == (__dif)))) - Adds a prefetch(head->chain.first) in __inet_lookup_established()/__tcp_v4_check_established() and __inet6_lookup_established()/__tcp_v6_check_established() and __dccp_v4_check_established() to bring into cache the first element of the list, before the {read|write}_lock(&head->lock); Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Acked-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Michael Chan authored
Test for VIA K8T800 north bridge instead of AMD K8 HyperTransport bridge based on new information from Andi Kleen. The AMD HyperTransport interface is not responsible for PCI transactions and so the re-ordering is more likely done by the VIA north bridge. This code is subject to change if we get more information from AMD or VIA. PCI Express devices are excluded from doing the read flush since all chipsets in the write_reorder list are PCI chipsets. Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Al Viro authored
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Herbert Xu authored
I've found the problem in general. It affects any 64-bit architecture. The problem occurs when you change the system time. Suppose that when you boot your system clock is forward by a day. This gets recorded down in skb_tv_base. You then wind the clock back by a day. From that point onwards the offset will be negative which essentially overflows the 32-bit variables they're stored in. In fact, why don't we just store the real time stamp in those 32-bit variables? After all, we're not going to overflow for quite a while yet. When we do overflow, we'll need a better solution of course. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Ravikiran G Thirumalai authored
2.6.14-rc2 does not assign cpus to proper nodeids on our em64t numa boxen. Our boxes use acpi srat for parsing the numa information. srat_detect_node() used phys_proc_id[] to get to the cpu's local apic id, but phys_proc_id[] represents the cpu<->initial_apic_id mapping. The following patch fixes this problem. Now apicid_to_node[] is properly indexed with the local apic id. Signed-off-by: Ravikiran Thirumalai <kiran@scalex86.org> Acked-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Andi Kleen <ak@muc.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Linus Torvalds authored
-
Paul Jackson authored
Improve explanation of the Subject line fields in Documentation/SubmittingPatches Canonical Patch Format. Signed-off-by: Paul Jackson <pj@sgi.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
James Bottomley authored
Some Legacy megaraid cards can't actually cope with the scatter/gather version of the READ CAPACITY command (which is what we now send them since altering all SCSI internal I/O to go via the block layer). Fix this (and a few other broken megaraid driver assumptions) by sending the non-sg version of the command if the sg list only has a single element. Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
-
Linus Torvalds authored
-
Paul Jackson authored
Document more details of patch format such as the "from" line and the "---" marker line, and provide more references for patch guidelines. Signed-off-by: Paul Jackson <pj@sgi.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
- 02 Oct, 2005 9 commits
-
-
Catalin Marinas authored
Patch from Catalin Marinas Data abort caused by ldrex/strex can leave the exclusive monitor in an unpredictable state. It is recommended that a clrex/strex is performed to clear this state. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
-
Richard Henderson authored
Pass in the pointer to the on-stack registers rather than using them directly as the arguments. Ivan noticed that I missed a spot when purging the registers as first stack parameter idiom. Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
James Bottomley authored
In these drivers, scsi_remove_host() is called too late, at the point it is called, the driver has already shut down too far to accept any I/O that the shutdown might generate. Any generated I/O actually triggers a panic. Fix this by calling scsi_remove_host() as early as possible and not calling scsi_host_put() until just before we kfree the ahc_softc. Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
-
James Bottomley authored
There's a problem in our host release in that it calls scsi_proc_hostdir_rm(). However, if you hold a reference to the host as you remove the module, the host template (which proc uses) will be freed and the system will panic when the host device is finally released. Fix this by moving scsi_proc_hostdir_rm() to where it should be: in scsi_remove_host(). Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
-
Russell King authored
Arrange for the initialisation printks to happen after we've registered the network interface, so we know what name the device is. Also, check the link every 500ms (and use msecs_to_jiffies.) Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
-
Russell King authored
EBSA110 link detection didn't read the register - it wrote it. Oops. Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
-
Linus Torvalds authored
-
Sven Henkel authored
This adds the new iBook G4 (manufactured after July 2005) to the PowerMac models table. The model name (PowerBook6,7) is taken from a 12" iBook, I don't know if it also matches the 14" version. The patch applies to a vanilla 2.6.13.2 kernel. Signed-off-by: Sven Henkel <shenkel@gmail.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Sven Henkel authored
This adds suspend support for the Radeon M11 chip in 12" iBooks manufactured after July 2005. I don't know if the new 14" iBooks also have that chip, so they might also be supported. The chip identifies itself as "RV350 NV" (pci id 0x4e56), revision 0x80. Apple calls it "Snowy", xfree86 names it "ATI FireGL Mobility T2 (M11) NV (AGP)". So, we seem to be lucky here: The suspend-code for the M10 (which also is a "RV350 NV") works flawless for that chip. Signed-off-by: Sven Henkel <shenkel@gmail.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
- 01 Oct, 2005 9 commits
-
-
Vincent Sanders authored
Patch from Vincent Sanders When building the fortunet ARM platform it fails to compile because of missing include. Signed-off-by: Vincent Sanders <vince@arm.linux.org.uk> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
-
Vincent Sanders authored
Patch from Vincent Sanders When building the mx1ads ARM platforms the serial driver fails to compile with GCC 4.01 due to extern/static ambiguity. Signed-off-by: Vincent Sanders <vince@arm.linux.org.uk> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
-
Linus Torvalds authored
We should always use bitmask ops, rather than depend on some ordering of the different states. With the TASK_NONINTERACTIVE flag, the inequality doesn't really work. Oleg Nesterov argues (likely correctly) that this test is unnecessary in the first place. However, the minimal fix for now is to at least make it work in the presense of TASK_NONINTERACTIVE. Waiting for consensus from Roland & co on potential bigger cleanups. Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Diego Calleja authored
Use '#ifdef' consistently on __KERNEL__. This was reported as bug #5340 (isn't easier to send a fix than report the bug?!) Signed-off-by: Diego Calleja <diegocg@gmail.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Ananth N Mavinakayanahalli authored
The incorrect kprobe_mutex usage on x86_64 had percolated to ppc64 too. First noticed by Yanmin Zhang. Signed-off-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Deepak Saxena authored
Serial port only needs 32 bytes of resource space but we are currently asking for 64K. Signed-off-by: Deepak Saxena <dsaxena@plexity.net> [ diff went missing first time due to corrupted patch ] Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Francois Romieu authored
Tone down the r8169 driver As an alternative, people can use the boot time 'debug' option and/or use 'ethtool -s ethX msglvl xyz'. The different messages are listed at: http://www.zoreil.com/~romieu/r8169/doc/msglvl.txtSigned-off-by: Francois Romieu <romieu@fr.zoreil.com> Cc: Jeff Garzik <jgarzik@pobox.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Benjamin Herrenschmidt authored
The old 550Mhz titanium powerbook can switch to a lower frequency (500Mhz). A user has been repeately reporting overtemp conditions on his machine at high speed so this simple patch adds support to PowerMac cpufreq for this machine. The difference in frequency isn't big but seem enough to fix that user's problems. The patch has been around for some time now and doesn't seem to cause any problem, so I suppose it could go in now. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Alain RICHARD <alain.richard@equation.fr> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Deepak Saxena authored
Serial port only needs 32 bytes of resource space but we are currently asking for 64K. Signed-off-by: Deepak Saxena <dsaxena@plexity.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
- 30 Sep, 2005 12 commits
-
-
Linus Torvalds authored
-
Ravikiran G Thirumalai authored
The tests Alok carried out on Petr's box confirmed that cpu_to_node[BP] is not setup early enough by numa_init_array due to the x86_64 changes in 2.6.14-rc*, and unfortunately set wrongly by the work around code in numa_init_array(). cpu_to_node[0] gets set with 1 early and later gets set properly to 0 during identify_cpu() when all cpus are brought up, but confusing the numa slab in the process. Here is a quick fix for this. The right fix obviously is to have cpu_to_node[bsp] setup early for numa_init_array(). The following patch will fix the problem now, and the code can stay on even when cpu_to_node{BP] gets fixed early correctly. Thanks to Petr for access to his box. Signed off by: Ravikiran Thirumalai <kiran@scalex86.org> Signed-off-by: Alok N Kataria <alokk@calsoftinc.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Ravikiran G Thirumalai authored
Fix the BP node_to_cpumask. 2.6.14-rc* broke the boot cpu bit as the cpu_to_node(0) is now not setup early enough for numa_init_array. cpu_to_node[] is setup much later at srat_detect_node on acpi srat based em64t machines. This seems like a problem on amd machines too, Tested on em64t though. /sys/devices/system/node/node0/cpumap shows up sanely after this patch. Signed off by: Ravikiran Thirumalai <kiran@scalex86.org> Signed-off-by: Shai Fultheim <shai@scalex86.org> Cc: Andi Kleen <ak@muc.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Zhang, Yanmin authored
The up()/down() orders are incorrect in arch/x86_64/kprobes.c file. kprobe_mutext is used to protect the free kprobe instruction slot list. arch_prepare_kprobe applies for a slot from the free list, and arch_remove_kprobe returns a slot to the free list. The incorrect up()/down() orders to operate on kprobe_mutex fail to protect the free list. If 2 threads try to get/return kprobe instruction slot at the same time, the free slot list might be broken, or a free slot might be applied by 2 threads. Signed-off-by: Zhang Yanmin <Yanmin.zhang@intel.com> Cc: Andi Kleen <ak@muc.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Jody McIntyre authored
less noise in dmesg Signed-off-by: Ben Collins <bcollins@debian.org> Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> Signed-off-by: Jody McIntyre <scjody@steamballoon.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Jody McIntyre authored
amdtp, dv1394, raw1394, video1394: Delete legacy module aliases. The macros did not work and the aliases are not needed nowadays. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> Signed-off-by: Ben Collins <bcollins@debian.org> Signed-off-by: Jody McIntyre <scjody@steamballoon.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Jody McIntyre authored
Work around limitation in rawiso routines. Required with 1394b cards on architectures where PAGE_SIZE is 4096. Based on a previous patch by Ben Collins. Signed-off-by: Jody McIntyre <scjody@steamballoon.com> Cc: Ben Collins <bcollins@debian.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Jody McIntyre authored
Remove superfluous include. Signed-off-by: Jody McIntyre <scjody@steamballoon.com> Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> Signed-off-by: Jody McIntyre <scjody@steamballoon.com> Cc: Ben Collins <bcollins@debian.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Jody McIntyre authored
trivial edits of a few comments Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> Signed-off-by: Ben Collins <bcollins@debian.org> Signed-off-by: Jody McIntyre <scjody@steamballoon.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Jody McIntyre authored
Use of time_before() macro, defined at linux/jiffies.h, which deal with wrapping correctly and are nicer to read. Signed-off-by: Marcelo Feitoza Parisi <marcelo@feitoza.com.br> Signed-off-by: Domen Puncer <domen@coderock.org> Signed-off-by: Ben Collins <bcollins@debian.org> Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> Signed-off-by: Jody McIntyre <scjody@steamballoon.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Jody McIntyre authored
Fix debug code so it prints the correct speed (was defaulting to 100, so anything > 400 showed only 100). Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> Signed-off-by: Ben Collins <bcollins@debian.org> Signed-off-by: Jody McIntyre <scjody@steamballoon.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Jody McIntyre authored
Skip a superfluous pause that occured when the config ROM of a node was scanned unsuccessfully. This also occurs if a node without link wrongly enables its "link active" self ID flag. A GWCTech 6-port hub does this. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> Signed-off-by: Jody McIntyre <scjody@steamballoon.com> Cc: Ben Collins <bcollins@debian.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-