- 14 Jan, 2010 2 commits
-
-
Linus Torvalds authored
This one is much faster than the spinlock based fallback rwsem code, with certain artifical benchmarks having shown 300%+ improvement on threaded page faults etc. Again, note the 32767-thread limit here. So this really does need that whole "make rwsem_count_t be 64-bit and fix the BIAS values to match" extension on top of it, but that is conceptually a totally independent issue. NOT TESTED! The original patch that this all was based on were tested by KAMEZAWA Hiroyuki, but maybe I screwed up something when I created the cleaned-up series, so caveat emptor.. Also note that it _may_ be a good idea to mark some more registers clobbered on x86-64 in the inline asms instead of saving/restoring them. They are inline functions, but they are only used in places where there are not a lot of live registers _anyway_, so doing for example the clobbers of %r8-%r11 in the asm wouldn't make the fast-path code any worse, and would make the slow-path code smaller. (Not that the slow-path really matters to that degree. Saving a few unnecessary registers is the _least_ of our problems when we hit the slow path. The instruction/cycle counting really only matters in the fast path). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> LKML-Reference: <alpine.LFD.2.00.1001121810410.17145@localhost.localdomain> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
-
Linus Torvalds authored
The fast version of the rwsems (the code that uses xadd) has traditionally only worked on x86-32, and as a result it mixes different kinds of types wildly - they just all happen to be 32-bit. We have "long", we have "__s32", and we have "int". To make it work on x86-64, the types suddenly matter a lot more. It can be either a 32-bit or 64-bit signed type, and both work (with the caveat that a 32-bit counter will only have 15 bits of effective write counters, so it's limited to 32767 users). But whatever type you choose, it needs to be used consistently. This makes a new 'rwsem_counter_t', that is a 32-bit signed type. For a 64-bit type, you'd need to also update the BIAS values. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> LKML-Reference: <alpine.LFD.2.00.1001121755220.17145@localhost.localdomain> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
-
- 13 Jan, 2010 3 commits
-
-
Brian Gerst authored
Using kernel_stack_pointer() allows 32-bit and 64-bit versions to be merged. This is more correct for 64-bit, since the old %rsp is always saved on the stack. Signed-off-by: Brian Gerst <brgerst@gmail.com> LKML-Reference: <1263397555-27695-1-git-send-email-brgerst@gmail.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
-
Dave Jones authored
Use a macro to define the cache sizes when cachesize > 1 MB. This is less typing, and less prone to introducing bugs like we saw in e02e0e1a, and means we don't have to do maths when adding new non-power-of-2 updates like those seen recently. Signed-off-by: Dave Jones <davej@redhat.com> LKML-Reference: <20100104144735.GA18390@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
-
Linus Torvalds authored
This makes gcc use the right register names and instruction operand sizes automatically for the rwsem inline asm statements. So instead of using "(%%eax)" to specify the memory address that is the semaphore, we use "(%1)" or similar. And instead of forcing the operation to always be 32-bit, we use "%z0", taking the size from the actual semaphore data structure itself. This doesn't actually matter on x86-32, but if we want to use the same inline asm for x86-64, we'll need to have the compiler generate the proper 64-bit names for the registers (%rax instead of %eax), and if we want to use a 64-bit counter too (in order to avoid the 15-bit limit on the write counter that limits concurrent users to 32767 threads), we'll need to be able to generate instructions with "q" accesses rather than "l". Since this header currently isn't enabled on x86-64, none of that matters, but we do want to use the xadd version of the semaphores rather than have to take spinlocks to do a rwsem. The mm->mmap_sem can be heavily contended when you have lots of threads all taking page faults, and the fallback rwsem code that uses a spinlock performs abysmally badly in that case. [ hpa: modified the patch to skip size suffixes entirely when they are redundant due to register operands. ] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> LKML-Reference: <alpine.LFD.2.00.1001121613560.17145@localhost.localdomain> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
-
- 07 Jan, 2010 3 commits
-
-
Brian Gerst authored
Merge the now identical code from asm/atomic_32.h and asm/atomic_64.h into asm/atomic.h. Signed-off-by: Brian Gerst <brgerst@gmail.com> LKML-Reference: <1262883215-4034-4-git-send-email-brgerst@gmail.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
-
Brian Gerst authored
Prepare for merging into asm/atomic.h. Signed-off-by: Brian Gerst <brgerst@gmail.com> LKML-Reference: <1262883215-4034-3-git-send-email-brgerst@gmail.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
-
Brian Gerst authored
Split atomic64_t functions out into separate headers, since they will not be practical to merge between 32 and 64 bits. Signed-off-by: Brian Gerst <brgerst@gmail.com> LKML-Reference: <1262883215-4034-2-git-send-email-brgerst@gmail.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
-
- 30 Dec, 2009 3 commits
-
-
Jan Beulich authored
In order to avoid unnecessary chains of branches, rather than implementing memcpy()/memset()'s access to their alternative implementations via a jump, patch the (larger) original function directly. The memcpy() part of this is slightly subtle: while alternative instruction patching does itself use memcpy(), with the replacement block being less than 64-bytes in size the main loop of the original function doesn't get used for copying memcpy_c() over memcpy(), and hence we can safely write over its beginning. Also note that the CFI annotations are fine for both variants of each of the functions. Signed-off-by: Jan Beulich <jbeulich@novell.com> Cc: Nick Piggin <npiggin@suse.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> LKML-Reference: <4B2BB8D30200007800026AF2@vpn.id2.novell.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
-
Jan Beulich authored
In order to avoid unnecessary chains of branches, rather than implementing copy_user_generic() as a function consisting of just a single (possibly patched) branch, instead properly deal with patching call instructions in the alternative instructions framework, and move the patching into the callers. As a follow-on, one could also introduce something like __EXPORT_SYMBOL_ALT() to avoid patching call sites in modules. Signed-off-by: Jan Beulich <jbeulich@novell.com> Cc: Nick Piggin <npiggin@suse.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> LKML-Reference: <4B2BB8180200007800026AE7@vpn.id2.novell.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
-
Jan Beulich authored
The early ioremap fixmap entries cover half (or for 32-bit non-PAE, a quarter) of a page table, yet they got uncondtitionally aligned so far to a 256-entry boundary. This is not necessary if the range of page table entries anyway falls into a single page table. This buys back, for (theoretically) 50% of all configurations (25% of all non-PAE ones), at least some of the lowmem necessarily lost with commit e621bd18. Signed-off-by: Jan Beulich <jbeulich@novell.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> LKML-Reference: <4B2BB66F0200007800026AD6@vpn.id2.novell.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
-
- 28 Dec, 2009 1 commit
-
-
Akinobu Mita authored
Optimize hweight32 by using the same technique in hweight64. The proof of this technique can be found in the commit log for f9b41929 ("bitops: hweight() speedup"). The userspace benchmark on x86_32 showed 20% speedup with bitmap_weight() which uses hweight32 to count bits for each unsigned long on 32bit architectures. int main(void) { #define SZ (1024 * 1024 * 512) static DECLARE_BITMAP(bitmap, SZ) = { [0 ... 100] = 1, }; return bitmap_weight(bitmap, SZ); } Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> LKML-Reference: <1258603932-4590-1-git-send-email-akinobu.mita@gmail.com> [ only x86 sets ARCH_HAS_FAST_MULTIPLIER so we do this via the x86 tree] Signed-off-by: Ingo Molnar <mingo@elte.hu>
-
- 24 Dec, 2009 28 commits
-
-
Linus Torvalds authored
-
git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-misc-2.6Linus Torvalds authored
* 'sysctl' of git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-misc-2.6: SYSCTL: Add a mutex to the page_alloc zone order sysctl SYSCTL: Print binary sysctl warnings (nearly) only once
-
git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-mce-2.6Linus Torvalds authored
* 'hwpoison' of git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-mce-2.6: HWPOISON: Add PROC_FS dependency to hwpoison injector v2
-
git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6Linus Torvalds authored
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6: (34 commits) classmate-laptop: add support for Classmate PC ACPI devices hp-wmi: Fix two memleaks acer-wmi, msi-wmi: Remove needless DMI MODULE_ALIAS dell-wmi: do not keep driver loaded on unsupported boxes wmi: Free the allocated acpi objects through wmi_get_event_data drivers/platform/x86/acerhdf.c: check BIOS information whether it begins with string of table acerhdf: add new BIOS versions acerhdf: limit modalias matching to supported toshiba_acpi: convert to seq_file asus_acpi: convert to seq_file ACPI: do not select ACPI_DOCK from ATA_ACPI sony-laptop: enumerate rfkill devices using SN06 sony-laptop: rfkill support for newer models ACPI: fix OSC regression that caused aer and pciehp not to load MAINTAINERS: add maintainer for msi-wmi driver fujitu-laptop: fix tests of acpi_evaluate_integer() return value arch/x86/kernel/cpu/cpufreq/acpi-cpufreq.c: avoid cross-CPU interrupts by using smp_call_function_any() ACPI: processor: remove _PDC object list from struct acpi_processor ACPI: processor: change acpi_processor_set_pdc() interface ACPI: processor: open code acpi_processor_cleanup_pdc ...
-
git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/ocfs2Linus Torvalds authored
* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/ocfs2: ocfs2/trivial: Use le16_to_cpu for a disk value in xattr.c ocfs2/trivial: Use proper mask for 2 places in hearbeat.c Ocfs2: Let ocfs2 support fiemap for symlink and fast symlink. Ocfs2: Should ocfs2 support fiemap for S_IFDIR inode? ocfs2: Use FIEMAP_EXTENT_SHARED fiemap: Add new extent flag FIEMAP_EXTENT_SHARED ocfs2: replace u8 by __u8 in ocfs2_fs.h ocfs2: explicit declare uninitialized var in user_cluster_connect() ocfs2-devel: remove redundant OCFS2_MOUNT_POSIX_ACL check in ocfs2_get_acl_nolock() ocfs2: return -EAGAIN instead of EAGAIN in dlm ocfs2/cluster: Make fence method configurable - v2 ocfs2: Set MS_POSIXACL on remount ocfs2: Make acl use the default ocfs2: Always include ACL support
-
Linus Torvalds authored
* 'for-linus' of master.kernel.org:/home/rmk/linux-2.6-arm: VIDEO: cyberpro: pci_request_regions needs a persistent name ARM: dma-isa: request cascade channel after registering it ARM: footbridge: trim down old ISA rtc setup ARM: fix PAGE_KERNEL ARM: Fix wrong shared bit for CPU write buffer bug test ARM: 5857/1: ARM: dmabounce: fix build ARM: 5856/1: Fix bug of uart0 platfrom data for nuc900 ARM: 5855/1: putc support for nuc900 ARM: 5854/1: fix compiling error for NUC900 ARM: 5849/1: ARMv7: fix Oprofile events count ARM: add missing include to nwflash.c ARM: Kill CONFIG_CPU_32 ARM: Convert VFP/Crunch/XscaleCP thread_release() to exit_thread() ARM: 5853/1: ARM: Fix build break on ARM v6 and v7
-
git://git.kernel.org/pub/scm/linux/kernel/git/bp/bpLinus Torvalds authored
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp: edac, pci: remove pesky debug printk amd64_edac: restrict PCI config space access amd64_edac: fix forcing module load/unload amd64_edac: make driver loading more robust amd64_edac: fix driver instance freeing amd64_edac: fix K8 chip select reporting
-
git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6Linus Torvalds authored
* 'sh/for-2.6.33' of git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6: sh: Ensure all PG_dcache_dirty pages are written back. sh: mach-ecovec24: setup.c detailed correction serial: sh-sci: Convert tremaining ctrl_xxx I/O routines to __raw_xxx. serial: sh-sci: earlyprintk zero uartclk fix sh: Only use bl bit toggling for sleeping idle. sh: Restore bl bit toggling in idle loop. sh: Fix up MAX_DMA_CHANNELS definition when DMA is disabled. sh: dmaengine support for SH7785 sh: dmaengine support for sh7724.
-
Russell King authored
Don't pass a name pointer from the kernel stack, it will not survive and will result in corrupted /proc/iomem output. Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
-
Russell King authored
We can't request the cascade channel before it's been registered, so move it afterwards. Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
-
Russell King authored
This fixes a "start_kernel(): bug: interrupts were enabled early". rtc_cmos now takes care of initializing the ISA RTC and reading the current time and date from it; there's no need to repeat that here, thereby causing interrupts to be enabled too early. Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
-
Russell King authored
PAGE_KERNEL should not be executable; any area marked executable can be prefetched into the instruction cache. We don't want vmalloc areas to be read in this way. Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
-
Borislav Petkov authored
Do not spam the logs needlessly with the sole info that edac_pci_dev_parity_clear is being called. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
-
Borislav Petkov authored
Do not access F2x19[0,4] on K8 since they're undefined there. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
-
Borislav Petkov authored
Clear the override flag after force-loading the module. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
-
Borislav Petkov authored
Currently, the module does not initialize fully when the DIMMs aren't ECC but remains still loaded. Propagate the error when no instance of the driver is properly initialized and prevent further loading. Reorganize and polish error handling in amd64_edac_init() while at it. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
-
Borislav Petkov authored
Fix use-after-free errors by pushing all memory-freeing calls to the end of amd64_remove_one_instance(). Reported-by: Darren Jenkins <darrenrjenkins@gmail.com> LKML-Reference: <1261370306.11354.52.camel@ICE-BOX> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
-
Borislav Petkov authored
Fix the case when amd64_debug_display_dimm_sizes() reports only half the amount of DRAM on it because it doesn't account for when the single DCT operates in 128-bit mode and merges chip selects from different DIMMs. Reported-by: Johannes Hirte <johannes.hirte@fem.tu-ilmenau.de> LKML-Reference: <200912112202.48173.johannes.hirte@fem.tu-ilmenau.de> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
-
Len Brown authored
-
Len Brown authored
-
Len Brown authored
-
Len Brown authored
-
Len Brown authored
-
Len Brown authored
-
Len Brown authored
-
Thadeu Lima de Souza Cascardo authored
This add supports for devices like keyboard, backlight, tablet and accelerometer. This work is supported by International Syst S/A. [randy.dunlap@oracle.com: cmpc_acpi: depends on ACPI] [akpm@linux-foundation.org: readability tweaks] [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@holoscopio.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Len Brown <len.brown@intel.com>
-
Paul Mundt authored
-
Markus Pietrek authored
With some of the cache rework an address aliasing optimization was added, but this managed to fail on certain mappings resulting in pages with PG_dcache_dirty set never writing back their dcache lines. This patch reverts to the earlier behaviour of simply always writing back when the dirty bit is set. Signed-off-by: Markus Pietrek <Markus.Pietrek@emtrion.de> Signed-off-by: Paul Mundt <lethal@linux-sh.org>
-