Commits · dce6670aaa7efece0558010b48d5ef9d421154be · nexedi / linux

20 Aug, 2009 32 commits

powerpc: Add PACA fields specific to 64-bit Book3E processors · dce6670a

Benjamin Herrenschmidt authored Jul 23, 2009

This adds various fields in the PACA that are for use specifically
by Book3E processors, such as exception save areas, current pgd
pointer, special exceptions kernel stacks etc...
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

dce6670a

powerpc: Add definitions used by exception handling on 64-bit Book3E · 13363ab9

Benjamin Herrenschmidt authored Jul 23, 2009

This adds various definitions and macros used by the exception and TLB
miss handling on 64-bit BookE

It also adds the definitions of the SPRGs used for various exception types
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

13363ab9

powerpc: Add memory management headers for new 64-bit BookE · 57e2a99f

Benjamin Herrenschmidt authored Jul 28, 2009

This adds the PTE and pgtable format definitions, along with changes
to the kernel memory map and other definitions related to implementing
support for 64-bit Book3E. This also shields some asm-offset bits that
are currently only relevant on 32-bit

We also move the definition of the "linux" page size constants to
the common mmu.h file and add a few sizes that are relevant to
embedded processors.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

57e2a99f

powerpc: Add SPR definitions for new 64-bit BookE · 0257c99c

Benjamin Herrenschmidt authored Jul 23, 2009

This adds various SPRs defined on 64-bit BookE, along with changes
to the definition of the base MSR values to add the values needed
for 64-bit Book3E.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

0257c99c

powerpc/mm: Rework & cleanup page table freeing code path · c7cc58a1

Benjamin Herrenschmidt authored Jul 23, 2009

That patch used to just add a hook to page table flushing but
pulling that string brought out a whole bunch of issues, so it
now does that and more:

 - We now make the RCU batching of page freeing SMP only, as I
believe it was intended initially. We make a few more things compile
to nothing on !CONFIG_SMP

 - Some macros are turned into functions, though that forced me to
out of line a few stuffs due to unsolvable include depenencies,
however it's probably better that way anyway, it's not -that-
critical code path.

 - 32-bit didn't call pte_free_finish() on tlb_flush() which means
that it wouldn't push out the batch to RCU for delayed freeing when
a bunch of page tables have been freed, they would just stay in there
until the batch gets full.

64-bit BookE will use that hook to maintain the virtually linear
page tables or the indirect entries in the TLB when using the
HW loader.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

c7cc58a1

powerpc: Move definitions of secondary CPU spinloop to header file · cf54dc7c

Benjamin Herrenschmidt authored Jul 23, 2009

Those definitions are currently declared extern in the .c file where
they are used, move them to a header file instead.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

cf54dc7c

powerpc: Clean ifdef usage in copy_thread() · 747bea91

Benjamin Herrenschmidt authored Jul 23, 2009

Currently, a single ifdef covers SLB related bits and more generic ppc64
related bits, split this in two separate ifdef's since 64-bit BookE will
need one but not the other.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

747bea91

powerpc/mm: Call mmu_context_init() from ppc64 · 6f0ef0f5

Benjamin Herrenschmidt authored Jul 23, 2009

Our 64-bit hash context handling has no init function, but 64-bit Book3E
will use the common mmu_context_nohash.c code which does, so define an
empty inline mmu_context_init() for 64-bit server and call it from
our 64-bit setup_arch()
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Kumar Gala <galak@kernel.crashing.org>

6f0ef0f5

powerpc/mm: Make low level TLB flush ops on BookE take additional args · d4e167da

Benjamin Herrenschmidt authored Jul 23, 2009

We need to pass down whether the page is direct or indirect and we'll
need to pass the page size to _tlbil_va and _tlbivax_bcast

We also add a new low level _tlbil_pid_noind() which does a TLB flush
by PID but avoids flushing indirect entries if possible

This implements those new prototypes but defines them with inlines
or macros so that no additional arguments are actually passed on current
processors.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

d4e167da

powerpc: Modify some ppc_asm.h macros to accomodate 64-bits Book3E · 44c58ccc

Benjamin Herrenschmidt authored Jul 23, 2009

The way I intend to use tophys/tovirt on 64-bit BookE is different
from the "trick" that we currently play for 32-bit BookE so change
the condition of definition of these macros to make it so.

Also, make sure we only use rfid and mtmsrd instead of rfi and mtmsr
for 64-bit server processors, not all 64-bit processors.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Kumar Gala <galak@kernel.crashing.org>

44c58ccc

powerpc/mm: Add support for early ioremap on non-hash 64-bit processors · a245067e

Benjamin Herrenschmidt authored Jul 23, 2009

This adds some code to do early ioremap's using page tables instead of
bolting entries in the hash table. This will be used by the upcoming
64-bits BookE port.

The patch also changes the test for early vs. late ioremap to use
slab_is_available() instead of our old hackish mem_init_done.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

a245067e

powerpc/mm: Add more bit definitions for Book3E MMU registers · 1fe1a210

Benjamin Herrenschmidt authored Jul 23, 2009

This adds various additional bit definitions for various MMU related
SPRs used on Book3E.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

1fe1a210

powerpc/mm: Add opcode definitions for tlbivax and tlbsrx. · 29c09e8f

Benjamin Herrenschmidt authored Jul 23, 2009

This adds the opcode definitions to ppc-opcode.h for the two instructions
tlbivax and tlbsrx. as defined by Book3E 2.06
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

29c09e8f

powerpc/mm: Add HW threads support to no_hash TLB management · fcce8109

Benjamin Herrenschmidt authored Jul 23, 2009

The current "no hash" MMU context management code is written with
the assumption that one CPU == one TLB. This is not the case on
implementations that support HW multithreading, where several
linux CPUs can share the same TLB.

This adds some basic support for this to our context management
and our TLB flushing code.

It also cleans up the optional debugging output a bit
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

fcce8109

powerpc/of: Remove useless register save/restore when calling OF back · 6c171994

Benjamin Herrenschmidt authored Jul 23, 2009

enter_prom() used to save and restore registers such as CTR, XER etc..
which are volatile, or SRR0,1... which we don't care about. This
removes a bunch of useless code and while at it turns an mtmsrd into
an MTMSRD macro which will be useful to Book3E.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

6c171994

powerpc/mm: Fix misplaced #endif in pgtable-ppc64-64k.h · 7d60b02c

Benjamin Herrenschmidt authored Jul 23, 2009

A misplaced #endif causes more definitions than intended to be
protected by #ifndef __ASSEMBLY__. This breaks upcoming 64-bit
BookE support patch when using 64k pages.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

7d60b02c

powerpc: Add compat_sys_truncate · dd90bbd5

Benjamin Herrenschmidt authored Jul 28, 2009

The truncate syscall has a signed long parameter, so when using a 32-
bit userspace with a 64-bit kernel the argument is zero-extended
instead of sign-extended. Adding the compat_sys_truncate function
fixes the issue.

This was noticed during an LSB truncate test failure. The test was
checking for the correct error number set when truncate is called with
a length of -1. The test can be found at:

http://bzr.linuxfoundation.org/lsb/devel/runtime-test?cmd=inventory;rev=stewb%40linux-foundation.org-20090626205411-sfb23cc0tjj7jzgm;path=modules/vsx-pcts/tset/POSIX.os/files/truncate/

BenH: Added compat_sys_ftruncate() as well, same issue.
Signed-off-by: Chase Douglas <cndougla@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

dd90bbd5

powerpc: Update boot wrapper script with the new location of dtc · c79b2973

Lucian Adrian Grijincu authored Jul 23, 2009

dtc was moved in 9fffb55f from
arch/powerpc/boot/ to scripts/dtc/

This patch updates the wrapper script to point to the new location of dtc.
Signed-off-by: Lucian Adrian Grijincu <lgrijincu@ixiacom.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

c79b2973

powerpc: Makefile simplification through use of cc-ifversion · f7d4f68d

Frans Pop authored Jul 23, 2009

Signed-off-by: Frans Pop <elendil@planet.nl>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

f7d4f68d

powerpc: Change PACA from SPRG3 to SPRG1 · 063517be

Benjamin Herrenschmidt authored Jul 14, 2009

This change the SPRG used to store the PACA on ppc64 from
SPRG3 to SPRG1. SPRG3 is user readable on most processors
and we want to use it for other things. We change the scratch
SPRG used by exception vectors from SRPG1 to SPRG2.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

063517be

powerpc/pmac: Fix PowerSurge SMP IPI allocation · 527b3639

Benjamin Herrenschmidt authored Jul 14, 2009

The code for setting up the IPIs for SMP PowerSurge marchines bitrot,
it needs to properly map the HW interrupt number
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

527b3639

powerpc/mm: Fix definitions of FORCE_MAX_ZONEORDER in Kconfig · 066c4b87

Benjamin Herrenschmidt authored Jul 21, 2009

The current definitions set ranges and defaults for 32 and 64-bit
only using "PPC_STD_MMU" which means hash based MMU. This uselessly
restrict the usefulness for the upcoming 64-bit BookE port, but more
than that, it's broken on 32-bit since the only 32-bit platform
supporting multiple page sizes currently is 44x which does -not-
have PPC_STD_MMU_32 set.

This fixes it by using PPC64 and PPC32 instead.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

066c4b87

powerpc/cell: Replace strncpy by strlcpy · 2e2ddb24

roel kluin authored Jul 21, 2009

Replace strncpy() and explicit null-termination by strlcpy()
Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

2e2ddb24

powerpc: Remove use of a second scratch SPRG in STAB code · c5a8c0c9

Benjamin Herrenschmidt authored Jul 16, 2009

The STAB code used on Power3 and RS/64 uses a second scratch SPRG to
save a GPR in order to decide whether to go to do_stab_bolted_* or
to handle a normal data access exception.

This prevents our scheme of freeing SPRG3 which is user visible for
user uses since we cannot use SPRG0 which, on RS/64, seems to be
read-only for supervisor mode (like POWER4).

This reworks the STAB exception entry to use the PACA as temporary
storage instead.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

c5a8c0c9

powerpc: Use names rather than numbers for SPRGs (v2) · ee43eb78

Benjamin Herrenschmidt authored Jul 14, 2009

The kernel uses SPRG registers for various purposes, typically in
low level assembly code as scratch registers or to hold per-cpu
global infos such as the PACA or the current thread_info pointer.

We want to be able to easily shuffle the usage of those registers
as some implementations have specific constraints realted to some
of them, for example, some have userspace readable aliases, etc..
and the current choice isn't always the best.

This patch should not change any code generation, and replaces the
usage of SPRN_SPRGn everywhere in the kernel with a named replacement
and adds documentation next to the definition of the names as to
what those are used for on each processor family.

The only parts that still use the original numbers are bits of KVM
or suspend/resume code that just blindly needs to save/restore all
the SPRGs.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

ee43eb78

powerpc: Rename exception.h to exception-64s.h · 8aa34ab8

Benjamin Herrenschmidt authored Jul 14, 2009

The file include/asm/exception.h contains definitions
that are specific to exception handling on 64-bit server
type processors.

This renames the file to exception-64s.h to reflect that
fact and avoid confusion.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

8aa34ab8

powerpc: Preload application text segment instead of TASK_UNMAPPED_BASE · de4376c2

Anton Blanchard authored Jul 13, 2009

TASK_UNMAPPED_BASE is not used with the new top down mmap layout. We can
reuse this preload slot by loading in the segment at 0x10000000, where almost
all PowerPC binaries are linked at.

On a microbenchmark that bounces a token between two 64bit processes over pipes
and calls gettimeofday each iteration (to access the VDSO), both the 32bit and
64bit context switch rate improves (tested on a 4GHz POWER6):

32bit: 273k/sec -> 283k/sec
64bit: 277k/sec -> 284k/sec
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

de4376c2

powerpc: Rearrange SLB preload code · 5eb9bac0

Anton Blanchard authored Jul 13, 2009

With the new top down layout it is likely that the pc and stack will be in the
same segment, because the pc is most likely in a library allocated via a top
down mmap. Right now we bail out early if these segments match.

Rearrange the SLB preload code to sanity check all SLB preload addresses
are not in the kernel, then check all addresses for conflicts.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

5eb9bac0

powerpc: Move 64bit VDSO to improve context switch performance · 30d0b368

Anton Blanchard authored Jul 13, 2009

On 64bit applications the VDSO is the only thing in segment 0. Since the VDSO
is position independent we can remove the hint and let get_unmapped_area pick
an area. This will mean the vdso will be near other mmaps and will share
an SLB entry:

10000000-10001000 r-xp 00000000 08:06 5778459 /root/context_switch_64
10010000-10011000 r--p 00000000 08:06 5778459 /root/context_switch_64
10011000-10012000 rw-p 00001000 08:06 5778459 /root/context_switch_64
fffa92ae000-fffa92b0000 rw-p 00000000 00:00 0
fffa92b0000-fffa9453000 r-xp 00000000 08:06 4334051 /lib64/power6/libc-2.9.so
fffa9453000-fffa9462000 ---p 001a3000 08:06 4334051 /lib64/power6/libc-2.9.so
fffa9462000-fffa9466000 r--p 001a2000 08:06 4334051 /lib64/power6/libc-2.9.so
fffa9466000-fffa947c000 rw-p 001a6000 08:06 4334051 /lib64/power6/libc-2.9.so
fffa947c000-fffa9480000 rw-p 00000000 00:00 0
fffa9480000-fffa94a8000 r-xp 00000000 08:06 4333852 /lib64/ld-2.9.so
fffa94b3000-fffa94b4000 rw-p 00000000 00:00 0

fffa94b4000-fffa94b7000 r-xp 00000000 00:00 0 [vdso] <----- here I am

fffa94b7000-fffa94b8000 r--p 00027000 08:06 4333852 /lib64/ld-2.9.so
fffa94b8000-fffa94bb000 rw-p 00028000 08:06 4333852 /lib64/ld-2.9.so
fffa94bb000-fffa94bc000 rw-p 00000000 00:00 0
fffe4c10000-fffe4c25000 rw-p 00000000 00:00 0 [stack]

On a microbenchmark that bounces a token between two 64bit processes over pipes
and calls gettimeofday each iteration (to access the VDSO), our context switch
rate goes from 268k to 277k ctx switches/sec (tested on a 4GHz POWER6).
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

30d0b368

powerpc: expose the multi-bit ops that underlie single-bit ops. · 0d2d3e38

Geoff Thorpe authored Jul 07, 2009

The bitops.h functions that operate on a single bit in a bitfield are
implemented by operating on the corresponding word location. In all
cases the inner logic is valid if the mask being applied has more than
one bit set, so this patch exposes those inner operations. Indeed,
set_bits() was already available, but it duplicated code from
set_bit() (rather than making the latter a wrapper) - it was also
missing the PPC405_ERR77() workaround and the "volatile" address
qualifier present in other APIs. This corrects that, and exposes the
other multi-bit equivalents.

One advantage of these multi-bit forms is that they allow word-sized
variables to essentially be their own spinlocks, eg. very useful for
state machines where an atomic "flags" variable can obviate the need
for any additional locking.
Signed-off-by: Geoff Thorpe <geoff@geoffthorpe.net>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

0d2d3e38

powerpc/mpic: Fix MPIC_BROKEN_REGREAD on non broken MPICs · 11a6b292

Michael Ellerman authored Jul 05, 2009

The workaround enabled by CONFIG_MPIC_BROKEN_REGREAD does not work
on non-broken MPICs. The symptom is no interrupts being received.

The fix is twofold. Firstly the code was broken for multiple isus,
we need to index into the shadow array with the src_no, not the idx.
Secondly, we always do the read, but only use the VECPRI_MASK and
VECPRI_ACTIVITY bits from the hardware, the rest of "val" comes
from the shadow.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

11a6b292

powerpc/amigaone: Convert amigaone_init() to a machine_device_initcall() · 66dc3304

Gerhard Pircher authored Jun 19, 2009

This allows to remove the ppc_md.init() hook in the setup code.
Signed-off-by: Gerhard Pircher <gerhard_pircher@gmx.net>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

66dc3304

19 Aug, 2009 2 commits

Merge branch 'for-linus' of... · c124891f

Linus Torvalds authored Aug 18, 2009

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6:
  security: Fix prompt for LSM_MMAP_MIN_ADDR
  security: Make LSM_MMAP_MIN_ADDR default match its help text.

c124891f

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu · 77f312a9

Linus Torvalds authored Aug 18, 2009

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu:
  percpu: use the right flag for get_vm_area()
  percpu, sparc64: fix sparse possible cpu map handling
  init: set nr_cpu_ids before setup_per_cpu_areas()

77f312a9

18 Aug, 2009 6 commits

Merge branch 'x86-fixes-for-linus' of... · dc8ed71e

Linus Torvalds authored Aug 18, 2009

Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip

* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86, mce: Don't initialize MCEs on unknown CPUs
  x86, mce: don't log boot MCEs on Pentium M (model == 13) CPUs
  x86: Annotate section mismatch warnings in kernel/apic/x2apic_uv_x.c
  x86, mce: therm_throt: Don't log redundant normality
  x86: Fix UV BAU destination subnode id

dc8ed71e

mm: build_zonelists(): move clear node_load[] to __build_all_zonelists() · 7f9cfb31

Bo Liu authored Aug 18, 2009

If node_load[] is cleared everytime build_zonelists() is
called,node_load[] will have no help to find the next node that should
appear in the given node's fallback list.

Because of the bug, zonelist's node_order is not calculated as expected.
This bug affects on big machine, which has asynmetric node distance.

[synmetric NUMA's node distance]
     0    1    2
0   10   12   12
1   12   10   12
2   12   12   10

[asynmetric NUMA's node distance]
     0    1    2
0   10   12   20
1   12   10   14
2   20   14   10

This (my bug) is very old but no one has reported this for a long time.
Maybe because the number of asynmetric NUMA is very small and they use
cpuset for customizing node memory allocation fallback.

[akpm@linux-foundation.org: fix CONFIG_NUMA=n build]
Signed-off-by: Bo Liu <bo-liu@hotmail.com>
Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

7f9cfb31

REPORTING-BUGS: add get_maintainer.pl blurb · 503f7944

Joe Perches authored Aug 18, 2009

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

503f7944

nommu: check fd read permission in validate_mmap_request() · 28d7a6ae

Graff Yang authored Aug 18, 2009

According to the POSIX (1003.1-2008), the file descriptor shall have been
opened with read permission, regardless of the protection options specified to
mmap().  The ltp test cases mmap06/07 need this.
Signed-off-by: Graff Yang <graff.yang@gmail.com>
Acked-by: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Greg Ungerer <gerg@snapgear.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

28d7a6ae

spi_s3c24xx: fix transfer setup code · 19152975

Ben Dooks authored Aug 18, 2009

Since the changes to the bitbang driver, there is the possibility we will
be called with either the speed_hz or bpw values zero.  We take these to
mean that the default values (8 bits per word, or maximum bus speed).
Signed-off-by: Ben Dooks <ben@simtec.co.uk>
Cc: David Brownell <david-b@pacbell.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

19152975

spi_s3c24xx: fix clock rate calculation · b8978784

Ben Dooks authored Aug 18, 2009

Currently the clock rate calculation may round as pleased, which means
that it is possible that we will round down and end up with a faster clock
rate than intended.

Change the calculation to use DIV_ROUND_UP() to ensure that we end up with
a clock rate either the same as or lower than the user requested one.
Signed-off-by: Ben Dooks <ben@simtec.co.uk>
Cc: David Brownell <david-b@pacbell.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

b8978784