- 31 Jul, 2010 2 commits
-
-
Matthew McClintock authored
Fix sizes of variables so correct values are exported via /proc. Cast variable in comparison to avoid compiler error. Signed-off-by: Matthew McClintock <msm@freescale.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Matt Evans authored
With dynamic PACAs, the kexecing CPU's PACA won't lie within the kernel static data and there is a chance that something may stomp it when preparing to kexec. This patch switches this final CPU to a static PACA just before we pull the switch. Signed-off-by: Matt Evans <matt@ozlabs.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
- 30 Jul, 2010 1 commit
-
-
Benjamin Herrenschmidt authored
-
- 26 Jul, 2010 3 commits
-
-
Lee Nipper authored
The recent AMCC 405EX Rev D without Security uses a PVR value that matches the old 405EXr Rev A/B with Security. The 405EX Rev D without Security would be shown incorrectly as an 405EXr. The pvr_mask of 0xffff0004 is no longer sufficient to distinguish the 405EX from 405EXr. This patch replaces 2 entries in the cpu_specs table and adds 8 more, each using pvr_mask of 0xffff000f and appropriate pvr_value to distinguish the AMCC PowerPC 405EX and 405EXr instances. The cpu_name for these entries now includes the Rev, in similar fashion to the 440GX. Signed-off-by: Lee Nipper <lee.nipper@gmail.com> Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
-
Stefan Roese authored
UART2 and UART3 on 460EX/GT have incorrect interrupt mappings right now. UART2 should be 28 (0x1c) and UART3 29 (0x1d). This patch fixes this and switches to using decimal number instead of hex, since the AppliedMicro (AMCC) users manuals describe their inerrupt numbers in decimal. Thanks to Fabien Proriol for pointing this out. Signed-off-by: Stefan Roese <sr@denx.de> Cc: Fabien Proriol <Fabien.Proriol@jdsu.com> Cc: Josh Boyer <jwboyer@linux.vnet.ibm.com> Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
-
Christian Dietrich authored
The config options for REDWOOD_[456] were commented out in the powerpc Kconfig. The ifdefs referencing this options therefore are dead and all references to this can be removed (Also dependencies in other KConfig files). Signed-off-by: Christian Dietrich <qy03fugy@stud.informatik.uni-erlangen.de> Signed-off-by: Christoph Egger <siccegge@cs.fau.de> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
-
- 14 Jul, 2010 5 commits
-
-
Benjamin Herrenschmidt authored
They will fail to build due to the lack of mtmsrd, and wouldn't be useful anyways Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Benjamin Herrenschmidt authored
Use the MMU config registers to scan for available direct and indirect page sizes and print out the result. Will be needed for future hugetlbfs implementation. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Benjamin Herrenschmidt authored
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Benjamin Herrenschmidt authored
We patch the TLB miss exception vectors to point to alternate functions when using HW page table on BookE. However, we were patching in a new branch in the first instruction of the exception handler instead of the second one, thus overriding the nop that is in the first instruction. This cause problems when single stepping as we rely on that nop for the single step to stop properly within the exception vector range rather than on the target of the branch. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Benjamin Herrenschmidt authored
We use a similar technique to ppc32: We set a thread local flag to indicate that we are about to enter or have entered the stop state, and have fixup code in the async interrupt entry code that reacts to this flag to make us return to a different location (sets NIP to LINK in our case). Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> -- v2. Fix lockdep bug Re-mask interrupts when coming back from idle
-
- 09 Jul, 2010 27 commits
-
-
Michael Ellerman authored
If we are soft disabled and receive a doorbell exception we don't process it immediately. This means we need to check on the way out of irq restore if there are any doorbell exceptions to process. The problem is at that point we don't know what our regs are, and that in turn makes xmon unhappy. To workaround the problem, instead of checking for and processing doorbells, we check for any doorbells and if there were any we send ourselves another. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
David Gibson authored
include/asm-generic/irq_regs.h declares per-cpu irq_regs variables and get_irq_regs() and set_irq_regs() helper functions to maintain them. These can be used to access the proper pt_regs structure related to the current interrupt entry (if any). In the powerpc arch code, this is used to maintain irq regs on decrementer and external interrupt exceptions. However, for the doorbell exceptions used by the msgsnd/msgrcv IPI mechanism of newer BookE CPUs, the irq_regs are not kept up to date. In particular this means that xmon will not work properly on SMP, because the secondary xmon instances started by IPI will blow up when they cannot retrieve the irq regs. This patch fixes the problem by adding calls to maintain the irq regs across doorbell exceptions. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Benjamin Herrenschmidt authored
Note that critical doorbells are an unimplemented stub just like other critical or machine check handlers, since we haven't done support for "levelled" exceptions yet. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Benjamin Herrenschmidt authored
The decrementer on BookE acts as a level interrupt and doesn't need to be re-triggered when going negative. It doesn't go negative anyways (unless programmed to auto-reload with a negative value) as it stops when reaching 0. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Benjamin Herrenschmidt authored
The doorbells use the content of the PIR register to match messages from other CPUs. This may or may not be the same as our linux CPU number, so using that as the "target" is no right. Instead, we sample the PIR register at boot on every processor and use that value subsequently when sending IPIs. We also use a per-cpu message mask rather than a global array which should limit cache line contention. Note: We could use the CPU number in the device-tree instead of the PIR register, as they are supposed to be equivalent. This might prove useful if doorbells are to be used to kick CPUs out of FW at boot time, thus before we can sample the PIR. This is however not the case now and using the PIR just works. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Benjamin Herrenschmidt authored
... where it belongs Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Benjamin Herrenschmidt authored
Our handling of debug interrupts on Book3E 64-bit is not quite the way it should be just yet. This is a workaround to let gdb work at least for now. We ensure that when context switching, we set the appropriate DBCR0 value for the new task. We also make sure that we turn off MSR[DE] within the kernel, and set it as part of the bits that get set when going back to userspace. In the long run, we will probably set the userspace DBCR0 on the exception exit code path and ensure we have some proper kernel value to set on the way into the kernel, a bit like ppc32 does, but that will take more work. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Benjamin Herrenschmidt authored
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Christoph Egger authored
CONFIG_SMP_750 doesn't exist in Kconfig, therefore removing all references for it from the source code. Signed-off-by: Christoph Egger <siccegge@cs.fau.de> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Julia Lawall authored
Use kstrdup when the goal of an allocation is copy a string into the allocated region. The semantic patch that makes this change is as follows: (http://coccinelle.lip6.fr/) // <smpl> @@ expression from,to; expression flag,E1,E2; statement S; @@ - to = kmalloc(strlen(from) + 1,flag); + to = kstrdup(from, flag); ... when != \(from = E1 \| to = E1 \) if (to==NULL || ...) S ... when != \(from = E2 \| to = E2 \) - strcpy(to, from); // </smpl> Signed-off-by: Julia Lawall <julia@diku.dk> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Julia Lawall authored
Use kstrdup when the goal of an allocation is copy a string into the allocated region. The semantic patch that makes this change is as follows: (http://coccinelle.lip6.fr/) // <smpl> @@ expression from,to; expression flag,E1,E2; statement S; @@ - to = kmalloc(strlen(from) + 1,flag); + to = kstrdup(from, flag); ... when != \(from = E1 \| to = E1 \) if (to==NULL || ...) S ... when != \(from = E2 \| to = E2 \) - strcpy(to, from); // </smpl> Signed-off-by: Julia Lawall <julia@diku.dk> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Anton Blanchard authored
Form 1 affinity allows multiple entries in ibm,associativity-reference-points which represent affinity domains in decreasing order of importance. The Linux concept of a node is always the first entry, but using the other values as an input to node_distance() allows the memory allocator to make better decisions on which node to go first when local memory has been exhausted. We keep things simple and create an array indexed by NUMA node, capped at 4 entries. Each time we lookup an associativity property we initialise the array which is overkill, but since we should only hit this path during boot it didn't seem worth adding a per node valid bit. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Paul E. McKenney authored
Remove all rcu head inits. We don't care about the RCU head state before passing it to call_rcu() anyway. Only leave the "on_stack" variants so debugobjects can keep track of objects on stack. Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Martyn Welch authored
Currently the irqs for the i8042, which historically provides keyboard and mouse (aux) support, is hardwired in the driver rather than parsing the dts. This patch modifies the powerpc legacy IO code to attempt to parse the device tree for this information, failing back to the hardcoded values if it fails. Signed-off-by: Martyn Welch <martyn.welch@ge.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Mark Nelson authored
At the moment if request_event_sources_irqs() can't allocate or request the interrupt, it just does a KERN_ERR printk. This may be fine for the existing RAS code where if we miss an EPOW event it just means that the event won't be logged and if we miss one of the RAS errors then we could miss an event that we perhaps should take action on. But, for the upcoming IO events code that will use event-sources if we can't allocate or request the interrupt it means we'd potentially miss an interrupt from the device. So, let's add a WARN_ON() in this error case so that we're a bit more vocal when something's amiss. While we're at it, also use pr_err() to neaten the code up a bit. Signed-off-by: Mark Nelson <markn@au1.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Mark Nelson authored
The RAS code has a #define, RAS_VECTOR_OFFSET, that's used in the check-exception RTAS call for the vector offset of the exception. We'll be using this same vector offset for the upcoming IO Event interrupts code (0x500) so let's move it to include/asm/rtas.h and call it RTAS_VECTOR_EXTERNAL_INTERRUPT. Signed-off-by: Mark Nelson <markn@au1.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Anton Blanchard authored
Now we dynamically allocate the paca array, it takes an extra load whenever we want to access another cpu's paca. One place we do that a lot is per cpu variables. A simple example: DEFINE_PER_CPU(unsigned long, vara); unsigned long test4(int cpu) { return per_cpu(vara, cpu); } This takes 4 loads, 5 if you include the actual load of the per cpu variable: ld r11,-32760(r30) # load address of paca pointer ld r9,-32768(r30) # load link address of percpu variable sldi r3,r29,9 # get offset into paca (each entry is 512 bytes) ld r0,0(r11) # load paca pointer add r3,r0,r3 # paca + offset ld r11,64(r3) # load paca[cpu].data_offset ldx r3,r9,r11 # load per cpu variable If we remove the ppc64 specific per_cpu_offset(), we get the generic one which indexes into a statically allocated array. This removes one load and one add: ld r11,-32760(r30) # load address of __per_cpu_offset ld r9,-32768(r30) # load link address of percpu variable sldi r3,r29,3 # get offset into __per_cpu_offset (each entry 8 bytes) ldx r11,r11,r3 # load __per_cpu_offset[cpu] ldx r3,r9,r11 # load per cpu variable Having all the offsets in one array also helps when iterating over a per cpu variable across a number of cpus, such as in the scheduler. Before we would need to load one paca cacheline when calculating each per cpu offset. Now we have 16 (128 / sizeof(long)) per cpu offsets in each cacheline. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Denis Kirjanov authored
Fix smatch warning: constant 0x8000000000000000 is so big it is unsigned long Signed-off-by: Denis Kirjanov <dkirjanov@kernel.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Matthew McClintock authored
We need the ability to reset cores for use with kexec/kdump for SMP systems. Calling this function with the specific core you want to reset will cause the CPU to spin in reset. Signed-off-by: Matthew McClintock <msm@freescale.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Becky Bruce authored
There are no BATS on BookE - we have the TLBCAM instead. Also correct the page size information to included extended sizes. We don't actually allow a 4G page size to be used, so comment on that as well. Signed-off-by: Becky Bruce <beckyb@kernel.crashing.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Kulikov Vasiliy authored
Use for_each_pci_dev() to simplify the code. Signed-off-by: Kulikov Vasiliy <segooon@gmail.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Chris Metcalf authored
The use of "hvc_con_driver" as the name for a file-static "struct console" with a ".setup" field pointing to an __init function causes a modpost warning, since a non-initdata structure points to init code. Using "hvc_console" as the name triggers the hacky "*_console" workaround in modpost to silence the warning, and is the same thing that most of the other console drivers already do. I made the same change in hvsi.c since I happened to notice it was likely to suffer from the same problem. Signed-off-by: Chris Metcalf <cmetcalf@tilera.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Brian King authored
Enables support for HMC initiated partition hibernation. This is a firmware assisted hibernation, since the firmware handles writing the memory out to disk, along with other partition information, so we just mimic suspend to ram. Signed-off-by: Brian King <brking@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Brian King authored
Partition hibernation will use some of the same code as is currently used for Live Partition Migration. This function further abstracts this code such that code outside of rtas.c can utilize it. It also changes the error field in the suspend me data structure to be an atomic type, since it is set and checked on different cpus without any barriers or locking. Signed-off-by: Brian King <brking@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Paul Mackerras authored
Since the decrementer and timekeeping code was moved over to using the generic clockevents and timekeeping infrastructure, several variables and functions have been obsolete and effectively unused. This deletes them. In particular, wakeup_decrementer() is no longer needed since the generic code reprograms the decrementer as part of the process of resuming the timekeeping code, which happens during sysdev resume. Thus the wakeup_decrementer calls in the suspend_enter methods for 52xx platforms have been removed. The call in the powermac cpu frequency change code has been replaced by set_dec(1), which will cause a timer interrupt as soon as interrupts are enabled, and the generic code will then reprogram the decrementer with the correct value. This also simplifies the generic_suspend_en/disable_irqs functions and makes them static since they are not referenced outside time.c. The preempt_enable/disable calls are removed because the generic code has disabled all but the boot cpu at the point where these functions are called, so we can't be moved to another cpu. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Paul Mackerras authored
Currently it is possible for userspace to see the result of gettimeofday() going backwards by 1 microsecond, assuming that userspace is using the gettimeofday() in the VDSO. The VDSO gettimeofday() algorithm computes the time in "xsecs", which are units of 2^-20 seconds, or approximately 0.954 microseconds, using the algorithm now = (timebase - tb_orig_stamp) * tb_to_xs + stamp_xsec and then converts the time in xsecs to seconds and microseconds. The kernel updates the tb_orig_stamp and stamp_xsec values every tick in update_vsyscall(). If the length of the tick is not an integer number of xsecs, then some precision is lost in converting the current time to xsecs. For example, with CONFIG_HZ=1000, the tick is 1ms long, which is 1048.576 xsecs. That means that stamp_xsec will advance by either 1048 or 1049 on each tick. With the right conditions, it is possible for userspace to get (timebase - tb_orig_stamp) * tb_to_xs being 1049 if the kernel is slightly late in updating the vdso_datapage, and then for stamp_xsec to advance by 1048 when the kernel does update it, and for userspace to then see (timebase - tb_orig_stamp) * tb_to_xs being zero due to integer truncation. The result is that time appears to go backwards by 1 microsecond. To fix this we change the VDSO gettimeofday to use a new field in the VDSO datapage which stores the nanoseconds part of the time as a fractional number of seconds in a 0.32 binary fraction format. (Or put another way, as a 32-bit number in units of 0.23283 ns.) This is convenient because we can use the mulhwu instruction to convert it to either microseconds or nanoseconds. Since it turns out that computing the time of day using this new field is simpler than either using stamp_xsec (as gettimeofday does) or stamp_xtime.tv_nsec (as clock_gettime does), this converts both gettimeofday and clock_gettime to use the new field. The existing __do_get_tspec function is converted to use the new field and take a parameter in r7 that indicates the desired resolution, 1,000,000 for microseconds or 1,000,000,000 for nanoseconds. The __do_get_xsec function is then unused and is deleted. The new algorithm is now = ((timebase - tb_orig_stamp) << 12) * tb_to_xs + (stamp_xtime_seconds << 32) + stamp_sec_fraction with 'now' in units of 2^-32 seconds. That is then converted to seconds and either microseconds or nanoseconds with seconds = now >> 32 partseconds = ((now & 0xffffffff) * resolution) >> 32 The 32-bit VDSO code also makes a further simplification: it ignores the bottom 32 bits of the tb_to_xs value, which is a 0.64 format binary fraction. Doing so gets rid of 4 multiply instructions. Assuming a timebase frequency of 1GHz or less and an update interval of no more than 10ms, the upper 32 bits of tb_to_xs will be at least 4503599, so the error from ignoring the low 32 bits will be at most 2.2ns, which is more than an order of magnitude less than the time taken to do gettimeofday or clock_gettime on our fastest processors, so there is no possibility of seeing inconsistent values due to this. This also moves update_gtod() down next to its only caller, and makes update_vsyscall use the time passed in via the wall_time argument rather than accessing xtime directly. At present, wall_time always points to xtime, but that could change in future. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Benjamin Herrenschmidt authored
-
- 08 Jul, 2010 2 commits
-
-
git://git.kernel.org/pub/scm/linux/kernel/git/roland/infinibandLinus Torvalds authored
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: IPoIB: Fix world-writable child interface control sysfs attributes IB/qib: Clean up properly if qib_init() fails IB/qib: Completion queue callback needs to be single threaded IB/qib: Update 7322 serdes tables IB/qib: Clear 6120 hardware error register IB/qib: Clear eager buffer memory for each new process IB/qib: Mask hardware error during link reset IB/qib: Don't mark VL15 bufs as WC to avoid a rare 7322 chip problem RDMA/cxgb4: Derive smac_idx from port viid RDMA/cxgb4: Avoid false GTS CIDX_INC overflows RDMA/cxgb4: Don't call abort_connection() for active connect failures RDMA/cxgb4: Use the DMA state API instead of the pci equivalents
-
Roland Dreier authored
-