Commit 8c2000be authored by David Mosberger's avatar David Mosberger

ia64: Merge with 2.5.59.

parents 6a3354a9 887b478a
-*-Mode: outline-*-
Light-weight System Calls for IA-64
-----------------------------------
Started: 13-Jan-2002
Last update: 15-Jan-2002
David Mosberger-Tang
<davidm@hpl.hp.com>
Using the "epc" instruction effectively introduces a new mode of
execution to the ia64 linux kernel. We call this mode the
"fsys-mode". To recap, the normal states of execution are:
- kernel mode:
Both the register stack and the memory stack have been
switched over to kernel memory. The user-level state is saved
in a pt-regs structure at the top of the kernel memory stack.
- user mode:
Both the register stack and the kernel stack are in
user memory. The user-level state is contained in the
CPU registers.
- bank 0 interruption-handling mode:
This is the non-interruptible state which all
interruption-handlers start execution in. The user-level
state remains in the CPU registers and some kernel state may
be stored in bank 0 of registers r16-r31.
In contrast, fsys-mode has the following special properties:
- execution is at privilege level 0 (most-privileged)
- CPU registers may contain a mixture of user-level and kernel-level
state (it is the responsibility of the kernel to ensure that no
security-sensitive kernel-level state is leaked back to
user-level)
- execution is interruptible and preemptible (an fsys-mode handler
can disable interrupts and avoid all other interruption-sources
to avoid preemption)
- neither the memory nor the register stack can be trusted while
in fsys-mode (they point to the user-level stacks, which may
be invalid)
In summary, fsys-mode is much more similar to running in user-mode
than it is to running in kernel-mode. Of course, given that the
privilege level is at level 0, this means that fsys-mode requires some
care (see below).
* How to tell fsys-mode
Linux operates in fsys-mode when (a) the privilege level is 0 (most
privileged) and (b) the stacks have NOT been switched to kernel memory
yet. For convenience, the header file <asm-ia64/ptrace.h> provides
three macros:
user_mode(regs)
user_stack(task,regs)
fsys_mode(task,regs)
The "regs" argument is a pointer to a pt_regs structure. The "task"
argument is a pointer to the task structure to which the "regs"
pointer belongs to. user_mode() returns TRUE if the CPU state pointed
to by "regs" was executing in user mode (privilege level 3).
user_stack() returns TRUE if the state pointed to by "regs" was
executing on the user-level stack(s). Finally, fsys_mode() returns
TRUE if the CPU state pointed to by "regs" was executing in fsys-mode.
The fsys_mode() macro is equivalent to the expression:
!user_mode(regs) && user_stack(task,regs)
* How to write an fsyscall handler
The file arch/ia64/kernel/fsys.S contains a table of fsyscall-handlers
(fsyscall_table). This table contains one entry for each system call.
By default, a system call is handled by fsys_fallback_syscall(). This
routine takes care of entering (full) kernel mode and calling the
normal Linux system call handler. For performance-critical system
calls, it is possible to write a hand-tuned fsyscall_handler. For
example, fsys.S contains fsys_getpid(), which is a hand-tuned version
of the getpid() system call.
The entry and exit-state of an fsyscall handler is as follows:
** Machine state on entry to fsyscall handler:
- r11 = saved ar.pfs (a user-level value)
- r15 = system call number
- r16 = "current" task pointer (in normal kernel-mode, this is in r13)
- r32-r39 = system call arguments
- b6 = return address (a user-level value)
- ar.pfs = previous frame-state (a user-level value)
- PSR.be = cleared to zero (i.e., little-endian byte order is in effect)
- all other registers may contain values passed in from user-mode
** Required machine state on exit to fsyscall handler:
- r11 = saved ar.pfs (as passed into the fsyscall handler)
- r15 = system call number (as passed into the fsyscall handler)
- r32-r39 = system call arguments (as passed into the fsyscall handler)
- b6 = return address (as passed into the fsyscall handler)
- ar.pfs = previous frame-state (as passed into the fsyscall handler)
Fsyscall handlers can execute with very little overhead, but with that
speed comes a set of restrictions:
o Fsyscall-handlers MUST check for any pending work in the flags
member of the thread-info structure and if any of the
TIF_ALLWORK_MASK flags are set, the handler needs to fall back on
doing a full system call (by calling fsys_fallback_syscall).
o Fsyscall-handlers MUST preserve incoming arguments (r32-r39, r11,
r15, b6, and ar.pfs) because they will be needed in case of a
system call restart. Of course, all "preserved" registers also
must be preserved, in accordance to the normal calling conventions.
o Fsyscall-handlers MUST check argument registers for containing a
NaT value before using them in any way that could trigger a
NaT-consumption fault. If a system call argument is found to
contain a NaT value, an fsyscall-handler may return immediately
with r8=EINVAL, r10=-1.
o Fsyscall-handlers MUST NOT use the "alloc" instruction or perform
any other operation that would trigger mandatory RSE
(register-stack engine) traffic.
o Fsyscall-handlers MUST NOT write to any stacked registers because
it is not safe to assume that user-level called a handler with the
proper number of arguments.
o Fsyscall-handlers need to be careful when accessing per-CPU variables:
unless proper safe-guards are taken (e.g., interruptions are avoided),
execution may be pre-empted and resumed on another CPU at any given
time.
o Fsyscall-handlers must be careful not to leak sensitive kernel'
information back to user-level. In particular, before returning to
user-level, care needs to be taken to clear any scratch registers
that could contain sensitive information (note that the current
task pointer is not considered sensitive: it's already exposed
through ar.k6).
The above restrictions may seem draconian, but remember that it's
possible to trade off some of the restrictions by paying a slightly
higher overhead. For example, if an fsyscall-handler could benefit
from the shadow register bank, it could temporarily disable PSR.i and
PSR.ic, switch to bank 0 (bsw.0) and then use the shadow registers as
needed. In other words, following the above rules yields extremely
fast system call execution (while fully preserving system call
semantics), but there is also a lot of flexibility in handling more
complicated cases.
* Signal handling
The delivery of (asynchronous) signals must be delayed until fsys-mode
is exited. This is acomplished with the help of the lower-privilege
transfer trap: arch/ia64/kernel/process.c:do_notify_resume_user()
checks whether the interrupted task was in fsys-mode and, if so, sets
PSR.lp and returns immediately. When fsys-mode is exited via the
"br.ret" instruction that lowers the privilege level, a trap will
occur. The trap handler clears PSR.lp again and returns immediately.
The kernel exit path then checks for and delivers any pending signals.
* PSR Handling
The "epc" instruction doesn't change the contents of PSR at all. This
is in contrast to a regular interruption, which clears almost all
bits. Because of that, some care needs to be taken to ensure things
work as expected. The following discussion describes how each PSR bit
is handled.
PSR.be Cleared when entering fsys-mode. A srlz.d instruction is used
to ensure the CPU is in little-endian mode before the first
load/store instruction is executed. PSR.be is normally NOT
restored upon return from an fsys-mode handler. In other
words, user-level code must not rely on PSR.be being preserved
across a system call.
PSR.up Unchanged.
PSR.ac Unchanged.
PSR.mfl Unchanged. Note: fsys-mode handlers must not write-registers!
PSR.mfh Unchanged. Note: fsys-mode handlers must not write-registers!
PSR.ic Unchanged. Note: fsys-mode handlers can clear the bit, if needed.
PSR.i Unchanged. Note: fsys-mode handlers can clear the bit, if needed.
PSR.pk Unchanged.
PSR.dt Unchanged.
PSR.dfl Unchanged. Note: fsys-mode handlers must not write-registers!
PSR.dfh Unchanged. Note: fsys-mode handlers must not write-registers!
PSR.sp Unchanged.
PSR.pp Unchanged.
PSR.di Unchanged.
PSR.si Unchanged.
PSR.db Unchanged. The kernel prevents user-level from setting a hardware
breakpoint that triggers at any privilege level other than 3 (user-mode).
PSR.lp Unchanged.
PSR.tb Lazy redirect. If a taken-branch trap occurs while in
fsys-mode, the trap-handler modifies the saved machine state
such that execution resumes in the gate page at
syscall_via_break(), with privilege level 3. Note: the
taken branch would occur on the branch invoking the
fsyscall-handler, at which point, by definition, a syscall
restart is still safe. If the system call number is invalid,
the fsys-mode handler will return directly to user-level. This
return will trigger a taken-branch trap, but since the trap is
taken _after_ restoring the privilege level, the CPU has already
left fsys-mode, so no special treatment is needed.
PSR.rt Unchanged.
PSR.cpl Cleared to 0.
PSR.is Unchanged (guaranteed to be 0 on entry to the gate page).
PSR.mc Unchanged.
PSR.it Unchanged (guaranteed to be 1).
PSR.id Unchanged. Note: the ia64 linux kernel never sets this bit.
PSR.da Unchanged. Note: the ia64 linux kernel never sets this bit.
PSR.dd Unchanged. Note: the ia64 linux kernel never sets this bit.
PSR.ss Lazy redirect. If set, "epc" will cause a Single Step Trap to
be taken. The trap handler then modifies the saved machine
state such that execution resumes in the gate page at
syscall_via_break(), with privilege level 3.
PSR.ri Unchanged.
PSR.ed Unchanged. Note: This bit could only have an effect if an fsys-mode
handler performed a speculative load that gets NaTted. If so, this
would be the normal & expected behavior, so no special treatment is
needed.
PSR.bn Unchanged. Note: fsys-mode handlers may clear the bit, if needed.
Doing so requires clearing PSR.i and PSR.ic as well.
PSR.ia Unchanged. Note: the ia64 linux kernel never sets this bit.
......@@ -768,6 +768,9 @@ source "arch/ia64/hp/sim/Kconfig"
menu "Kernel hacking"
config FSYS
bool "Light-weight system-call support (via epc)"
choice
prompt "Physical memory granularity"
default IA64_GRANULE_64MB
......
......@@ -58,9 +58,13 @@ all compressed: vmlinux.gz
vmlinux.gz: vmlinux
$(call makeboot,vmlinux.gz)
check: vmlinux
arch/ia64/scripts/unwcheck.sh vmlinux
archmrproper:
archclean:
$(Q)$(MAKE) -f scripts/Makefile.clean obj=arch/ia64/boot
$(Q)$(MAKE) -f scripts/Makefile.clean obj=arch/ia64/tools
CLEAN_FILES += include/asm-ia64/offsets.h vmlinux.gz bootloader
......
......@@ -95,12 +95,19 @@ END(sys32_sigsuspend)
GLOBAL_ENTRY(ia32_ret_from_clone)
PT_REGS_UNWIND_INFO(0)
#if defined(CONFIG_SMP) || defined(CONFIG_PREEMPT)
{ /*
* Some versions of gas generate bad unwind info if the first instruction of a
* procedure doesn't go into the first slot of a bundle. This is a workaround.
*/
nop.m 0
nop.i 0
/*
* We need to call schedule_tail() to complete the scheduling process.
* Called by ia64_switch_to after do_fork()->copy_thread(). r8 contains the
* address of the previously executing task.
*/
br.call.sptk.many rp=ia64_invoke_schedule_tail
}
.ret1:
#endif
adds r2=TI_FLAGS+IA64_TASK_SIZE,r13
......
......@@ -95,8 +95,6 @@ ia32_load_state (struct task_struct *t)
struct pt_regs *regs = ia64_task_regs(t);
int nr = smp_processor_id(); /* LDT and TSS depend on CPU number: */
nr = smp_processor_id();
eflag = t->thread.eflag;
fsr = t->thread.fsr;
fcr = t->thread.fcr;
......
......@@ -2011,6 +2011,10 @@ semctl32 (int first, int second, int third, void *uptr)
else
fourth.__pad = (void *)A(pad);
switch (third) {
default:
err = -EINVAL;
break;
case IPC_INFO:
case IPC_RMID:
case IPC_SET:
......
......@@ -12,6 +12,7 @@ obj-y := acpi.o entry.o gate.o efi.o efi_stub.o ia64_ksyms.o \
semaphore.o setup.o \
signal.o sys_ia64.o traps.o time.o unaligned.o unwind.o
obj-$(CONFIG_FSYS) += fsys.o
obj-$(CONFIG_IOSAPIC) += iosapic.o
obj-$(CONFIG_IA64_PALINFO) += palinfo.o
obj-$(CONFIG_EFI_VARS) += efivars.o
......
......@@ -888,4 +888,26 @@ acpi_irq_to_vector (u32 irq)
return gsi_to_vector(irq);
}
int __init
acpi_register_irq (u32 gsi, u32 polarity, u32 trigger)
{
int vector = 0;
u32 irq_base;
char *iosapic_address;
if (acpi_madt->flags.pcat_compat && (gsi < 16))
return isa_irq_to_vector(gsi);
if (!iosapic_register_intr)
return 0;
/* Find the IOSAPIC */
if (!acpi_find_iosapic(gsi, &irq_base, &iosapic_address)) {
/* Turn it on */
vector = iosapic_register_intr (gsi, polarity, trigger,
irq_base, iosapic_address);
}
return vector;
}
#endif /* CONFIG_ACPI_BOOT */
......@@ -33,15 +33,6 @@
#define EFI_DEBUG 0
#ifdef CONFIG_HUGETLB_PAGE
/* By default at total of 512MB is reserved huge pages. */
#define HTLBZONE_SIZE_DEFAULT 0x20000000
unsigned long htlbzone_pages = (HTLBZONE_SIZE_DEFAULT >> HPAGE_SHIFT);
#endif
extern efi_status_t efi_call_phys (void *, ...);
struct efi efi;
......@@ -497,25 +488,6 @@ efi_init (void)
++cp;
}
}
#ifdef CONFIG_HUGETLB_PAGE
/* Just duplicating the above algo for lpzone start */
for (cp = saved_command_line; *cp; ) {
if (memcmp(cp, "lpmem=", 6) == 0) {
cp += 6;
htlbzone_pages = memparse(cp, &end);
htlbzone_pages = (htlbzone_pages >> HPAGE_SHIFT);
if (end != cp)
break;
cp = end;
} else {
while (*cp != ' ' && *cp)
++cp;
while (*cp == ' ')
++cp;
}
}
printk("Total HugeTLB_Page memory pages requested 0x%lx \n", htlbzone_pages);
#endif
if (mem_limit != ~0UL)
printk("Ignoring memory above %luMB\n", mem_limit >> 20);
......
......@@ -3,7 +3,7 @@
*
* Kernel entry points.
*
* Copyright (C) 1998-2002 Hewlett-Packard Co
* Copyright (C) 1998-2003 Hewlett-Packard Co
* David Mosberger-Tang <davidm@hpl.hp.com>
* Copyright (C) 1999 VA Linux Systems
* Copyright (C) 1999 Walt Drummond <drummond@valinux.com>
......@@ -22,8 +22,8 @@
/*
* Global (preserved) predicate usage on syscall entry/exit path:
*
* pKern: See entry.h.
* pUser: See entry.h.
* pKStk: See entry.h.
* pUStk: See entry.h.
* pSys: See entry.h.
* pNonSys: !pSys
*/
......@@ -63,7 +63,7 @@ ENTRY(ia64_execve)
sxt4 r8=r8 // return 64-bit result
;;
stf.spill [sp]=f0
(p6) cmp.ne pKern,pUser=r0,r0 // a successful execve() lands us in user-mode...
(p6) cmp.ne pKStk,pUStk=r0,r0 // a successful execve() lands us in user-mode...
mov rp=loc0
(p6) mov ar.pfs=r0 // clear ar.pfs on success
(p7) br.ret.sptk.many rp
......@@ -193,7 +193,7 @@ GLOBAL_ENTRY(ia64_switch_to)
;;
(p6) srlz.d
ld8 sp=[r21] // load kernel stack pointer of new task
mov IA64_KR(CURRENT)=r20 // update "current" application register
mov IA64_KR(CURRENT)=in0 // update "current" application register
mov r8=r13 // return pointer to previously running task
mov r13=in0 // set "current" pointer
;;
......@@ -507,7 +507,14 @@ END(invoke_syscall_trace)
GLOBAL_ENTRY(ia64_trace_syscall)
PT_REGS_UNWIND_INFO(0)
{ /*
* Some versions of gas generate bad unwind info if the first instruction of a
* procedure doesn't go into the first slot of a bundle. This is a workaround.
*/
nop.m 0
nop.i 0
br.call.sptk.many rp=invoke_syscall_trace // give parent a chance to catch syscall args
}
.ret6: br.call.sptk.many rp=b6 // do the syscall
strace_check_retval:
cmp.lt p6,p0=r8,r0 // syscall failed?
......@@ -537,12 +544,19 @@ END(ia64_trace_syscall)
GLOBAL_ENTRY(ia64_ret_from_clone)
PT_REGS_UNWIND_INFO(0)
{ /*
* Some versions of gas generate bad unwind info if the first instruction of a
* procedure doesn't go into the first slot of a bundle. This is a workaround.
*/
nop.m 0
nop.i 0
/*
* We need to call schedule_tail() to complete the scheduling process.
* Called by ia64_switch_to() after do_fork()->copy_thread(). r8 contains the
* address of the previously executing task.
*/
br.call.sptk.many rp=ia64_invoke_schedule_tail
}
.ret8:
adds r2=TI_FLAGS+IA64_TASK_SIZE,r13
;;
......@@ -569,11 +583,12 @@ END(ia64_ret_from_syscall)
// fall through
GLOBAL_ENTRY(ia64_leave_kernel)
PT_REGS_UNWIND_INFO(0)
// work.need_resched etc. mustn't get changed by this CPU before it returns to userspace:
(pUser) cmp.eq.unc p6,p0=r0,r0 // p6 <- pUser
(pUser) rsm psr.i
// work.need_resched etc. mustn't get changed by this CPU before it returns to
// user- or fsys-mode:
(pUStk) cmp.eq.unc p6,p0=r0,r0 // p6 <- pUStk
(pUStk) rsm psr.i
;;
(pUser) adds r17=TI_FLAGS+IA64_TASK_SIZE,r13
(pUStk) adds r17=TI_FLAGS+IA64_TASK_SIZE,r13
;;
.work_processed:
(p6) ld4 r18=[r17] // load current_thread_info()->flags
......@@ -635,9 +650,9 @@ GLOBAL_ENTRY(ia64_leave_kernel)
;;
srlz.i // ensure interruption collection is off
mov b7=r15
bsw.0 // switch back to bank 0 (no stop bit required beforehand...)
;;
bsw.0 // switch back to bank 0
;;
(pUStk) mov r18=IA64_KR(CURRENT) // Itanium 2: 12 cycle read latency
adds r16=16,r12
adds r17=24,r12
;;
......@@ -665,16 +680,21 @@ GLOBAL_ENTRY(ia64_leave_kernel)
;;
ld8.fill r12=[r16],16
ld8.fill r13=[r17],16
(pUStk) adds r18=IA64_TASK_THREAD_ON_USTACK_OFFSET,r18
;;
ld8.fill r14=[r16]
ld8.fill r15=[r17]
(pUStk) mov r17=1
;;
(pUStk) st1 [r18]=r17 // restore current->thread.on_ustack
shr.u r18=r19,16 // get byte size of existing "dirty" partition
;;
mov r16=ar.bsp // get existing backing store pointer
movl r17=THIS_CPU(ia64_phys_stacked_size_p8)
;;
ld4 r17=[r17] // r17 = cpu_data->phys_stacked_size_p8
(pKern) br.cond.dpnt skip_rbs_switch
(pKStk) br.cond.dpnt skip_rbs_switch
/*
* Restore user backing store.
*
......@@ -710,21 +730,9 @@ dont_preserve_current_frame:
shr.u loc1=r18,9 // RNaTslots <= dirtySize / (64*8) + 1
sub r17=r17,r18 // r17 = (physStackedSize + 8) - dirtySize
;;
#if 1
.align 32 // see comment below about gas bug...
#endif
mov ar.rsc=r19 // load ar.rsc to be used for "loadrs"
shladd in0=loc1,3,r17
mov in1=0
#if 0
// gas-2.12.90 is unable to generate a stop bit after .align, which is bad,
// because alloc must be at the beginning of an insn-group.
.align 32
#else
nop 0
nop 0
nop 0
#endif
;;
rse_clear_invalid:
#ifdef CONFIG_ITANIUM
......@@ -788,12 +796,12 @@ rse_clear_invalid:
skip_rbs_switch:
mov b6=rB6
mov ar.pfs=rARPFS
(pUser) mov ar.bspstore=rARBSPSTORE
(pUStk) mov ar.bspstore=rARBSPSTORE
(p9) mov cr.ifs=rCRIFS
mov cr.ipsr=rCRIPSR
mov cr.iip=rCRIIP
;;
(pUser) mov ar.rnat=rARRNAT // must happen with RSE in lazy mode
(pUStk) mov ar.rnat=rARRNAT // must happen with RSE in lazy mode
mov ar.rsc=rARRSC
mov ar.unat=rARUNAT
mov pr=rARPR,-1
......@@ -963,12 +971,11 @@ ENTRY(sys_rt_sigreturn)
END(sys_rt_sigreturn)
GLOBAL_ENTRY(ia64_prepare_handle_unaligned)
//
// r16 = fake ar.pfs, we simply need to make sure
// privilege is still 0
//
mov r16=r0
.prologue
/*
* r16 = fake ar.pfs, we simply need to make sure privilege is still 0
*/
mov r16=r0
DO_SAVE_SWITCH_STACK
br.call.sptk.many rp=ia64_handle_unaligned // stack frame setup in ivt
.ret21: .body
......@@ -1235,8 +1242,8 @@ sys_call_table:
data8 sys_sched_setaffinity
data8 sys_sched_getaffinity
data8 sys_set_tid_address
data8 ia64_ni_syscall // available. (was sys_alloc_hugepages)
data8 ia64_ni_syscall // available (was sys_free_hugepages)
data8 ia64_ni_syscall
data8 ia64_ni_syscall // 1235
data8 sys_exit_group
data8 sys_lookup_dcookie
data8 sys_io_setup
......
......@@ -4,8 +4,8 @@
* Preserved registers that are shared between code in ivt.S and entry.S. Be
* careful not to step on these!
*/
#define pKern p2 /* will leave_kernel return to kernel-mode? */
#define pUser p3 /* will leave_kernel return to user-mode? */
#define pKStk p2 /* will leave_kernel return to kernel-stacks? */
#define pUStk p3 /* will leave_kernel return to user-stacks? */
#define pSys p4 /* are we processing a (synchronous) system call? */
#define pNonSys p5 /* complement of pSys */
......
/*
* This file contains the light-weight system call handlers (fsyscall-handlers).
*
* Copyright (C) 2003 Hewlett-Packard Co
* David Mosberger-Tang <davidm@hpl.hp.com>
*/
#include <asm/asmmacro.h>
#include <asm/errno.h>
#include <asm/offsets.h>
#include <asm/thread_info.h>
ENTRY(fsys_ni_syscall)
mov r8=ENOSYS
mov r10=-1
MCKINLEY_E9_WORKAROUND
br.ret.sptk.many b6
END(fsys_ni_syscall)
ENTRY(fsys_getpid)
add r9=TI_FLAGS+IA64_TASK_SIZE,r16
;;
ld4 r9=[r9]
add r8=IA64_TASK_TGID_OFFSET,r16
;;
and r9=TIF_ALLWORK_MASK,r9
ld4 r8=[r8]
;;
cmp.ne p8,p0=0,r9
(p8) br.spnt.many fsys_fallback_syscall
MCKINLEY_E9_WORKAROUND
br.ret.sptk.many b6
END(fsys_getpid)
.rodata
.align 8
.globl fsyscall_table
fsyscall_table:
data8 fsys_ni_syscall
data8 fsys_fallback_syscall // exit // 1025
data8 fsys_fallback_syscall // read
data8 fsys_fallback_syscall // write
data8 fsys_fallback_syscall // open
data8 fsys_fallback_syscall // close
data8 fsys_fallback_syscall // creat // 1030
data8 fsys_fallback_syscall // link
data8 fsys_fallback_syscall // unlink
data8 fsys_fallback_syscall // execve
data8 fsys_fallback_syscall // chdir
data8 fsys_fallback_syscall // fchdir // 1035
data8 fsys_fallback_syscall // utimes
data8 fsys_fallback_syscall // mknod
data8 fsys_fallback_syscall // chmod
data8 fsys_fallback_syscall // chown
data8 fsys_fallback_syscall // lseek // 1040
data8 fsys_getpid
data8 fsys_fallback_syscall // getppid
data8 fsys_fallback_syscall // mount
data8 fsys_fallback_syscall // umount
data8 fsys_fallback_syscall // setuid // 1045
data8 fsys_fallback_syscall // getuid
data8 fsys_fallback_syscall // geteuid
data8 fsys_fallback_syscall // ptrace
data8 fsys_fallback_syscall // access
data8 fsys_fallback_syscall // sync // 1050
data8 fsys_fallback_syscall // fsync
data8 fsys_fallback_syscall // fdatasync
data8 fsys_fallback_syscall // kill
data8 fsys_fallback_syscall // rename
data8 fsys_fallback_syscall // mkdir // 1055
data8 fsys_fallback_syscall // rmdir
data8 fsys_fallback_syscall // dup
data8 fsys_fallback_syscall // pipe
data8 fsys_fallback_syscall // times
data8 fsys_fallback_syscall // brk // 1060
data8 fsys_fallback_syscall // setgid
data8 fsys_fallback_syscall // getgid
data8 fsys_fallback_syscall // getegid
data8 fsys_fallback_syscall // acct
data8 fsys_fallback_syscall // ioctl // 1065
data8 fsys_fallback_syscall // fcntl
data8 fsys_fallback_syscall // umask
data8 fsys_fallback_syscall // chroot
data8 fsys_fallback_syscall // ustat
data8 fsys_fallback_syscall // dup2 // 1070
data8 fsys_fallback_syscall // setreuid
data8 fsys_fallback_syscall // setregid
data8 fsys_fallback_syscall // getresuid
data8 fsys_fallback_syscall // setresuid
data8 fsys_fallback_syscall // getresgid // 1075
data8 fsys_fallback_syscall // setresgid
data8 fsys_fallback_syscall // getgroups
data8 fsys_fallback_syscall // setgroups
data8 fsys_fallback_syscall // getpgid
data8 fsys_fallback_syscall // setpgid // 1080
data8 fsys_fallback_syscall // setsid
data8 fsys_fallback_syscall // getsid
data8 fsys_fallback_syscall // sethostname
data8 fsys_fallback_syscall // setrlimit
data8 fsys_fallback_syscall // getrlimit // 1085
data8 fsys_fallback_syscall // getrusage
data8 fsys_fallback_syscall // gettimeofday
data8 fsys_fallback_syscall // settimeofday
data8 fsys_fallback_syscall // select
data8 fsys_fallback_syscall // poll // 1090
data8 fsys_fallback_syscall // symlink
data8 fsys_fallback_syscall // readlink
data8 fsys_fallback_syscall // uselib
data8 fsys_fallback_syscall // swapon
data8 fsys_fallback_syscall // swapoff // 1095
data8 fsys_fallback_syscall // reboot
data8 fsys_fallback_syscall // truncate
data8 fsys_fallback_syscall // ftruncate
data8 fsys_fallback_syscall // fchmod
data8 fsys_fallback_syscall // fchown // 1100
data8 fsys_fallback_syscall // getpriority
data8 fsys_fallback_syscall // setpriority
data8 fsys_fallback_syscall // statfs
data8 fsys_fallback_syscall // fstatfs
data8 fsys_fallback_syscall // gettid // 1105
data8 fsys_fallback_syscall // semget
data8 fsys_fallback_syscall // semop
data8 fsys_fallback_syscall // semctl
data8 fsys_fallback_syscall // msgget
data8 fsys_fallback_syscall // msgsnd // 1110
data8 fsys_fallback_syscall // msgrcv
data8 fsys_fallback_syscall // msgctl
data8 fsys_fallback_syscall // shmget
data8 fsys_fallback_syscall // shmat
data8 fsys_fallback_syscall // shmdt // 1115
data8 fsys_fallback_syscall // shmctl
data8 fsys_fallback_syscall // syslog
data8 fsys_fallback_syscall // setitimer
data8 fsys_fallback_syscall // getitimer
data8 fsys_fallback_syscall // 1120
data8 fsys_fallback_syscall
data8 fsys_fallback_syscall
data8 fsys_fallback_syscall // vhangup
data8 fsys_fallback_syscall // lchown
data8 fsys_fallback_syscall // remap_file_pages // 1125
data8 fsys_fallback_syscall // wait4
data8 fsys_fallback_syscall // sysinfo
data8 fsys_fallback_syscall // clone
data8 fsys_fallback_syscall // setdomainname
data8 fsys_fallback_syscall // newuname // 1130
data8 fsys_fallback_syscall // adjtimex
data8 fsys_fallback_syscall
data8 fsys_fallback_syscall // init_module
data8 fsys_fallback_syscall // delete_module
data8 fsys_fallback_syscall // 1135
data8 fsys_fallback_syscall
data8 fsys_fallback_syscall // quotactl
data8 fsys_fallback_syscall // bdflush
data8 fsys_fallback_syscall // sysfs
data8 fsys_fallback_syscall // personality // 1140
data8 fsys_fallback_syscall // afs_syscall
data8 fsys_fallback_syscall // setfsuid
data8 fsys_fallback_syscall // setfsgid
data8 fsys_fallback_syscall // getdents
data8 fsys_fallback_syscall // flock // 1145
data8 fsys_fallback_syscall // readv
data8 fsys_fallback_syscall // writev
data8 fsys_fallback_syscall // pread64
data8 fsys_fallback_syscall // pwrite64
data8 fsys_fallback_syscall // sysctl // 1150
data8 fsys_fallback_syscall // mmap
data8 fsys_fallback_syscall // munmap
data8 fsys_fallback_syscall // mlock
data8 fsys_fallback_syscall // mlockall
data8 fsys_fallback_syscall // mprotect // 1155
data8 fsys_fallback_syscall // mremap
data8 fsys_fallback_syscall // msync
data8 fsys_fallback_syscall // munlock
data8 fsys_fallback_syscall // munlockall
data8 fsys_fallback_syscall // sched_getparam // 1160
data8 fsys_fallback_syscall // sched_setparam
data8 fsys_fallback_syscall // sched_getscheduler
data8 fsys_fallback_syscall // sched_setscheduler
data8 fsys_fallback_syscall // sched_yield
data8 fsys_fallback_syscall // sched_get_priority_max // 1165
data8 fsys_fallback_syscall // sched_get_priority_min
data8 fsys_fallback_syscall // sched_rr_get_interval
data8 fsys_fallback_syscall // nanosleep
data8 fsys_fallback_syscall // nfsservctl
data8 fsys_fallback_syscall // prctl // 1170
data8 fsys_fallback_syscall // getpagesize
data8 fsys_fallback_syscall // mmap2
data8 fsys_fallback_syscall // pciconfig_read
data8 fsys_fallback_syscall // pciconfig_write
data8 fsys_fallback_syscall // perfmonctl // 1175
data8 fsys_fallback_syscall // sigaltstack
data8 fsys_fallback_syscall // rt_sigaction
data8 fsys_fallback_syscall // rt_sigpending
data8 fsys_fallback_syscall // rt_sigprocmask
data8 fsys_fallback_syscall // rt_sigqueueinfo // 1180
data8 fsys_fallback_syscall // rt_sigreturn
data8 fsys_fallback_syscall // rt_sigsuspend
data8 fsys_fallback_syscall // rt_sigtimedwait
data8 fsys_fallback_syscall // getcwd
data8 fsys_fallback_syscall // capget // 1185
data8 fsys_fallback_syscall // capset
data8 fsys_fallback_syscall // sendfile
data8 fsys_fallback_syscall
data8 fsys_fallback_syscall
data8 fsys_fallback_syscall // socket // 1190
data8 fsys_fallback_syscall // bind
data8 fsys_fallback_syscall // connect
data8 fsys_fallback_syscall // listen
data8 fsys_fallback_syscall // accept
data8 fsys_fallback_syscall // getsockname // 1195
data8 fsys_fallback_syscall // getpeername
data8 fsys_fallback_syscall // socketpair
data8 fsys_fallback_syscall // send
data8 fsys_fallback_syscall // sendto
data8 fsys_fallback_syscall // recv // 1200
data8 fsys_fallback_syscall // recvfrom
data8 fsys_fallback_syscall // shutdown
data8 fsys_fallback_syscall // setsockopt
data8 fsys_fallback_syscall // getsockopt
data8 fsys_fallback_syscall // sendmsg // 1205
data8 fsys_fallback_syscall // recvmsg
data8 fsys_fallback_syscall // pivot_root
data8 fsys_fallback_syscall // mincore
data8 fsys_fallback_syscall // madvise
data8 fsys_fallback_syscall // newstat // 1210
data8 fsys_fallback_syscall // newlstat
data8 fsys_fallback_syscall // newfstat
data8 fsys_fallback_syscall // clone2
data8 fsys_fallback_syscall // getdents64
data8 fsys_fallback_syscall // getunwind // 1215
data8 fsys_fallback_syscall // readahead
data8 fsys_fallback_syscall // setxattr
data8 fsys_fallback_syscall // lsetxattr
data8 fsys_fallback_syscall // fsetxattr
data8 fsys_fallback_syscall // getxattr // 1220
data8 fsys_fallback_syscall // lgetxattr
data8 fsys_fallback_syscall // fgetxattr
data8 fsys_fallback_syscall // listxattr
data8 fsys_fallback_syscall // llistxattr
data8 fsys_fallback_syscall // flistxattr // 1225
data8 fsys_fallback_syscall // removexattr
data8 fsys_fallback_syscall // lremovexattr
data8 fsys_fallback_syscall // fremovexattr
data8 fsys_fallback_syscall // tkill
data8 fsys_fallback_syscall // futex // 1230
data8 fsys_fallback_syscall // sched_setaffinity
data8 fsys_fallback_syscall // sched_getaffinity
data8 fsys_fallback_syscall // set_tid_address
data8 fsys_fallback_syscall // alloc_hugepages
data8 fsys_fallback_syscall // free_hugepages // 1235
data8 fsys_fallback_syscall // exit_group
data8 fsys_fallback_syscall // lookup_dcookie
data8 fsys_fallback_syscall // io_setup
data8 fsys_fallback_syscall // io_destroy
data8 fsys_fallback_syscall // io_getevents // 1240
data8 fsys_fallback_syscall // io_submit
data8 fsys_fallback_syscall // io_cancel
data8 fsys_fallback_syscall // epoll_create
data8 fsys_fallback_syscall // epoll_ctl
data8 fsys_fallback_syscall // epoll_wait // 1245
data8 fsys_fallback_syscall // restart_syscall
data8 fsys_fallback_syscall // semtimedop
data8 fsys_fallback_syscall
data8 fsys_fallback_syscall
data8 fsys_fallback_syscall // 1250
data8 fsys_fallback_syscall
data8 fsys_fallback_syscall
data8 fsys_fallback_syscall
data8 fsys_fallback_syscall
data8 fsys_fallback_syscall // 1255
data8 fsys_fallback_syscall
data8 fsys_fallback_syscall
data8 fsys_fallback_syscall
data8 fsys_fallback_syscall
data8 fsys_fallback_syscall // 1260
data8 fsys_fallback_syscall
data8 fsys_fallback_syscall
data8 fsys_fallback_syscall
data8 fsys_fallback_syscall
data8 fsys_fallback_syscall // 1265
data8 fsys_fallback_syscall
data8 fsys_fallback_syscall
data8 fsys_fallback_syscall
data8 fsys_fallback_syscall
data8 fsys_fallback_syscall // 1270
data8 fsys_fallback_syscall
data8 fsys_fallback_syscall
data8 fsys_fallback_syscall
data8 fsys_fallback_syscall
data8 fsys_fallback_syscall // 1275
data8 fsys_fallback_syscall
data8 fsys_fallback_syscall
data8 fsys_fallback_syscall
......@@ -2,7 +2,7 @@
* This file contains the code that gets mapped at the upper end of each task's text
* region. For now, it contains the signal trampoline code only.
*
* Copyright (C) 1999-2002 Hewlett-Packard Co
* Copyright (C) 1999-2003 Hewlett-Packard Co
* David Mosberger-Tang <davidm@hpl.hp.com>
*/
......@@ -14,6 +14,87 @@
#include <asm/page.h>
.section .text.gate, "ax"
.start_gate:
#if CONFIG_FSYS
#include <asm/errno.h>
/*
* On entry:
* r11 = saved ar.pfs
* r15 = system call #
* b0 = saved return address
* b6 = return address
* On exit:
* r11 = saved ar.pfs
* r15 = system call #
* b0 = saved return address
* all other "scratch" registers: undefined
* all "preserved" registers: same as on entry
*/
GLOBAL_ENTRY(syscall_via_epc)
.prologue
.altrp b6
.body
{
/*
* Note: the kernel cannot assume that the first two instructions in this
* bundle get executed. The remaining code must be safe even if
* they do not get executed.
*/
adds r17=-1024,r15
mov r10=0 // default to successful syscall execution
epc
}
;;
rsm psr.be
movl r18=fsyscall_table
mov r16=IA64_KR(CURRENT)
mov r19=255
;;
shladd r18=r17,3,r18
cmp.geu p6,p0=r19,r17 // (syscall > 0 && syscall <= 1024+255)?
;;
srlz.d // ensure little-endian byteorder is in effect
(p6) ld8 r18=[r18]
;;
(p6) mov b7=r18
(p6) br.sptk.many b7
mov r10=-1
mov r8=ENOSYS
MCKINLEY_E9_WORKAROUND
br.ret.sptk.many b6
END(syscall_via_epc)
GLOBAL_ENTRY(syscall_via_break)
.prologue
.altrp b6
.body
break 0x100000
br.ret.sptk.many b6
END(syscall_via_break)
GLOBAL_ENTRY(fsys_fallback_syscall)
/*
* It would be better/fsyser to do the SAVE_MIN magic directly here, but for now
* we simply fall back on doing a system-call via break. Good enough
* to get started. (Note: we have to do this through the gate page again, since
* the br.ret will switch us back to user-level privilege.)
*
* XXX Move this back to fsys.S after changing it over to avoid break 0x100000.
*/
movl r2=(syscall_via_break - .start_gate) + GATE_ADDR
;;
MCKINLEY_E9_WORKAROUND
mov b7=r2
br.ret.sptk.many b7
END(fsys_fallback_syscall)
#endif /* CONFIG_FSYS */
# define ARG0_OFF (16 + IA64_SIGFRAME_ARG0_OFFSET)
# define ARG1_OFF (16 + IA64_SIGFRAME_ARG1_OFFSET)
......@@ -63,15 +144,18 @@
* call stack.
*/
#define SIGTRAMP_SAVES \
.unwabi @svr4, 's' // mark this as a sigtramp handler (saves scratch regs) \
.savesp ar.unat, UNAT_OFF+SIGCONTEXT_OFF \
.savesp ar.fpsr, FPSR_OFF+SIGCONTEXT_OFF \
.savesp pr, PR_OFF+SIGCONTEXT_OFF \
.savesp rp, RP_OFF+SIGCONTEXT_OFF \
.vframesp SP_OFF+SIGCONTEXT_OFF
GLOBAL_ENTRY(ia64_sigtramp)
// describe the state that is active when we get here:
.prologue
.unwabi @svr4, 's' // mark this as a sigtramp handler (saves scratch regs)
.savesp ar.unat, UNAT_OFF+SIGCONTEXT_OFF
.savesp ar.fpsr, FPSR_OFF+SIGCONTEXT_OFF
.savesp pr, PR_OFF+SIGCONTEXT_OFF
.savesp rp, RP_OFF+SIGCONTEXT_OFF
.vframesp SP_OFF+SIGCONTEXT_OFF
SIGTRAMP_SAVES
.body
.label_state 1
......@@ -156,10 +240,11 @@ back_from_restore_rbs:
ldf.fill f14=[base0],32
ldf.fill f15=[base1],32
mov r15=__NR_rt_sigreturn
.restore sp // pop .prologue
break __BREAK_SYSCALL
.body
.copy_state 1
.prologue
SIGTRAMP_SAVES
setup_rbs:
mov ar.rsc=0 // put RSE into enforced lazy mode
;;
......@@ -171,6 +256,7 @@ setup_rbs:
;;
.spillsp ar.rnat, RNAT_OFF+SIGCONTEXT_OFF
st8 [r14]=r16 // save sc_ar_rnat
.body
adds r14=(LOADRS_OFF+SIGCONTEXT_OFF),sp
mov.m r16=ar.bsp // sc_loadrs <- (new bsp - new bspstore) << 16
......@@ -182,10 +268,11 @@ setup_rbs:
;;
st8 [r14]=r15 // save sc_loadrs
mov ar.rsc=0xf // set RSE into eager mode, pl 3
.restore sp // pop .prologue
br.cond.sptk back_from_setup_rbs
.prologue
.copy_state 1
SIGTRAMP_SAVES
.spillsp ar.rnat, RNAT_OFF+SIGCONTEXT_OFF
.body
restore_rbs:
......
......@@ -5,7 +5,7 @@
* to set up the kernel's global pointer and jump to the kernel
* entry point.
*
* Copyright (C) 1998-2001 Hewlett-Packard Co
* Copyright (C) 1998-2001, 2003 Hewlett-Packard Co
* David Mosberger-Tang <davidm@hpl.hp.com>
* Stephane Eranian <eranian@hpl.hp.com>
* Copyright (C) 1999 VA Linux Systems
......@@ -143,17 +143,14 @@ start_ap:
movl r2=init_thread_union
cmp.eq isBP,isAP=r0,r0
#endif
;;
extr r3=r2,0,61 // r3 == phys addr of task struct
mov r16=KERNEL_TR_PAGE_NUM
;;
// load the "current" pointer (r13) and ar.k6 with the current task
mov r13=r2
mov IA64_KR(CURRENT)=r3 // Physical address
mov IA64_KR(CURRENT)=r2 // virtual address
// initialize k4 to a safe value (64-128MB is mapped by TR_KERNEL)
mov IA64_KR(CURRENT_STACK)=r16
mov r13=r2
/*
* Reserve space at the top of the stack for "struct pt_regs". Kernel threads
* don't store interesting values in that structure, but the space still needs
......
......@@ -142,4 +142,8 @@ EXPORT_SYMBOL(efi_dir);
EXPORT_SYMBOL(ia64_mv);
#endif
EXPORT_SYMBOL(machvec_noop);
#ifdef CONFIG_PERFMON
#include <asm/perfmon.h>
EXPORT_SYMBOL(pfm_install_alternate_syswide_subsystem);
EXPORT_SYMBOL(pfm_remove_alternate_syswide_subsystem);
#endif
......@@ -752,7 +752,7 @@ iosapic_parse_prt (void)
if (index < 0) {
printk(KERN_WARNING"IOSAPIC: GSI 0x%x has no IOSAPIC!\n", gsi);
return;
continue;
}
addr = iosapic_lists[index].addr;
gsi_base = iosapic_lists[index].gsi_base;
......
......@@ -178,7 +178,7 @@ init_IRQ (void)
register_percpu_irq(IA64_IPI_VECTOR, &ipi_irqaction);
#endif
#ifdef CONFIG_PERFMON
perfmon_init_percpu();
pfm_init_percpu();
#endif
platform_irq_init();
}
......
......@@ -192,7 +192,7 @@ ENTRY(vhpt_miss)
rfi
END(vhpt_miss)
.align 1024
.org ia64_ivt+0x400
/////////////////////////////////////////////////////////////////////////////////////////
// 0x0400 Entry 1 (size 64 bundles) ITLB (21)
ENTRY(itlb_miss)
......@@ -206,7 +206,7 @@ ENTRY(itlb_miss)
mov r16=cr.ifa // get virtual address
mov r29=b0 // save b0
mov r31=pr // save predicates
itlb_fault:
.itlb_fault:
mov r17=cr.iha // get virtual address of L3 PTE
movl r30=1f // load nested fault continuation point
;;
......@@ -230,7 +230,7 @@ itlb_fault:
rfi
END(itlb_miss)
.align 1024
.org ia64_ivt+0x0800
/////////////////////////////////////////////////////////////////////////////////////////
// 0x0800 Entry 2 (size 64 bundles) DTLB (9,48)
ENTRY(dtlb_miss)
......@@ -268,7 +268,7 @@ dtlb_fault:
rfi
END(dtlb_miss)
.align 1024
.org ia64_ivt+0x0c00
/////////////////////////////////////////////////////////////////////////////////////////
// 0x0c00 Entry 3 (size 64 bundles) Alt ITLB (19)
ENTRY(alt_itlb_miss)
......@@ -288,7 +288,7 @@ ENTRY(alt_itlb_miss)
;;
(p8) mov cr.iha=r17
(p8) mov r29=b0 // save b0
(p8) br.cond.dptk itlb_fault
(p8) br.cond.dptk .itlb_fault
#endif
extr.u r23=r21,IA64_PSR_CPL0_BIT,2 // extract psr.cpl
and r19=r19,r16 // clear ed, reserved bits, and PTE control bits
......@@ -306,7 +306,7 @@ ENTRY(alt_itlb_miss)
rfi
END(alt_itlb_miss)
.align 1024
.org ia64_ivt+0x1000
/////////////////////////////////////////////////////////////////////////////////////////
// 0x1000 Entry 4 (size 64 bundles) Alt DTLB (7,46)
ENTRY(alt_dtlb_miss)
......@@ -379,7 +379,7 @@ ENTRY(page_fault)
br.call.sptk.many b6=ia64_do_page_fault // ignore return address
END(page_fault)
.align 1024
.org ia64_ivt+0x1400
/////////////////////////////////////////////////////////////////////////////////////////
// 0x1400 Entry 5 (size 64 bundles) Data nested TLB (6,45)
ENTRY(nested_dtlb_miss)
......@@ -440,7 +440,7 @@ ENTRY(nested_dtlb_miss)
br.sptk.many b0 // return to continuation point
END(nested_dtlb_miss)
.align 1024
.org ia64_ivt+0x1800
/////////////////////////////////////////////////////////////////////////////////////////
// 0x1800 Entry 6 (size 64 bundles) Instruction Key Miss (24)
ENTRY(ikey_miss)
......@@ -448,7 +448,7 @@ ENTRY(ikey_miss)
FAULT(6)
END(ikey_miss)
.align 1024
.org ia64_ivt+0x1c00
/////////////////////////////////////////////////////////////////////////////////////////
// 0x1c00 Entry 7 (size 64 bundles) Data Key Miss (12,51)
ENTRY(dkey_miss)
......@@ -456,7 +456,7 @@ ENTRY(dkey_miss)
FAULT(7)
END(dkey_miss)
.align 1024
.org ia64_ivt+0x2000
/////////////////////////////////////////////////////////////////////////////////////////
// 0x2000 Entry 8 (size 64 bundles) Dirty-bit (54)
ENTRY(dirty_bit)
......@@ -512,7 +512,7 @@ ENTRY(dirty_bit)
rfi
END(idirty_bit)
.align 1024
.org ia64_ivt+0x2400
/////////////////////////////////////////////////////////////////////////////////////////
// 0x2400 Entry 9 (size 64 bundles) Instruction Access-bit (27)
ENTRY(iaccess_bit)
......@@ -571,7 +571,7 @@ ENTRY(iaccess_bit)
rfi
END(iaccess_bit)
.align 1024
.org ia64_ivt+0x2800
/////////////////////////////////////////////////////////////////////////////////////////
// 0x2800 Entry 10 (size 64 bundles) Data Access-bit (15,55)
ENTRY(daccess_bit)
......@@ -618,7 +618,7 @@ ENTRY(daccess_bit)
rfi
END(daccess_bit)
.align 1024
.org ia64_ivt+0x2c00
/////////////////////////////////////////////////////////////////////////////////////////
// 0x2c00 Entry 11 (size 64 bundles) Break instruction (33)
ENTRY(break_fault)
......@@ -690,7 +690,7 @@ ENTRY(break_fault)
// NOT REACHED
END(break_fault)
ENTRY(demine_args)
ENTRY_MIN_ALIGN(demine_args)
alloc r2=ar.pfs,8,0,0,0
tnat.nz p8,p0=in0
tnat.nz p9,p0=in1
......@@ -719,7 +719,7 @@ ENTRY(demine_args)
br.ret.sptk.many rp
END(demine_args)
.align 1024
.org ia64_ivt+0x3000
/////////////////////////////////////////////////////////////////////////////////////////
// 0x3000 Entry 12 (size 64 bundles) External Interrupt (4)
ENTRY(interrupt)
......@@ -746,19 +746,19 @@ ENTRY(interrupt)
br.call.sptk.many b6=ia64_handle_irq
END(interrupt)
.align 1024
.org ia64_ivt+0x3400
/////////////////////////////////////////////////////////////////////////////////////////
// 0x3400 Entry 13 (size 64 bundles) Reserved
DBG_FAULT(13)
FAULT(13)
.align 1024
.org ia64_ivt+0x3800
/////////////////////////////////////////////////////////////////////////////////////////
// 0x3800 Entry 14 (size 64 bundles) Reserved
DBG_FAULT(14)
FAULT(14)
.align 1024
.org ia64_ivt+0x3c00
/////////////////////////////////////////////////////////////////////////////////////////
// 0x3c00 Entry 15 (size 64 bundles) Reserved
DBG_FAULT(15)
......@@ -803,7 +803,7 @@ ENTRY(dispatch_illegal_op_fault)
br.sptk.many ia64_leave_kernel
END(dispatch_illegal_op_fault)
.align 1024
.org ia64_ivt+0x4000
/////////////////////////////////////////////////////////////////////////////////////////
// 0x4000 Entry 16 (size 64 bundles) Reserved
DBG_FAULT(16)
......@@ -893,7 +893,7 @@ END(dispatch_to_ia32_handler)
#endif /* CONFIG_IA32_SUPPORT */
.align 1024
.org ia64_ivt+0x4400
/////////////////////////////////////////////////////////////////////////////////////////
// 0x4400 Entry 17 (size 64 bundles) Reserved
DBG_FAULT(17)
......@@ -925,7 +925,7 @@ ENTRY(non_syscall)
br.call.sptk.many b6=ia64_bad_break // avoid WAW on CFM and ignore return addr
END(non_syscall)
.align 1024
.org ia64_ivt+0x4800
/////////////////////////////////////////////////////////////////////////////////////////
// 0x4800 Entry 18 (size 64 bundles) Reserved
DBG_FAULT(18)
......@@ -959,7 +959,7 @@ ENTRY(dispatch_unaligned_handler)
br.sptk.many ia64_prepare_handle_unaligned
END(dispatch_unaligned_handler)
.align 1024
.org ia64_ivt+0x4c00
/////////////////////////////////////////////////////////////////////////////////////////
// 0x4c00 Entry 19 (size 64 bundles) Reserved
DBG_FAULT(19)
......@@ -1005,7 +1005,7 @@ END(dispatch_to_fault_handler)
// --- End of long entries, Beginning of short entries
//
.align 1024
.org ia64_ivt+0x5000
/////////////////////////////////////////////////////////////////////////////////////////
// 0x5000 Entry 20 (size 16 bundles) Page Not Present (10,22,49)
ENTRY(page_not_present)
......@@ -1025,7 +1025,7 @@ ENTRY(page_not_present)
br.sptk.many page_fault
END(page_not_present)
.align 256
.org ia64_ivt+0x5100
/////////////////////////////////////////////////////////////////////////////////////////
// 0x5100 Entry 21 (size 16 bundles) Key Permission (13,25,52)
ENTRY(key_permission)
......@@ -1038,7 +1038,7 @@ ENTRY(key_permission)
br.sptk.many page_fault
END(key_permission)
.align 256
.org ia64_ivt+0x5200
/////////////////////////////////////////////////////////////////////////////////////////
// 0x5200 Entry 22 (size 16 bundles) Instruction Access Rights (26)
ENTRY(iaccess_rights)
......@@ -1051,7 +1051,7 @@ ENTRY(iaccess_rights)
br.sptk.many page_fault
END(iaccess_rights)
.align 256
.org ia64_ivt+0x5300
/////////////////////////////////////////////////////////////////////////////////////////
// 0x5300 Entry 23 (size 16 bundles) Data Access Rights (14,53)
ENTRY(daccess_rights)
......@@ -1064,7 +1064,7 @@ ENTRY(daccess_rights)
br.sptk.many page_fault
END(daccess_rights)
.align 256
.org ia64_ivt+0x5400
/////////////////////////////////////////////////////////////////////////////////////////
// 0x5400 Entry 24 (size 16 bundles) General Exception (5,32,34,36,38,39)
ENTRY(general_exception)
......@@ -1079,7 +1079,7 @@ ENTRY(general_exception)
br.sptk.many dispatch_to_fault_handler
END(general_exception)
.align 256
.org ia64_ivt+0x5500
/////////////////////////////////////////////////////////////////////////////////////////
// 0x5500 Entry 25 (size 16 bundles) Disabled FP-Register (35)
ENTRY(disabled_fp_reg)
......@@ -1092,7 +1092,7 @@ ENTRY(disabled_fp_reg)
br.sptk.many dispatch_to_fault_handler
END(disabled_fp_reg)
.align 256
.org ia64_ivt+0x5600
/////////////////////////////////////////////////////////////////////////////////////////
// 0x5600 Entry 26 (size 16 bundles) Nat Consumption (11,23,37,50)
ENTRY(nat_consumption)
......@@ -1100,7 +1100,7 @@ ENTRY(nat_consumption)
FAULT(26)
END(nat_consumption)
.align 256
.org ia64_ivt+0x5700
/////////////////////////////////////////////////////////////////////////////////////////
// 0x5700 Entry 27 (size 16 bundles) Speculation (40)
ENTRY(speculation_vector)
......@@ -1137,13 +1137,13 @@ ENTRY(speculation_vector)
rfi // and go back
END(speculation_vector)
.align 256
.org ia64_ivt+0x5800
/////////////////////////////////////////////////////////////////////////////////////////
// 0x5800 Entry 28 (size 16 bundles) Reserved
DBG_FAULT(28)
FAULT(28)
.align 256
.org ia64_ivt+0x5900
/////////////////////////////////////////////////////////////////////////////////////////
// 0x5900 Entry 29 (size 16 bundles) Debug (16,28,56)
ENTRY(debug_vector)
......@@ -1151,7 +1151,7 @@ ENTRY(debug_vector)
FAULT(29)
END(debug_vector)
.align 256
.org ia64_ivt+0x5a00
/////////////////////////////////////////////////////////////////////////////////////////
// 0x5a00 Entry 30 (size 16 bundles) Unaligned Reference (57)
ENTRY(unaligned_access)
......@@ -1162,91 +1162,103 @@ ENTRY(unaligned_access)
br.sptk.many dispatch_unaligned_handler
END(unaligned_access)
.align 256
.org ia64_ivt+0x5b00
/////////////////////////////////////////////////////////////////////////////////////////
// 0x5b00 Entry 31 (size 16 bundles) Unsupported Data Reference (57)
ENTRY(unsupported_data_reference)
DBG_FAULT(31)
FAULT(31)
END(unsupported_data_reference)
.align 256
.org ia64_ivt+0x5c00
/////////////////////////////////////////////////////////////////////////////////////////
// 0x5c00 Entry 32 (size 16 bundles) Floating-Point Fault (64)
ENTRY(floating_point_fault)
DBG_FAULT(32)
FAULT(32)
END(floating_point_fault)
.align 256
.org ia64_ivt+0x5d00
/////////////////////////////////////////////////////////////////////////////////////////
// 0x5d00 Entry 33 (size 16 bundles) Floating Point Trap (66)
ENTRY(floating_point_trap)
DBG_FAULT(33)
FAULT(33)
END(floating_point_trap)
.align 256
.org ia64_ivt+0x5e00
/////////////////////////////////////////////////////////////////////////////////////////
// 0x5e00 Entry 34 (size 16 bundles) Lower Privilege Tranfer Trap (66)
// 0x5e00 Entry 34 (size 16 bundles) Lower Privilege Transfer Trap (66)
ENTRY(lower_privilege_trap)
DBG_FAULT(34)
FAULT(34)
END(lower_privilege_trap)
.align 256
.org ia64_ivt+0x5f00
/////////////////////////////////////////////////////////////////////////////////////////
// 0x5f00 Entry 35 (size 16 bundles) Taken Branch Trap (68)
ENTRY(taken_branch_trap)
DBG_FAULT(35)
FAULT(35)
END(taken_branch_trap)
.align 256
.org ia64_ivt+0x6000
/////////////////////////////////////////////////////////////////////////////////////////
// 0x6000 Entry 36 (size 16 bundles) Single Step Trap (69)
ENTRY(single_step_trap)
DBG_FAULT(36)
FAULT(36)
END(single_step_trap)
.align 256
.org ia64_ivt+0x6100
/////////////////////////////////////////////////////////////////////////////////////////
// 0x6100 Entry 37 (size 16 bundles) Reserved
DBG_FAULT(37)
FAULT(37)
.align 256
.org ia64_ivt+0x6200
/////////////////////////////////////////////////////////////////////////////////////////
// 0x6200 Entry 38 (size 16 bundles) Reserved
DBG_FAULT(38)
FAULT(38)
.align 256
.org ia64_ivt+0x6300
/////////////////////////////////////////////////////////////////////////////////////////
// 0x6300 Entry 39 (size 16 bundles) Reserved
DBG_FAULT(39)
FAULT(39)
.align 256
.org ia64_ivt+0x6400
/////////////////////////////////////////////////////////////////////////////////////////
// 0x6400 Entry 40 (size 16 bundles) Reserved
DBG_FAULT(40)
FAULT(40)
.align 256
.org ia64_ivt+0x6500
/////////////////////////////////////////////////////////////////////////////////////////
// 0x6500 Entry 41 (size 16 bundles) Reserved
DBG_FAULT(41)
FAULT(41)
.align 256
.org ia64_ivt+0x6600
/////////////////////////////////////////////////////////////////////////////////////////
// 0x6600 Entry 42 (size 16 bundles) Reserved
DBG_FAULT(42)
FAULT(42)
.align 256
.org ia64_ivt+0x6700
/////////////////////////////////////////////////////////////////////////////////////////
// 0x6700 Entry 43 (size 16 bundles) Reserved
DBG_FAULT(43)
FAULT(43)
.align 256
.org ia64_ivt+0x6800
/////////////////////////////////////////////////////////////////////////////////////////
// 0x6800 Entry 44 (size 16 bundles) Reserved
DBG_FAULT(44)
FAULT(44)
.align 256
.org ia64_ivt+0x6900
/////////////////////////////////////////////////////////////////////////////////////////
// 0x6900 Entry 45 (size 16 bundles) IA-32 Exeception (17,18,29,41,42,43,44,58,60,61,62,72,73,75,76,77)
ENTRY(ia32_exception)
......@@ -1254,7 +1266,7 @@ ENTRY(ia32_exception)
FAULT(45)
END(ia32_exception)
.align 256
.org ia64_ivt+0x6a00
/////////////////////////////////////////////////////////////////////////////////////////
// 0x6a00 Entry 46 (size 16 bundles) IA-32 Intercept (30,31,59,70,71)
ENTRY(ia32_intercept)
......@@ -1284,7 +1296,7 @@ ENTRY(ia32_intercept)
FAULT(46)
END(ia32_intercept)
.align 256
.org ia64_ivt+0x6b00
/////////////////////////////////////////////////////////////////////////////////////////
// 0x6b00 Entry 47 (size 16 bundles) IA-32 Interrupt (74)
ENTRY(ia32_interrupt)
......@@ -1297,121 +1309,121 @@ ENTRY(ia32_interrupt)
#endif
END(ia32_interrupt)
.align 256
.org ia64_ivt+0x6c00
/////////////////////////////////////////////////////////////////////////////////////////
// 0x6c00 Entry 48 (size 16 bundles) Reserved
DBG_FAULT(48)
FAULT(48)
.align 256
.org ia64_ivt+0x6d00
/////////////////////////////////////////////////////////////////////////////////////////
// 0x6d00 Entry 49 (size 16 bundles) Reserved
DBG_FAULT(49)
FAULT(49)
.align 256
.org ia64_ivt+0x6e00
/////////////////////////////////////////////////////////////////////////////////////////
// 0x6e00 Entry 50 (size 16 bundles) Reserved
DBG_FAULT(50)
FAULT(50)
.align 256
.org ia64_ivt+0x6f00
/////////////////////////////////////////////////////////////////////////////////////////
// 0x6f00 Entry 51 (size 16 bundles) Reserved
DBG_FAULT(51)
FAULT(51)
.align 256
.org ia64_ivt+0x7000
/////////////////////////////////////////////////////////////////////////////////////////
// 0x7000 Entry 52 (size 16 bundles) Reserved
DBG_FAULT(52)
FAULT(52)
.align 256
.org ia64_ivt+0x7100
/////////////////////////////////////////////////////////////////////////////////////////
// 0x7100 Entry 53 (size 16 bundles) Reserved
DBG_FAULT(53)
FAULT(53)
.align 256
.org ia64_ivt+0x7200
/////////////////////////////////////////////////////////////////////////////////////////
// 0x7200 Entry 54 (size 16 bundles) Reserved
DBG_FAULT(54)
FAULT(54)
.align 256
.org ia64_ivt+0x7300
/////////////////////////////////////////////////////////////////////////////////////////
// 0x7300 Entry 55 (size 16 bundles) Reserved
DBG_FAULT(55)
FAULT(55)
.align 256
.org ia64_ivt+0x7400
/////////////////////////////////////////////////////////////////////////////////////////
// 0x7400 Entry 56 (size 16 bundles) Reserved
DBG_FAULT(56)
FAULT(56)
.align 256
.org ia64_ivt+0x7500
/////////////////////////////////////////////////////////////////////////////////////////
// 0x7500 Entry 57 (size 16 bundles) Reserved
DBG_FAULT(57)
FAULT(57)
.align 256
.org ia64_ivt+0x7600
/////////////////////////////////////////////////////////////////////////////////////////
// 0x7600 Entry 58 (size 16 bundles) Reserved
DBG_FAULT(58)
FAULT(58)
.align 256
.org ia64_ivt+0x7700
/////////////////////////////////////////////////////////////////////////////////////////
// 0x7700 Entry 59 (size 16 bundles) Reserved
DBG_FAULT(59)
FAULT(59)
.align 256
.org ia64_ivt+0x7800
/////////////////////////////////////////////////////////////////////////////////////////
// 0x7800 Entry 60 (size 16 bundles) Reserved
DBG_FAULT(60)
FAULT(60)
.align 256
.org ia64_ivt+0x7900
/////////////////////////////////////////////////////////////////////////////////////////
// 0x7900 Entry 61 (size 16 bundles) Reserved
DBG_FAULT(61)
FAULT(61)
.align 256
.org ia64_ivt+0x7a00
/////////////////////////////////////////////////////////////////////////////////////////
// 0x7a00 Entry 62 (size 16 bundles) Reserved
DBG_FAULT(62)
FAULT(62)
.align 256
.org ia64_ivt+0x7b00
/////////////////////////////////////////////////////////////////////////////////////////
// 0x7b00 Entry 63 (size 16 bundles) Reserved
DBG_FAULT(63)
FAULT(63)
.align 256
.org ia64_ivt+0x7c00
/////////////////////////////////////////////////////////////////////////////////////////
// 0x7c00 Entry 64 (size 16 bundles) Reserved
DBG_FAULT(64)
FAULT(64)
.align 256
.org ia64_ivt+0x7d00
/////////////////////////////////////////////////////////////////////////////////////////
// 0x7d00 Entry 65 (size 16 bundles) Reserved
DBG_FAULT(65)
FAULT(65)
.align 256
.org ia64_ivt+0x7e00
/////////////////////////////////////////////////////////////////////////////////////////
// 0x7e00 Entry 66 (size 16 bundles) Reserved
DBG_FAULT(66)
FAULT(66)
.align 256
.org ia64_ivt+0x7f00
/////////////////////////////////////////////////////////////////////////////////////////
// 0x7f00 Entry 67 (size 16 bundles) Reserved
DBG_FAULT(67)
......
......@@ -30,25 +30,23 @@
* on interrupts.
*/
#define MINSTATE_START_SAVE_MIN_VIRT \
(pUser) mov ar.rsc=0; /* set enforced lazy mode, pl 0, little-endian, loadrs=0 */ \
dep r1=-1,r1,61,3; /* r1 = current (virtual) */ \
(pUStk) mov ar.rsc=0; /* set enforced lazy mode, pl 0, little-endian, loadrs=0 */ \
;; \
(pUser) mov.m rARRNAT=ar.rnat; \
(pUser) addl rKRBS=IA64_RBS_OFFSET,r1; /* compute base of RBS */ \
(pKern) mov r1=sp; /* get sp */ \
(pUStk) mov.m rARRNAT=ar.rnat; \
(pUStk) addl rKRBS=IA64_RBS_OFFSET,r1; /* compute base of RBS */ \
(pKStk) mov r1=sp; /* get sp */ \
;; \
(pUser) lfetch.fault.excl.nt1 [rKRBS]; \
(pUser) mov rARBSPSTORE=ar.bspstore; /* save ar.bspstore */ \
(pUser) addl r1=IA64_STK_OFFSET-IA64_PT_REGS_SIZE,r1; /* compute base of memory stack */ \
(pUStk) lfetch.fault.excl.nt1 [rKRBS]; \
(pUStk) addl r1=IA64_STK_OFFSET-IA64_PT_REGS_SIZE,r1; /* compute base of memory stack */ \
(pUStk) mov rARBSPSTORE=ar.bspstore; /* save ar.bspstore */ \
;; \
(pUser) mov ar.bspstore=rKRBS; /* switch to kernel RBS */ \
(pKern) addl r1=-IA64_PT_REGS_SIZE,r1; /* if in kernel mode, use sp (r12) */ \
(pUStk) mov ar.bspstore=rKRBS; /* switch to kernel RBS */ \
(pKStk) addl r1=-IA64_PT_REGS_SIZE,r1; /* if in kernel mode, use sp (r12) */ \
;; \
(pUser) mov r18=ar.bsp; \
(pUser) mov ar.rsc=0x3; /* set eager mode, pl 0, little-endian, loadrs=0 */ \
(pUStk) mov r18=ar.bsp; \
(pUStk) mov ar.rsc=0x3; /* set eager mode, pl 0, little-endian, loadrs=0 */ \
#define MINSTATE_END_SAVE_MIN_VIRT \
or r13=r13,r14; /* make `current' a kernel virtual address */ \
bsw.1; /* switch back to bank 1 (must be last in insn group) */ \
;;
......@@ -57,21 +55,21 @@
* go virtual and dont want to destroy the iip or ipsr.
*/
#define MINSTATE_START_SAVE_MIN_PHYS \
(pKern) movl sp=ia64_init_stack+IA64_STK_OFFSET-IA64_PT_REGS_SIZE; \
(pUser) mov ar.rsc=0; /* set enforced lazy mode, pl 0, little-endian, loadrs=0 */ \
(pUser) addl rKRBS=IA64_RBS_OFFSET,r1; /* compute base of register backing store */ \
(pKStk) movl sp=ia64_init_stack+IA64_STK_OFFSET-IA64_PT_REGS_SIZE; \
(pUStk) mov ar.rsc=0; /* set enforced lazy mode, pl 0, little-endian, loadrs=0 */ \
(pUStk) addl rKRBS=IA64_RBS_OFFSET,r1; /* compute base of register backing store */ \
;; \
(pUser) mov rARRNAT=ar.rnat; \
(pKern) dep r1=0,sp,61,3; /* compute physical addr of sp */ \
(pUser) addl r1=IA64_STK_OFFSET-IA64_PT_REGS_SIZE,r1; /* compute base of memory stack */ \
(pUser) mov rARBSPSTORE=ar.bspstore; /* save ar.bspstore */ \
(pUser) dep rKRBS=-1,rKRBS,61,3; /* compute kernel virtual addr of RBS */\
(pUStk) mov rARRNAT=ar.rnat; \
(pKStk) dep r1=0,sp,61,3; /* compute physical addr of sp */ \
(pUStk) addl r1=IA64_STK_OFFSET-IA64_PT_REGS_SIZE,r1; /* compute base of memory stack */ \
(pUStk) mov rARBSPSTORE=ar.bspstore; /* save ar.bspstore */ \
(pUStk) dep rKRBS=-1,rKRBS,61,3; /* compute kernel virtual addr of RBS */\
;; \
(pKern) addl r1=-IA64_PT_REGS_SIZE,r1; /* if in kernel mode, use sp (r12) */ \
(pUser) mov ar.bspstore=rKRBS; /* switch to kernel RBS */ \
(pKStk) addl r1=-IA64_PT_REGS_SIZE,r1; /* if in kernel mode, use sp (r12) */ \
(pUStk) mov ar.bspstore=rKRBS; /* switch to kernel RBS */ \
;; \
(pUser) mov r18=ar.bsp; \
(pUser) mov ar.rsc=0x3; /* set eager mode, pl 0, little-endian, loadrs=0 */ \
(pUStk) mov r18=ar.bsp; \
(pUStk) mov ar.rsc=0x3; /* set eager mode, pl 0, little-endian, loadrs=0 */ \
#define MINSTATE_END_SAVE_MIN_PHYS \
or r12=r12,r14; /* make sp a kernel virtual address */ \
......@@ -79,11 +77,13 @@
;;
#ifdef MINSTATE_VIRT
# define MINSTATE_GET_CURRENT(reg) mov reg=IA64_KR(CURRENT)
# define MINSTATE_START_SAVE_MIN MINSTATE_START_SAVE_MIN_VIRT
# define MINSTATE_END_SAVE_MIN MINSTATE_END_SAVE_MIN_VIRT
#endif
#ifdef MINSTATE_PHYS
# define MINSTATE_GET_CURRENT(reg) mov reg=IA64_KR(CURRENT);; dep reg=0,reg,61,3
# define MINSTATE_START_SAVE_MIN MINSTATE_START_SAVE_MIN_PHYS
# define MINSTATE_END_SAVE_MIN MINSTATE_END_SAVE_MIN_PHYS
#endif
......@@ -110,23 +110,26 @@
* we can pass interruption state as arguments to a handler.
*/
#define DO_SAVE_MIN(COVER,SAVE_IFS,EXTRA) \
mov rARRSC=ar.rsc; \
mov rARPFS=ar.pfs; \
mov rR1=r1; \
mov rARUNAT=ar.unat; \
mov rCRIPSR=cr.ipsr; \
mov rB6=b6; /* rB6 = branch reg 6 */ \
mov rCRIIP=cr.iip; \
mov r1=IA64_KR(CURRENT); /* r1 = current (physical) */ \
COVER; \
;; \
invala; \
extr.u r16=rCRIPSR,32,2; /* extract psr.cpl */ \
;; \
cmp.eq pKern,pUser=r0,r16; /* are we in kernel mode already? (psr.cpl==0) */ \
mov rARRSC=ar.rsc; /* M */ \
mov rARUNAT=ar.unat; /* M */ \
mov rR1=r1; /* A */ \
MINSTATE_GET_CURRENT(r1); /* M (or M;;I) */ \
mov rCRIPSR=cr.ipsr; /* M */ \
mov rARPFS=ar.pfs; /* I */ \
mov rCRIIP=cr.iip; /* M */ \
mov rB6=b6; /* I */ /* rB6 = branch reg 6 */ \
COVER; /* B;; (or nothing) */ \
;; \
adds r16=IA64_TASK_THREAD_ON_USTACK_OFFSET,r1; \
;; \
ld1 r17=[r16]; /* load current->thread.on_ustack flag */ \
st1 [r16]=r0; /* clear current->thread.on_ustack flag */ \
/* switch from user to kernel RBS: */ \
;; \
invala; /* M */ \
SAVE_IFS; \
cmp.eq pKStk,pUStk=r0,r17; /* are we in kernel mode already? (psr.cpl==0) */ \
;; \
MINSTATE_START_SAVE_MIN \
add r17=L1_CACHE_BYTES,r1 /* really: biggest cache-line size */ \
;; \
......@@ -138,23 +141,23 @@
;; \
lfetch.fault.excl.nt1 [r17]; \
adds r17=8,r1; /* initialize second base pointer */ \
(pKern) mov r18=r0; /* make sure r18 isn't NaT */ \
(pKStk) mov r18=r0; /* make sure r18 isn't NaT */ \
;; \
st8 [r17]=rCRIIP,16; /* save cr.iip */ \
st8 [r16]=rCRIFS,16; /* save cr.ifs */ \
(pUser) sub r18=r18,rKRBS; /* r18=RSE.ndirty*8 */ \
(pUStk) sub r18=r18,rKRBS; /* r18=RSE.ndirty*8 */ \
;; \
st8 [r17]=rARUNAT,16; /* save ar.unat */ \
st8 [r16]=rARPFS,16; /* save ar.pfs */ \
shl r18=r18,16; /* compute ar.rsc to be used for "loadrs" */ \
;; \
st8 [r17]=rARRSC,16; /* save ar.rsc */ \
(pUser) st8 [r16]=rARRNAT,16; /* save ar.rnat */ \
(pKern) adds r16=16,r16; /* skip over ar_rnat field */ \
(pUStk) st8 [r16]=rARRNAT,16; /* save ar.rnat */ \
(pKStk) adds r16=16,r16; /* skip over ar_rnat field */ \
;; /* avoid RAW on r16 & r17 */ \
(pUser) st8 [r17]=rARBSPSTORE,16; /* save ar.bspstore */ \
(pUStk) st8 [r17]=rARBSPSTORE,16; /* save ar.bspstore */ \
st8 [r16]=rARPR,16; /* save predicates */ \
(pKern) adds r17=16,r17; /* skip over ar_bspstore field */ \
(pKStk) adds r17=16,r17; /* skip over ar_bspstore field */ \
;; \
st8 [r17]=rB6,16; /* save b6 */ \
st8 [r16]=r18,16; /* save ar.rsc value for "loadrs" */ \
......
......@@ -4,7 +4,7 @@
*
* Copyright (C) 1999 Don Dugger <don.dugger@intel.com>
* Copyright (C) 1999 Walt Drummond <drummond@valinux.com>
* Copyright (C) 1999-2001 Hewlett-Packard Co
* Copyright (C) 1999-2001, 2003 Hewlett-Packard Co
* David Mosberger <davidm@hpl.hp.com>
* Stephane Eranian <eranian@hpl.hp.com>
*
......@@ -131,11 +131,11 @@ END(ia64_pal_call_stacked)
* in0 Index of PAL service
* in2 - in3 Remaning PAL arguments
*
* PSR_DB, PSR_LP, PSR_TB, PSR_ID, PSR_DA are never set by the kernel.
* PSR_LP, PSR_TB, PSR_ID, PSR_DA are never set by the kernel.
* So we don't need to clear them.
*/
#define PAL_PSR_BITS_TO_CLEAR \
(IA64_PSR_I | IA64_PSR_IT | IA64_PSR_DT | IA64_PSR_RT | \
(IA64_PSR_I | IA64_PSR_IT | IA64_PSR_DT | IA64_PSR_DB | IA64_PSR_RT | \
IA64_PSR_DD | IA64_PSR_SS | IA64_PSR_RI | IA64_PSR_ED | \
IA64_PSR_DFL | IA64_PSR_DFH)
......@@ -275,7 +275,6 @@ END(ia64_save_scratch_fpregs)
* Inputs:
* in0 Address of stack storage for fp regs
*/
GLOBAL_ENTRY(ia64_load_scratch_fpregs)
alloc r3=ar.pfs,1,0,0,0
add r2=16,in0
......
......@@ -28,7 +28,6 @@
#include <asm/bitops.h>
#include <asm/errno.h>
#include <asm/page.h>
#include <asm/pal.h>
#include <asm/perfmon.h>
#include <asm/processor.h>
#include <asm/signal.h>
......@@ -56,8 +55,8 @@
/*
* Reset register flags
*/
#define PFM_RELOAD_LONG_RESET 1
#define PFM_RELOAD_SHORT_RESET 2
#define PFM_PMD_LONG_RESET 1
#define PFM_PMD_SHORT_RESET 2
/*
* Misc macros and definitions
......@@ -83,8 +82,10 @@
#define PFM_REG_CONFIG (0x4<<4|PFM_REG_IMPL) /* refine configuration */
#define PFM_REG_BUFFER (0x5<<4|PFM_REG_IMPL) /* PMD used as buffer */
#define PMC_IS_LAST(i) (pmu_conf.pmc_desc[i].type & PFM_REG_END)
#define PMD_IS_LAST(i) (pmu_conf.pmd_desc[i].type & PFM_REG_END)
#define PFM_IS_DISABLED() pmu_conf.pfm_is_disabled
#define PFM_IS_DISABLED() pmu_conf.disabled
#define PMC_OVFL_NOTIFY(ctx, i) ((ctx)->ctx_soft_pmds[i].flags & PFM_REGFL_OVFL_NOTIFY)
#define PFM_FL_INHERIT_MASK (PFM_FL_INHERIT_NONE|PFM_FL_INHERIT_ONCE|PFM_FL_INHERIT_ALL)
......@@ -102,7 +103,6 @@
#define PMD_PMD_DEP(i) pmu_conf.pmd_desc[i].dep_pmd[0]
#define PMC_PMD_DEP(i) pmu_conf.pmc_desc[i].dep_pmd[0]
/* k assume unsigned */
#define IBR_IS_IMPL(k) (k<pmu_conf.num_ibrs)
#define DBR_IS_IMPL(k) (k<pmu_conf.num_dbrs)
......@@ -131,6 +131,9 @@
#define PFM_REG_RETFLAG_SET(flags, val) do { flags &= ~PFM_REG_RETFL_MASK; flags |= (val); } while(0)
#define PFM_CPUINFO_CLEAR(v) __get_cpu_var(pfm_syst_info) &= ~(v)
#define PFM_CPUINFO_SET(v) __get_cpu_var(pfm_syst_info) |= (v)
#ifdef CONFIG_SMP
#define cpu_is_online(i) (cpu_online_map & (1UL << i))
#else
......@@ -211,7 +214,7 @@ typedef struct {
u64 reset_pmds[4]; /* which other pmds to reset when this counter overflows */
u64 seed; /* seed for random-number generator */
u64 mask; /* mask for random-number generator */
int flags; /* notify/do not notify */
unsigned int flags; /* notify/do not notify */
} pfm_counter_t;
/*
......@@ -226,7 +229,8 @@ typedef struct {
unsigned int frozen:1; /* pmu must be kept frozen on ctxsw in */
unsigned int protected:1; /* allow access to creator of context only */
unsigned int using_dbreg:1; /* using range restrictions (debug registers) */
unsigned int reserved:24;
unsigned int excl_idle:1; /* exclude idle task in system wide session */
unsigned int reserved:23;
} pfm_context_flags_t;
/*
......@@ -261,7 +265,7 @@ typedef struct pfm_context {
u64 ctx_saved_psr; /* copy of psr used for lazy ctxsw */
unsigned long ctx_saved_cpus_allowed; /* copy of the task cpus_allowed (system wide) */
unsigned long ctx_cpu; /* cpu to which perfmon is applied (system wide) */
unsigned int ctx_cpu; /* CPU used by system wide session */
atomic_t ctx_saving_in_progress; /* flag indicating actual save in progress */
atomic_t ctx_is_busy; /* context accessed by overflow handler */
......@@ -274,6 +278,7 @@ typedef struct pfm_context {
#define ctx_fl_frozen ctx_flags.frozen
#define ctx_fl_protected ctx_flags.protected
#define ctx_fl_using_dbreg ctx_flags.using_dbreg
#define ctx_fl_excl_idle ctx_flags.excl_idle
/*
* global information about all sessions
......@@ -282,10 +287,10 @@ typedef struct pfm_context {
typedef struct {
spinlock_t pfs_lock; /* lock the structure */
unsigned long pfs_task_sessions; /* number of per task sessions */
unsigned long pfs_sys_sessions; /* number of per system wide sessions */
unsigned long pfs_sys_use_dbregs; /* incremented when a system wide session uses debug regs */
unsigned long pfs_ptrace_use_dbregs; /* incremented when a process uses debug regs */
unsigned int pfs_task_sessions; /* number of per task sessions */
unsigned int pfs_sys_sessions; /* number of per system wide sessions */
unsigned int pfs_sys_use_dbregs; /* incremented when a system wide session uses debug regs */
unsigned int pfs_ptrace_use_dbregs; /* incremented when a process uses debug regs */
struct task_struct *pfs_sys_session[NR_CPUS]; /* point to task owning a system-wide session */
} pfm_session_t;
......@@ -313,23 +318,22 @@ typedef struct {
/*
* This structure is initialized at boot time and contains
* a description of the PMU main characteristic as indicated
* by PAL along with a list of inter-registers dependencies and configurations.
* a description of the PMU main characteristics.
*/
typedef struct {
unsigned long pfm_is_disabled; /* indicates if perfmon is working properly */
unsigned long perf_ovfl_val; /* overflow value for generic counters */
unsigned long max_counters; /* upper limit on counter pair (PMC/PMD) */
unsigned long num_pmcs ; /* highest PMC implemented (may have holes) */
unsigned long num_pmds; /* highest PMD implemented (may have holes) */
unsigned long impl_regs[16]; /* buffer used to hold implememted PMC/PMD mask */
unsigned long num_ibrs; /* number of instruction debug registers */
unsigned long num_dbrs; /* number of data debug registers */
pfm_reg_desc_t *pmc_desc; /* detailed PMC register descriptions */
pfm_reg_desc_t *pmd_desc; /* detailed PMD register descriptions */
unsigned int disabled; /* indicates if perfmon is working properly */
unsigned long ovfl_val; /* overflow value for generic counters */
unsigned long impl_pmcs[4]; /* bitmask of implemented PMCS */
unsigned long impl_pmds[4]; /* bitmask of implemented PMDS */
unsigned int num_pmcs; /* number of implemented PMCS */
unsigned int num_pmds; /* number of implemented PMDS */
unsigned int num_ibrs; /* number of implemented IBRS */
unsigned int num_dbrs; /* number of implemented DBRS */
unsigned int num_counters; /* number of PMD/PMC counters */
pfm_reg_desc_t *pmc_desc; /* detailed PMC register dependencies descriptions */
pfm_reg_desc_t *pmd_desc; /* detailed PMD register dependencies descriptions */
} pmu_config_t;
/*
* structure used to pass argument to/from remote CPU
* using IPI to check and possibly save the PMU context on SMP systems.
......@@ -389,13 +393,12 @@ typedef struct {
/*
* perfmon internal variables
*/
static pmu_config_t pmu_conf; /* PMU configuration */
static pfm_session_t pfm_sessions; /* global sessions information */
static struct proc_dir_entry *perfmon_dir; /* for debug only */
static pfm_stats_t pfm_stats[NR_CPUS];
static pfm_intr_handler_desc_t *pfm_alternate_intr_handler;
DEFINE_PER_CPU(int, pfm_syst_wide);
static DEFINE_PER_CPU(int, pfm_dcr_pp);
DEFINE_PER_CPU(unsigned long, pfm_syst_info);
/* sysctl() controls */
static pfm_sysctl_t pfm_sysctl;
......@@ -449,42 +452,62 @@ static void pfm_lazy_save_regs (struct task_struct *ta);
#include "perfmon_generic.h"
#endif
static inline void
pfm_clear_psr_pp(void)
{
__asm__ __volatile__ ("rsm psr.pp;; srlz.i;;"::: "memory");
}
static inline void
pfm_set_psr_pp(void)
{
__asm__ __volatile__ ("ssm psr.pp;; srlz.i;;"::: "memory");
}
static inline void
pfm_clear_psr_up(void)
{
__asm__ __volatile__ ("rum psr.up;; srlz.i;;"::: "memory");
}
static inline void
pfm_set_psr_up(void)
{
__asm__ __volatile__ ("sum psr.up;; srlz.i;;"::: "memory");
}
static inline unsigned long
pfm_get_psr(void)
{
unsigned long tmp;
__asm__ __volatile__ ("mov %0=psr;;": "=r"(tmp) :: "memory");
return tmp;
}
static inline void
pfm_set_psr_l(unsigned long val)
{
__asm__ __volatile__ ("mov psr.l=%0;; srlz.i;;"::"r"(val): "memory");
}
static inline unsigned long
pfm_read_soft_counter(pfm_context_t *ctx, int i)
{
return ctx->ctx_soft_pmds[i].val + (ia64_get_pmd(i) & pmu_conf.perf_ovfl_val);
return ctx->ctx_soft_pmds[i].val + (ia64_get_pmd(i) & pmu_conf.ovfl_val);
}
static inline void
pfm_write_soft_counter(pfm_context_t *ctx, int i, unsigned long val)
{
ctx->ctx_soft_pmds[i].val = val & ~pmu_conf.perf_ovfl_val;
ctx->ctx_soft_pmds[i].val = val & ~pmu_conf.ovfl_val;
/*
* writing to unimplemented part is ignore, so we do not need to
* mask off top part
*/
ia64_set_pmd(i, val & pmu_conf.perf_ovfl_val);
}
/*
* finds the number of PM(C|D) registers given
* the bitvector returned by PAL
*/
static unsigned long __init
find_num_pm_regs(long *buffer)
{
int i=3; /* 4 words/per bitvector */
/* start from the most significant word */
while (i>=0 && buffer[i] == 0 ) i--;
if (i< 0) {
printk(KERN_ERR "perfmon: No bit set in pm_buffer\n");
return 0;
}
return 1+ ia64_fls(buffer[i]) + 64 * i;
ia64_set_pmd(i, val & pmu_conf.ovfl_val);
}
/*
* Generates a unique (per CPU) timestamp
*/
......@@ -875,6 +898,120 @@ pfm_smpl_buffer_alloc(pfm_context_t *ctx, unsigned long *which_pmds, unsigned lo
return -ENOMEM;
}
static int
pfm_reserve_session(struct task_struct *task, int is_syswide, unsigned long cpu_mask)
{
unsigned long m, undo_mask;
unsigned int n, i;
/*
* validy checks on cpu_mask have been done upstream
*/
LOCK_PFS();
if (is_syswide) {
/*
* cannot mix system wide and per-task sessions
*/
if (pfm_sessions.pfs_task_sessions > 0UL) {
DBprintk(("system wide not possible, %u conflicting task_sessions\n",
pfm_sessions.pfs_task_sessions));
goto abort;
}
m = cpu_mask; undo_mask = 0UL; n = 0;
DBprintk(("cpu_mask=0x%lx\n", cpu_mask));
for(i=0; m; i++, m>>=1) {
if ((m & 0x1) == 0UL) continue;
if (pfm_sessions.pfs_sys_session[i]) goto undo;
DBprintk(("reserving CPU%d currently on CPU%d\n", i, smp_processor_id()));
pfm_sessions.pfs_sys_session[i] = task;
undo_mask |= 1UL << i;
n++;
}
pfm_sessions.pfs_sys_sessions += n;
} else {
if (pfm_sessions.pfs_sys_sessions) goto abort;
pfm_sessions.pfs_task_sessions++;
}
DBprintk(("task_sessions=%u sys_session[%d]=%d",
pfm_sessions.pfs_task_sessions,
smp_processor_id(), pfm_sessions.pfs_sys_session[smp_processor_id()] ? 1 : 0));
UNLOCK_PFS();
return 0;
undo:
DBprintk(("system wide not possible, conflicting session [%d] on CPU%d\n",
pfm_sessions.pfs_sys_session[i]->pid, i));
for(i=0; undo_mask; i++, undo_mask >>=1) {
pfm_sessions.pfs_sys_session[i] = NULL;
}
abort:
UNLOCK_PFS();
return -EBUSY;
}
static int
pfm_unreserve_session(struct task_struct *task, int is_syswide, unsigned long cpu_mask)
{
pfm_context_t *ctx;
unsigned long m;
unsigned int n, i;
ctx = task ? task->thread.pfm_context : NULL;
/*
* validy checks on cpu_mask have been done upstream
*/
LOCK_PFS();
DBprintk(("[%d] sys_sessions=%u task_sessions=%u dbregs=%u syswide=%d cpu_mask=0x%lx\n",
task->pid,
pfm_sessions.pfs_sys_sessions,
pfm_sessions.pfs_task_sessions,
pfm_sessions.pfs_sys_use_dbregs,
is_syswide,
cpu_mask));
if (is_syswide) {
m = cpu_mask; n = 0;
for(i=0; m; i++, m>>=1) {
if ((m & 0x1) == 0UL) continue;
pfm_sessions.pfs_sys_session[i] = NULL;
n++;
}
/*
* would not work with perfmon+more than one bit in cpu_mask
*/
if (ctx && ctx->ctx_fl_using_dbreg) {
if (pfm_sessions.pfs_sys_use_dbregs == 0) {
printk("perfmon: invalid release for [%d] sys_use_dbregs=0\n", task->pid);
} else {
pfm_sessions.pfs_sys_use_dbregs--;
}
}
pfm_sessions.pfs_sys_sessions -= n;
DBprintk(("CPU%d sys_sessions=%u\n",
smp_processor_id(), pfm_sessions.pfs_sys_sessions));
} else {
pfm_sessions.pfs_task_sessions--;
DBprintk(("[%d] task_sessions=%u\n",
task->pid, pfm_sessions.pfs_task_sessions));
}
UNLOCK_PFS();
return 0;
}
/*
* XXX: do something better here
*/
......@@ -891,6 +1028,7 @@ pfm_bad_permissions(struct task_struct *task)
static int
pfx_is_sane(struct task_struct *task, pfarg_context_t *pfx)
{
unsigned long smpl_pmds = pfx->ctx_smpl_regs[0];
int ctx_flags;
int cpu;
......@@ -957,6 +1095,11 @@ pfx_is_sane(struct task_struct *task, pfarg_context_t *pfx)
}
#endif
}
/* verify validity of smpl_regs */
if ((smpl_pmds & pmu_conf.impl_pmds[0]) != smpl_pmds) {
DBprintk(("invalid smpl_regs 0x%lx\n", smpl_pmds));
return -EINVAL;
}
/* probably more to add here */
return 0;
......@@ -968,7 +1111,7 @@ pfm_context_create(struct task_struct *task, pfm_context_t *ctx, void *req, int
{
pfarg_context_t tmp;
void *uaddr = NULL;
int ret, cpu = 0;
int ret;
int ctx_flags;
pid_t notify_pid;
......@@ -987,40 +1130,8 @@ pfm_context_create(struct task_struct *task, pfm_context_t *ctx, void *req, int
ctx_flags = tmp.ctx_flags;
ret = -EBUSY;
LOCK_PFS();
if (ctx_flags & PFM_FL_SYSTEM_WIDE) {
/* at this point, we know there is at least one bit set */
cpu = ffz(~tmp.ctx_cpu_mask);
DBprintk(("requesting CPU%d currently on CPU%d\n",cpu, smp_processor_id()));
if (pfm_sessions.pfs_task_sessions > 0) {
DBprintk(("system wide not possible, task_sessions=%ld\n", pfm_sessions.pfs_task_sessions));
goto abort;
}
if (pfm_sessions.pfs_sys_session[cpu]) {
DBprintk(("system wide not possible, conflicting session [%d] on CPU%d\n",pfm_sessions.pfs_sys_session[cpu]->pid, cpu));
goto abort;
}
pfm_sessions.pfs_sys_session[cpu] = task;
/*
* count the number of system wide sessions
*/
pfm_sessions.pfs_sys_sessions++;
} else if (pfm_sessions.pfs_sys_sessions == 0) {
pfm_sessions.pfs_task_sessions++;
} else {
/* no per-process monitoring while there is a system wide session */
goto abort;
}
UNLOCK_PFS();
ret = pfm_reserve_session(task, ctx_flags & PFM_FL_SYSTEM_WIDE, tmp.ctx_cpu_mask);
if (ret) goto abort;
ret = -ENOMEM;
......@@ -1103,6 +1214,7 @@ pfm_context_create(struct task_struct *task, pfm_context_t *ctx, void *req, int
ctx->ctx_fl_inherit = ctx_flags & PFM_FL_INHERIT_MASK;
ctx->ctx_fl_block = (ctx_flags & PFM_FL_NOTIFY_BLOCK) ? 1 : 0;
ctx->ctx_fl_system = (ctx_flags & PFM_FL_SYSTEM_WIDE) ? 1: 0;
ctx->ctx_fl_excl_idle = (ctx_flags & PFM_FL_EXCL_IDLE) ? 1: 0;
ctx->ctx_fl_frozen = 0;
/*
* setting this flag to 0 here means, that the creator or the task that the
......@@ -1113,7 +1225,7 @@ pfm_context_create(struct task_struct *task, pfm_context_t *ctx, void *req, int
ctx->ctx_fl_protected = 0;
/* for system wide mode only (only 1 bit set) */
ctx->ctx_cpu = cpu;
ctx->ctx_cpu = ffz(~tmp.ctx_cpu_mask);
atomic_set(&ctx->ctx_last_cpu,-1); /* SMP only, means no CPU */
......@@ -1131,9 +1243,9 @@ pfm_context_create(struct task_struct *task, pfm_context_t *ctx, void *req, int
DBprintk(("context=%p, pid=%d notify_task=%p\n",
(void *)ctx, task->pid, ctx->ctx_notify_task));
DBprintk(("context=%p, pid=%d flags=0x%x inherit=%d block=%d system=%d\n",
DBprintk(("context=%p, pid=%d flags=0x%x inherit=%d block=%d system=%d excl_idle=%d\n",
(void *)ctx, task->pid, ctx_flags, ctx->ctx_fl_inherit,
ctx->ctx_fl_block, ctx->ctx_fl_system));
ctx->ctx_fl_block, ctx->ctx_fl_system, ctx->ctx_fl_excl_idle));
/*
* when no notification is required, we can make this visible at the last moment
......@@ -1146,8 +1258,8 @@ pfm_context_create(struct task_struct *task, pfm_context_t *ctx, void *req, int
*/
if (ctx->ctx_fl_system) {
ctx->ctx_saved_cpus_allowed = task->cpus_allowed;
set_cpus_allowed(task, 1UL << cpu);
DBprintk(("[%d] rescheduled allowed=0x%lx\n", task->pid,task->cpus_allowed));
set_cpus_allowed(task, tmp.ctx_cpu_mask);
DBprintk(("[%d] rescheduled allowed=0x%lx\n", task->pid, task->cpus_allowed));
}
return 0;
......@@ -1155,20 +1267,8 @@ pfm_context_create(struct task_struct *task, pfm_context_t *ctx, void *req, int
buffer_error:
pfm_context_free(ctx);
error:
/*
* undo session reservation
*/
LOCK_PFS();
if (ctx_flags & PFM_FL_SYSTEM_WIDE) {
pfm_sessions.pfs_sys_session[cpu] = NULL;
pfm_sessions.pfs_sys_sessions--;
} else {
pfm_sessions.pfs_task_sessions--;
}
pfm_unreserve_session(task, ctx_flags & PFM_FL_SYSTEM_WIDE , tmp.ctx_cpu_mask);
abort:
UNLOCK_PFS();
/* make sure we don't leave anything behind */
task->thread.pfm_context = NULL;
......@@ -1200,9 +1300,7 @@ pfm_reset_regs(pfm_context_t *ctx, unsigned long *ovfl_regs, int flag)
unsigned long mask = ovfl_regs[0];
unsigned long reset_others = 0UL;
unsigned long val;
int i, is_long_reset = (flag & PFM_RELOAD_LONG_RESET);
DBprintk(("masks=0x%lx\n", mask));
int i, is_long_reset = (flag == PFM_PMD_LONG_RESET);
/*
* now restore reset value on sampling overflowed counters
......@@ -1213,7 +1311,7 @@ pfm_reset_regs(pfm_context_t *ctx, unsigned long *ovfl_regs, int flag)
val = pfm_new_counter_value(ctx->ctx_soft_pmds + i, is_long_reset);
reset_others |= ctx->ctx_soft_pmds[i].reset_pmds[0];
DBprintk(("[%d] %s reset soft_pmd[%d]=%lx\n", current->pid,
DBprintk_ovfl(("[%d] %s reset soft_pmd[%d]=%lx\n", current->pid,
is_long_reset ? "long" : "short", i, val));
/* upper part is ignored on rval */
......@@ -1235,7 +1333,7 @@ pfm_reset_regs(pfm_context_t *ctx, unsigned long *ovfl_regs, int flag)
} else {
ia64_set_pmd(i, val);
}
DBprintk(("[%d] %s reset_others pmd[%d]=%lx\n", current->pid,
DBprintk_ovfl(("[%d] %s reset_others pmd[%d]=%lx\n", current->pid,
is_long_reset ? "long" : "short", i, val));
}
ia64_srlz_d();
......@@ -1246,7 +1344,7 @@ pfm_write_pmcs(struct task_struct *task, pfm_context_t *ctx, void *arg, int coun
{
struct thread_struct *th = &task->thread;
pfarg_reg_t tmp, *req = (pfarg_reg_t *)arg;
unsigned long value;
unsigned long value, reset_pmds;
unsigned int cnum, reg_flags, flags;
int i;
int ret = -EINVAL;
......@@ -1265,6 +1363,7 @@ pfm_write_pmcs(struct task_struct *task, pfm_context_t *ctx, void *arg, int coun
cnum = tmp.reg_num;
reg_flags = tmp.reg_flags;
value = tmp.reg_value;
reset_pmds = tmp.reg_reset_pmds[0];
flags = 0;
/*
......@@ -1283,6 +1382,8 @@ pfm_write_pmcs(struct task_struct *task, pfm_context_t *ctx, void *arg, int coun
* any other configuration is rejected.
*/
if (PMC_IS_MONITOR(cnum) || PMC_IS_COUNTING(cnum)) {
DBprintk(("pmc[%u].pm=%ld\n", cnum, PMC_PM(cnum, value)));
if (ctx->ctx_fl_system ^ PMC_PM(cnum, value)) {
DBprintk(("pmc_pm=%ld fl_system=%d\n", PMC_PM(cnum, value), ctx->ctx_fl_system));
goto error;
......@@ -1310,6 +1411,11 @@ pfm_write_pmcs(struct task_struct *task, pfm_context_t *ctx, void *arg, int coun
if (reg_flags & PFM_REGFL_RANDOM) flags |= PFM_REGFL_RANDOM;
/* verify validity of reset_pmds */
if ((reset_pmds & pmu_conf.impl_pmds[0]) != reset_pmds) {
DBprintk(("invalid reset_pmds 0x%lx for pmc%u\n", reset_pmds, cnum));
goto error;
}
} else if (reg_flags & (PFM_REGFL_OVFL_NOTIFY|PFM_REGFL_RANDOM)) {
DBprintk(("cannot set ovfl_notify or random on pmc%u\n", cnum));
goto error;
......@@ -1348,13 +1454,10 @@ pfm_write_pmcs(struct task_struct *task, pfm_context_t *ctx, void *arg, int coun
ctx->ctx_soft_pmds[cnum].flags = flags;
if (PMC_IS_COUNTING(cnum)) {
/*
* copy reset vector
*/
ctx->ctx_soft_pmds[cnum].reset_pmds[0] = tmp.reg_reset_pmds[0];
ctx->ctx_soft_pmds[cnum].reset_pmds[1] = tmp.reg_reset_pmds[1];
ctx->ctx_soft_pmds[cnum].reset_pmds[2] = tmp.reg_reset_pmds[2];
ctx->ctx_soft_pmds[cnum].reset_pmds[3] = tmp.reg_reset_pmds[3];
ctx->ctx_soft_pmds[cnum].reset_pmds[0] = reset_pmds;
/* mark all PMDS to be accessed as used */
CTX_USED_PMD(ctx, reset_pmds);
}
/*
......@@ -1397,7 +1500,7 @@ pfm_write_pmds(struct task_struct *task, pfm_context_t *ctx, void *arg, int coun
unsigned long value, hw_value;
unsigned int cnum;
int i;
int ret;
int ret = 0;
/* we don't quite support this right now */
if (task != current) return -EINVAL;
......@@ -1448,9 +1551,9 @@ pfm_write_pmds(struct task_struct *task, pfm_context_t *ctx, void *arg, int coun
/* update virtualized (64bits) counter */
if (PMD_IS_COUNTING(cnum)) {
ctx->ctx_soft_pmds[cnum].lval = value;
ctx->ctx_soft_pmds[cnum].val = value & ~pmu_conf.perf_ovfl_val;
ctx->ctx_soft_pmds[cnum].val = value & ~pmu_conf.ovfl_val;
hw_value = value & pmu_conf.perf_ovfl_val;
hw_value = value & pmu_conf.ovfl_val;
ctx->ctx_soft_pmds[cnum].long_reset = tmp.reg_long_reset;
ctx->ctx_soft_pmds[cnum].short_reset = tmp.reg_short_reset;
......@@ -1478,7 +1581,7 @@ pfm_write_pmds(struct task_struct *task, pfm_context_t *ctx, void *arg, int coun
ctx->ctx_soft_pmds[cnum].val,
ctx->ctx_soft_pmds[cnum].short_reset,
ctx->ctx_soft_pmds[cnum].long_reset,
ia64_get_pmd(cnum) & pmu_conf.perf_ovfl_val,
ia64_get_pmd(cnum) & pmu_conf.ovfl_val,
PMC_OVFL_NOTIFY(ctx, cnum) ? 'Y':'N',
ctx->ctx_used_pmds[0],
ctx->ctx_soft_pmds[cnum].reset_pmds[0]));
......@@ -1504,15 +1607,18 @@ pfm_write_pmds(struct task_struct *task, pfm_context_t *ctx, void *arg, int coun
return ret;
}
static int
pfm_read_pmds(struct task_struct *task, pfm_context_t *ctx, void *arg, int count, struct pt_regs *regs)
{
struct thread_struct *th = &task->thread;
unsigned long val = 0UL;
unsigned long val, lval;
pfarg_reg_t *req = (pfarg_reg_t *)arg;
unsigned int cnum, reg_flags = 0;
int i, ret = -EINVAL;
int i, ret = 0;
#if __GNUC__ < 3
int foo;
#endif
if (!CTX_IS_ENABLED(ctx)) return -EINVAL;
......@@ -1528,9 +1634,16 @@ pfm_read_pmds(struct task_struct *task, pfm_context_t *ctx, void *arg, int count
DBprintk(("ctx_last_cpu=%d for [%d]\n", atomic_read(&ctx->ctx_last_cpu), task->pid));
for (i = 0; i < count; i++, req++) {
#if __GNUC__ < 3
foo = __get_user(cnum, &req->reg_num);
if (foo) return -EFAULT;
foo = __get_user(reg_flags, &req->reg_flags);
if (foo) return -EFAULT;
#else
if (__get_user(cnum, &req->reg_num)) return -EFAULT;
if (__get_user(reg_flags, &req->reg_flags)) return -EFAULT;
#endif
lval = 0UL;
if (!PMD_IS_IMPL(cnum)) goto abort_mission;
/*
......@@ -1578,9 +1691,10 @@ pfm_read_pmds(struct task_struct *task, pfm_context_t *ctx, void *arg, int count
/*
* XXX: need to check for overflow
*/
val &= pmu_conf.perf_ovfl_val;
val &= pmu_conf.ovfl_val;
val += ctx->ctx_soft_pmds[cnum].val;
lval = ctx->ctx_soft_pmds[cnum].lval;
}
/*
......@@ -1592,10 +1706,11 @@ pfm_read_pmds(struct task_struct *task, pfm_context_t *ctx, void *arg, int count
val = v;
}
PFM_REG_RETFLAG_SET(reg_flags, 0);
PFM_REG_RETFLAG_SET(reg_flags, ret);
DBprintk(("read pmd[%u] ret=%d value=0x%lx pmc=0x%lx\n",
cnum, ret, val, ia64_get_pmc(cnum)));
/*
* update register return value, abort all if problem during copy.
* we only modify the reg_flags field. no check mode is fine because
......@@ -1604,16 +1719,19 @@ pfm_read_pmds(struct task_struct *task, pfm_context_t *ctx, void *arg, int count
if (__put_user(cnum, &req->reg_num)) return -EFAULT;
if (__put_user(val, &req->reg_value)) return -EFAULT;
if (__put_user(reg_flags, &req->reg_flags)) return -EFAULT;
if (__put_user(lval, &req->reg_last_reset_value)) return -EFAULT;
}
return 0;
abort_mission:
PFM_REG_RETFLAG_SET(reg_flags, PFM_REG_RETFL_EINVAL);
/*
* XXX: if this fails, we stick with the original failure, flag not updated!
*/
__put_user(reg_flags, &req->reg_flags);
if (__put_user(reg_flags, &req->reg_flags)) ret = -EFAULT;
return ret;
return -EINVAL;
}
#ifdef PFM_PMU_USES_DBR
......@@ -1655,7 +1773,7 @@ pfm_use_debug_registers(struct task_struct *task)
else
pfm_sessions.pfs_ptrace_use_dbregs++;
DBprintk(("ptrace_use_dbregs=%lu sys_use_dbregs=%lu by [%d] ret = %d\n",
DBprintk(("ptrace_use_dbregs=%u sys_use_dbregs=%u by [%d] ret = %d\n",
pfm_sessions.pfs_ptrace_use_dbregs,
pfm_sessions.pfs_sys_use_dbregs,
task->pid, ret));
......@@ -1673,7 +1791,6 @@ pfm_use_debug_registers(struct task_struct *task)
* perfmormance monitoring, so we only decrement the number
* of "ptraced" debug register users to keep the count up to date
*/
int
pfm_release_debug_registers(struct task_struct *task)
{
......@@ -1702,6 +1819,7 @@ pfm_use_debug_registers(struct task_struct *task)
{
return 0;
}
int
pfm_release_debug_registers(struct task_struct *task)
{
......@@ -1721,9 +1839,12 @@ pfm_restart(struct task_struct *task, pfm_context_t *ctx, void *arg, int count,
if (!CTX_IS_ENABLED(ctx)) return -EINVAL;
if (task == current) {
DBprintk(("restarting self %d frozen=%d \n", current->pid, ctx->ctx_fl_frozen));
DBprintk(("restarting self %d frozen=%d ovfl_regs=0x%lx\n",
task->pid,
ctx->ctx_fl_frozen,
ctx->ctx_ovfl_regs[0]));
pfm_reset_regs(ctx, ctx->ctx_ovfl_regs, PFM_RELOAD_LONG_RESET);
pfm_reset_regs(ctx, ctx->ctx_ovfl_regs, PFM_PMD_LONG_RESET);
ctx->ctx_ovfl_regs[0] = 0UL;
......@@ -1806,18 +1927,18 @@ pfm_stop(struct task_struct *task, pfm_context_t *ctx, void *arg, int count,
ia64_set_dcr(ia64_get_dcr() & ~IA64_DCR_PP);
/* stop monitoring */
__asm__ __volatile__ ("rsm psr.pp;;"::: "memory");
pfm_clear_psr_pp();
ia64_srlz_i();
__get_cpu_var(pfm_dcr_pp) = 0;
PFM_CPUINFO_CLEAR(PFM_CPUINFO_DCR_PP);
ia64_psr(regs)->pp = 0;
} else {
/* stop monitoring */
__asm__ __volatile__ ("rum psr.up;;"::: "memory");
pfm_clear_psr_up();
ia64_srlz_i();
......@@ -1979,14 +2100,9 @@ pfm_write_ibr_dbr(int mode, struct task_struct *task, void *arg, int count, stru
int i, ret = 0;
/*
* for range restriction: psr.db must be cleared or the
* the PMU will ignore the debug registers.
*
* XXX: may need more in system wide mode,
* no task can have this bit set?
* we do not need to check for ipsr.db because we do clear ibr.x, dbr.r, and dbr.w
* ensuring that no real breakpoint can be installed via this call.
*/
if (ia64_psr(regs)->db == 1) return -EINVAL;
first_time = ctx->ctx_fl_using_dbreg == 0;
......@@ -2056,7 +2172,6 @@ pfm_write_ibr_dbr(int mode, struct task_struct *task, void *arg, int count, stru
*/
for (i = 0; i < count; i++, req++) {
if (__copy_from_user(&tmp, req, sizeof(tmp))) goto abort_mission;
rnum = tmp.dbreg_num;
......@@ -2145,7 +2260,7 @@ pfm_write_ibr_dbr(int mode, struct task_struct *task, void *arg, int count, stru
* XXX: for now we can only come here on EINVAL
*/
PFM_REG_RETFLAG_SET(tmp.dbreg_flags, PFM_REG_RETFL_EINVAL);
__put_user(tmp.dbreg_flags, &req->dbreg_flags);
if (__put_user(tmp.dbreg_flags, &req->dbreg_flags)) ret = -EFAULT;
}
return ret;
}
......@@ -2215,13 +2330,13 @@ pfm_start(struct task_struct *task, pfm_context_t *ctx, void *arg, int count,
if (ctx->ctx_fl_system) {
__get_cpu_var(pfm_dcr_pp) = 1;
PFM_CPUINFO_SET(PFM_CPUINFO_DCR_PP);
/* set user level psr.pp */
ia64_psr(regs)->pp = 1;
/* start monitoring at kernel level */
__asm__ __volatile__ ("ssm psr.pp;;"::: "memory");
pfm_set_psr_pp();
/* enable dcr pp */
ia64_set_dcr(ia64_get_dcr()|IA64_DCR_PP);
......@@ -2237,7 +2352,7 @@ pfm_start(struct task_struct *task, pfm_context_t *ctx, void *arg, int count,
ia64_psr(regs)->up = 1;
/* start monitoring at kernel level */
__asm__ __volatile__ ("sum psr.up;;"::: "memory");
pfm_set_psr_up();
ia64_srlz_i();
}
......@@ -2264,11 +2379,12 @@ pfm_enable(struct task_struct *task, pfm_context_t *ctx, void *arg, int count,
ia64_psr(regs)->up = 0; /* just to make sure! */
/* make sure monitoring is stopped */
__asm__ __volatile__ ("rsm psr.pp;;"::: "memory");
pfm_clear_psr_pp();
ia64_srlz_i();
__get_cpu_var(pfm_dcr_pp) = 0;
__get_cpu_var(pfm_syst_wide) = 1;
PFM_CPUINFO_CLEAR(PFM_CPUINFO_DCR_PP);
PFM_CPUINFO_SET(PFM_CPUINFO_SYST_WIDE);
if (ctx->ctx_fl_excl_idle) PFM_CPUINFO_SET(PFM_CPUINFO_EXCL_IDLE);
} else {
/*
* needed in case the task was a passive task during
......@@ -2279,7 +2395,7 @@ pfm_enable(struct task_struct *task, pfm_context_t *ctx, void *arg, int count,
ia64_psr(regs)->up = 0;
/* make sure monitoring is stopped */
__asm__ __volatile__ ("rum psr.up;;"::: "memory");
pfm_clear_psr_up();
ia64_srlz_i();
DBprintk(("clearing psr.sp for [%d]\n", current->pid));
......@@ -2331,6 +2447,7 @@ pfm_get_pmc_reset(struct task_struct *task, pfm_context_t *ctx, void *arg, int c
abort_mission:
PFM_REG_RETFLAG_SET(tmp.reg_flags, PFM_REG_RETFL_EINVAL);
if (__copy_to_user(req, &tmp, sizeof(tmp))) ret = -EFAULT;
return ret;
}
......@@ -2532,7 +2649,7 @@ pfm_ovfl_block_reset(void)
* use the local reference
*/
pfm_reset_regs(ctx, ctx->ctx_ovfl_regs, PFM_RELOAD_LONG_RESET);
pfm_reset_regs(ctx, ctx->ctx_ovfl_regs, PFM_PMD_LONG_RESET);
ctx->ctx_ovfl_regs[0] = 0UL;
......@@ -2591,19 +2708,11 @@ pfm_record_sample(struct task_struct *task, pfm_context_t *ctx, unsigned long ov
h->pid = current->pid;
h->cpu = smp_processor_id();
h->last_reset_value = ovfl_mask ? ctx->ctx_soft_pmds[ffz(~ovfl_mask)].lval : 0UL;
/*
* where did the fault happen
*/
h->ip = regs ? regs->cr_iip | ((regs->cr_ipsr >> 41) & 0x3): 0x0UL;
/*
* which registers overflowed
*/
h->regs = ovfl_mask;
h->regs = ovfl_mask; /* which registers overflowed */
/* guaranteed to monotonically increase on each cpu */
h->stamp = pfm_get_stamp();
h->period = 0UL; /* not yet used */
/* position for first pmd */
e = (unsigned long *)(h+1);
......@@ -2724,7 +2833,7 @@ pfm_overflow_handler(struct task_struct *task, pfm_context_t *ctx, u64 pmc0, str
* pfm_read_pmds().
*/
old_val = ctx->ctx_soft_pmds[i].val;
ctx->ctx_soft_pmds[i].val += 1 + pmu_conf.perf_ovfl_val;
ctx->ctx_soft_pmds[i].val += 1 + pmu_conf.ovfl_val;
/*
* check for overflow condition
......@@ -2739,9 +2848,7 @@ pfm_overflow_handler(struct task_struct *task, pfm_context_t *ctx, u64 pmc0, str
}
DBprintk_ovfl(("soft_pmd[%d].val=0x%lx old_val=0x%lx pmd=0x%lx ovfl_pmds=0x%lx ovfl_notify=0x%lx\n",
i, ctx->ctx_soft_pmds[i].val, old_val,
ia64_get_pmd(i) & pmu_conf.perf_ovfl_val, ovfl_pmds, ovfl_notify));
ia64_get_pmd(i) & pmu_conf.ovfl_val, ovfl_pmds, ovfl_notify));
}
/*
......@@ -2776,7 +2883,7 @@ pfm_overflow_handler(struct task_struct *task, pfm_context_t *ctx, u64 pmc0, str
*/
if (ovfl_notify == 0UL) {
if (ovfl_pmds)
pfm_reset_regs(ctx, &ovfl_pmds, PFM_RELOAD_SHORT_RESET);
pfm_reset_regs(ctx, &ovfl_pmds, PFM_PMD_SHORT_RESET);
return 0x0;
}
......@@ -2924,7 +3031,7 @@ pfm_overflow_handler(struct task_struct *task, pfm_context_t *ctx, u64 pmc0, str
}
static void
perfmon_interrupt (int irq, void *arg, struct pt_regs *regs)
pfm_interrupt_handler(int irq, void *arg, struct pt_regs *regs)
{
u64 pmc0;
struct task_struct *task;
......@@ -2932,6 +3039,14 @@ perfmon_interrupt (int irq, void *arg, struct pt_regs *regs)
pfm_stats[smp_processor_id()].pfm_ovfl_intr_count++;
/*
* if an alternate handler is registered, just bypass the default one
*/
if (pfm_alternate_intr_handler) {
(*pfm_alternate_intr_handler->handler)(irq, arg, regs);
return;
}
/*
* srlz.d done before arriving here
*
......@@ -2994,14 +3109,13 @@ perfmon_interrupt (int irq, void *arg, struct pt_regs *regs)
/* for debug only */
static int
perfmon_proc_info(char *page)
pfm_proc_info(char *page)
{
char *p = page;
int i;
p += sprintf(p, "enabled : %s\n", pmu_conf.pfm_is_disabled ? "No": "Yes");
p += sprintf(p, "fastctxsw : %s\n", pfm_sysctl.fastctxsw > 0 ? "Yes": "No");
p += sprintf(p, "ovfl_mask : 0x%lx\n", pmu_conf.perf_ovfl_val);
p += sprintf(p, "ovfl_mask : 0x%lx\n", pmu_conf.ovfl_val);
for(i=0; i < NR_CPUS; i++) {
if (cpu_is_online(i) == 0) continue;
......@@ -3009,16 +3123,18 @@ perfmon_proc_info(char *page)
p += sprintf(p, "CPU%-2d spurious intrs : %lu\n", i, pfm_stats[i].pfm_spurious_ovfl_intr_count);
p += sprintf(p, "CPU%-2d recorded samples : %lu\n", i, pfm_stats[i].pfm_recorded_samples_count);
p += sprintf(p, "CPU%-2d smpl buffer full : %lu\n", i, pfm_stats[i].pfm_full_smpl_buffer_count);
p += sprintf(p, "CPU%-2d syst_wide : %d\n", i, per_cpu(pfm_syst_info, i) & PFM_CPUINFO_SYST_WIDE ? 1 : 0);
p += sprintf(p, "CPU%-2d dcr_pp : %d\n", i, per_cpu(pfm_syst_info, i) & PFM_CPUINFO_DCR_PP ? 1 : 0);
p += sprintf(p, "CPU%-2d exclude idle : %d\n", i, per_cpu(pfm_syst_info, i) & PFM_CPUINFO_EXCL_IDLE ? 1 : 0);
p += sprintf(p, "CPU%-2d owner : %d\n", i, pmu_owners[i].owner ? pmu_owners[i].owner->pid: -1);
p += sprintf(p, "CPU%-2d syst_wide : %d\n", i, per_cpu(pfm_syst_wide, i));
p += sprintf(p, "CPU%-2d dcr_pp : %d\n", i, per_cpu(pfm_dcr_pp, i));
}
LOCK_PFS();
p += sprintf(p, "proc_sessions : %lu\n"
"sys_sessions : %lu\n"
"sys_use_dbregs : %lu\n"
"ptrace_use_dbregs : %lu\n",
p += sprintf(p, "proc_sessions : %u\n"
"sys_sessions : %u\n"
"sys_use_dbregs : %u\n"
"ptrace_use_dbregs : %u\n",
pfm_sessions.pfs_task_sessions,
pfm_sessions.pfs_sys_sessions,
pfm_sessions.pfs_sys_use_dbregs,
......@@ -3033,7 +3149,7 @@ perfmon_proc_info(char *page)
static int
perfmon_read_entry(char *page, char **start, off_t off, int count, int *eof, void *data)
{
int len = perfmon_proc_info(page);
int len = pfm_proc_info(page);
if (len <= off+count) *eof = 1;
......@@ -3046,17 +3162,57 @@ perfmon_read_entry(char *page, char **start, off_t off, int count, int *eof, voi
return len;
}
/*
* we come here as soon as PFM_CPUINFO_SYST_WIDE is set. This happens
* during pfm_enable() hence before pfm_start(). We cannot assume monitoring
* is active or inactive based on mode. We must rely on the value in
* cpu_data(i)->pfm_syst_info
*/
void
pfm_syst_wide_update_task(struct task_struct *task, int mode)
pfm_syst_wide_update_task(struct task_struct *task, unsigned long info, int is_ctxswin)
{
struct pt_regs *regs = (struct pt_regs *)((unsigned long) task + IA64_STK_OFFSET);
struct pt_regs *regs;
unsigned long dcr;
unsigned long dcr_pp;
regs--;
dcr_pp = info & PFM_CPUINFO_DCR_PP ? 1 : 0;
/*
* propagate the value of the dcr_pp bit to the psr
* pid 0 is guaranteed to be the idle task. There is one such task with pid 0
* on every CPU, so we can rely on the pid to identify the idle task.
*/
if ((info & PFM_CPUINFO_EXCL_IDLE) == 0 || task->pid) {
regs = (struct pt_regs *)((unsigned long) task + IA64_STK_OFFSET);
regs--;
ia64_psr(regs)->pp = is_ctxswin ? dcr_pp : 0;
return;
}
/*
* if monitoring has started
*/
ia64_psr(regs)->pp = mode ? __get_cpu_var(pfm_dcr_pp) : 0;
if (dcr_pp) {
dcr = ia64_get_dcr();
/*
* context switching in?
*/
if (is_ctxswin) {
/* mask monitoring for the idle task */
ia64_set_dcr(dcr & ~IA64_DCR_PP);
pfm_clear_psr_pp();
ia64_srlz_i();
return;
}
/*
* context switching out
* restore monitoring for next task
*
* Due to inlining this odd if-then-else construction generates
* better code.
*/
ia64_set_dcr(dcr |IA64_DCR_PP);
pfm_set_psr_pp();
ia64_srlz_i();
}
}
void
......@@ -3067,11 +3223,10 @@ pfm_save_regs (struct task_struct *task)
ctx = task->thread.pfm_context;
/*
* save current PSR: needed because we modify it
*/
__asm__ __volatile__ ("mov %0=psr;;": "=r"(psr) :: "memory");
psr = pfm_get_psr();
/*
* stop monitoring:
......@@ -3369,7 +3524,7 @@ pfm_load_regs (struct task_struct *task)
*/
mask = pfm_sysctl.fastctxsw || ctx->ctx_fl_protected ? ctx->ctx_used_pmds[0] : ctx->ctx_reload_pmds[0];
for (i=0; mask; i++, mask>>=1) {
if (mask & 0x1) ia64_set_pmd(i, t->pmd[i] & pmu_conf.perf_ovfl_val);
if (mask & 0x1) ia64_set_pmd(i, t->pmd[i] & pmu_conf.ovfl_val);
}
/*
......@@ -3419,7 +3574,7 @@ pfm_reset_pmu(struct task_struct *task)
int i;
if (task != current) {
printk("perfmon: invalid task in ia64_reset_pmu()\n");
printk("perfmon: invalid task in pfm_reset_pmu()\n");
return;
}
......@@ -3428,6 +3583,7 @@ pfm_reset_pmu(struct task_struct *task)
/*
* install reset values for PMC. We skip PMC0 (done above)
* XX: good up to 64 PMCS
*/
for (i=1; (pmu_conf.pmc_desc[i].type & PFM_REG_END) == 0; i++) {
if ((pmu_conf.pmc_desc[i].type & PFM_REG_IMPL) == 0) continue;
......@@ -3444,7 +3600,7 @@ pfm_reset_pmu(struct task_struct *task)
/*
* clear reset values for PMD.
* XXX: good up to 64 PMDS. Suppose that zero is a valid value.
* XXX: good up to 64 PMDS.
*/
for (i=0; (pmu_conf.pmd_desc[i].type & PFM_REG_END) == 0; i++) {
if ((pmu_conf.pmd_desc[i].type & PFM_REG_IMPL) == 0) continue;
......@@ -3477,13 +3633,13 @@ pfm_reset_pmu(struct task_struct *task)
*
* We never directly restore PMC0 so we do not include it in the mask.
*/
ctx->ctx_reload_pmcs[0] = pmu_conf.impl_regs[0] & ~0x1;
ctx->ctx_reload_pmcs[0] = pmu_conf.impl_pmcs[0] & ~0x1;
/*
* We must include all the PMD in this mask to avoid picking
* up stale value and leak information, especially directly
* at the user level when psr.sp=0
*/
ctx->ctx_reload_pmds[0] = pmu_conf.impl_regs[4];
ctx->ctx_reload_pmds[0] = pmu_conf.impl_pmds[0];
/*
* Keep track of the pmds we want to sample
......@@ -3493,7 +3649,7 @@ pfm_reset_pmu(struct task_struct *task)
*
* We ignore the unimplemented pmds specified by the user
*/
ctx->ctx_used_pmds[0] = ctx->ctx_smpl_regs[0] & pmu_conf.impl_regs[4];
ctx->ctx_used_pmds[0] = ctx->ctx_smpl_regs[0];
ctx->ctx_used_pmcs[0] = 1; /* always save/restore PMC[0] */
/*
......@@ -3547,16 +3703,17 @@ pfm_flush_regs (struct task_struct *task)
ia64_set_dcr(ia64_get_dcr() & ~IA64_DCR_PP);
/* stop monitoring */
__asm__ __volatile__ ("rsm psr.pp;;"::: "memory");
pfm_clear_psr_pp();
ia64_srlz_i();
__get_cpu_var(pfm_syst_wide) = 0;
__get_cpu_var(pfm_dcr_pp) = 0;
PFM_CPUINFO_CLEAR(PFM_CPUINFO_SYST_WIDE);
PFM_CPUINFO_CLEAR(PFM_CPUINFO_DCR_PP);
PFM_CPUINFO_CLEAR(PFM_CPUINFO_EXCL_IDLE);
} else {
/* stop monitoring */
__asm__ __volatile__ ("rum psr.up;;"::: "memory");
pfm_clear_psr_up();
ia64_srlz_i();
......@@ -3622,10 +3779,14 @@ pfm_flush_regs (struct task_struct *task)
val = ia64_get_pmd(i);
if (PMD_IS_COUNTING(i)) {
DBprintk(("[%d] pmd[%d] soft_pmd=0x%lx hw_pmd=0x%lx\n", task->pid, i, ctx->ctx_soft_pmds[i].val, val & pmu_conf.perf_ovfl_val));
DBprintk(("[%d] pmd[%d] soft_pmd=0x%lx hw_pmd=0x%lx\n",
task->pid,
i,
ctx->ctx_soft_pmds[i].val,
val & pmu_conf.ovfl_val));
/* collect latest results */
ctx->ctx_soft_pmds[i].val += val & pmu_conf.perf_ovfl_val;
ctx->ctx_soft_pmds[i].val += val & pmu_conf.ovfl_val;
/*
* now everything is in ctx_soft_pmds[] and we need
......@@ -3638,7 +3799,7 @@ pfm_flush_regs (struct task_struct *task)
* take care of overflow inline
*/
if (pmc0 & (1UL << i)) {
ctx->ctx_soft_pmds[i].val += 1 + pmu_conf.perf_ovfl_val;
ctx->ctx_soft_pmds[i].val += 1 + pmu_conf.ovfl_val;
DBprintk(("[%d] pmd[%d] overflowed soft_pmd=0x%lx\n",
task->pid, i, ctx->ctx_soft_pmds[i].val));
}
......@@ -3771,8 +3932,8 @@ pfm_inherit(struct task_struct *task, struct pt_regs *regs)
m = nctx->ctx_used_pmds[0] >> PMU_FIRST_COUNTER;
for(i = PMU_FIRST_COUNTER ; m ; m>>=1, i++) {
if ((m & 0x1) && pmu_conf.pmd_desc[i].type == PFM_REG_COUNTING) {
nctx->ctx_soft_pmds[i].val = nctx->ctx_soft_pmds[i].lval & ~pmu_conf.perf_ovfl_val;
thread->pmd[i] = nctx->ctx_soft_pmds[i].lval & pmu_conf.perf_ovfl_val;
nctx->ctx_soft_pmds[i].val = nctx->ctx_soft_pmds[i].lval & ~pmu_conf.ovfl_val;
thread->pmd[i] = nctx->ctx_soft_pmds[i].lval & pmu_conf.ovfl_val;
} else {
thread->pmd[i] = 0UL; /* reset to initial state */
}
......@@ -3939,30 +4100,14 @@ pfm_context_exit(struct task_struct *task)
UNLOCK_CTX(ctx);
LOCK_PFS();
pfm_unreserve_session(task, ctx->ctx_fl_system, 1UL << ctx->ctx_cpu);
if (ctx->ctx_fl_system) {
pfm_sessions.pfs_sys_session[ctx->ctx_cpu] = NULL;
pfm_sessions.pfs_sys_sessions--;
DBprintk(("freeing syswide session on CPU%ld\n", ctx->ctx_cpu));
/* update perfmon debug register usage counter */
if (ctx->ctx_fl_using_dbreg) {
if (pfm_sessions.pfs_sys_use_dbregs == 0) {
printk("perfmon: invalid release for [%d] sys_use_dbregs=0\n", task->pid);
} else
pfm_sessions.pfs_sys_use_dbregs--;
}
/*
* remove any CPU pinning
*/
set_cpus_allowed(task, ctx->ctx_saved_cpus_allowed);
} else {
pfm_sessions.pfs_task_sessions--;
}
UNLOCK_PFS();
pfm_context_free(ctx);
/*
......@@ -3990,8 +4135,7 @@ pfm_cleanup_smpl_buf(struct task_struct *task)
* Walk through the list and free the sampling buffer and psb
*/
while (psb) {
DBprintk(("[%d] freeing smpl @%p size %ld\n",
current->pid, psb->psb_hdr, psb->psb_size));
DBprintk(("[%d] freeing smpl @%p size %ld\n", current->pid, psb->psb_hdr, psb->psb_size));
pfm_rvfree(psb->psb_hdr, psb->psb_size);
tmp = psb->psb_next;
......@@ -4095,16 +4239,16 @@ pfm_cleanup_notifiers(struct task_struct *task)
if (ctx && ctx->ctx_notify_task == task) {
DBprintk(("trying for notifier [%d] in [%d]\n", task->pid, p->pid));
/*
* the spinlock is required to take care of a race condition with
* the send_sig_info() call. We must make sure that either the
* send_sig_info() completes using a valid task, or the
* notify_task is cleared before the send_sig_info() can pick up a
* stale value. Note that by the time this function is executed
* the 'task' is already detached from the tasklist. The problem
* is that the notifiers have a direct pointer to it. It is okay
* to send a signal to a task in this stage, it simply will have
* no effect. But it is better than sending to a completely
* destroyed task or worse to a new task using the same
* the spinlock is required to take care of a race condition
* with the send_sig_info() call. We must make sure that
* either the send_sig_info() completes using a valid task,
* or the notify_task is cleared before the send_sig_info()
* can pick up a stale value. Note that by the time this
* function is executed the 'task' is already detached from the
* tasklist. The problem is that the notifiers have a direct
* pointer to it. It is okay to send a signal to a task in this
* stage, it simply will have no effect. But it is better than sending
* to a completely destroyed task or worse to a new task using the same
* task_struct address.
*/
LOCK_CTX(ctx);
......@@ -4123,87 +4267,131 @@ pfm_cleanup_notifiers(struct task_struct *task)
}
static struct irqaction perfmon_irqaction = {
.handler = perfmon_interrupt,
.handler = pfm_interrupt_handler,
.flags = SA_INTERRUPT,
.name = "perfmon"
};
int
pfm_install_alternate_syswide_subsystem(pfm_intr_handler_desc_t *hdl)
{
int ret;
/* some sanity checks */
if (hdl == NULL || hdl->handler == NULL) return -EINVAL;
/* do the easy test first */
if (pfm_alternate_intr_handler) return -EBUSY;
/* reserve our session */
ret = pfm_reserve_session(NULL, 1, cpu_online_map);
if (ret) return ret;
if (pfm_alternate_intr_handler) {
printk("perfmon: install_alternate, intr_handler not NULL after reserve\n");
return -EINVAL;
}
pfm_alternate_intr_handler = hdl;
return 0;
}
int
pfm_remove_alternate_syswide_subsystem(pfm_intr_handler_desc_t *hdl)
{
if (hdl == NULL) return -EINVAL;
/* cannot remove someone else's handler! */
if (pfm_alternate_intr_handler != hdl) return -EINVAL;
pfm_alternate_intr_handler = NULL;
/*
* XXX: assume cpu_online_map has not changed since reservation
*/
pfm_unreserve_session(NULL, 1, cpu_online_map);
return 0;
}
/*
* perfmon initialization routine, called from the initcall() table
*/
int __init
perfmon_init (void)
pfm_init(void)
{
pal_perf_mon_info_u_t pm_info;
s64 status;
unsigned int n, n_counters, i;
pmu_conf.pfm_is_disabled = 1;
pmu_conf.disabled = 1;
printk("perfmon: version %u.%u (sampling format v%u.%u) IRQ %u\n",
printk("perfmon: version %u.%u IRQ %u\n",
PFM_VERSION_MAJ,
PFM_VERSION_MIN,
PFM_SMPL_VERSION_MAJ,
PFM_SMPL_VERSION_MIN,
IA64_PERFMON_VECTOR);
if ((status=ia64_pal_perf_mon_info(pmu_conf.impl_regs, &pm_info)) != 0) {
printk("perfmon: PAL call failed (%ld), perfmon disabled\n", status);
return -1;
}
pmu_conf.perf_ovfl_val = (1UL << pm_info.pal_perf_mon_info_s.width) - 1;
/*
* XXX: use the pfm_*_desc tables instead and simply verify with PAL
* compute the number of implemented PMD/PMC from the
* description tables
*/
pmu_conf.max_counters = pm_info.pal_perf_mon_info_s.generic;
pmu_conf.num_pmcs = find_num_pm_regs(pmu_conf.impl_regs);
pmu_conf.num_pmds = find_num_pm_regs(&pmu_conf.impl_regs[4]);
n = 0;
for (i=0; PMC_IS_LAST(i) == 0; i++) {
if (PMC_IS_IMPL(i) == 0) continue;
pmu_conf.impl_pmcs[i>>6] |= 1UL << (i&63);
n++;
}
pmu_conf.num_pmcs = n;
printk("perfmon: %u bits counters\n", pm_info.pal_perf_mon_info_s.width);
n = 0; n_counters = 0;
for (i=0; PMD_IS_LAST(i) == 0; i++) {
if (PMD_IS_IMPL(i) == 0) continue;
pmu_conf.impl_pmds[i>>6] |= 1UL << (i&63);
n++;
if (PMD_IS_COUNTING(i)) n_counters++;
}
pmu_conf.num_pmds = n;
pmu_conf.num_counters = n_counters;
printk("perfmon: %lu PMC/PMD pairs, %lu PMCs, %lu PMDs\n",
pmu_conf.max_counters, pmu_conf.num_pmcs, pmu_conf.num_pmds);
printk("perfmon: %u PMCs, %u PMDs, %u counters (%lu bits)\n",
pmu_conf.num_pmcs,
pmu_conf.num_pmds,
pmu_conf.num_counters,
ffz(pmu_conf.ovfl_val));
/* sanity check */
if (pmu_conf.num_pmds >= IA64_NUM_PMD_REGS || pmu_conf.num_pmcs >= IA64_NUM_PMC_REGS) {
printk(KERN_ERR "perfmon: not enough pmc/pmd, perfmon is DISABLED\n");
return -1; /* no need to continue anyway */
}
if (ia64_pal_debug_info(&pmu_conf.num_ibrs, &pmu_conf.num_dbrs)) {
printk(KERN_WARNING "perfmon: unable to get number of debug registers\n");
pmu_conf.num_ibrs = pmu_conf.num_dbrs = 0;
printk(KERN_ERR "perfmon: not enough pmc/pmd, perfmon disabled\n");
return -1;
}
/* PAL reports the number of pairs */
pmu_conf.num_ibrs <<=1;
pmu_conf.num_dbrs <<=1;
/*
* setup the register configuration descriptions for the CPU
*/
pmu_conf.pmc_desc = pfm_pmc_desc;
pmu_conf.pmd_desc = pfm_pmd_desc;
/* we are all set */
pmu_conf.pfm_is_disabled = 0;
/*
* for now here for debug purposes
*/
perfmon_dir = create_proc_read_entry ("perfmon", 0, 0, perfmon_read_entry, NULL);
if (perfmon_dir == NULL) {
printk(KERN_ERR "perfmon: cannot create /proc entry, perfmon disabled\n");
return -1;
}
/*
* create /proc/perfmon
*/
pfm_sysctl_header = register_sysctl_table(pfm_sysctl_root, 0);
/*
* initialize all our spinlocks
*/
spin_lock_init(&pfm_sessions.pfs_lock);
/* we are all set */
pmu_conf.disabled = 0;
return 0;
}
__initcall(perfmon_init);
__initcall(pfm_init);
void
perfmon_init_percpu (void)
pfm_init_percpu(void)
{
int i;
......@@ -4222,17 +4410,17 @@ perfmon_init_percpu (void)
*
* On McKinley, this code is ineffective until PMC4 is initialized.
*/
for (i=1; (pfm_pmc_desc[i].type & PFM_REG_END) == 0; i++) {
if ((pfm_pmc_desc[i].type & PFM_REG_IMPL) == 0) continue;
ia64_set_pmc(i, pfm_pmc_desc[i].default_value);
for (i=1; PMC_IS_LAST(i) == 0; i++) {
if (PMC_IS_IMPL(i) == 0) continue;
ia64_set_pmc(i, PMC_DFL_VAL(i));
}
for (i=0; (pfm_pmd_desc[i].type & PFM_REG_END) == 0; i++) {
if ((pfm_pmd_desc[i].type & PFM_REG_IMPL) == 0) continue;
for (i=0; PMD_IS_LAST(i); i++) {
if (PMD_IS_IMPL(i) == 0) continue;
ia64_set_pmd(i, 0UL);
}
ia64_set_pmc(0,1UL);
ia64_srlz_d();
}
#else /* !CONFIG_PERFMON */
......
/*
* This file contains the architected PMU register description tables
* and pmc checker used by perfmon.c.
*
* Copyright (C) 2002 Hewlett Packard Co
* Stephane Eranian <eranian@hpl.hp.com>
*/
#define RDEP(x) (1UL<<(x))
#if defined(CONFIG_ITANIUM) || defined(CONFIG_MCKINLEY)
#error "This file should only be used when CONFIG_ITANIUM and CONFIG_MCKINLEY are not defined"
#if defined(CONFIG_ITANIUM) || defined (CONFIG_MCKINLEY)
#error "This file should not be used when CONFIG_ITANIUM or CONFIG_MCKINLEY is defined"
#endif
static pfm_reg_desc_t pmc_desc[PMU_MAX_PMCS]={
static pfm_reg_desc_t pmc_gen_desc[PMU_MAX_PMCS]={
/* pmc0 */ { PFM_REG_CONTROL , 0, 0x1UL, -1UL, NULL, NULL, {0UL,0UL, 0UL, 0UL}, {0UL,0UL, 0UL, 0UL}},
/* pmc1 */ { PFM_REG_CONTROL , 0, 0x0UL, -1UL, NULL, NULL, {0UL,0UL, 0UL, 0UL}, {0UL,0UL, 0UL, 0UL}},
/* pmc2 */ { PFM_REG_CONTROL , 0, 0x0UL, -1UL, NULL, NULL, {0UL,0UL, 0UL, 0UL}, {0UL,0UL, 0UL, 0UL}},
......@@ -16,7 +23,7 @@ static pfm_reg_desc_t pmc_desc[PMU_MAX_PMCS]={
{ PFM_REG_END , 0, 0x0UL, -1UL, NULL, NULL, {0,}, {0,}}, /* end marker */
};
static pfm_reg_desc_t pmd_desc[PMU_MAX_PMDS]={
static pfm_reg_desc_t pmd_gen_desc[PMU_MAX_PMDS]={
/* pmd0 */ { PFM_REG_NOTIMPL , 0, 0x0UL, -1UL, NULL, NULL, {0,}, {0,}},
/* pmd1 */ { PFM_REG_NOTIMPL , 0, 0x0UL, -1UL, NULL, NULL, {0,}, {0,}},
/* pmd2 */ { PFM_REG_NOTIMPL , 0, 0x0UL, -1UL, NULL, NULL, {0,}, {0,}},
......@@ -27,3 +34,15 @@ static pfm_reg_desc_t pmd_desc[PMU_MAX_PMDS]={
/* pmd7 */ { PFM_REG_COUNTING, 0, 0x0UL, -1UL, NULL, NULL, {0UL,0UL, 0UL, 0UL}, {RDEP(7),0UL, 0UL, 0UL}},
{ PFM_REG_END , 0, 0x0UL, -1UL, NULL, NULL, {0,}, {0,}}, /* end marker */
};
/*
* impl_pmcs, impl_pmds are computed at runtime to minimize errors!
*/
static pmu_config_t pmu_conf={
disabled: 1,
ovfl_val: (1UL << 32) - 1,
num_ibrs: 8,
num_dbrs: 8,
pmd_desc: pfm_gen_pmd_desc,
pmc_desc: pfm_gen_pmc_desc
};
......@@ -15,7 +15,7 @@
static int pfm_ita_pmc_check(struct task_struct *task, unsigned int cnum, unsigned long *val, struct pt_regs *regs);
static int pfm_write_ibr_dbr(int mode, struct task_struct *task, void *arg, int count, struct pt_regs *regs);
static pfm_reg_desc_t pfm_pmc_desc[PMU_MAX_PMCS]={
static pfm_reg_desc_t pfm_ita_pmc_desc[PMU_MAX_PMCS]={
/* pmc0 */ { PFM_REG_CONTROL , 0, 0x1UL, -1UL, NULL, NULL, {0UL,0UL, 0UL, 0UL}, {0UL,0UL, 0UL, 0UL}},
/* pmc1 */ { PFM_REG_CONTROL , 0, 0x0UL, -1UL, NULL, NULL, {0UL,0UL, 0UL, 0UL}, {0UL,0UL, 0UL, 0UL}},
/* pmc2 */ { PFM_REG_CONTROL , 0, 0x0UL, -1UL, NULL, NULL, {0UL,0UL, 0UL, 0UL}, {0UL,0UL, 0UL, 0UL}},
......@@ -33,7 +33,7 @@ static pfm_reg_desc_t pfm_pmc_desc[PMU_MAX_PMCS]={
{ PFM_REG_END , 0, 0x0UL, -1UL, NULL, NULL, {0,}, {0,}}, /* end marker */
};
static pfm_reg_desc_t pfm_pmd_desc[PMU_MAX_PMDS]={
static pfm_reg_desc_t pfm_ita_pmd_desc[PMU_MAX_PMDS]={
/* pmd0 */ { PFM_REG_BUFFER , 0, 0UL, -1UL, NULL, NULL, {RDEP(1),0UL, 0UL, 0UL}, {RDEP(10),0UL, 0UL, 0UL}},
/* pmd1 */ { PFM_REG_BUFFER , 0, 0UL, -1UL, NULL, NULL, {RDEP(0),0UL, 0UL, 0UL}, {RDEP(10),0UL, 0UL, 0UL}},
/* pmd2 */ { PFM_REG_BUFFER , 0, 0UL, -1UL, NULL, NULL, {RDEP(3)|RDEP(17),0UL, 0UL, 0UL}, {RDEP(11),0UL, 0UL, 0UL}},
......@@ -55,6 +55,19 @@ static pfm_reg_desc_t pfm_pmd_desc[PMU_MAX_PMDS]={
{ PFM_REG_END , 0, 0UL, -1UL, NULL, NULL, {0,}, {0,}}, /* end marker */
};
/*
* impl_pmcs, impl_pmds are computed at runtime to minimize errors!
*/
static pmu_config_t pmu_conf={
disabled: 1,
ovfl_val: (1UL << 32) - 1,
num_ibrs: 8,
num_dbrs: 8,
pmd_desc: pfm_ita_pmd_desc,
pmc_desc: pfm_ita_pmc_desc
};
static int
pfm_ita_pmc_check(struct task_struct *task, unsigned int cnum, unsigned long *val, struct pt_regs *regs)
{
......
......@@ -16,7 +16,7 @@ static int pfm_mck_reserved(struct task_struct *task, unsigned int cnum, unsigne
static int pfm_mck_pmc_check(struct task_struct *task, unsigned int cnum, unsigned long *val, struct pt_regs *regs);
static int pfm_write_ibr_dbr(int mode, struct task_struct *task, void *arg, int count, struct pt_regs *regs);
static pfm_reg_desc_t pfm_pmc_desc[PMU_MAX_PMCS]={
static pfm_reg_desc_t pfm_mck_pmc_desc[PMU_MAX_PMCS]={
/* pmc0 */ { PFM_REG_CONTROL , 0, 0x1UL, -1UL, NULL, NULL, {0UL,0UL, 0UL, 0UL}, {0UL,0UL, 0UL, 0UL}},
/* pmc1 */ { PFM_REG_CONTROL , 0, 0x0UL, -1UL, NULL, NULL, {0UL,0UL, 0UL, 0UL}, {0UL,0UL, 0UL, 0UL}},
/* pmc2 */ { PFM_REG_CONTROL , 0, 0x0UL, -1UL, NULL, NULL, {0UL,0UL, 0UL, 0UL}, {0UL,0UL, 0UL, 0UL}},
......@@ -36,7 +36,7 @@ static pfm_reg_desc_t pfm_pmc_desc[PMU_MAX_PMCS]={
{ PFM_REG_END , 0, 0x0UL, -1UL, NULL, NULL, {0,}, {0,}}, /* end marker */
};
static pfm_reg_desc_t pfm_pmd_desc[PMU_MAX_PMDS]={
static pfm_reg_desc_t pfm_mck_pmd_desc[PMU_MAX_PMDS]={
/* pmd0 */ { PFM_REG_BUFFER , 0, 0x0UL, -1UL, NULL, NULL, {RDEP(1),0UL, 0UL, 0UL}, {RDEP(10),0UL, 0UL, 0UL}},
/* pmd1 */ { PFM_REG_BUFFER , 0, 0x0UL, -1UL, NULL, NULL, {RDEP(0),0UL, 0UL, 0UL}, {RDEP(10),0UL, 0UL, 0UL}},
/* pmd2 */ { PFM_REG_BUFFER , 0, 0x0UL, -1UL, NULL, NULL, {RDEP(3)|RDEP(17),0UL, 0UL, 0UL}, {RDEP(11),0UL, 0UL, 0UL}},
......@@ -58,6 +58,19 @@ static pfm_reg_desc_t pfm_pmd_desc[PMU_MAX_PMDS]={
{ PFM_REG_END , 0, 0x0UL, -1UL, NULL, NULL, {0,}, {0,}}, /* end marker */
};
/*
* impl_pmcs, impl_pmds are computed at runtime to minimize errors!
*/
static pmu_config_t pmu_conf={
disabled: 1,
ovfl_val: (1UL << 47) - 1,
num_ibrs: 8,
num_dbrs: 8,
pmd_desc: pfm_mck_pmd_desc,
pmc_desc: pfm_mck_pmc_desc
};
/*
* PMC reserved fields must have their power-up values preserved
*/
......
/*
* Architecture-specific setup.
*
* Copyright (C) 1998-2002 Hewlett-Packard Co
* Copyright (C) 1998-2003 Hewlett-Packard Co
* David Mosberger-Tang <davidm@hpl.hp.com>
*/
#define __KERNEL_SYSCALLS__ /* see <asm/unistd.h> */
......@@ -96,7 +96,7 @@ show_regs (struct pt_regs *regs)
{
unsigned long ip = regs->cr_iip + ia64_psr(regs)->ri;
printk("\nPid: %d, comm: %20s\n", current->pid, current->comm);
printk("\nPid: %d, CPU %d, comm: %20s\n", current->pid, smp_processor_id(), current->comm);
printk("psr : %016lx ifs : %016lx ip : [<%016lx>] %s\n",
regs->cr_ipsr, regs->cr_ifs, ip, print_tainted());
print_symbol("ip is at %s\n", ip);
......@@ -144,6 +144,13 @@ show_regs (struct pt_regs *regs)
void
do_notify_resume_user (sigset_t *oldset, struct sigscratch *scr, long in_syscall)
{
if (fsys_mode(current, &scr->pt)) {
/* defer signal-handling etc. until we return to privilege-level 0. */
if (!ia64_psr(&scr->pt)->lp)
ia64_psr(&scr->pt)->lp = 1;
return;
}
#ifdef CONFIG_PERFMON
if (current->thread.pfm_ovfl_block_reset)
pfm_ovfl_block_reset();
......@@ -198,6 +205,10 @@ cpu_idle (void *unused)
void
ia64_save_extra (struct task_struct *task)
{
#ifdef CONFIG_PERFMON
unsigned long info;
#endif
if ((task->thread.flags & IA64_THREAD_DBG_VALID) != 0)
ia64_save_debug_regs(&task->thread.dbr[0]);
......@@ -205,8 +216,9 @@ ia64_save_extra (struct task_struct *task)
if ((task->thread.flags & IA64_THREAD_PM_VALID) != 0)
pfm_save_regs(task);
if (__get_cpu_var(pfm_syst_wide))
pfm_syst_wide_update_task(task, 0);
info = __get_cpu_var(pfm_syst_info);
if (info & PFM_CPUINFO_SYST_WIDE)
pfm_syst_wide_update_task(task, info, 0);
#endif
#ifdef CONFIG_IA32_SUPPORT
......@@ -218,6 +230,10 @@ ia64_save_extra (struct task_struct *task)
void
ia64_load_extra (struct task_struct *task)
{
#ifdef CONFIG_PERFMON
unsigned long info;
#endif
if ((task->thread.flags & IA64_THREAD_DBG_VALID) != 0)
ia64_load_debug_regs(&task->thread.dbr[0]);
......@@ -225,8 +241,9 @@ ia64_load_extra (struct task_struct *task)
if ((task->thread.flags & IA64_THREAD_PM_VALID) != 0)
pfm_load_regs(task);
if (__get_cpu_var(pfm_syst_wide))
pfm_syst_wide_update_task(task, 1);
info = __get_cpu_var(pfm_syst_info);
if (info & PFM_CPUINFO_SYST_WIDE)
pfm_syst_wide_update_task(task, info, 1);
#endif
#ifdef CONFIG_IA32_SUPPORT
......
......@@ -834,20 +834,18 @@ access_uarea (struct task_struct *child, unsigned long addr, unsigned long *data
}
#ifdef CONFIG_PERFMON
/*
* Check if debug registers are used
* by perfmon. This test must be done once we know that we can
* do the operation, i.e. the arguments are all valid, but before
* we start modifying the state.
* Check if debug registers are used by perfmon. This test must be done
* once we know that we can do the operation, i.e. the arguments are all
* valid, but before we start modifying the state.
*
* Perfmon needs to keep a count of how many processes are
* trying to modify the debug registers for system wide monitoring
* sessions.
* Perfmon needs to keep a count of how many processes are trying to
* modify the debug registers for system wide monitoring sessions.
*
* We also include read access here, because they may cause
* the PMU-installed debug register state (dbr[], ibr[]) to
* be reset. The two arrays are also used by perfmon, but
* we do not use IA64_THREAD_DBG_VALID. The registers are restored
* by the PMU context switch code.
* We also include read access here, because they may cause the
* PMU-installed debug register state (dbr[], ibr[]) to be reset. The two
* arrays are also used by perfmon, but we do not use
* IA64_THREAD_DBG_VALID. The registers are restored by the PMU context
* switch code.
*/
if (pfm_use_debug_registers(child)) return -1;
#endif
......
......@@ -265,7 +265,7 @@ smp_callin (void)
extern void ia64_init_itm(void);
#ifdef CONFIG_PERFMON
extern void perfmon_init_percpu(void);
extern void pfm_init_percpu(void);
#endif
cpuid = smp_processor_id();
......@@ -300,7 +300,7 @@ smp_callin (void)
#endif
#ifdef CONFIG_PERFMON
perfmon_init_percpu();
pfm_init_percpu();
#endif
local_irq_enable();
......
......@@ -20,7 +20,6 @@
#include <asm/shmparam.h>
#include <asm/uaccess.h>
unsigned long
arch_get_unmapped_area (struct file *filp, unsigned long addr, unsigned long len,
unsigned long pgoff, unsigned long flags)
......@@ -31,6 +30,20 @@ arch_get_unmapped_area (struct file *filp, unsigned long addr, unsigned long len
if (len > RGN_MAP_LIMIT)
return -ENOMEM;
#ifdef CONFIG_HUGETLB_PAGE
#define COLOR_HALIGN(addr) ((addr + HPAGE_SIZE - 1) & ~(HPAGE_SIZE - 1))
#define TASK_HPAGE_BASE ((REGION_HPAGE << REGION_SHIFT) | HPAGE_SIZE)
if (filp && is_file_hugepages(filp)) {
if ((REGION_NUMBER(addr) != REGION_HPAGE) || (addr & (HPAGE_SIZE -1)))
addr = TASK_HPAGE_BASE;
addr = COLOR_HALIGN(addr);
}
else {
if (REGION_NUMBER(addr) == REGION_HPAGE)
addr = 0;
}
#endif
if (!addr)
addr = TASK_UNMAPPED_BASE;
......
/*
* Architecture-specific trap handling.
*
* Copyright (C) 1998-2002 Hewlett-Packard Co
* Copyright (C) 1998-2003 Hewlett-Packard Co
* David Mosberger-Tang <davidm@hpl.hp.com>
*
* 05/12/00 grao <goutham.rao@intel.com> : added isr in siginfo for SIGFPE
......@@ -524,6 +524,23 @@ ia64_fault (unsigned long vector, unsigned long isr, unsigned long ifa,
case 29: /* Debug */
case 35: /* Taken Branch Trap */
case 36: /* Single Step Trap */
if (fsys_mode(current, regs)) {
extern char syscall_via_break[], __start_gate_section[];
/*
* Got a trap in fsys-mode: Taken Branch Trap and Single Step trap
* need special handling; Debug trap is not supposed to happen.
*/
if (unlikely(vector == 29)) {
die("Got debug trap in fsys-mode---not supposed to happen!",
regs, 0);
return;
}
/* re-do the system call via break 0x100000: */
regs->cr_iip = GATE_ADDR + (syscall_via_break - __start_gate_section);
ia64_psr(regs)->ri = 0;
ia64_psr(regs)->cpl = 3;
return;
}
switch (vector) {
case 29:
siginfo.si_code = TRAP_HWBKPT;
......@@ -563,7 +580,18 @@ ia64_fault (unsigned long vector, unsigned long isr, unsigned long ifa,
}
return;
case 34: /* Unimplemented Instruction Address Trap */
case 34:
if (isr & 0x2) {
/* Lower-Privilege Transfer Trap */
/*
* Just clear PSR.lp and then return immediately: all the
* interesting work (e.g., signal delivery is done in the kernel
* exit path).
*/
ia64_psr(regs)->lp = 0;
return;
} else {
/* Unimplemented Instr. Address Trap */
if (user_mode(regs)) {
siginfo.si_signo = SIGILL;
siginfo.si_code = ILL_BADIADDR;
......@@ -576,6 +604,7 @@ ia64_fault (unsigned long vector, unsigned long isr, unsigned long ifa,
return;
}
sprintf(buf, "Unimplemented Instruction Address fault");
}
break;
case 45:
......
......@@ -331,12 +331,8 @@ set_rse_reg (struct pt_regs *regs, unsigned long r1, unsigned long val, int nat)
return;
}
/*
* Avoid using user_mode() here: with "epc", we cannot use the privilege level to
* infer whether the interrupt task was running on the kernel backing store.
*/
if (regs->r12 >= TASK_SIZE) {
DPRINT("ignoring kernel write to r%lu; register isn't on the RBS!", r1);
if (!user_stack(current, regs)) {
DPRINT("ignoring kernel write to r%lu; register isn't on the kernel RBS!", r1);
return;
}
......@@ -406,11 +402,7 @@ get_rse_reg (struct pt_regs *regs, unsigned long r1, unsigned long *val, int *na
return;
}
/*
* Avoid using user_mode() here: with "epc", we cannot use the privilege level to
* infer whether the interrupt task was running on the kernel backing store.
*/
if (regs->r12 >= TASK_SIZE) {
if (!user_stack(current, regs)) {
DPRINT("ignoring kernel read of r%lu; register isn't on the RBS!", r1);
goto fail;
}
......
......@@ -1997,16 +1997,18 @@ unw_create_gate_table (void)
{
extern char __start_gate_section[], __stop_gate_section[];
unsigned long *lp, start, end, segbase = unw.kernel_table.segment_base;
const struct unw_table_entry *entry, *first;
const struct unw_table_entry *entry, *first, *unw_table_end;
extern int ia64_unw_end;
size_t info_size, size;
char *info;
start = (unsigned long) __start_gate_section - segbase;
end = (unsigned long) __stop_gate_section - segbase;
unw_table_end = (struct unw_table_entry *) &ia64_unw_end;
size = 0;
first = lookup(&unw.kernel_table, start);
for (entry = first; entry->start_offset < end; ++entry)
for (entry = first; entry < unw_table_end && entry->start_offset < end; ++entry)
size += 3*8 + 8 + 8*UNW_LENGTH(*(u64 *) (segbase + entry->info_offset));
size += 8; /* reserve space for "end of table" marker */
......@@ -2021,7 +2023,7 @@ unw_create_gate_table (void)
lp = unw.gate_table;
info = (char *) unw.gate_table + size;
for (entry = first; entry->start_offset < end; ++entry, lp += 3) {
for (entry = first; entry < unw_table_end && entry->start_offset < end; ++entry, lp += 3) {
info_size = 8 + 8*UNW_LENGTH(*(u64 *) (segbase + entry->info_offset));
info -= info_size;
memcpy(info, (char *) segbase + entry->info_offset, info_size);
......
......@@ -159,7 +159,7 @@ GLOBAL_ENTRY(__copy_user)
mov ar.ec=2
(p10) br.dpnt.few .aligned_src_tail
;;
.align 32
// .align 32
1:
EX(.ex_handler, (p16) ld8 r34=[src0],16)
EK(.ex_handler, (p16) ld8 r38=[src1],16)
......@@ -316,7 +316,7 @@ EK(.ex_handler, (p[D]) st8 [dst1] = t15, 4*8)
(p7) mov ar.lc = r21
(p8) mov ar.lc = r0
;;
.align 32
// .align 32
1: lfetch.fault [src_pre_mem], 128
lfetch.fault.excl [dst_pre_mem], 128
br.cloop.dptk.few 1b
......@@ -522,7 +522,7 @@ EK(.ex_handler, (p17) st8 [dst1]=r39,8); \
shrp r21=r22,r38,shift; /* speculative work */ \
br.sptk.few .unaligned_src_tail /* branch out of jump table */ \
;;
.align 32
// .align 32
.jump_table:
COPYU(8) // unaligned cases
.jmp1:
......
......@@ -125,7 +125,7 @@ GLOBAL_ENTRY(memset)
(p_zr) br.cond.dptk.many .l1b // Jump to use stf.spill
;; }
.align 32 // -------------------------- // L1A: store ahead into cache lines; fill later
// .align 32 // -------------------------- // L1A: store ahead into cache lines; fill later
{ .mmi
and tmp = -(LINE_SIZE), cnt // compute end of range
mov ptr9 = ptr1 // used for prefetching
......@@ -194,7 +194,7 @@ GLOBAL_ENTRY(memset)
br.cond.dpnt.many .move_bytes_from_alignment // Branch no. 3
;; }
.align 32
// .align 32
.l1b: // ------------------------------------ // L1B: store ahead into cache lines; fill later
{ .mmi
and tmp = -(LINE_SIZE), cnt // compute end of range
......@@ -261,7 +261,7 @@ GLOBAL_ENTRY(memset)
and cnt = 0x1f, cnt // compute the remaining cnt
mov.i ar.lc = loopcnt
;; }
.align 32
// .align 32
.l2: // ------------------------------------ // L2A: store 32B in 2 cycles
{ .mmb
stf8 [ptr1] = fvalue, 8
......
......@@ -12,71 +12,42 @@
#include <linux/pagemap.h>
#include <linux/smp_lock.h>
#include <linux/slab.h>
#include <asm/mman.h>
#include <asm/pgalloc.h>
#include <asm/tlb.h>
#include <asm/tlbflush.h>
static struct vm_operations_struct hugetlb_vm_ops;
struct list_head htlbpage_freelist;
spinlock_t htlbpage_lock = SPIN_LOCK_UNLOCKED;
extern long htlbpagemem;
#include <linux/sysctl.h>
static long htlbpagemem;
int htlbpage_max;
static long htlbzone_pages;
static void zap_hugetlb_resources (struct vm_area_struct *);
struct vm_operations_struct hugetlb_vm_ops;
static LIST_HEAD(htlbpage_freelist);
static spinlock_t htlbpage_lock = SPIN_LOCK_UNLOCKED;
static struct page *
alloc_hugetlb_page (void)
static struct page *alloc_hugetlb_page(void)
{
struct list_head *curr, *head;
int i;
struct page *page;
spin_lock(&htlbpage_lock);
head = &htlbpage_freelist;
curr = head->next;
if (curr == head) {
if (list_empty(&htlbpage_freelist)) {
spin_unlock(&htlbpage_lock);
return NULL;
}
page = list_entry(curr, struct page, list);
list_del(curr);
page = list_entry(htlbpage_freelist.next, struct page, list);
list_del(&page->list);
htlbpagemem--;
spin_unlock(&htlbpage_lock);
set_page_count(page, 1);
memset(page_address(page), 0, HPAGE_SIZE);
for (i = 0; i < (HPAGE_SIZE/PAGE_SIZE); ++i)
clear_highpage(&page[i]);
return page;
}
static void
free_hugetlb_page (struct page *page)
{
spin_lock(&htlbpage_lock);
if ((page->mapping != NULL) && (page_count(page) == 2)) {
struct inode *inode = page->mapping->host;
int i;
ClearPageDirty(page);
remove_from_page_cache(page);
set_page_count(page, 1);
if ((inode->i_size -= HPAGE_SIZE) == 0) {
for (i = 0; i < MAX_ID; i++)
if (htlbpagek[i].key == inode->i_ino) {
htlbpagek[i].key = 0;
htlbpagek[i].in = NULL;
break;
}
kfree(inode);
}
}
if (put_page_testzero(page)) {
list_add(&page->list, &htlbpage_freelist);
htlbpagemem++;
}
spin_unlock(&htlbpage_lock);
}
static pte_t *
huge_pte_alloc (struct mm_struct *mm, unsigned long addr)
{
......@@ -126,63 +97,8 @@ set_huge_pte (struct mm_struct *mm, struct vm_area_struct *vma,
return;
}
static int
anon_get_hugetlb_page (struct mm_struct *mm, struct vm_area_struct *vma,
int write_access, pte_t * page_table)
{
struct page *page;
page = alloc_hugetlb_page();
if (page == NULL)
return -1;
set_huge_pte(mm, vma, page, page_table, write_access);
return 1;
}
static int
make_hugetlb_pages_present (unsigned long addr, unsigned long end, int flags)
{
int write;
struct mm_struct *mm = current->mm;
struct vm_area_struct *vma;
pte_t *pte;
vma = find_vma(mm, addr);
if (!vma)
goto out_error1;
write = (vma->vm_flags & VM_WRITE) != 0;
if ((vma->vm_end - vma->vm_start) & (HPAGE_SIZE - 1))
goto out_error1;
spin_lock(&mm->page_table_lock);
do {
pte = huge_pte_alloc(mm, addr);
if ((pte) && (pte_none(*pte))) {
if (anon_get_hugetlb_page(mm, vma, write ? VM_WRITE : VM_READ, pte) == -1)
goto out_error;
} else
goto out_error;
addr += HPAGE_SIZE;
} while (addr < end);
spin_unlock(&mm->page_table_lock);
vma->vm_flags |= (VM_HUGETLB | VM_RESERVED);
if (flags & MAP_PRIVATE)
vma->vm_flags |= VM_DONTCOPY;
vma->vm_ops = &hugetlb_vm_ops;
return 0;
out_error:
if (addr > vma->vm_start) {
vma->vm_end = addr;
zap_hugetlb_resources(vma);
vma->vm_end = end;
}
spin_unlock(&mm->page_table_lock);
out_error1:
return -1;
}
int
copy_hugetlb_page_range (struct mm_struct *dst, struct mm_struct *src, struct vm_area_struct *vma)
int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src,
struct vm_area_struct *vma)
{
pte_t *src_pte, *dst_pte, entry;
struct page *ptepage;
......@@ -202,13 +118,12 @@ copy_hugetlb_page_range (struct mm_struct *dst, struct mm_struct *src, struct vm
addr += HPAGE_SIZE;
}
return 0;
nomem:
nomem:
return -ENOMEM;
}
int
follow_hugetlb_page (struct mm_struct *mm, struct vm_area_struct *vma,
follow_hugetlb_page(struct mm_struct *mm, struct vm_area_struct *vma,
struct page **pages, struct vm_area_struct **vmas,
unsigned long *st, int *length, int i)
{
......@@ -234,8 +149,8 @@ follow_hugetlb_page (struct mm_struct *mm, struct vm_area_struct *vma,
i++;
len--;
start += PAGE_SIZE;
if (((start & HPAGE_MASK) == pstart) && len
&& (start < vma->vm_end))
if (((start & HPAGE_MASK) == pstart) && len &&
(start < vma->vm_end))
goto back1;
} while (len && start < vma->vm_end);
*length = len;
......@@ -243,51 +158,149 @@ follow_hugetlb_page (struct mm_struct *mm, struct vm_area_struct *vma,
return i;
}
static void
zap_hugetlb_resources (struct vm_area_struct *mpnt)
void free_huge_page(struct page *page)
{
BUG_ON(page_count(page));
BUG_ON(page->mapping);
INIT_LIST_HEAD(&page->list);
spin_lock(&htlbpage_lock);
list_add(&page->list, &htlbpage_freelist);
htlbpagemem++;
spin_unlock(&htlbpage_lock);
}
void huge_page_release(struct page *page)
{
struct mm_struct *mm = mpnt->vm_mm;
unsigned long len, addr, end;
pte_t *ptep;
if (!put_page_testzero(page))
return;
free_huge_page(page);
}
void unmap_hugepage_range(struct vm_area_struct *vma, unsigned long start, unsigned long end)
{
struct mm_struct *mm = vma->vm_mm;
unsigned long address;
pte_t *pte;
struct page *page;
addr = mpnt->vm_start;
end = mpnt->vm_end;
len = end - addr;
do {
ptep = huge_pte_offset(mm, addr);
page = pte_page(*ptep);
pte_clear(ptep);
free_hugetlb_page(page);
addr += HPAGE_SIZE;
} while (addr < end);
mm->rss -= (len >> PAGE_SHIFT);
mpnt->vm_ops = NULL;
flush_tlb_range(mpnt, end - len, end);
BUG_ON(start & (HPAGE_SIZE - 1));
BUG_ON(end & (HPAGE_SIZE - 1));
spin_lock(&htlbpage_lock);
spin_unlock(&htlbpage_lock);
for (address = start; address < end; address += HPAGE_SIZE) {
pte = huge_pte_offset(mm, address);
if (pte_none(*pte))
continue;
page = pte_page(*pte);
huge_page_release(page);
pte_clear(pte);
}
mm->rss -= (end - start) >> PAGE_SHIFT;
flush_tlb_range(vma, start, end);
}
static void
unlink_vma (struct vm_area_struct *mpnt)
void zap_hugepage_range(struct vm_area_struct *vma, unsigned long start, unsigned long length)
{
struct mm_struct *mm = vma->vm_mm;
spin_lock(&mm->page_table_lock);
unmap_hugepage_range(vma, start, start + length);
spin_unlock(&mm->page_table_lock);
}
int hugetlb_prefault(struct address_space *mapping, struct vm_area_struct *vma)
{
struct mm_struct *mm = current->mm;
struct vm_area_struct *vma;
vma = mm->mmap;
if (vma == mpnt) {
mm->mmap = vma->vm_next;
} else {
while (vma->vm_next != mpnt) {
vma = vma->vm_next;
unsigned long addr;
int ret = 0;
BUG_ON(vma->vm_start & ~HPAGE_MASK);
BUG_ON(vma->vm_end & ~HPAGE_MASK);
spin_lock(&mm->page_table_lock);
for (addr = vma->vm_start; addr < vma->vm_end; addr += HPAGE_SIZE) {
unsigned long idx;
pte_t *pte = huge_pte_alloc(mm, addr);
struct page *page;
if (!pte) {
ret = -ENOMEM;
goto out;
}
if (!pte_none(*pte))
continue;
idx = ((addr - vma->vm_start) >> HPAGE_SHIFT)
+ (vma->vm_pgoff >> (HPAGE_SHIFT - PAGE_SHIFT));
page = find_get_page(mapping, idx);
if (!page) {
page = alloc_hugetlb_page();
if (!page) {
ret = -ENOMEM;
goto out;
}
add_to_page_cache(page, mapping, idx);
unlock_page(page);
}
vma->vm_next = mpnt->vm_next;
set_huge_pte(mm, vma, page, pte, vma->vm_flags & VM_WRITE);
}
rb_erase(&mpnt->vm_rb, &mm->mm_rb);
mm->mmap_cache = NULL;
mm->map_count--;
out:
spin_unlock(&mm->page_table_lock);
return ret;
}
int
set_hugetlb_mem_size (int count)
void update_and_free_page(struct page *page)
{
int j;
struct page *map;
map = page;
htlbzone_pages--;
for (j = 0; j < (HPAGE_SIZE / PAGE_SIZE); j++) {
map->flags &= ~(1 << PG_locked | 1 << PG_error | 1 << PG_referenced |
1 << PG_dirty | 1 << PG_active | 1 << PG_reserved |
1 << PG_private | 1<< PG_writeback);
set_page_count(map, 0);
map++;
}
set_page_count(page, 1);
__free_pages(page, HUGETLB_PAGE_ORDER);
}
int try_to_free_low(int count)
{
struct list_head *p;
struct page *page, *map;
map = NULL;
spin_lock(&htlbpage_lock);
list_for_each(p, &htlbpage_freelist) {
if (map) {
list_del(&map->list);
update_and_free_page(map);
htlbpagemem--;
map = NULL;
if (++count == 0)
break;
}
page = list_entry(p, struct page, list);
if ((page_zone(page))->name[0] != 'H') // Look for non-Highmem
map = page;
}
if (map) {
list_del(&map->list);
update_and_free_page(map);
htlbpagemem--;
count++;
}
spin_unlock(&htlbpage_lock);
return count;
}
int set_hugetlb_mem_size(int count)
{
int j, lcount;
struct page *page, *map;
......@@ -298,7 +311,10 @@ set_hugetlb_mem_size (int count)
lcount = count;
else
lcount = count - htlbzone_pages;
if (lcount > 0) { /*Increase the mem size. */
if (lcount == 0)
return (int)htlbzone_pages;
if (lcount > 0) { /* Increase the mem size. */
while (lcount--) {
page = alloc_pages(__GFP_HIGHMEM, HUGETLB_PAGE_ORDER);
if (page == NULL)
......@@ -316,27 +332,79 @@ set_hugetlb_mem_size (int count)
}
return (int) htlbzone_pages;
}
/*Shrink the memory size. */
/* Shrink the memory size. */
lcount = try_to_free_low(lcount);
while (lcount++) {
page = alloc_hugetlb_page();
if (page == NULL)
break;
spin_lock(&htlbpage_lock);
htlbzone_pages--;
update_and_free_page(page);
spin_unlock(&htlbpage_lock);
map = page;
for (j = 0; j < (HPAGE_SIZE / PAGE_SIZE); j++) {
map->flags &= ~(1 << PG_locked | 1 << PG_error | 1 << PG_referenced |
1 << PG_dirty | 1 << PG_active | 1 << PG_reserved |
1 << PG_private | 1<< PG_writeback);
map++;
}
set_page_count(page, 1);
__free_pages(page, HUGETLB_PAGE_ORDER);
}
return (int) htlbzone_pages;
}
static struct vm_operations_struct hugetlb_vm_ops = {
.close = zap_hugetlb_resources
int hugetlb_sysctl_handler(ctl_table *table, int write, struct file *file, void *buffer, size_t *length)
{
proc_dointvec(table, write, file, buffer, length);
htlbpage_max = set_hugetlb_mem_size(htlbpage_max);
return 0;
}
static int __init hugetlb_setup(char *s)
{
if (sscanf(s, "%d", &htlbpage_max) <= 0)
htlbpage_max = 0;
return 1;
}
__setup("hugepages=", hugetlb_setup);
static int __init hugetlb_init(void)
{
int i, j;
struct page *page;
for (i = 0; i < htlbpage_max; ++i) {
page = alloc_pages(__GFP_HIGHMEM, HUGETLB_PAGE_ORDER);
if (!page)
break;
for (j = 0; j < HPAGE_SIZE/PAGE_SIZE; ++j)
SetPageReserved(&page[j]);
spin_lock(&htlbpage_lock);
list_add(&page->list, &htlbpage_freelist);
spin_unlock(&htlbpage_lock);
}
htlbpage_max = htlbpagemem = htlbzone_pages = i;
printk("Total HugeTLB memory allocated, %ld\n", htlbpagemem);
return 0;
}
module_init(hugetlb_init);
int hugetlb_report_meminfo(char *buf)
{
return sprintf(buf,
"HugePages_Total: %5lu\n"
"HugePages_Free: %5lu\n"
"Hugepagesize: %5lu kB\n",
htlbzone_pages,
htlbpagemem,
HPAGE_SIZE/1024);
}
int is_hugepage_mem_enough(size_t size)
{
if (size > (htlbpagemem << HPAGE_SHIFT))
return 0;
return 1;
}
static struct page *hugetlb_nopage(struct vm_area_struct * area, unsigned long address, int unused)
{
BUG();
return NULL;
}
struct vm_operations_struct hugetlb_vm_ops = {
.nopage = hugetlb_nopage,
};
......@@ -342,13 +342,6 @@ ia64_mmu_init (void *my_cpu_data)
* Set up the page tables.
*/
#ifdef CONFIG_HUGETLB_PAGE
long htlbpagemem;
int htlbpage_max;
extern long htlbzone_pages;
extern struct list_head htlbpage_freelist;
#endif
#ifdef CONFIG_DISCONTIGMEM
void
paging_init (void)
......@@ -462,29 +455,4 @@ mem_init (void)
#ifdef CONFIG_IA32_SUPPORT
ia32_gdt_init();
#endif
#ifdef CONFIG_HUGETLB_PAGE
{
long i;
int j;
struct page *page, *map;
if ((htlbzone_pages << (HPAGE_SHIFT - PAGE_SHIFT)) >= max_low_pfn)
htlbzone_pages = (max_low_pfn >> ((HPAGE_SHIFT - PAGE_SHIFT) + 1));
INIT_LIST_HEAD(&htlbpage_freelist);
for (i = 0; i < htlbzone_pages; i++) {
page = alloc_pages(__GFP_HIGHMEM, HUGETLB_PAGE_ORDER);
if (!page)
break;
map = page;
for (j = 0; j < (HPAGE_SIZE/PAGE_SIZE); j++) {
SetPageReserved(map);
map++;
}
list_add(&page->list, &htlbpage_freelist);
}
printk("Total Huge_TLB_Page memory pages allocated %ld \n", i);
htlbzone_pages = htlbpagemem = i;
htlbpage_max = (int)i;
}
#endif
}
#!/bin/sh
# Usage: unwcheck.sh <executable_file_name>
# Pre-requisite: readelf [from Gnu binutils package]
# Purpose: Check the following invariant
# For each code range in the input binary:
# Sum[ lengths of unwind regions] = Number of slots in code range.
# Author : Harish Patil
# First version: January 2002
# Modified : 2/13/2002
# Modified : 3/15/2002: duplicate detection
readelf -u $1 | gawk '\
function todec(hexstr){
dec = 0;
l = length(hexstr);
for (i = 1; i <= l; i++)
{
c = substr(hexstr, i, 1);
if (c == "A")
dec = dec*16 + 10;
else if (c == "B")
dec = dec*16 + 11;
else if (c == "C")
dec = dec*16 + 12;
else if (c == "D")
dec = dec*16 + 13;
else if (c == "E")
dec = dec*16 + 14;
else if (c == "F")
dec = dec*16 + 15;
else
dec = dec*16 + c;
}
return dec;
}
BEGIN { first = 1; sum_rlen = 0; no_slots = 0; errors=0; no_code_ranges=0; }
{
if (NF==5 && $3=="info")
{
no_code_ranges += 1;
if (first == 0)
{
if (sum_rlen != no_slots)
{
print full_code_range;
print " ", "lo = ", lo, " hi =", hi;
print " ", "sum_rlen = ", sum_rlen, "no_slots = " no_slots;
print " "," ", "*******ERROR ***********";
print " "," ", "sum_rlen:", sum_rlen, " != no_slots:" no_slots;
errors += 1;
}
sum_rlen = 0;
}
full_code_range = $0;
code_range = $2;
gsub("..$", "", code_range);
gsub("^.", "", code_range);
split(code_range, addr, "-");
lo = toupper(addr[1]);
code_range_lo[no_code_ranges] = addr[1];
occurs[addr[1]] += 1;
full_range[addr[1]] = $0;
gsub("0X.[0]*", "", lo);
hi = toupper(addr[2]);
gsub("0X.[0]*", "", hi);
no_slots = (todec(hi) - todec(lo))/ 16*3
first = 0;
}
if (index($0,"rlen") > 0 )
{
rlen_str = substr($0, index($0,"rlen"));
rlen = rlen_str;
gsub("rlen=", "", rlen);
gsub(")", "", rlen);
sum_rlen = sum_rlen + rlen;
}
}
END {
if (first == 0)
{
if (sum_rlen != no_slots)
{
print "code_range=", code_range;
print " ", "lo = ", lo, " hi =", hi;
print " ", "sum_rlen = ", sum_rlen, "no_slots = " no_slots;
print " "," ", "*******ERROR ***********";
print " "," ", "sum_rlen:", sum_rlen, " != no_slots:" no_slots;
errors += 1;
}
}
no_duplicates = 0;
for (i=1; i<=no_code_ranges; i++)
{
cr = code_range_lo[i];
if (reported_cr[cr]==1) continue;
if ( occurs[cr] > 1)
{
reported_cr[cr] = 1;
print "Code range low ", code_range_lo[i], ":", full_range[cr], " occurs: ", occurs[cr], " times.";
print " ";
no_duplicates++;
}
}
print "======================================"
print "Total errors:", errors, "/", no_code_ranges, " duplicates:", no_duplicates;
print "======================================"
}
'
......@@ -4,14 +4,7 @@ TARGET = include/asm-ia64/offsets.h
src = $(obj)
all:
fastdep:
mrproper: clean
clean:
rm -f $(obj)/print_offsets.s $(obj)/print_offsets $(obj)/offsets.h
clean-files := print_offsets.s print_offsets offsets.h
$(TARGET): $(obj)/offsets.h
@if ! cmp -s $(obj)/offsets.h ${TARGET}; then \
......
/*
* Utility to generate asm-ia64/offsets.h.
*
* Copyright (C) 1999-2002 Hewlett-Packard Co
* Copyright (C) 1999-2003 Hewlett-Packard Co
* David Mosberger-Tang <davidm@hpl.hp.com>
*
* Note that this file has dual use: when building the kernel
......@@ -53,7 +53,9 @@ tab[] =
{ "UNW_FRAME_INFO_SIZE", sizeof (struct unw_frame_info) },
{ "", 0 }, /* spacer */
{ "IA64_TASK_THREAD_KSP_OFFSET", offsetof (struct task_struct, thread.ksp) },
{ "IA64_TASK_THREAD_ON_USTACK_OFFSET", offsetof (struct task_struct, thread.on_ustack) },
{ "IA64_TASK_PID_OFFSET", offsetof (struct task_struct, pid) },
{ "IA64_TASK_TGID_OFFSET", offsetof (struct task_struct, tgid) },
{ "IA64_PT_REGS_CR_IPSR_OFFSET", offsetof (struct pt_regs, cr_ipsr) },
{ "IA64_PT_REGS_CR_IIP_OFFSET", offsetof (struct pt_regs, cr_iip) },
{ "IA64_PT_REGS_CR_IFS_OFFSET", offsetof (struct pt_regs, cr_ifs) },
......
......@@ -131,10 +131,6 @@ SECTIONS
.data.cacheline_aligned : AT(ADDR(.data.cacheline_aligned) - PAGE_OFFSET)
{ *(.data.cacheline_aligned) }
/* Kernel symbol names for modules: */
.kstrtab : AT(ADDR(.kstrtab) - PAGE_OFFSET)
{ *(.kstrtab) }
/* Per-cpu data: */
. = ALIGN(PERCPU_PAGE_SIZE);
__phys_per_cpu_start = .;
......
......@@ -2,15 +2,22 @@
#define _ASM_IA64_ASMMACRO_H
/*
* Copyright (C) 2000-2001 Hewlett-Packard Co
* Copyright (C) 2000-2001, 2003 Hewlett-Packard Co
* David Mosberger-Tang <davidm@hpl.hp.com>
*/
#include <linux/config.h>
#define ENTRY(name) \
.align 32; \
.proc name; \
name:
#define ENTRY_MIN_ALIGN(name) \
.align 16; \
.proc name; \
name:
#define GLOBAL_ENTRY(name) \
.global name; \
ENTRY(name)
......@@ -52,4 +59,13 @@
99: x
#endif
#ifdef CONFIG_MCKINLEY
/* workaround for Itanium 2 Errata 9: */
# define MCKINLEY_E9_WORKAROUND \
br.call.sptk.many b7=1f;; \
1:
#else
# define MCKINLEY_E9_WORKAROUND
#endif
#endif /* _ASM_IA64_ASMMACRO_H */
......@@ -2,7 +2,7 @@
#define _ASM_IA64_BITOPS_H
/*
* Copyright (C) 1998-2002 Hewlett-Packard Co
* Copyright (C) 1998-2003 Hewlett-Packard Co
* David Mosberger-Tang <davidm@hpl.hp.com>
*
* 02/06/02 find_next_bit() and find_first_bit() added from Erich Focht's ia64 O(1)
......@@ -320,7 +320,7 @@ __ffs (unsigned long x)
static inline unsigned long
ia64_fls (unsigned long x)
{
double d = x;
long double d = x;
long exp;
__asm__ ("getf.exp %0=%1" : "=r"(exp) : "f"(d));
......
......@@ -4,10 +4,12 @@
/*
* ELF-specific definitions.
*
* Copyright (C) 1998, 1999, 2002 Hewlett-Packard Co
* Copyright (C) 1998-1999, 2002-2003 Hewlett-Packard Co
* David Mosberger-Tang <davidm@hpl.hp.com>
*/
#include <linux/config.h>
#include <asm/fpu.h>
#include <asm/page.h>
......@@ -88,6 +90,11 @@ extern void ia64_elf_core_copy_regs (struct pt_regs *src, elf_gregset_t dst);
relevant until we have real hardware to play with... */
#define ELF_PLATFORM 0
/*
* This should go into linux/elf.h...
*/
#define AT_SYSINFO 32
#ifdef __KERNEL__
struct elf64_hdr;
extern void ia64_set_personality (struct elf64_hdr *elf_ex, int ibcs2_interpreter);
......@@ -99,7 +106,14 @@ extern int dump_task_fpu (struct task_struct *, elf_fpregset_t *);
#define ELF_CORE_COPY_TASK_REGS(tsk, elf_gregs) dump_task_regs(tsk, elf_gregs)
#define ELF_CORE_COPY_FPREGS(tsk, elf_fpregs) dump_task_fpu(tsk, elf_fpregs)
#ifdef CONFIG_FSYS
#define ARCH_DLINFO \
do { \
extern int syscall_via_epc; \
NEW_AUX_ENT(AT_SYSINFO, syscall_via_epc); \
} while (0)
#endif
#endif /* __KERNEL__ */
#endif /* _ASM_IA64_ELF_H */
......@@ -4,10 +4,12 @@
/*
* Compiler-dependent intrinsics.
*
* Copyright (C) 2002 Hewlett-Packard Co
* Copyright (C) 2002-2003 Hewlett-Packard Co
* David Mosberger-Tang <davidm@hpl.hp.com>
*/
#include <linux/config.h>
/*
* Force an unresolved reference if someone tries to use
* ia64_fetch_and_add() with a bad value.
......
......@@ -28,6 +28,36 @@
#include <asm/processor.h>
#define MMU_CONTEXT_DEBUG 0
#if MMU_CONTEXT_DEBUG
#include <ia64intrin.h>
extern struct mmu_trace_entry {
char op;
u8 cpu;
u32 context;
void *mm;
} mmu_tbuf[1024];
extern volatile int mmu_tbuf_index;
# define MMU_TRACE(_op,_cpu,_mm,_ctx) \
do { \
int i = __sync_fetch_and_add(&mmu_tbuf_index, 1) % ARRAY_SIZE(mmu_tbuf); \
struct mmu_trace_entry e; \
e.op = (_op); \
e.cpu = (_cpu); \
e.mm = (_mm); \
e.context = (_ctx); \
mmu_tbuf[i] = e; \
} while (0)
#else
# define MMU_TRACE(op,cpu,mm,ctx) do { ; } while (0)
#endif
struct ia64_ctx {
spinlock_t lock;
unsigned int next; /* next context number to use */
......@@ -91,6 +121,7 @@ get_mmu_context (struct mm_struct *mm)
static inline int
init_new_context (struct task_struct *p, struct mm_struct *mm)
{
MMU_TRACE('N', smp_processor_id(), mm, 0);
mm->context = 0;
return 0;
}
......@@ -99,6 +130,7 @@ static inline void
destroy_context (struct mm_struct *mm)
{
/* Nothing to do. */
MMU_TRACE('D', smp_processor_id(), mm, mm->context);
}
static inline void
......@@ -138,7 +170,9 @@ activate_context (struct mm_struct *mm)
do {
context = get_mmu_context(mm);
MMU_TRACE('A', smp_processor_id(), mm, context);
reload_context(context);
MMU_TRACE('a', smp_processor_id(), mm, context);
/* in the unlikely event of a TLB-flush by another thread, redo the load: */
} while (unlikely(context != mm->context));
}
......
......@@ -40,6 +40,7 @@
#define PFM_FL_INHERIT_ALL 0x02 /* always clone pfm_context across fork() */
#define PFM_FL_NOTIFY_BLOCK 0x04 /* block task on user level notifications */
#define PFM_FL_SYSTEM_WIDE 0x08 /* create a system wide context */
#define PFM_FL_EXCL_IDLE 0x20 /* exclude idle task from system wide session */
/*
* PMC flags
......@@ -89,8 +90,9 @@ typedef struct {
unsigned long reg_reset_pmds[4]; /* which other counters to reset on overflow */
unsigned long reg_random_seed; /* seed value when randomization is used */
unsigned long reg_random_mask; /* bitmask used to limit random value */
unsigned long reg_last_reset_value;/* last value used to reset the PMD (PFM_READ_PMDS) */
unsigned long reserved[14]; /* for future use */
unsigned long reserved[13]; /* for future use */
} pfarg_reg_t;
typedef struct {
......@@ -123,7 +125,7 @@ typedef struct {
* Define the version numbers for both perfmon as a whole and the sampling buffer format.
*/
#define PFM_VERSION_MAJ 1U
#define PFM_VERSION_MIN 1U
#define PFM_VERSION_MIN 3U
#define PFM_VERSION (((PFM_VERSION_MAJ&0xffff)<<16)|(PFM_VERSION_MIN & 0xffff))
#define PFM_SMPL_VERSION_MAJ 1U
......@@ -156,13 +158,17 @@ typedef struct {
unsigned long stamp; /* timestamp */
unsigned long ip; /* where did the overflow interrupt happened */
unsigned long regs; /* bitmask of which registers overflowed */
unsigned long period; /* unused */
unsigned long reserved; /* unused */
} perfmon_smpl_entry_t;
extern int perfmonctl(pid_t pid, int cmd, void *arg, int narg);
#ifdef __KERNEL__
typedef struct {
void (*handler)(int irq, void *arg, struct pt_regs *regs);
} pfm_intr_handler_desc_t;
extern void pfm_save_regs (struct task_struct *);
extern void pfm_load_regs (struct task_struct *);
......@@ -174,9 +180,24 @@ extern void pfm_cleanup_owners (struct task_struct *);
extern int pfm_use_debug_registers(struct task_struct *);
extern int pfm_release_debug_registers(struct task_struct *);
extern int pfm_cleanup_smpl_buf(struct task_struct *);
extern void pfm_syst_wide_update_task(struct task_struct *, int);
extern void pfm_syst_wide_update_task(struct task_struct *, unsigned long info, int is_ctxswin);
extern void pfm_ovfl_block_reset(void);
extern void perfmon_init_percpu(void);
extern void pfm_init_percpu(void);
/*
* hooks to allow VTune/Prospect to cooperate with perfmon.
* (reserved for system wide monitoring modules only)
*/
extern int pfm_install_alternate_syswide_subsystem(pfm_intr_handler_desc_t *h);
extern int pfm_remove_alternate_syswide_subsystem(pfm_intr_handler_desc_t *h);
/*
* describe the content of the local_cpu_date->pfm_syst_info field
*/
#define PFM_CPUINFO_SYST_WIDE 0x1 /* if set a system wide session exist */
#define PFM_CPUINFO_DCR_PP 0x2 /* if set the system wide session has started */
#define PFM_CPUINFO_EXCL_IDLE 0x4 /* the system wide session excludes the idle task */
#endif /* __KERNEL__ */
......
......@@ -2,7 +2,7 @@
#define _ASM_IA64_PROCESSOR_H
/*
* Copyright (C) 1998-2002 Hewlett-Packard Co
* Copyright (C) 1998-2003 Hewlett-Packard Co
* David Mosberger-Tang <davidm@hpl.hp.com>
* Stephane Eranian <eranian@hpl.hp.com>
* Copyright (C) 1999 Asit Mallick <asit.k.mallick@intel.com>
......@@ -223,7 +223,10 @@ typedef struct {
struct siginfo;
struct thread_struct {
__u64 flags; /* various thread flags (see IA64_THREAD_*) */
__u32 flags; /* various thread flags (see IA64_THREAD_*) */
/* writing on_ustack is performance-critical, so it's worth spending 8 bits on it... */
__u8 on_ustack; /* executing on user-stacks? */
__u8 pad[3];
__u64 ksp; /* kernel stack pointer */
__u64 map_base; /* base address for get_unmapped_area() */
__u64 task_size; /* limit for task size */
......@@ -277,6 +280,7 @@ struct thread_struct {
#define INIT_THREAD { \
.flags = 0, \
.on_ustack = 0, \
.ksp = 0, \
.map_base = DEFAULT_MAP_BASE, \
.task_size = DEFAULT_TASK_SIZE, \
......
......@@ -2,7 +2,7 @@
#define _ASM_IA64_PTRACE_H
/*
* Copyright (C) 1998-2002 Hewlett-Packard Co
* Copyright (C) 1998-2003 Hewlett-Packard Co
* David Mosberger-Tang <davidm@hpl.hp.com>
* Stephane Eranian <eranian@hpl.hp.com>
*
......@@ -218,6 +218,13 @@ struct switch_stack {
# define ia64_task_regs(t) (((struct pt_regs *) ((char *) (t) + IA64_STK_OFFSET)) - 1)
# define ia64_psr(regs) ((struct ia64_psr *) &(regs)->cr_ipsr)
# define user_mode(regs) (((struct ia64_psr *) &(regs)->cr_ipsr)->cpl != 0)
# define user_stack(task,regs) ((long) regs - (long) task == IA64_STK_OFFSET - sizeof(*regs))
# define fsys_mode(task,regs) \
({ \
struct task_struct *_task = (task); \
struct pt_regs *_regs = (regs); \
!user_mode(regs) && user_stack(task, regs); \
})
struct task_struct; /* forward decl */
......
......@@ -74,6 +74,27 @@ typedef struct {
#define SPIN_LOCK_UNLOCKED (spinlock_t) { 0 }
#define spin_lock_init(x) ((x)->lock = 0)
#define DEBUG_SPIN_LOCK 0
#if DEBUG_SPIN_LOCK
#include <ia64intrin.h>
#define _raw_spin_lock(x) \
do { \
unsigned long _timeout = 1000000000; \
volatile unsigned int _old = 0, _new = 1, *_ptr = &((x)->lock); \
do { \
if (_timeout-- == 0) { \
extern void dump_stack (void); \
printk("kernel DEADLOCK at %s:%d?\n", __FILE__, __LINE__); \
dump_stack(); \
} \
} while (__sync_val_compare_and_swap(_ptr, _old, _new) != _old); \
} while (0)
#else
/*
* Streamlined test_and_set_bit(0, (x)). We use test-and-test-and-set
* rather than a simple xchg to avoid writing the cache-line when
......@@ -95,6 +116,8 @@ typedef struct {
";;\n" \
:: "r"(&(x)->lock) : "ar.ccv", "p7", "r2", "r29", "memory")
#endif /* !DEBUG_SPIN_LOCK */
#define spin_is_locked(x) ((x)->lock != 0)
#define _raw_spin_unlock(x) do { barrier(); ((spinlock_t *) x)->lock = 0; } while (0)
#define _raw_spin_trylock(x) (cmpxchg_acq(&(x)->lock, 0, 1) == 0)
......
......@@ -117,62 +117,51 @@ ia64_insn_group_barrier (void)
*/
/* For spinlocks etc */
/* clearing psr.i is implicitly serialized (visible by next insn) */
/* setting psr.i requires data serialization */
#define __local_irq_save(x) __asm__ __volatile__ ("mov %0=psr;;" \
"rsm psr.i;;" \
: "=r" (x) :: "memory")
#define __local_irq_disable() __asm__ __volatile__ (";; rsm psr.i;;" ::: "memory")
#define __local_irq_restore(x) __asm__ __volatile__ ("cmp.ne p6,p7=%0,r0;;" \
"(p6) ssm psr.i;" \
"(p7) rsm psr.i;;" \
"(p6) srlz.d" \
:: "r" ((x) & IA64_PSR_I) \
: "p6", "p7", "memory")
#ifdef CONFIG_IA64_DEBUG_IRQ
extern unsigned long last_cli_ip;
# define __save_ip() __asm__ ("mov %0=ip" : "=r" (last_cli_ip))
# define local_irq_save(x) \
do { \
unsigned long ip, psr; \
unsigned long psr; \
\
__asm__ __volatile__ ("mov %0=psr;; rsm psr.i;;" : "=r" (psr) :: "memory"); \
if (psr & (1UL << 14)) { \
__asm__ ("mov %0=ip" : "=r"(ip)); \
last_cli_ip = ip; \
} \
__local_irq_save(psr); \
if (psr & IA64_PSR_I) \
__save_ip(); \
(x) = psr; \
} while (0)
# define local_irq_disable() \
do { \
unsigned long ip, psr; \
\
__asm__ __volatile__ ("mov %0=psr;; rsm psr.i;;" : "=r" (psr) :: "memory"); \
if (psr & (1UL << 14)) { \
__asm__ ("mov %0=ip" : "=r"(ip)); \
last_cli_ip = ip; \
} \
} while (0)
# define local_irq_disable() do { unsigned long x; local_irq_save(x); } while (0)
# define local_irq_restore(x) \
do { \
unsigned long ip, old_psr, psr = (x); \
unsigned long old_psr, psr = (x); \
\
__asm__ __volatile__ ("mov %0=psr;" \
"cmp.ne p6,p7=%1,r0;;" \
"(p6) ssm psr.i;" \
"(p7) rsm psr.i;;" \
"(p6) srlz.d" \
: "=r" (old_psr) : "r"((psr) & IA64_PSR_I) \
: "p6", "p7", "memory"); \
if ((old_psr & IA64_PSR_I) && !(psr & IA64_PSR_I)) { \
__asm__ ("mov %0=ip" : "=r"(ip)); \
last_cli_ip = ip; \
} \
local_save_flags(old_psr); \
__local_irq_restore(psr); \
if ((old_psr & IA64_PSR_I) && !(psr & IA64_PSR_I)) \
__save_ip(); \
} while (0)
#else /* !CONFIG_IA64_DEBUG_IRQ */
/* clearing of psr.i is implicitly serialized (visible by next insn) */
# define local_irq_save(x) __asm__ __volatile__ ("mov %0=psr;; rsm psr.i;;" \
: "=r" (x) :: "memory")
# define local_irq_disable() __asm__ __volatile__ (";; rsm psr.i;;" ::: "memory")
/* (potentially) setting psr.i requires data serialization: */
# define local_irq_restore(x) __asm__ __volatile__ ("cmp.ne p6,p7=%0,r0;;" \
"(p6) ssm psr.i;" \
"(p7) rsm psr.i;;" \
"srlz.d" \
:: "r"((x) & IA64_PSR_I) \
: "p6", "p7", "memory")
# define local_irq_save(x) __local_irq_save(x)
# define local_irq_disable() __local_irq_disable()
# define local_irq_restore(x) __local_irq_restore(x)
#endif /* !CONFIG_IA64_DEBUG_IRQ */
#define local_irq_enable() __asm__ __volatile__ (";; ssm psr.i;; srlz.d" ::: "memory")
......@@ -216,8 +205,8 @@ extern void ia64_save_extra (struct task_struct *task);
extern void ia64_load_extra (struct task_struct *task);
#ifdef CONFIG_PERFMON
DECLARE_PER_CPU(int, pfm_syst_wide);
# define PERFMON_IS_SYSWIDE() (get_cpu_var(pfm_syst_wide) != 0)
DECLARE_PER_CPU(unsigned long, pfm_syst_info);
# define PERFMON_IS_SYSWIDE() (get_cpu_var(pfm_syst_info) & 0x1)
#else
# define PERFMON_IS_SYSWIDE() (0)
#endif
......
......@@ -47,19 +47,22 @@ local_finish_flush_tlb_mm (struct mm_struct *mm)
static inline void
flush_tlb_mm (struct mm_struct *mm)
{
MMU_TRACE('F', smp_processor_id(), mm, mm->context);
if (!mm)
return;
goto out;
mm->context = 0;
if (atomic_read(&mm->mm_users) == 0)
return; /* happens as a result of exit_mmap() */
goto out; /* happens as a result of exit_mmap() */
#ifdef CONFIG_SMP
smp_flush_tlb_mm(mm);
#else
local_finish_flush_tlb_mm(mm);
#endif
out:
MMU_TRACE('f', smp_processor_id(), mm, mm->context);
}
extern void flush_tlb_range (struct vm_area_struct *vma, unsigned long start, unsigned long end);
......
......@@ -4,7 +4,7 @@
/*
* IA-64 Linux syscall numbers and inline-functions.
*
* Copyright (C) 1998-2002 Hewlett-Packard Co
* Copyright (C) 1998-2003 Hewlett-Packard Co
* David Mosberger-Tang <davidm@hpl.hp.com>
*/
......@@ -223,8 +223,8 @@
#define __NR_sched_setaffinity 1231
#define __NR_sched_getaffinity 1232
#define __NR_set_tid_address 1233
/* #define __NR_alloc_hugepages 1234 reusable */
/* #define __NR_free_hugepages 1235 reusable */
/* 1234 available for reuse */
/* 1235 available for reuse */
#define __NR_exit_group 1236
#define __NR_lookup_dcookie 1237
#define __NR_io_setup 1238
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment