- 05 Oct, 2016 4 commits
-
-
Namhyung Kim authored
When it's called with an offset less than or equal to the first event, it'll return a garbage value since the data is not initialized. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Steven Rostedt <rostedt@goodmis.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20161001101700.29146-1-namhyung@kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Adrian Hunter authored
The MTC packet provides a 8-bit slice of CTC which is related to TSC by the TMA packet, however the TMA packet only provides the lower 16 bits of CTC. If mtc_shift > 8 then some of the MTC bits are not in the CTC provided by the TMA packet. Fix-up the last_mtc calculated from the TMA packet by copying the missing bits from the current MTC assuming the least difference between the two, and that the current MTC comes after last_mtc. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: stable@vger.kernel.org # v4.3+ Link: http://lkml.kernel.org/r/1475062896-22274-2-git-send-email-adrian.hunter@intel.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Adrian Hunter authored
In cycle-accurate mode, timestamps can be calculated from CYC packets. The decoder also estimates timestamps based on the number of instructions since the last timestamp. For that to work in cycle-accurate mode, the instruction count needs to be reset to zero when a timestamp is calculated from a CYC packet, but that wasn't happening, so fix it. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: stable@vger.kernel.org # v4.3+ Link: http://lkml.kernel.org/r/1475062896-22274-1-git-send-email-adrian.hunter@intel.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Ravi Bangoria authored
Perf uretprobe probes on GEP(Global Entry Point) which fails to record all function calls via LEP(Local Entry Point). Fix that by probing on LEP. Objdump: 00000000100005f0 <doit>: 100005f0: 02 10 40 3c lis r2,4098 100005f4: 00 7f 42 38 addi r2,r2,32512 100005f8: a6 02 08 7c mflr r0 100005fc: 10 00 01 f8 std r0,16(r1) 10000600: f8 ff e1 fb std r31,-8(r1) Before applying patch: $ cat /sys/kernel/debug/tracing/uprobe_events r:probe_uprobe_test/doit /home/ravi/uprobe_test:0x00000000000005f0 After applying patch: $ cat /sys/kernel/debug/tracing/uprobe_events r:probe_uprobe_test/doit /home/ravi/uprobe_test:0x00000000000005f8 This is not the case with kretprobes because the kernel itself finds LEP and probes on it. Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Balbir Singh <bsingharora@gmail.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1475576865-6562-1-git-send-email-ravi.bangoria@linux.vnet.ibm.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
- 04 Oct, 2016 23 commits
-
-
Ingo Molnar authored
Merge tag 'perf-core-for-mingo-20161003' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent Pull perf/core improvements and fixes: - Allow vendors to provide JSON files describing PMU events, that then get parsed to generate C tables that are linked against perf, allowing the use of the names in their documentations, such as: # perf list l1d List of pre-defined events (to be used in -e): Cache: l1d.replacement [L1D data line replacements] l1d_pend_miss.fb_full [Cycles a demand request was blocked due to Fill Buffers inavailability] l1d_pend_miss.pending [L1D miss oustandings duration in cycles] l1d_pend_miss.pending_cycles [Cycles with L1D load Misses outstanding] l1d_pend_miss.pending_cycles_any [Cycles with L1D load Misses outstanding from any thread on physical core] l2_trans.l1d_wb [L1D writebacks that access L2 cache] Pipeline: cycle_activity.cycles_l1d_miss [Cycles while L1 cache miss demand load is outstanding] cycle_activity.cycles_l1d_pending [Cycles while L1 cache miss demand load is outstanding] cycle_activity.stalls_l1d_miss [Execution stalls while L1 cache miss demand load is outstanding] cycle_activity.stalls_l1d_pending [Execution stalls while L1 cache miss demand load is outstanding] The above example was done on a Broadwell based ThinkPad t450s after downloading and installing such JSON files which will be added to the tools/perf/pmu-events/ directory in a subsequent patchkit. Now one can use those names with -e/--event in all 'perf tools'. (Andi Kleen, Sukadev Bhattiprolu) - Add a missing pointer dereference in 'perf probe' (Colin Ian King) - Add support for building host programs to be used in generating files to be used in the build process, such as fixdep and jevents, fixing the usage of these features in a cross compilation setup (Jiri Olsa) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull CPU hotplug updates from Thomas Gleixner: "Yet another batch of cpu hotplug core updates and conversions: - Provide core infrastructure for multi instance drivers so the drivers do not have to keep custom lists. - Convert custom lists to the new infrastructure. The block-mq custom list conversion comes through the block tree and makes the diffstat tip over to more lines removed than added. - Handle unbalanced hotplug enable/disable calls more gracefully. - Remove the obsolete CPU_STARTING/DYING notifier support. - Convert another batch of notifier users. The relayfs changes which conflicted with the conversion have been shipped to me by Andrew. The remaining lot is targeted for 4.10 so that we finally can remove the rest of the notifiers" * 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (46 commits) cpufreq: Fix up conversion to hotplug state machine blk/mq: Reserve hotplug states for block multiqueue x86/apic/uv: Convert to hotplug state machine s390/mm/pfault: Convert to hotplug state machine mips/loongson/smp: Convert to hotplug state machine mips/octeon/smp: Convert to hotplug state machine fault-injection/cpu: Convert to hotplug state machine padata: Convert to hotplug state machine cpufreq: Convert to hotplug state machine ACPI/processor: Convert to hotplug state machine virtio scsi: Convert to hotplug state machine oprofile/timer: Convert to hotplug state machine block/softirq: Convert to hotplug state machine lib/irq_poll: Convert to hotplug state machine x86/microcode: Convert to hotplug state machine sh/SH-X3 SMP: Convert to hotplug state machine ia64/mca: Convert to hotplug state machine ARM/OMAP/wakeupgen: Convert to hotplug state machine ARM/shmobile: Convert to hotplug state machine arm64/FP/SIMD: Convert to hotplug state machine ...
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull irq updates from Thomas Gleixner: "The irq departement proudly presents: - A rework of the core infrastructure to optimally spread interrupt for multiqueue devices. The first version was a bit naive and failed to take thread siblings and other details into account. Developed in cooperation with Christoph and Keith. - Proper delegation of softirqs to ksoftirqd, so if ksoftirqd is active then no further softirq processsing on interrupt return happens. Otherwise we try to delegate and still run another batch of network packets in the irq return path, which then tries to delegate to ksoftirqd ..... - A proper machine parseable sysfs based alternative for /proc/interrupts. - ACPI support for the GICV3-ITS and ARM interrupt remapping - Two new irq chips from the ARM SoC zoo: STM32-EXTI and MVEBU-PIC - A new irq chip for the JCore (SuperH) - The usual pile of small fixlets in core and irqchip drivers" * 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (42 commits) softirq: Let ksoftirqd do its job genirq: Make function __irq_do_set_handler() static ARM/dts: Add EXTI controller node to stm32f429 ARM/STM32: Select external interrupts controller drivers/irqchip: Add STM32 external interrupts support Documentation/dt-bindings: Document STM32 EXTI controller bindings irqchip/mips-gic: Use for_each_set_bit to iterate over local IRQs pci/msi: Retrieve affinity for a vector genirq/affinity: Remove old irq spread infrastructure genirq/msi: Switch to new irq spreading infrastructure genirq/affinity: Provide smarter irq spreading infrastructure genirq/msi: Add cpumask allocation to alloc_msi_entry genirq: Expose interrupt information through sysfs irqchip/gicv3-its: Use MADT ITS subtable to do PCI/MSI domain initialization irqchip/gicv3-its: Factor out PCI-MSI part that might be reused for ACPI irqchip/gicv3-its: Probe ITS in the ACPI way irqchip/gicv3-its: Refactor ITS DT init code to prepare for ACPI irqchip/gicv3-its: Cleanup for ITS domain initialization PCI/MSI: Setup MSI domain on a per-device basis using IORT ACPI table ACPI: Add new IORT functions to support MSI domain handling ...
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull timer updates from Thomas Gleixner: "A rather smalish set of updates for timers and timekeeping: - Two core fixes to prevent potential undefinded behaviour about which gcc is complaining rightfully. - A fix to prevent stopping the tick on an (soon) offline CPU so it can complete the shutdown procedure. - Wait for clocks to stabilize before making decisions, so a not yet validated clock is not rejected. - The usual pile of fixes to the various clocksource drivers. - Core code typo and include fixlets" * 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: timekeeping: Include the correct header for errno definitions clocksource/drivers/ti-32k: Prevent ftrace recursion clocksource/mips-gic-timer: Stop checking cpu_has_counter clocksource/mips-gic-timer: Print an error if IRQ setup fails tick/nohz: Prevent stopping the tick on an offline CPU clocksource/drivers/oxnas: Add OX820 compatible clocksource/drivers/timer-atmel-pit: Simplify IRQ handler clocksource/drivers/timer-atmel-pit: Remove uselesss WARN_ON_ONCE clocksource/drivers/timer-atmel-pit: Drop at91sam926x_pit_common_init clocksource/drivers/moxart: Replace panic by pr_err clocksource/drivers/moxart: Replace setup_irq by request_irq clocksource/drivers/moxart: Add Aspeed support clocksource/drivers/moxart: Use struct to hold state clocksource/drivers/moxart: Refactor enable/disable time: Avoid undefined behaviour in ktime_add_safe() time: Avoid undefined behaviour in timespec64_add_safe() timekeeping: Prints the amounts of time spent during suspend clocksource: Defer override invalidation unless clock is unstable hrtimer: Spelling fixes
-
git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arcLinus Torvalds authored
Pull ARC updates from Vineet Gupta: - ARCv2 support for native 64-bit atomics using LLOCK/SCONDD instructions - Support for upcoming 3.0 release of HS38 cores - Dwarf unwindinder improvements: - enable unwinding of hand written assembler code using CFI pseudo-ops - switch to .eh_frame (as opposed to historic .debug_frame) - get rid of a bunch of adhoc band-aids in the process - Misc fixes: - perf supporting generic cache-references and cache-misses (Alexey) - default NODE_SHIFT (Noam Camus) - usage of KFLAG instruction to set IE (Yuriy) - Platforms: - Add "model" property across the DT (Alexey) - Enable MODULE_* in defconfigs * tag 'arc-4.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc: ARC: [plat*] enables MODULE* ARCv2: fix local_save_flags ARC: CONFIG_NODES_SHIFT fix default values ARCv2: intc: Use kflag if STATUS32.IE must be reset ARC: .exit.* sections can be discarded in .eh_frame regime ARC: dw2 unwind: enable cfi pseudo ops in string lib ARC: dw2 unwind: add infrastructure for adding cfi pseudo ops to asm ARC: entry: make ret_from_system_call local label ARC: dw2 unwind: don't force dwarf 2 ARC: dw2 unwind: switch to .eh_frame based unwinding ARC: dw2 unwind: factor CIE specifics for .eh_frame/.debug_frame ARC: module: support R_ARC_32_PCREL relocation arc: perf: Enable generic "cache-references" and "cache-misses" events ARC: [plat-eznps] add missing atomic_fetch_xxx operations ARCv2: Implement atomic64 based on LLOCKD/SCONDD instructions ARCv2: Support dynamic peripheral address space in HS38 rel 3.0 cores ARCv2: identify HS38 rel 3.0 cores ARCv2: Add support for ZeBu Emulation platform for HS cores arc: Add "model" properly in device tree description of all boards
-
git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68kLinus Torvalds authored
Pull m68k updates from Geert Uytterhoeven: - cleanups - defconfig updates - GPG fingerprint update * tag 'm68k-for-v4.9-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k: m68k: Migrate exception table users off module.h and onto extable.h CREDITS: Update fingerprint for Geert Uytterhoeven m68k: Use IS_ENABLED() instead of checking for built-in or module m68k/defconfig: Update defconfigs for v4.8-rc1
-
Andi Kleen authored
Add support for the "frontend" extra MSR on Skylake in the JSON conversion. Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Ingo Molnar <mingo@kernel.org> Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: linuxppc-dev@lists.ozlabs.org Link: http://lkml.kernel.org/r/1473978296-20712-19-git-send-email-sukadev@linux.vnet.ibm.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Andi Kleen authored
The JSON event lists use a different encoding for fixed counters than perf for instructions and cycles (ref-cycles is ok) This lead to some common events like inst_retired.any or cpu_clk_unhalted.thread not counting, when specified with their JSON name. Special case these events in the jevents conversion process. I prefer to not touch the JSON files for this, as it's intended that standard JSON files can be just dropped into the perf build without changes. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> [Fix minor compile error] Acked-by: Ingo Molnar <mingo@kernel.org> Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: linuxppc-dev@lists.ozlabs.org Link: http://lkml.kernel.org/r/1473978296-20712-18-git-send-email-sukadev@linux.vnet.ibm.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Andi Kleen authored
Make alias matching the events parser case-insensitive. This is useful with the JSON events. perf uses lower case events, but the CPU manuals generally use upper case event names. The JSON files use lower case by default too. But if we search case insensitively then users can cut-n-paste the upper case event names. So the following works: % perf stat -e BR_INST_EXEC.TAKEN_INDIRECT_NEAR_CALL true Performance counter stats for 'true': 305 BR_INST_EXEC.TAKEN_INDIRECT_NEAR_CALL 0.000492799 seconds time elapsed Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Ingo Molnar <mingo@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: linuxppc-dev@lists.ozlabs.org Link: http://lkml.kernel.org/r/1473978296-20712-17-git-send-email-sukadev@linux.vnet.ibm.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Sukadev Bhattiprolu authored
This avoids the JSON PMU events parser having to know whether its aliases are for perf stat or perf record. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Acked-by: Ingo Molnar <mingo@kernel.org> Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: linuxppc-dev@lists.ozlabs.org Link: http://lkml.kernel.org/r/1473978296-20712-20-git-send-email-sukadev@linux.vnet.ibm.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Sukadev Bhattiprolu authored
Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Acked-by: Ingo Molnar <mingo@kernel.org> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: linuxppc-dev@lists.ozlabs.org Link: http://lkml.kernel.org/r/1473978296-20712-16-git-send-email-sukadev@linux.vnet.ibm.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Andi Kleen authored
Add support to group the output of perf list by the Topic field in the JSON file. Example output: % perf list ... Cache: l1d.replacement [L1D data line replacements] l1d_pend_miss.pending [L1D miss oustandings duration in cycles] l1d_pend_miss.pending_cycles [Cycles with L1D load Misses outstanding] l2_l1d_wb_rqsts.all [Not rejected writebacks from L1D to L2 cache lines in any state] l2_l1d_wb_rqsts.hit_e [Not rejected writebacks from L1D to L2 cache lines in E state] l2_l1d_wb_rqsts.hit_m [Not rejected writebacks from L1D to L2 cache lines in M state] ... Pipeline: arith.fpu_div [Divide operations executed] arith.fpu_div_active [Cycles when divider is busy executing divide operations] baclears.any [Counts the total number when the front end is resteered, mainly when the BPU cannot provide a correct prediction and this is corrected by other branch handling mechanisms at the front end] br_inst_exec.all_branches [Speculative and retired branches] br_inst_exec.all_conditional [Speculative and retired macro-conditional branches] br_inst_exec.all_direct_jmp [Speculative and retired macro-unconditional branches excluding calls and indirects] br_inst_exec.all_direct_near_call [Speculative and retired direct near calls] br_inst_exec.all_indirect_jump_non_call_ret Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Acked-by: Ingo Molnar <mingo@kernel.org> Acked-by: Jiri Olsa <jolsa@redhat.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: linuxppc-dev@lists.ozlabs.org Link: http://lkml.kernel.org/r/1473978296-20712-14-git-send-email-sukadev@linux.vnet.ibm.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Sukadev Bhattiprolu authored
Previously we were dropping the useful longer descriptions that some events have in the event list completely. This patch makes them appear with perf list. Old perf list: baclears: baclears.all [Counts the number of baclears] vs new: perf list -v: ... baclears: baclears.all [The BACLEARS event counts the number of times the front end is resteered, mainly when the Branch Prediction Unit cannot provide a correct prediction and this is corrected by the Branch Address Calculator at the front end. The BACLEARS.ANY event counts the number of baclears for any type of branch] Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Acked-by: Ingo Molnar <mingo@kernel.org> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: linuxppc-dev@lists.ozlabs.org Link: http://lkml.kernel.org/r/1473978296-20712-13-git-send-email-sukadev@linux.vnet.ibm.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Sukadev Bhattiprolu authored
Implement support in jevents to parse long descriptions for events that may have them in the JSON files. A follow on patch will make this long description available to user through the 'perf list' command. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Acked-by: Ingo Molnar <mingo@kernel.org> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: linuxppc-dev@lists.ozlabs.org Link: http://lkml.kernel.org/r/1473978296-20712-11-git-send-email-sukadev@linux.vnet.ibm.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Andi Kleen authored
Add a PERF_CPUID variable to override the CPUID of the current CPU (within the current architecture). This is useful for testing, so that all event lists can be tested on a single system. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Acked-by: Ingo Molnar <mingo@kernel.org> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: linuxppc-dev@lists.ozlabs.org Link: http://lkml.kernel.org/r/1473978296-20712-10-git-send-email-sukadev@linux.vnet.ibm.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Andi Kleen authored
Add a --no-desc flag to 'perf list' to not print the event descriptions that were earlier added for JSON events. This may be useful to get a less crowded listing. It's still default to print descriptions as that is the more useful default for most users. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Acked-by: Ingo Molnar <mingo@kernel.org> Acked-by: Jiri Olsa <jolsa@redhat.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: linuxppc-dev@lists.ozlabs.org Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1473978296-20712-9-git-send-email-sukadev@linux.vnet.ibm.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Andi Kleen authored
Automatically adapt the now wider and word wrapped perf list output to wider terminals. This requires querying the terminal before the auto pager takes over, and exporting this information from the pager subsystem. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Acked-by: Ingo Molnar <mingo@kernel.org> Acked-by: Jiri Olsa <jolsa@redhat.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: linuxppc-dev@lists.ozlabs.org Link: http://lkml.kernel.org/r/1473978296-20712-8-git-send-email-sukadev@linux.vnet.ibm.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Andi Kleen authored
Add support to print alias descriptions in perf list, which are taken from the generated event files. The sorting code is changed to put the events with descriptions at the end. The descriptions are printed as possibly multiple word wrapped lines. Example output: % perf list ... arith.fpu_div [Divide operations executed] arith.fpu_div_active [Cycles when divider is busy executing divide operations] Committer notes: Further testing on a Broadwell machine (ThinkPad t450s), using these files: $ find tools/perf/pmu-events/arch/x86/ tools/perf/pmu-events/arch/x86/ tools/perf/pmu-events/arch/x86/Broadwell tools/perf/pmu-events/arch/x86/Broadwell/Cache.json tools/perf/pmu-events/arch/x86/Broadwell/Other.json tools/perf/pmu-events/arch/x86/Broadwell/Frontend.json tools/perf/pmu-events/arch/x86/Broadwell/Virtual-Memory.json tools/perf/pmu-events/arch/x86/Broadwell/Pipeline.json tools/perf/pmu-events/arch/x86/Broadwell/Floating-point.json tools/perf/pmu-events/arch/x86/Broadwell/Memory.json tools/perf/pmu-events/arch/x86/mapfile.csv $ Taken from: https://github.com/sukadev/linux/tree/json-code+data-v21/tools/perf/pmu-events/arch/x86/ to get this machinery to actually parse JSON files, generate $(OUTPUT)pmu-events/pmu-events.c, compile it and link it with perf, that will then use the table it contains, these files will be submitted right after this patchkit. [acme@jouet linux]$ perf list page_walker List of pre-defined events (to be used in -e): page_walker_loads.dtlb_l1 [Number of DTLB page walker hits in the L1+FB] page_walker_loads.dtlb_l2 [Number of DTLB page walker hits in the L2] page_walker_loads.dtlb_l3 [Number of DTLB page walker hits in the L3 + XSNP] page_walker_loads.dtlb_memory [Number of DTLB page walker hits in Memory] page_walker_loads.itlb_l1 [Number of ITLB page walker hits in the L1+FB] page_walker_loads.itlb_l2 [Number of ITLB page walker hits in the L2] page_walker_loads.itlb_l3 [Number of ITLB page walker hits in the L3 + XSNP] [acme@jouet linux]$ Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Acked-by: Ingo Molnar <mingo@kernel.org> Acked-by: Jiri Olsa <jolsa@redhat.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: linuxppc-dev@lists.ozlabs.org Link: http://lkml.kernel.org/r/1473978296-20712-7-git-send-email-sukadev@linux.vnet.ibm.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Andi Kleen authored
To work with existing mapfiles, assume that the first line in 'mapfile.csv' is a header line and skip over it. Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Acked-by: Ingo Molnar <mingo@kernel.org> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1473978296-20712-15-git-send-email-sukadev@linux.vnet.ibm.com Cc: linuxppc-dev@lists.ozlabs.org Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull x86 vdso updates from Ingo Molnar: "The main changes in this cycle centered around adding support for 32-bit compatible C/R of the vDSO on 64-bit kernels, by Dmitry Safonov" * 'x86-vdso-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/vdso: Use CONFIG_X86_X32_ABI to enable vdso prctl x86/vdso: Only define map_vdso_randomized() if CONFIG_X86_64 x86/vdso: Only define prctl_map_vdso() if CONFIG_CHECKPOINT_RESTORE x86/signal: Add SA_{X32,IA32}_ABI sa_flags x86/ptrace: Down with test_thread_flag(TIF_IA32) x86/coredump: Use pr_reg size, rather that TIF_IA32 flag x86/arch_prctl/vdso: Add ARCH_MAP_VDSO_* x86/vdso: Replace calculate_addr in map_vdso() with addr x86/vdso: Unmap vdso blob on vvar mapping failure
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull x86 timer updates from Ingo Molnar: "This tree includes a HPET overhead micro-optimization plus new TSC frequencies for newer Intel CPUs" * 'x86-timers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/tsc: Add additional Intel CPU models to the crystal quirk list x86/tsc: Use cpu id defines instead of hex constants x86/hpet: Reduce HPET counter read contention
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull x86 platform changes from Ingo Molnar: "The main changes in this cycle were: - SGI UV updates (Andrew Banman) - Intel MID updates (Andy Shevchenko) - Initial Mellanox systems platform (Vadim Pasternak)" * 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/platform/mellanox: Fix return value check in mlxplat_init() x86/platform/mellanox: Introduce support for Mellanox systems platform x86/platform/uv/BAU: Add UV4-specific functions x86/platform/uv/BAU: Fix payload queue setup on UV4 hardware x86/platform/uv/BAU: Disable software timeout on UV4 hardware x86/platform/uv/BAU: Populate ->uvhub_version with UV4 version information x86/platform/uv/BAU: Use generic function pointers x86/platform/uv/BAU: Add generic function pointers x86/platform/uv/BAU: Convert uv_physnodeaddr() use to uv_gpa_to_offset() x86/platform/uv/BAU: Clean up pq_init() x86/platform/uv/BAU: Clean up and update printks x86/platform/uv/BAU: Clean up vertical alignment x86/platform/intel-mid: Keep SRAM powered on at boot x86/platform/intel-mid: Add Intel Penwell to ID table x86/cpu: Rename Merrifield2 to Moorefield x86/platform/intel-mid: Implement power off sequence x86/platform/intel-mid: Enable SD card detection on Merrifield x86/platform/intel-mid: Enable WiFi on Intel Edison x86/platform/intel-mid: Run PWRMU command immediately
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull x86 cleanups from Ingo Molnar: "Header file and a wrapper functions cleanup" * 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86: Migrate exception table users off module.h and onto extable.h x86: Clean up various simple wrapper functions
-
- 03 Oct, 2016 13 commits
-
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull x86 boot updates from Ingo Molnar: "The changes in this cycle were: - Save e820 table RAM footprint on larger kernel configurations. (Denys Vlasenko) - pmem related fixes (Dan Williams) - theoretical e820 boundary condition fix (Wei Yang)" * 'x86-boot-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/boot: Fix kdump, cleanup aborted E820_PRAM max_pfn manipulation x86/e820: Use much less memory for e820/e820_saved, save up to 120k x86/e820: Prepare e280 code for switch to dynamic storage x86/e820: Mark some static functions __init x86/e820: Fix very large 'size' handling boundary condition
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull low-level x86 updates from Ingo Molnar: "In this cycle this topic tree has become one of those 'super topics' that accumulated a lot of changes: - Add CONFIG_VMAP_STACK=y support to the core kernel and enable it on x86 - preceded by an array of changes. v4.8 saw preparatory changes in this area already - this is the rest of the work. Includes the thread stack caching performance optimization. (Andy Lutomirski) - switch_to() cleanups and all around enhancements. (Brian Gerst) - A large number of dumpstack infrastructure enhancements and an unwinder abstraction. The secret long term plan is safe(r) live patching plus maybe another attempt at debuginfo based unwinding - but all these current bits are standalone enhancements in a frame pointer based debug environment as well. (Josh Poimboeuf) - More __ro_after_init and const annotations. (Kees Cook) - Enable KASLR for the vmemmap memory region. (Thomas Garnier)" [ The virtually mapped stack changes are pretty fundamental, and not x86-specific per se, even if they are only used on x86 right now. ] * 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (70 commits) x86/asm: Get rid of __read_cr4_safe() thread_info: Use unsigned long for flags x86/alternatives: Add stack frame dependency to alternative_call_2() x86/dumpstack: Fix show_stack() task pointer regression x86/dumpstack: Remove dump_trace() and related callbacks x86/dumpstack: Convert show_trace_log_lvl() to use the new unwinder oprofile/x86: Convert x86_backtrace() to use the new unwinder x86/stacktrace: Convert save_stack_trace_*() to use the new unwinder perf/x86: Convert perf_callchain_kernel() to use the new unwinder x86/unwind: Add new unwind interface and implementations x86/dumpstack: Remove NULL task pointer convention fork: Optimize task creation by caching two thread stacks per CPU if CONFIG_VMAP_STACK=y sched/core: Free the stack early if CONFIG_THREAD_INFO_IN_TASK lib/syscall: Pin the task stack in collect_syscall() x86/process: Pin the target stack in get_wchan() x86/dumpstack: Pin the target stack when dumping it kthread: Pin the stack via try_get_task_stack()/put_task_stack() in to_live_kthread() function sched/core: Add try_get_task_stack() and put_task_stack() x86/entry/64: Fix a minor comment rebase error iommu/amd: Don't put completion-wait semaphore on stack ...
-
Andi Kleen authored
Implement the code to match CPU types to mapfile types for x86 based on CPUID. This extends an existing similar function, but changes it to use the x86 mapfile cpu description. This allows to resolve event lists generated by jevents. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Acked-by: Ingo Molnar <mingo@kernel.org> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: linuxppc-dev@lists.ozlabs.org Link: http://lkml.kernel.org/r/1473978296-20712-6-git-send-email-sukadev@linux.vnet.ibm.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Sukadev Bhattiprolu authored
Implement code that returns the generic CPU ID string for Powerpc. This will be used to identify the specific table of PMU events to parse/compare user specified events against. Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Acked-by: Ingo Molnar <mingo@kernel.org> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: linuxppc-dev@lists.ozlabs.org Link: http://lkml.kernel.org/r/1473978296-20712-5-git-send-email-sukadev@linux.vnet.ibm.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Sukadev Bhattiprolu authored
At run time (when 'perf' is starting up), locate the specific table of PMU events that corresponds to the current CPU. Using that table, create aliases for the each of the PMU events in the CPU. The use these aliases to parse the user specified perf event. In short this would allow the user to specify events using their aliases rather than raw event codes. Based on input and some earlier patches from Andi Kleen, Jiri Olsa. Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Acked-by: Ingo Molnar <mingo@kernel.org> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: linuxppc-dev@lists.ozlabs.org Link: http://lkml.kernel.org/r/1473978296-20712-4-git-send-email-sukadev@linux.vnet.ibm.com [ Make pmu_add_cpu_aliases() return void, since it was returning just '0' and furthermore, even that was being discarded via an explicit (void) cast ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Andi Kleen authored
This is a modified version of an earlier patch by Andi Kleen. We expect architectures to create JSON files describing the performance monitoring (PMU) events that each CPU model/family of the architecture supports. Following is an example of the JSON file entry for an x86 event: [ ... { "EventCode": "0x00", "UMask": "0x01", "EventName": "INST_RETIRED.ANY", "BriefDescription": "Instructions retired from execution.", "PublicDescription": "Instructions retired from execution.", "Counter": "Fixed counter 1", "CounterHTOff": "Fixed counter 1", "SampleAfterValue": "2000003", "SampleAfterValue": "2000003", "MSRIndex": "0", "MSRValue": "0", "TakenAlone": "0", "CounterMask": "0", "Invert": "0", "AnyThread": "0", "EdgeDetect": "0", "PEBS": "0", "PRECISE_STORE": "0", "Errata": "null", "Offcore": "0" }, ... ] All the PMU events supported by a CPU model/family must be grouped into "topics" such as "Pipelining", "Floating-point", "Virtual-memory" etc. All events belonging to a topic must be placed in a separate JSON file (eg: "Pipelining.json") and all the topic JSON files for a CPU model must be in a separate directory. Eg: for the CPU model "Silvermont_core": $ ls tools/perf/pmu-events/arch/x86/Silvermont_core Floating-point.json Memory.json Other.json Pipelining.json Virtualmemory.json Finally, to allow multiple CPU models to share a single set of JSON files, architectures must provide a mapping between a model and its set of events: $ grep Silvermont tools/perf/pmu-events/arch/x86/mapfile.csv GenuineIntel-6-4D,V13,Silvermont_core,core GenuineIntel-6-4C,V13,Silvermont_core,core which maps each CPU, identified by [vendor, family, model, version, type] to a directory of JSON files. Thus two (or more) CPU models support the set of PMU events listed in the directory. tools/perf/pmu-events/arch/x86/Silvermont_core/ Given this organization of files, the program, jevents: - locates all JSON files for each CPU-model of the architecture, - parses all JSON files for the CPU-model and generates a C-style "PMU-events table" (pmu-events.c) for the model - locates a mapfile for the architecture - builds a global table, mapping each model of CPU to the corresponding PMU-events table. The 'pmu-events.c' is generated when building perf and added to libperf.a. The global table pmu_events_map[] table in this pmu-events.c will be used in perf in a follow-on patch. If the architecture does not have any JSON files or there is an error in processing them, an empty mapping file is created. This would allow the build of perf to proceed even if we are not able to provide aliases for events. The parser for JSON files allows parsing Intel style JSON event files. This allows to use an Intel event list directly with perf. The Intel event lists can be quite large and are too big to store in unswappable kernel memory. The conversion from JSON to C-style is straight forward. The parser knows (very little) Intel specific information, and can be easily extended to handle fields for other CPUs. The parser code is partially shared with an independent parsing library, which is 2-clause BSD licensed. To avoid any conflicts I marked those files as BSD licensed too. As part of perf they become GPLv2. Committer notes: Fixes: 1) Limit maxfds to 512 to avoid nftd() segfaulting on alloca() with a big rlim_max, as in docker containers - acme 2) Make jevents a hostprog, supporting cross compilation - jolsa 3) Use HOSTCC for jevents final step - acme 4) Define _GNU_SOURCE for asprintf, as we can't use CC's EXTRA_CFLAGS, that has to have --sysroot on the Android NDK 24 - acme 5) Removed $(srctree)/tools/perf/pmu-events/pmu-events.c from the 'clean' target, it is generated on $(OUTPUT)pmu-events/pmu-events.c, which is already taken care of in the original patch - acme Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Jiri Olsa <jolsa@redhat.com> Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Cc: linuxppc-dev@lists.ozlabs.org Link: http://lkml.kernel.org/r/1473978296-20712-3-git-send-email-sukadev@linux.vnet.ibm.com Link: http://lkml.kernel.org/r/20160927141846.GA6589@kravaSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull x86 apic updates from Ingo Molnar: "The main changes are: - Persistent CPU/node numbering across CPU hotplug/unplug events. This is a pretty involved series of changes that first fetches all the information during bootup and then uses it for the various hotplug/unplug methods. (Gu Zheng, Dou Liyang) - IO-APIC hot-add/remove fixes and enhancements. (Rui Wang) - ... various fixes, cleanups and enhancements" * 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (22 commits) x86/apic: Fix silent & fatal merge conflict in __generic_processor_info() acpi: Fix broken error check in map_processor() acpi: Validate processor id when mapping the processor acpi: Provide mechanism to validate processors in the ACPI tables x86/acpi: Set persistent cpuid <-> nodeid mapping when booting x86/acpi: Enable MADT APIs to return disabled apicids x86/acpi: Introduce persistent storage for cpuid <-> apicid mapping x86/acpi: Enable acpi to register all possible cpus at boot time x86/numa: Online memory-less nodes at boot time x86/apic: Get rid of apic_version[] array x86/apic: Order irq_enter/exit() calls correctly vs. ack_APIC_irq() x86/ioapic: Ignore root bridges without a companion ACPI device x86/apic: Update comment about disabling processor focus x86/smpboot: Check APIC ID before setting up default routing x86/ioapic: Fix IOAPIC failing to request resource x86/ioapic: Fix lost IOAPIC resource after hot-removal and hotadd x86/ioapic: Fix setup_res() failing to get resource x86/ioapic: Support hot-removal of IOAPICs present during boot x86/ioapic: Change prototype of acpi_ioapic_add() x86/apic, ACPI: Fix incorrect assignment when handling apic/x2apic entries ...
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull scheduler changes from Ingo Molnar: "The main changes are: - irqtime accounting cleanups and enhancements. (Frederic Weisbecker) - schedstat debugging enhancements, make it more broadly runtime available. (Josh Poimboeuf) - More work on asymmetric topology/capacity scheduling. (Morten Rasmussen) - sched/wait fixes and cleanups. (Oleg Nesterov) - PELT (per entity load tracking) improvements. (Peter Zijlstra) - Rewrite and enhance select_idle_siblings(). (Peter Zijlstra) - sched/numa enhancements/fixes (Rik van Riel) - sched/cputime scalability improvements (Stanislaw Gruszka) - Load calculation arithmetics fixes. (Dietmar Eggemann) - sched/deadline enhancements (Tommaso Cucinotta) - Fix utilization accounting when switching to the SCHED_NORMAL policy. (Vincent Guittot) - ... plus misc cleanups and enhancements" * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (64 commits) sched/irqtime: Consolidate irqtime flushing code sched/irqtime: Consolidate accounting synchronization with u64_stats API u64_stats: Introduce IRQs disabled helpers sched/irqtime: Remove needless IRQs disablement on kcpustat update sched/irqtime: No need for preempt-safe accessors sched/fair: Fix min_vruntime tracking sched/debug: Add SCHED_WARN_ON() sched/core: Fix set_user_nice() sched/fair: Introduce set_curr_task() helper sched/core, ia64: Rename set_curr_task() sched/core: Fix incorrect utilization accounting when switching to fair class sched/core: Optimize SCHED_SMT sched/core: Rewrite and improve select_idle_siblings() sched/core: Replace sd_busy/nr_busy_cpus with sched_domain_shared sched/core: Introduce 'struct sched_domain_shared' sched/core: Restructure destroy_sched_domain() sched/core: Remove unused @cpu argument from destroy_sched_domain*() sched/wait: Introduce init_wait_entry() sched/wait: Avoid abort_exclusive_wait() in __wait_on_bit_lock() sched/wait: Avoid abort_exclusive_wait() in ___wait_event() ...
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull RAS updates from Ingo Molnar: "The main changes were: - Lots of enhancements for AMD SMCA (Scalable MCA features/extensions) systems: extract, decode and print more hardware error information and add matching support on the injection/testing side as well. (Yazn Ghannam) - Various MCE handling improvements on modern Intel Xeons. (Tony Luck) - Plus misc fixes and enhancements" * 'ras-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (21 commits) x86/RAS/mce_amd_inj: Remove debugfs dir recursively on exit x86/RAS/mce_amd_inj: Fix signed wrap around when decrementing index 'i' x86/RAS/mce_amd_inj: Fix some W= warnings x86/MCE/AMD, EDAC: Handle reserved bank 4 on Fam17h properly x86/mce/AMD: Extract the error address on SMCA systems x86/mce, EDAC/mce_amd: Print MCA_SYND and MCA_IPID during MCE on SMCA systems x86/mce/AMD: Save MCA_IPID in MCE struct on SMCA systems x86/mce/AMD: Ensure the deferred error interrupt is of type APIC on SMCA systems x86/mce/AMD: Update sysfs bank names for SMCA systems x86/mce/AMD, EDAC/mce_amd: Define and use tables for known SMCA IP types EDAC/mce_amd: Use SMCA prefix for error descriptions arrays EDAC/mce_amd: Add missing SMCA error descriptions x86/mce/AMD: Read MSRs on the CPU allocating the threshold blocks x86/RAS: Add syndrome support to mce_amd_inj EDAC/mce_amd: Print syndrome register value on SMCA systems x86/mce: Add support for new MCA_SYND register x86/mce/AMD: Use msr_ops.misc() in allocate_threshold_blocks() x86/mce: Drop X86_FEATURE_MCE_RECOVERY and the related model string test x86/mce: Improve memcpy_mcsafe() x86/mce: Add PCI quirks to identify Xeons with machine check recovery ...
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull perf updates from Ingo Molnar: "The main kernel side changes were: - uprobes enhancements (Masami Hiramatsu) - Uncore group events enhancements (David Carrillo-Cisneros) - x86 Intel: Add support for Skylake server uncore PMUs (Kan Liang) - x86 Intel: LBR cleanups and enhancements, for better branch annotation tracking (Peter Zijlstra) - x86 Intel: Add support for PTWRITE and power event tracing (Alexander Shishkin) - ... various fixes, cleanups and smaller enhancements. Lots of tooling changes - a couple of highlights: - Support event group view with hierarchy mode in 'perf top' and 'perf report' (Namhyung Kim) e.g.: $ perf record -e '{cycles,instructions}' make $ perf report --hierarchy --stdio ... # Overhead Command / Shared Object / Symbol # ...................... .................................. ... 25.74% 27.18%sh 19.96% 24.14%libc-2.24.so 9.55% 14.64%[.] __strcmp_sse2 1.54% 0.00%[.] __tfind 1.07% 1.13%[.] _int_malloc 0.95% 0.00%[.] __strchr_sse2 0.89% 1.39%[.] __tsearch 0.76% 0.00%[.] strlen - Add branch stack / basic block info to 'perf annotate --stdio', where for each branch, we add an asm comment after the instruction with information on how often it was taken and predicted. See example with color output at: http://vger.kernel.org/~acme/perf/annotate_basic_blocks.png (Peter Zijlstra) - Add support for using symbols in address filters with Intel PT and ARM CoreSight (hardware assisted tracing facilities) (Adrian Hunter, Mathieu Poirier) - Add support for interacting with Coresight PMU ETMs/PTMs, that are IP blocks to perform hardware assisted tracing on a ARM CPU core (Mathieu Poirier) - Support generating cross arch probes, i.e. if you specify a vmlinux file for different arch than the one in the host machine, $ perf probe --definition function_name args will generate the probe definition string needed to append to the target machine /sys/kernel/debug/tracing/kprobes_events file, using scripting (Masami Hiramatsu). - Allow configuring the default 'perf report -s' sort order in ~/.perfconfig, for instance, "sym,dso" may be more fitting for kernel developers. (Arnaldo Carvalho de Melo) - ... plus lots of other changes, refactorings, features and fixes" * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (149 commits) perf tests: Add dwarf unwind test for powerpc perf probe: Match linkage name with mangled name perf probe: Fix to cut off incompatible chars from group name perf probe: Skip if the function address is 0 perf probe: Ignore the error of finding inline instance perf intel-pt: Fix decoding when there are address filters perf intel-pt: Enable decoder to handle TIP.PGD with missing IP perf intel-pt: Read address filter from AUXTRACE_INFO event perf intel-pt: Record address filter in AUXTRACE_INFO event perf intel-pt: Add a helper function for processing AUXTRACE_INFO perf intel-pt: Fix missing error codes processing auxtrace_info perf intel-pt: Add support for recording the max non-turbo ratio perf intel-pt: Fix snapshot overlap detection decoder errors perf probe: Increase debug level of SDT debug messages perf record: Add support for using symbols in address filters perf symbols: Add dso__last_symbol() perf record: Fix error paths perf record: Rename label 'out_symbol_exit' perf script: Fix vanished idle symbols perf evsel: Add support for address filters ...
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull locking updates from Ingo Molnar: "The main changes in this cycle were: - rwsem micro-optimizations (Davidlohr Bueso) - Improve the implementation and optimize the performance of percpu-rwsems. (Peter Zijlstra.) - Convert all lglock users to better facilities such as percpu-rwsems or percpu-spinlocks and remove lglocks. (Peter Zijlstra) - Remove the ticket (spin)lock implementation. (Peter Zijlstra) - Korean translation of memory-barriers.txt and related fixes to the English document. (SeongJae Park) - misc fixes and cleanups" * 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (24 commits) x86/cmpxchg, locking/atomics: Remove superfluous definitions x86, locking/spinlocks: Remove ticket (spin)lock implementation locking/lglock: Remove lglock implementation stop_machine: Remove stop_cpus_lock and lg_double_lock/unlock() fs/locks: Use percpu_down_read_preempt_disable() locking/percpu-rwsem: Add down_read_preempt_disable() fs/locks: Replace lg_local with a per-cpu spinlock fs/locks: Replace lg_global with a percpu-rwsem locking/percpu-rwsem: Add DEFINE_STATIC_PERCPU_RWSEMand percpu_rwsem_assert_held() locking/pv-qspinlock: Use cmpxchg_release() in __pv_queued_spin_unlock() locking/rwsem, x86: Drop a bogus cc clobber futex: Add some more function commentry locking/hung_task: Show all locks locking/rwsem: Scan the wait_list for readers only once locking/rwsem: Remove a few useless comments locking/rwsem: Return void in __rwsem_mark_wake() locking, rcu, cgroup: Avoid synchronize_sched() in __cgroup_procs_write() locking/Documentation: Add Korean translation locking/Documentation: Fix a typo of example result locking/Documentation: Fix wrong section reference ...
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull EFI updates from Ingo Molnar: "Main changes in this cycle were: - Refactor the EFI memory map code into architecture neutral files and allow drivers to permanently reserve EFI boot services regions on x86, as well as ARM/arm64. (Matt Fleming) - Add ARM support for the EFI ESRT driver. (Ard Biesheuvel) - Make the EFI runtime services and efivar API interruptible by swapping spinlocks for semaphores. (Sylvain Chouleur) - Provide the EFI identity mapping for kexec which allows kexec to work on SGI/UV platforms with requiring the "noefi" kernel command line parameter. (Alex Thorlton) - Add debugfs node to dump EFI page tables on arm64. (Ard Biesheuvel) - Merge the EFI test driver being carried out of tree until now in the FWTS project. (Ivan Hu) - Expand the list of flags for classifying EFI regions as "RAM" on arm64 so we align with the UEFI spec. (Ard Biesheuvel) - Optimise out the EFI mixed mode if it's unsupported (CONFIG_X86_32) or disabled (CONFIG_EFI_MIXED=n) and switch the early EFI boot services function table for direct calls, alleviating us from having to maintain the custom function table. (Lukas Wunner) - Miscellaneous cleanups and fixes" * 'efi-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (30 commits) x86/efi: Round EFI memmap reservations to EFI_PAGE_SIZE x86/efi: Allow invocation of arbitrary boot services x86/efi: Optimize away setup_gop32/64 if unused x86/efi: Use kmalloc_array() in efi_call_phys_prolog() efi/arm64: Treat regions with WT/WC set but WB cleared as memory efi: Add efi_test driver for exporting UEFI runtime service interfaces x86/efi: Defer efi_esrt_init until after memblock_x86_fill efi/arm64: Add debugfs node to dump UEFI runtime page tables x86/efi: Remove unused find_bits() function fs/efivarfs: Fix double kfree() in error path x86/efi: Map in physical addresses in efi_map_region_fixed lib/ucs2_string: Speed up ucs2_utf8size() firmware-gsmi: Delete an unnecessary check before the function call "dma_pool_destroy" x86/efi: Initialize status to ensure garbage is not returned on small size efi: Replace runtime services spinlock with semaphore efi: Don't use spinlocks for efi vars efi: Use a file local lock for efivars efi/arm*: esrt: Add missing call to efi_esrt_init() efi/esrt: Use memremap not ioremap to access ESRT table in memory x86/efi-bgrt: Use efi_mem_reserve() to avoid copying image data ...
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull core SMP updates from Ingo Molnar: "Two main change is generic vCPU pinning and physical CPU SMP-call support, for Xen to be able to perform certain calls on specific physical CPUs - by Juergen Gross" * 'core-smp-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: smp: Allocate smp_call_on_cpu() workqueue on stack too hwmon: Use smp_call_on_cpu() for dell-smm i8k dcdbas: Make use of smp_call_on_cpu() xen: Add xen_pin_vcpu() to support calling functions on a dedicated pCPU smp: Add function to execute a function synchronously on a CPU virt, sched: Add generic vCPU pinning support xen: Sync xen header
-