1. 23 May, 2016 6 commits
  2. 20 May, 2016 15 commits
  3. 19 May, 2016 1 commit
  4. 18 May, 2016 1 commit
  5. 17 May, 2016 12 commits
    • Arnaldo Carvalho de Melo's avatar
      perf tools: Separate accounting of contexts and real addresses in a stack trace · a29d5c9b
      Arnaldo Carvalho de Melo authored
      The perf_sample->ip_callchain->nr value includes all the entries in the
      ip_callchain->ip[] array, real addresses and PERF_CONTEXT_{KERNEL,USER,etc},
      while what the user expects is that what is in the kernel.perf_event_max_stack
      sysctl or in the upcoming per event perf_event_attr.sample_max_stack knob be
      honoured in terms of IP addresses in the stack trace.
      
      So match the kernel support and validate chain->nr taking into account
      both kernel.perf_event_max_stack and kernel.perf_event_max_contexts_per_stack.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: He Kuang <hekuang@huawei.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Milian Wolff <milian.wolff@kdab.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: Wang Nan <wangnan0@huawei.com>
      Cc: Zefan Li <lizefan@huawei.com>
      Link: http://lkml.kernel.org/n/tip-mgx0jpzfdq4uq4abfa40byu0@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      a29d5c9b
    • Arnaldo Carvalho de Melo's avatar
      perf core: Separate accounting of contexts and real addresses in a stack trace · c85b0334
      Arnaldo Carvalho de Melo authored
      The perf_sample->ip_callchain->nr value includes all the entries in the
      ip_callchain->ip[] array, real addresses and PERF_CONTEXT_{KERNEL,USER,etc},
      while what the user expects is that what is in the kernel.perf_event_max_stack
      sysctl or in the upcoming per event perf_event_attr.sample_max_stack knob be
      honoured in terms of IP addresses in the stack trace.
      
      So allocate a bunch of extra entries for contexts, and do the accounting
      via perf_callchain_entry_ctx struct members.
      
      A new sysctl, kernel.perf_event_max_contexts_per_stack is also
      introduced for investigating possible bugs in the callchain
      implementation by some arch.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: He Kuang <hekuang@huawei.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Milian Wolff <milian.wolff@kdab.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: Wang Nan <wangnan0@huawei.com>
      Cc: Zefan Li <lizefan@huawei.com>
      Link: http://lkml.kernel.org/n/tip-3b4wnqk340c4sg4gwkfdi9yk@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      c85b0334
    • Arnaldo Carvalho de Melo's avatar
      perf core: Add perf_callchain_store_context() helper · 3e4de4ec
      Arnaldo Carvalho de Melo authored
      We need have different helpers to account how many contexts we have in
      the sample and for real addresses, so do it now as a prep patch, to
      ease review.
      
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/n/tip-q964tnyuqrxw5gld18vizs3c@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      3e4de4ec
    • Arnaldo Carvalho de Melo's avatar
      perf core: Add a 'nr' field to perf_event_callchain_context · 3b1fff08
      Arnaldo Carvalho de Melo authored
      We will use it to count how many addresses are in the entry->ip[] array,
      excluding PERF_CONTEXT_{KERNEL,USER,etc} entries, so that we can really
      return the number of entries specified by the user via the relevant
      sysctl, kernel.perf_event_max_contexts, or via the per event
      perf_event_attr.sample_max_stack knob.
      
      This way we keep the perf_sample->ip_callchain->nr meaning, that is the
      number of entries, be it real addresses or PERF_CONTEXT_ entries, while
      honouring the max_stack knobs, i.e. the end result will be max_stack
      entries if we have at least that many entries in a given stack trace.
      
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/n/tip-s8teto51tdqvlfhefndtat9r@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      3b1fff08
    • Arnaldo Carvalho de Melo's avatar
      perf core: Pass max stack as a perf_callchain_entry context · cfbcf468
      Arnaldo Carvalho de Melo authored
      This makes perf_callchain_{user,kernel}() receive the max stack
      as context for the perf_callchain_entry, instead of accessing
      the global sysctl_perf_event_max_stack.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: He Kuang <hekuang@huawei.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Milian Wolff <milian.wolff@kdab.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: Wang Nan <wangnan0@huawei.com>
      Cc: Zefan Li <lizefan@huawei.com>
      Link: http://lkml.kernel.org/n/tip-kolmn1yo40p7jhswxwrc7rrd@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      cfbcf468
    • Arnaldo Carvalho de Melo's avatar
      perf core: Generalize max_stack sysctl handler · a831100a
      Arnaldo Carvalho de Melo authored
      So that it can be used for other stack related knobs, such as the
      upcoming one to tweak the max number of of contexts per stack sample.
      
      In all those cases we can only change the value if there are no perf
      sessions collecting stacks, so they need to grab that mutex, etc.
      
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/n/tip-8t3fk94wuzp8m2z1n4gc0s17@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      a831100a
    • Masami Hiramatsu's avatar
      perf symbols: Introduce DSO__NAME_KALLSYMS and DSO__NAME_KCORE · 0a77582f
      Masami Hiramatsu authored
      Instead of using a raw string, use DSO__NAME_KALLSYMS and
      DSO__NAME_KCORE macros for kallsyms and kcore.
      Signed-off-by: default avatarMasami Hiramatsu <mhiramat@kernel.org>
      Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
      Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
      Cc: Hemant Kumar <hemant@linux.vnet.ibm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20160515031935.4017.50971.stgit@devboxSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      0a77582f
    • Namhyung Kim's avatar
      perf stat: Use cpu-clock event for cpu targets · a1f3d567
      Namhyung Kim authored
      Currently 'perf stat' always counts task-clock event by default.  But
      it's somewhat confusing for system-wide targets (especially with 'sleep
      N' as the 'sleep' task just sleeps and doesn't use cputime).  Changing
      to cpu-clock event instead for that case makes more sense IMHO.
      
      Before:
        # perf stat -a sleep 0.1
      
         Performance counter stats for 'system wide':
      
              403.038603      task-clock (msec)     #    4.001 CPUs utilized
                     150      context-switches      #    0.372 K/sec
                       7      cpu-migrations        #    0.017 K/sec
                      71      page-faults           #    0.176 K/sec
              23,705,169      cycles                #    0.059 GHz
              15,888,166      instructions          #    0.67  insn per cycle
               3,326,078      branches              #    8.253 M/sec
                  87,643      branch-misses         #    2.64% of all branches
      
             0.100737009 seconds time elapsed
      
        #
      
      After:
      
        # perf stat -a sleep 0.1
      
         Performance counter stats for 'system wide':
      
              404.271182      cpu-clock (msec)      #    4.000 CPUs utilized
                     143      context-switches      #    0.354 K/sec
                      13      cpu-migrations        #    0.032 K/sec
                      73      page-faults           #    0.181 K/sec
              22,119,220      cycles                #    0.055 GHz
              13,622,065      instructions          #    0.62  insn per cycle
               2,918,769      branches              #    7.220 M/sec
                  85,033      branch-misses         #    2.91% of all branches
      
             0.101073089 seconds time elapsed
      
        #
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/1463119263-5569-3-git-send-email-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      a1f3d567
    • Namhyung Kim's avatar
      perf stat: Update runtime using cpu-clock event · daf4f478
      Namhyung Kim authored
      Currently only the task-clock event updates the runtime_nsec so it
      cannot show the metric when using cpu-clock events.  However cpu clock
      works basically same as task-clock, so no need to not update the runtime
      IMHO.
      
      Before:
      
        # perf stat -a -e cpu-clock,context-switches,page-faults,cycles sleep 0.1
      
          Performance counter stats for 'system wide':
      
               1217.759506      cpu-clock (msec)
                        93      context-switches
                        61      page-faults
                18,958,022      cycles
      
               0.101393794 seconds time elapsed
      
      After:
      
         Performance counter stats for 'system wide':
      
               1220.471884      cpu-clock (msec)          #   12.013 CPUs utilized
                       118      context-switches          #    0.097 K/sec
                        59      page-faults               #    0.048 K/sec
                17,941,247      cycles                    #    0.015 GHz
      
               0.101594777 seconds time elapsed
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/1463119263-5569-2-git-send-email-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      daf4f478
    • Namhyung Kim's avatar
      perf stat: Fix indentation of stalled backend cycle · b0404be8
      Namhyung Kim authored
      The commit 140aeadc ("perf stat: Abstract stat metrics printing")
      changed how shadow metrics are printed, but it missed to update the
      width of the stalled backend cycles event to 7.2% like others.  This
      resulted in misaligned output like below:
      
        Performance counter stats for 'pwd':
      
                0.638313      task-clock (msec)         #    0.567 CPUs utilized
                       0      context-switches          #    0.000 K/sec
                       0      cpu-migrations            #    0.000 K/sec
                      54      page-faults               #    0.085 M/sec
                 885,600      cycles                    #    1.387 GHz
                 558,438      stalled-cycles-frontend   #   63.06% frontend cycles idle
                 431,355      stalled-cycles-backend    #  48.71% backend cycles idle
                 674,956      instructions              #    0.76  insn per cycle
                                                        #    0.83  stalled cycles per insn
                 130,380      branches                  #  204.257 M/sec
           <not counted>      branch-misses
      
             0.001125426 seconds time elapsed
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Fixes: 140aeadc ("perf stat: Abstract stat metrics printing")
      Link: http://lkml.kernel.org/r/1463119263-5569-1-git-send-email-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      b0404be8
    • He Kuang's avatar
      perf symbols: Store vdso buildid unconditionally · 6ae98ba6
      He Kuang authored
      When unwinding callchains on a different machine, vdso info should be
      available so the unwind process won't be interrupted if address falls
      into vdso region. But in most cases, the addresses of sample events are
      not in vdso range, the buildid of a zero hit vdso won't be stored into
      perf.data.
      
      This patch stores vdso buildid regardless of whether the vdso is hit or
      not.
      Signed-off-by: default avatarHe Kuang <hekuang@huawei.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Ekaterina Tumanova <tumanova@linux.vnet.ibm.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/1463042596-61703-3-git-send-email-hekuang@huawei.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      6ae98ba6
    • Andi Kleen's avatar
      perf stat: Avoid fractional digits for integer scales · e3b03b6c
      Andi Kleen authored
      When the scaling factor is a full integer don't display fractional
      digits. This avoids unnecessary .00 output for topdown metrics with
      scale factors.
      
      v2: Remove redundant check.
      Signed-off-by: default avatarAndi Kleen <ak@linux.intel.com>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1462489447-31832-7-git-send-email-andi@firstfloor.org
      [ Rename 'round' to 'stat_round' as 'round' is defined in math.h,
        included by this patch, and this breaks the build on ubuntu 12.04 ]
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      e3b03b6c
  6. 16 May, 2016 5 commits
    • Linus Torvalds's avatar
      Merge branch 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · bc231d9e
      Linus Torvalds authored
      Pull x86 platform updates from Ingo Molnar:
       "The main change is the addition of SGI/UV4 support"
      
      * 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (22 commits)
        x86/platform/UV: Fix incorrect nodes and pnodes for cpuless and memoryless nodes
        x86/platform/UV: Remove Obsolete GRU MMR address translation
        x86/platform/UV: Update physical address conversions for UV4
        x86/platform/UV: Build GAM reference tables
        x86/platform/UV: Support UV4 socket address changes
        x86/platform/UV: Add obtaining GAM Range Table from UV BIOS
        x86/platform/UV: Add UV4 addressing discovery function
        x86/platform/UV: Fold blade info into per node hub info structs
        x86/platform/UV: Allocate common per node hub info structs on local node
        x86/platform/UV: Move blade local processor ID to the per cpu info struct
        x86/platform/UV: Move scir info to the per cpu info struct
        x86/platform/UV: Create per cpu info structs to replace per hub info structs
        x86/platform/UV: Update MMIOH setup function to work for both UV3 and UV4
        x86/platform/UV: Clean up redunduncies after merge of UV4 MMR definitions
        x86/platform/UV: Add UV4 Specific MMR definitions
        x86/platform/UV: Prep for UV4 MMR updates
        x86/platform/UV: Add UV MMR Illegal Access Function
        x86/platform/UV: Add UV4 Specific Defines
        x86/platform/UV: Add UV Architecture Defines
        x86/platform/UV: Add Initial UV4 definitions
        ...
      bc231d9e
    • Linus Torvalds's avatar
      Merge branch 'x86-debug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 62a00278
      Linus Torvalds authored
      Pull x86 debug cleanup from Ingo Molnar:
       "A printk() output simplification"
      
      * 'x86-debug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/dumpstack: Combine some printk()s
      62a00278
    • Linus Torvalds's avatar
      Merge branch 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · bcea36df
      Linus Torvalds authored
      Pull x86 cleanup from Ingo Molnar:
       "Inline optimizations"
      
      * 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86: Fix non-static inlines
      bcea36df
    • Linus Torvalds's avatar
      Merge branch 'x86-build-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 05e30f01
      Linus Torvalds authored
      Pull x86-64 defconfig update from Ingo Molnar:
       "Small defconfig addition"
      
      [ I'm not actually convinced our defconfig is sensible, but whatever ]
      
      * 'x86-build-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/build/defconfig/64: Enable CONFIG_E1000E=y
      05e30f01
    • Linus Torvalds's avatar
      Merge branch 'x86-boot-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 9a45f036
      Linus Torvalds authored
      Pull x86 boot updates from Ingo Molnar:
       "The biggest changes in this cycle were:
      
         - prepare for more KASLR related changes, by restructuring, cleaning
           up and fixing the existing boot code.  (Kees Cook, Baoquan He,
           Yinghai Lu)
      
         - simplifly/concentrate subarch handling code, eliminate
           paravirt_enabled() usage.  (Luis R Rodriguez)"
      
      * 'x86-boot-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (50 commits)
        x86/KASLR: Clarify purpose of each get_random_long()
        x86/KASLR: Add virtual address choosing function
        x86/KASLR: Return earliest overlap when avoiding regions
        x86/KASLR: Add 'struct slot_area' to manage random_addr slots
        x86/boot: Add missing file header comments
        x86/KASLR: Initialize mapping_info every time
        x86/boot: Comment what finalize_identity_maps() does
        x86/KASLR: Build identity mappings on demand
        x86/boot: Split out kernel_ident_mapping_init()
        x86/boot: Clean up indenting for asm/boot.h
        x86/KASLR: Improve comments around the mem_avoid[] logic
        x86/boot: Simplify pointer casting in choose_random_location()
        x86/KASLR: Consolidate mem_avoid[] entries
        x86/boot: Clean up pointer casting
        x86/boot: Warn on future overlapping memcpy() use
        x86/boot: Extract error reporting functions
        x86/boot: Correctly bounds-check relocations
        x86/KASLR: Clean up unused code from old 'run_size' and rename it to 'kernel_total_size'
        x86/boot: Fix "run_size" calculation
        x86/boot: Calculate decompression size during boot not build
        ...
      9a45f036