1. 27 Apr, 2024 24 commits
    • Howard Chu's avatar
      perf record: Fix comment misspellings · 7cc72090
      Howard Chu authored
      Fix comment misspellings
      Signed-off-by: default avatarHoward Chu <howardchu95@gmail.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20240425060427.1800663-1-howardchu95@gmail.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      7cc72090
    • Namhyung Kim's avatar
      perf annotate: Update DSO binary type when trying build-id · 8f3ec810
      Namhyung Kim authored
      dso__disassemble_filename() tries to get the filename for objdump (or
      capstone) using build-id.  But I found sometimes it didn't disassemble
      some functions.
      
      It turned out that those functions belong to a DSO which has no binary
      type set.  It seems it sets the binary type for some special files only
      - like kernel (kallsyms or kcore) or BPF images.  And there's a logic to
      skip dso with DSO_BINARY_TYPE__NOT_FOUND.
      
      As it's checked the build-id cache link, it should set the binary type
      as DSO_BINARY_TYPE__BUILD_ID_CACHE.
      
      Fixes: 873a8373 ("perf annotate: Skip DSOs not found")
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20240425005157.1104789-2-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      8f3ec810
    • Namhyung Kim's avatar
      perf annotate: Fallback disassemble to objdump when capstone fails · f35847de
      Namhyung Kim authored
      I found some cases that capstone failed to disassemble.  Probably my
      capstone is an old version but anyway there's a chance it can fail.  And
      then it silently stopped in the middle.  In my case, it didn't
      understand "RDPKRU" instruction.
      
      Let's check if the capstone disassemble reached the end of the function
      and fallback to objdump if not.
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20240425005157.1104789-1-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      f35847de
    • Namhyung Kim's avatar
      perf annotate-data: Check if 'struct annotation_source' was allocated on 'perf report' TUI · 47557db9
      Namhyung Kim authored
      As it removed the sample accounting for code when no symbol sort key is
      given for 'perf report' TUI, it might not have allocated the
      'struct annotated_source' yet.  Let's check if it's NULL first.
      
      Fixes: 6cdd977e ("perf report: Do not collect sample histogram unnecessarily")
      Reviewed-by: default avatarIan Rogers <irogers@google.com>
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20240424230015.1054013-1-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      47557db9
    • Namhyung Kim's avatar
      perf test: Add a new test for 'perf annotate' · 281bf8f6
      Namhyung Kim authored
      Add a basic 'perf annotate' test:
      
        $ ./perf test annotate -vv
         76: perf annotate basic tests:
        --- start ---
        test child forked, pid 846989
         fbcd0-fbd55 l noploop
        perf does have symbol 'noploop'
        Basic perf annotate test
                 : 0     0xfbcd0 <noploop>:
            0.00 :   fbcd0:       pushq   %rbp
            0.00 :   fbcd1:       movq    %rsp, %rbp
            0.00 :   fbcd4:       pushq   %r12
            0.00 :   fbcd6:       pushq   %rbx
            0.00 :   fbcd7:       movl    $1, %ebx
            0.00 :   fbcdc:       subq    $0x10, %rsp
            0.00 :   fbce0:       movq    %fs:0x28, %rax
            0.00 :   fbce9:       movq    %rax, -0x18(%rbp)
            0.00 :   fbced:       xorl    %eax, %eax
            0.00 :   fbcef:       testl   %edi, %edi
            0.00 :   fbcf1:       jle     0xfbd04
            0.00 :   fbcf3:       movq    (%rsi), %rdi
            0.00 :   fbcf6:       movl    $0xa, %edx
            0.00 :   fbcfb:       xorl    %esi, %esi
            0.00 :   fbcfd:       callq   0x41920
            0.00 :   fbd02:       movl    %eax, %ebx
            0.00 :   fbd04:       leaq    -0x7b(%rip), %r12	# fbc90 <sighandler>
            0.00 :   fbd0b:       movl    $2, %edi
            0.00 :   fbd10:       movq    %r12, %rsi
            0.00 :   fbd13:       callq   0x40a00
            0.00 :   fbd18:       movl    $0xe, %edi
            0.00 :   fbd1d:       movq    %r12, %rsi
            0.00 :   fbd20:       callq   0x40a00
            0.00 :   fbd25:       movl    %ebx, %edi
            0.00 :   fbd27:       callq   0x407c0
            0.10 :   fbd2c:       movl    0x89785e(%rip), %eax	# 993590 <done>
            0.00 :   fbd32:       testl   %eax, %eax
           99.90 :   fbd34:       je      0xfbd2c
            0.00 :   fbd36:       movq    -0x18(%rbp), %rax
            0.00 :   fbd3a:       subq    %fs:0x28, %rax
            0.00 :   fbd43:       jne     0xfbd50
            0.00 :   fbd45:       addq    $0x10, %rsp
            0.00 :   fbd49:       xorl    %eax, %eax
            0.00 :   fbd4b:       popq    %rbx
            0.00 :   fbd4c:       popq    %r12
            0.00 :   fbd4e:       popq    %rbp
            0.00 :   fbd4f:       retq
            0.00 :   fbd50:       callq   0x407e0
            0.00 :   fbcd0:       pushq   %rbp
            0.00 :   fbcd1:       movq    %rsp, %rbp
            0.00 :   fbcd4:       pushq   %r12
            0.00 :   fbcd0:  push   %rbp
            0.00 :   fbcd1:  mov    %rsp,%rbp
            0.00 :   fbcd4:  push   %r12
        Basic annotate test [Success]
        ---- end(0) ----
         76: perf annotate basic tests                                       : Ok
      Reviewed-by: default avatarIan Rogers <irogers@google.com>
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20240424001231.849972-1-namhyung@kernel.org
      [ Improved a bit the error messages ]
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      281bf8f6
    • Ian Rogers's avatar
      perf parse-events: Tidy the setting of the default event name · bb65ff78
      Ian Rogers authored
      Add comments. Pass ownership of the event name to save on a strdup.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Reviewed-by: default avatarKan Liang <kan.liang@linux.intel.com>
      Tested-by: default avatarAtish Patra <atishp@rivosinc.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Beeman Strong <beeman@rivosinc.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20240416061533.921723-17-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      bb65ff78
    • Ian Rogers's avatar
      perf parse-events: Minor grouping tidy up · afd876bb
      Ian Rogers authored
      Add comments. Ensure leader->group_name is freed before overwriting
      it.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Reviewed-by: default avatarKan Liang <kan.liang@linux.intel.com>
      Tested-by: default avatarAtish Patra <atishp@rivosinc.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Beeman Strong <beeman@rivosinc.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20240416061533.921723-16-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      afd876bb
    • Ian Rogers's avatar
      perf parse-event: Constify event_symbol arrays · 4a20e793
      Ian Rogers authored
      Moves 352 bytes from .data to .data.rel.ro.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Reviewed-by: default avatarKan Liang <kan.liang@linux.intel.com>
      Tested-by: default avatarAtish Patra <atishp@rivosinc.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Beeman Strong <beeman@rivosinc.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20240416061533.921723-15-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      4a20e793
    • Ian Rogers's avatar
      perf parse-events: Improvements to modifier parsing · e30a7912
      Ian Rogers authored
      Use a struct/bitmap rather than a copied string from lexer.
      
      In lexer give improved error message when too many precise flags are
      given or repeated modifiers.
      
      Before:
      
        $ perf stat -e 'cycles:kuk' true
        event syntax error: 'cycles:kuk'
                                    \___ Bad modifier
        ...
        $ perf stat -e 'cycles:pppp' true
        event syntax error: 'cycles:pppp'
                                    \___ Bad modifier
        ...
        $ perf stat -e '{instructions:p,cycles:pp}:pp' -a true
        event syntax error: '..cycles:pp}:pp'
                                          \___ Bad modifier
        ...
      
      After:
      
        $ perf stat -e 'cycles:kuk' true
        event syntax error: 'cycles:kuk'
                                      \___ Duplicate modifier 'k' (kernel)
        ...
        $ perf stat -e 'cycles:pppp' true
        event syntax error: 'cycles:pppp'
                                       \___ Maximum precise value is 3
        ...
        $ perf stat -e '{instructions:p,cycles:pp}:pp' true
        event syntax error: '..cycles:pp}:pp'
                                          \___ Maximum combined precise value is 3, adding precision to "cycles:pp"
        ...
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Reviewed-by: default avatarKan Liang <kan.liang@linux.intel.com>
      Tested-by: default avatarAtish Patra <atishp@rivosinc.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Beeman Strong <beeman@rivosinc.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20240416061533.921723-14-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      e30a7912
    • Ian Rogers's avatar
      perf parse-events: Inline parse_events_evlist_error · e18601d8
      Ian Rogers authored
      Inline parse_events_evlist_error that is only used in
      parse_events_error. Modify parse_events_error to not report a parser
      error unless errors haven't already been reported. Make it clearer
      that the latter case only happens for unrecognized input.
      
      Before:
      
        $ perf stat -e 'cycles/period=99999999999999999999/' true
        event syntax error: 'cycles/period=99999999999999999999/'
                                          \___ parser error
      
        event syntax error: '..les/period=99999999999999999999/'
                                          \___ Bad base 10 number "99999999999999999999"
        Run 'perf list' for a list of valid events
      
         Usage: perf stat [<options>] [<command>]
      
            -e, --event <event>   event selector. use 'perf list' to list available events
        $ perf stat -e 'cycles:xyz' true
        event syntax error: 'cycles:xyz'
                                   \___ parser error
        Run 'perf list' for a list of valid events
      
         Usage: perf stat [<options>] [<command>]
      
            -e, --event <event>   event selector. use 'perf list' to list available events
      
      After:
      
        $ perf stat -e 'cycles/period=99999999999999999999/xyz' true
        event syntax error: '..les/period=99999999999999999999/xyz'
                                          \___ Bad base 10 number "99999999999999999999"
        Run 'perf list' for a list of valid events
      
         Usage: perf stat [<options>] [<command>]
      
            -e, --event <event>   event selector. use 'perf list' to list available events
        $ perf stat -e 'cycles:xyz' true
        event syntax error: 'cycles:xyz'
                                   \___ Unrecognized input
        Run 'perf list' for a list of valid events
      
         Usage: perf stat [<options>] [<command>]
      
            -e, --event <event>   event selector. use 'perf list' to list available events
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Reviewed-by: default avatarKan Liang <kan.liang@linux.intel.com>
      Tested-by: default avatarAtish Patra <atishp@rivosinc.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Beeman Strong <beeman@rivosinc.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20240416061533.921723-13-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      e18601d8
    • Ian Rogers's avatar
      perf parse-events: Improve error message for bad numbers · ba5c371e
      Ian Rogers authored
      Use the error handler from the parse_state to give a more informative
      error message.
      
      Before:
      
        $ perf stat -e 'cycles/period=99999999999999999999/' true
        event syntax error: 'cycles/period=99999999999999999999/'
                                          \___ parser error
        Run 'perf list' for a list of valid events
      
         Usage: perf stat [<options>] [<command>]
      
            -e, --event <event>   event selector. use 'perf list' to list available events
      
      After:
      
        $ perf stat -e 'cycles/period=99999999999999999999/' true
        event syntax error: 'cycles/period=99999999999999999999/'
                                          \___ parser error
      
        event syntax error: '..les/period=99999999999999999999/'
                                          \___ Bad base 10 number "99999999999999999999"
        Run 'perf list' for a list of valid events
      
         Usage: perf stat [<options>] [<command>]
      
            -e, --event <event>   event selector. use 'perf list' to list available events
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Reviewed-by: default avatarKan Liang <kan.liang@linux.intel.com>
      Tested-by: default avatarAtish Patra <atishp@rivosinc.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Beeman Strong <beeman@rivosinc.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20240416061533.921723-12-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      ba5c371e
    • Ian Rogers's avatar
      perf parse-events: Inline parse_events_update_lists · 4e5484b4
      Ian Rogers authored
      The helper function just wraps a splice and free. Making the free
      inline removes a comment, so then it just wraps a splice which we can
      make inline too.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Reviewed-by: default avatarKan Liang <kan.liang@linux.intel.com>
      Tested-by: default avatarAtish Patra <atishp@rivosinc.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Beeman Strong <beeman@rivosinc.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20240416061533.921723-11-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      4e5484b4
    • Ian Rogers's avatar
      perf parse-events: Prefer sysfs/JSON hardware events over legacy · 617824a7
      Ian Rogers authored
      It was requested that RISC-V be able to add events to the perf tool so
      the PMU driver didn't need to map legacy events to config encodings:
      https://lore.kernel.org/lkml/20240217005738.3744121-1-atishp@rivosinc.com/
      
      This change makes the priority of events specified without a PMU the
      same as those specified with a PMU, namely sysfs and JSON events are
      checked first before using the legacy encoding.
      
      The hw_term is made more generic as a hardware_event that encodes a
      pair of string and int value, allowing parse_events_multi_pmu_add to
      fall back on a known encoding when the sysfs/JSON adding fails for
      core events. As this covers PE_VALUE_SYM_HW, that token is removed and
      related code simplified.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Reviewed-by: default avatarKan Liang <kan.liang@linux.intel.com>
      Tested-by: default avatarAtish Patra <atishp@rivosinc.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Beeman Strong <beeman@rivosinc.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20240416061533.921723-10-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      617824a7
    • Ian Rogers's avatar
      perf parse-events: Constify parse_events_add_numeric · 5ccc4edf
      Ian Rogers authored
      Allow the term list to be const so that other functions can pass const
      term lists. Add const as necessary to called functions.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Reviewed-by: default avatarKan Liang <kan.liang@linux.intel.com>
      Tested-by: default avatarAtish Patra <atishp@rivosinc.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Beeman Strong <beeman@rivosinc.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20240416061533.921723-9-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      5ccc4edf
    • Ian Rogers's avatar
      perf parse-events: Handle PE_TERM_HW in name_or_raw · 9d0dba23
      Ian Rogers authored
      Avoid duplicate logic for name_or_raw and PE_TERM_HW by having a rule
      to turn PE_TERM_HW into a name_or_raw.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Reviewed-by: default avatarKan Liang <kan.liang@linux.intel.com>
      Tested-by: default avatarAtish Patra <atishp@rivosinc.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Beeman Strong <beeman@rivosinc.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20240416061533.921723-8-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      9d0dba23
    • Ian Rogers's avatar
      perf parse-events: Legacy cache names on all PMUs and lower priority · 62593394
      Ian Rogers authored
      Prior behavior is to not look for legacy cache names in sysfs/JSON and
      to create events on all core PMUs. New behavior is to look for
      sysfs/JSON events first on all PMUs, for core PMUs add a legacy event
      if the sysfs/JSON event isn't present.
      
      This is done so that there is consistency with how event names in
      terms are handled and their prioritization of sysfs/JSON over
      legacy. It may make sense to use a legacy cache event name as an event
      name on a non-core PMU so we should allow it.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Reviewed-by: default avatarKan Liang <kan.liang@linux.intel.com>
      Tested-by: default avatarAtish Patra <atishp@rivosinc.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Beeman Strong <beeman@rivosinc.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20240416061533.921723-7-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      62593394
    • Ian Rogers's avatar
      perf tests parse-events: Use "branches" rather than "cache-references" · 78fae207
      Ian Rogers authored
      Switch from "cache-references" to "branches" in test as Intel has a
      sysfs event for "cache-references" and changing the priority for sysfs
      over legacy causes the test to fail.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Reviewed-by: default avatarKan Liang <kan.liang@linux.intel.com>
      Tested-by: default avatarAtish Patra <atishp@rivosinc.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Beeman Strong <beeman@rivosinc.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20240416061533.921723-6-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      78fae207
    • Ian Rogers's avatar
      perf pmu: Refactor perf_pmu__match() · f91fa2ae
      Ian Rogers authored
      Move all implementation to pmu code. Don't allocate a fnmatch wildcard
      pattern, matching ignoring the suffix already handles this, and only
      use fnmatch if the given PMU name has a '*' in it.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Reviewed-by: default avatarKan Liang <kan.liang@linux.intel.com>
      Tested-by: default avatarAtish Patra <atishp@rivosinc.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Beeman Strong <beeman@rivosinc.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20240416061533.921723-5-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      f91fa2ae
    • Ian Rogers's avatar
      perf parse-events: Avoid copying an empty list · 90b2c210
      Ian Rogers authored
      In parse_events_add_pmu, delay copying the list of terms until it is
      known the list contains terms.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Reviewed-by: default avatarKan Liang <kan.liang@linux.intel.com>
      Tested-by: default avatarAtish Patra <atishp@rivosinc.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Beeman Strong <beeman@rivosinc.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20240416061533.921723-4-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      90b2c210
    • Ian Rogers's avatar
      perf parse-events: Directly pass PMU to parse_events_add_pmu() · 63dfcde9
      Ian Rogers authored
      Avoid passing the name of a PMU then finding it again, just directly
      pass the PMU. parse_events_multi_pmu_add_or_add_pmu() is the only version
      that needs to find a PMU, so move the find there. Remove the error
      message as parse_events_multi_pmu_add_or_add_pmu will given an error at
      the end when a name isn't either a PMU name or event name. Without the
      error message being created the location in the input parameter (loc)
      can be removed.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Reviewed-by: default avatarKan Liang <kan.liang@linux.intel.com>
      Tested-by: default avatarAtish Patra <atishp@rivosinc.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Beeman Strong <beeman@rivosinc.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20240416061533.921723-3-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      63dfcde9
    • Ian Rogers's avatar
      perf parse-events: Factor out '<event_or_pmu>/.../' parsing · 8b734eaa
      Ian Rogers authored
      Factor out the case of an event or PMU name followed by a slash based
      term list. This is with a view to sharing the code with new legacy
      hardware parsing. Use early return to reduce indentation in the code.
      Make parse_events_add_pmu static now it doesn't need sharing with
      parse-events.y.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Reviewed-by: default avatarKan Liang <kan.liang@linux.intel.com>
      Tested-by: default avatarAtish Patra <atishp@rivosinc.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Beeman Strong <beeman@rivosinc.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20240416061533.921723-2-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      8b734eaa
    • Adrian Hunter's avatar
      perf scripts python: Add a script to run instances of 'perf script' in parallel · e0c48bf9
      Adrian Hunter authored
      Add a Python script to run a perf script command multiple times in
      parallel, using perf script options --cpu and --time so that each job
      processes a different chunk of the data.
      
      Extend perf script tests to test also the new script.
      
      The script supports the use of normal 'perf script' options like
      --dlfilter and --script, so that the benefit of running parallel jobs
      naturally extends to them also. In addition, a command can be provided
      (refer --pipe-to option) to pipe standard output to a custom command.
      
      Refer to the script's own help text at the end of the patch for more
      details.
      
      The script is useful for Intel PT traces, that can be efficiently
      decoded by 'perf script' when split by CPU and/or time ranges. Running
      jobs in parallel can decrease the overall decoding time.
      
      Committer testing:
      
        Ian reported that shellcheck found some issues, I installed it as there
        are no warnings about it not being available, but when available it
        fails the build with:
      
          TEST    /tmp/build/perf-tools-next/tests/shell/script.sh.shellcheck_log
          CC      /tmp/build/perf-tools-next/util/header.o
      
        In tests/shell/script.sh line 20:
                        rm -rf "${temp_dir}/"*
                               ^-------------^ SC2115 (warning): Use "${var:?}" to ensure this never expands to /* .
      
        In tests/shell/script.sh line 83:
                output1_dir="${temp_dir}/output1"
                ^---------^ SC2034 (warning): output1_dir appears unused. Verify use (or export if used externally).
      
        In tests/shell/script.sh line 84:
                output2_dir="${temp_dir}/output2"
                ^---------^ SC2034 (warning): output2_dir appears unused. Verify use (or export if used externally).
      
        In tests/shell/script.sh line 86:
                python3 "${pp}" -o "${output_dir}" --jobs 4 --verbose -- perf script -i "${perf_data}"
                                    ^-----------^ SC2154 (warning): output_dir is referenced but not assigned (did you mean 'output1_dir'?).
      
        For more information:
          https://www.shellcheck.net/wiki/SC2034 -- output1_dir appears unused. Verif...
          https://www.shellcheck.net/wiki/SC2115 -- Use "${var:?}" to ensure this nev...
          https://www.shellcheck.net/wiki/SC2154 -- output_dir is referenced but not ...
      
      Did these fixes:
      
        -               rm -rf "${temp_dir}/"*
        +               rm -rf "${temp_dir:?}/"*
      
      And:
      
         @@ -83,8 +83,8 @@ test_parallel_perf()
                output1_dir="${temp_dir}/output1"
                output2_dir="${temp_dir}/output2"
                perf record -o "${perf_data}" --sample-cpu uname
        -       python3 "${pp}" -o "${output_dir}" --jobs 4 --verbose -- perf script -i "${perf_data}"
        -       python3 "${pp}" -o "${output_dir}" --jobs 4 --verbose --per-cpu -- perf script -i "${perf_data}"
        +       python3 "${pp}" -o "${output1_dir}" --jobs 4 --verbose -- perf script -i "${perf_data}"
        +       python3 "${pp}" -o "${output2_dir}" --jobs 4 --verbose --per-cpu -- perf script -i "${perf_data}"
      
      After that:
      
        root@number:~# perf test -vv "perf script tests"
         97: perf script tests:
        --- start ---
        test child forked, pid 4084139
        DB test
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.032 MB /tmp/perf-test-script.T4MJDr0L6J/perf.data (7 samples) ]
        <SNIP>
        DB test [Success]
        parallel-perf test
        Linux
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.034 MB /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data (7 samples) ]
        Starting: perf script --time=,91898.301878499 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
        Starting: perf script --time=91898.301878500,91898.301905999 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
        Starting: perf script --time=91898.301906000,91898.301933499 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
        Starting: perf script --time=91898.301933500, -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
        Finished: perf script --time=91898.301878500,91898.301905999 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
        Finished: perf script --time=91898.301906000,91898.301933499 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
        There are 4 jobs: 2 completed, 2 running
        Finished: perf script --time=,91898.301878499 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
        Finished: perf script --time=91898.301933500, -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
        There are 4 jobs: 4 completed, 0 running
        All jobs finished successfully
        parallel-perf.py done
        Starting: perf script --cpu=0 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
        Starting: perf script --cpu=1 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
        Starting: perf script --cpu=2 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
        Starting: perf script --cpu=3 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
        Finished: perf script --cpu=0 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
        Finished: perf script --cpu=1 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
        Finished: perf script --cpu=2 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
        Finished: perf script --cpu=3 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
        There are 28 jobs: 4 completed, 0 running
        Starting: perf script --cpu=4 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
        Starting: perf script --cpu=5 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
        Starting: perf script --cpu=6 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
        Starting: perf script --cpu=7 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
        Finished: perf script --cpu=4 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
        Finished: perf script --cpu=5 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
        Finished: perf script --cpu=6 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
        Finished: perf script --cpu=7 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
        There are 28 jobs: 8 completed, 0 running
        Starting: perf script --cpu=8 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
        Starting: perf script --cpu=9 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
        Starting: perf script --cpu=10 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
        Starting: perf script --cpu=11 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
        Finished: perf script --cpu=8 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
        Finished: perf script --cpu=9 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
        Finished: perf script --cpu=10 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
        Finished: perf script --cpu=11 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
        There are 28 jobs: 12 completed, 0 running
        Starting: perf script --cpu=12 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
        Starting: perf script --cpu=13 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
        Starting: perf script --cpu=14 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
        Starting: perf script --cpu=15 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
        Finished: perf script --cpu=12 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
        Finished: perf script --cpu=13 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
        Finished: perf script --cpu=14 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
        Finished: perf script --cpu=15 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
        There are 28 jobs: 16 completed, 0 running
        Starting: perf script --cpu=16 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
        Starting: perf script --cpu=17 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
        Starting: perf script --cpu=18 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
        Starting: perf script --cpu=19 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
        Finished: perf script --cpu=16 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
        Finished: perf script --cpu=17 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
        Finished: perf script --cpu=18 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
        Finished: perf script --cpu=19 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
        There are 28 jobs: 20 completed, 0 running
        Starting: perf script --cpu=20 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
        Starting: perf script --cpu=21 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
        Starting: perf script --cpu=22 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
        Starting: perf script --cpu=23 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
        Finished: perf script --cpu=20 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
        Finished: perf script --cpu=21 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
        Finished: perf script --cpu=22 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
        Finished: perf script --cpu=23 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
        There are 28 jobs: 24 completed, 0 running
        Starting: perf script --cpu=24 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
        Starting: perf script --cpu=25 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
        Starting: perf script --cpu=26 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
        Starting: perf script --cpu=27 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
        Finished: perf script --cpu=25 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
        Finished: perf script --cpu=26 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
        Finished: perf script --cpu=27 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
        There are 28 jobs: 27 completed, 1 running
        Finished: perf script --cpu=24 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
        There are 28 jobs: 28 completed, 0 running
        All jobs finished successfully
        parallel-perf.py done
        parallel-perf test [Success]
        --- Cleaning up ---
        ---- end(0) ----
         97: perf script tests                                               : Ok
        root@number:~#
      Reviewed-by: default avatarAndi Kleen <ak@linux.intel.com>
      Signed-off-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lore.kernel.org/r/20240423133248.10206-1-adrian.hunter@intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      e0c48bf9
    • Arnaldo Carvalho de Melo's avatar
      tools lib rbtree: Pick some improvements from the kernel rbtree code · cd88c11c
      Arnaldo Carvalho de Melo authored
      The tools/lib/rbtree.c code came from the kernel, removing the
      EXPORT_SYMBOL() that make sense only there, unfortunately it is not
      being checked with tools/perf/check_headers.sh, will try to remedy this,
      till then pick the improvements from:
      
        b0687c11 ("lib/rbtree: use '+' instead of '|' for setting color.")
      
      That I noticed by doing:
      
        diff -u tools/lib/rbtree.c lib/rbtree.c
        diff -u tools/include/linux/rbtree_augmented.h include/linux/rbtree_augmented.h
      
      There is one other cases, but lets pick it in separate patches.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Noah Goldstein <goldstein.w.n@gmail.com>
      Link: https://lore.kernel.org/lkml/ZigZzeFoukzRKG1Q@x1Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      cd88c11c
    • Arnaldo Carvalho de Melo's avatar
      perf tests shell kprobes: Add missing description as used by 'perf test' output · 7255fcc8
      Arnaldo Carvalho de Melo authored
      Before:
      
        root@x1:~# perf test 76
         76: SPDX-License-Identifier: GPL-2.0                                : Ok
        root@x1:~#
      
      After:
      
        root@x1:~# perf test 76
         76: Add 'perf probe's, list and remove them.                        : Ok
        root@x1:~#
      Reviewed-by: default avatarIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Michael Petlan <mpetlan@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Veronika Molnarova <vmolnaro@redhat.com>
      Link: https://lore.kernel.org/lkml/ZigRDKUGkcDqD-yW@x1Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      7255fcc8
  2. 23 Apr, 2024 1 commit
    • Arnaldo Carvalho de Melo's avatar
      tools arch x86: Sync the msr-index.h copy with the kernel sources · b29781af
      Arnaldo Carvalho de Melo authored
      To pick up the changes from these csets:
      
        be482ff9 ("x86/bhi: Enumerate Branch History Injection (BHI) bug")
        0f4a8376 ("x86/bhi: Define SPEC_CTRL_BHI_DIS_S")
      
      That cause no changes to tooling:
      
        $ tools/perf/trace/beauty/tracepoints/x86_msr.sh > x86_msr.before
        $ objdump -dS /tmp/build/perf-tools-next/util/amd-sample-raw.o > amd-sample-raw.o.before
        $ cp arch/x86/include/asm/msr-index.h tools/arch/x86/include/asm/msr-index.h
        $ make -C tools/perf O=/tmp/build/perf-tools-next
        <SNIP>
        CC      /tmp/build/perf-tools-next/trace/beauty/tracepoints/x86_msr.o
        <SNIP>
        CC      /tmp/build/perf-tools-next/util/amd-sample-raw.o
        <SNIP>
        $ objdump -dS /tmp/build/perf-tools-next/util/amd-sample-raw.o > amd-sample-raw.o.after
        $ tools/perf/trace/beauty/tracepoints/x86_msr.sh > x86_msr.after
        $ diff -u x86_msr.before x86_msr.after
        $ diff -u amd-sample-raw.o.before amd-sample-raw.o.after
      
      Just silences this perf build warning:
      
        Warning: Kernel ABI header differences:
          diff -u tools/arch/x86/include/asm/msr-index.h arch/x86/include/asm/msr-index.h
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Daniel Sneddon <daniel.sneddon@linux.intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: https://lore.kernel.org/lkml/ZifCnEZFx5MZQuIW@x1Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      b29781af
  3. 22 Apr, 2024 2 commits
    • Arnaldo Carvalho de Melo's avatar
      tools include UAPI: Sync linux/vhost.h with the kernel sources · e7a8074d
      Arnaldo Carvalho de Melo authored
      To get the changes in:
      
        2855c2a7 ("vhost-vdpa: change ioctl # for VDPA_GET_VRING_SIZE")
        1496c470 ("vhost-vdpa: uapi to support reporting per vq size")
      
      To pick up these changes and support them:
      
        $ tools/perf/trace/beauty/vhost_virtio_ioctl.sh > before
        $ cp include/uapi/linux/vhost.h tools/perf/trace/beauty/include/uapi/linux/vhost.h
        $ tools/perf/trace/beauty/vhost_virtio_ioctl.sh > after
        $ diff -u before after
        --- before	2024-04-22 13:39:37.185674799 -0300
        +++ after	2024-04-22 13:39:52.043344784 -0300
        @@ -50,5 +50,6 @@
         	[0x7F] = "VDPA_GET_VRING_DESC_GROUP",
         	[0x80] = "VDPA_GET_VQS_COUNT",
         	[0x81] = "VDPA_GET_GROUP_NUM",
        +	[0x82] = "VDPA_GET_VRING_SIZE",
         	[0x8] = "NEW_WORKER",
         };
        $
      
      For instance, see how those 'cmd' ioctl arguments get translated, now
      VDPA_GET_VRING_SIZE will be as well:
      
        # perf trace -a -e ioctl --max-events=10
             0.000 ( 0.011 ms): pipewire/2261 ioctl(fd: 60, cmd: SNDRV_PCM_HWSYNC, arg: 0x1)                   = 0
            21.353 ( 0.014 ms): pipewire/2261 ioctl(fd: 60, cmd: SNDRV_PCM_HWSYNC, arg: 0x1)                   = 0
            25.766 ( 0.014 ms): gnome-shell/2196 ioctl(fd: 14, cmd: DRM_I915_IRQ_WAIT, arg: 0x7ffe4a22c740)    = 0
            25.845 ( 0.034 ms): gnome-shel:cs0/2212 ioctl(fd: 14, cmd: DRM_I915_IRQ_EMIT, arg: 0x7fd43915dc70) = 0
            25.916 ( 0.011 ms): gnome-shell/2196 ioctl(fd: 9, cmd: DRM_MODE_ADDFB2, arg: 0x7ffe4a22c8a0)       = 0
            25.941 ( 0.025 ms): gnome-shell/2196 ioctl(fd: 9, cmd: DRM_MODE_ATOMIC, arg: 0x7ffe4a22c840)       = 0
            32.915 ( 0.009 ms): gnome-shell/2196 ioctl(fd: 9, cmd: DRM_MODE_RMFB, arg: 0x7ffe4a22cf9c)         = 0
            42.522 ( 0.013 ms): gnome-shell/2196 ioctl(fd: 14, cmd: DRM_I915_IRQ_WAIT, arg: 0x7ffe4a22c740)    = 0
            42.579 ( 0.031 ms): gnome-shel:cs0/2212 ioctl(fd: 14, cmd: DRM_I915_IRQ_EMIT, arg: 0x7fd43915dc70) = 0
            42.644 ( 0.010 ms): gnome-shell/2196 ioctl(fd: 9, cmd: DRM_MODE_ADDFB2, arg: 0x7ffe4a22c8a0)       = 0
        #
      
      This addresses this perf tools build warning:
      
        diff -u tools/perf/trace/beauty/include/uapi/linux/vhost.h include/uapi/linux/vhost.h
      
      But this specific process, usually boring, this time around catch a
      problem, namely the addition of VDPA_GET_VRING_SIZE used an ioctl number
      already taken, which went on unnoticed and only got caught when the
      tools/perf/trace/beauty/vhost_virtio_ioctl.sh script was run as part of
      the perf tools process of updating the tools copies of system headers it
      uses for creating id->string tables that, well, broke the perf tools
      build because there were multiple initializations in the strings table
      for the 0x80 entry...
      
      I'm adding here a link to the discussion, that is lacking in the fix for
      the reported problem, and a quote from one of the developers involved:
      
      "Thanks a lot for taking care of this! So given the header is actually
      buggy pls hang on to this change until I merge the fix for the header
      (you were CC'd on the patch).  It's great we have this redundancy which
      allowed us to catch the bug in time, and many thanks to Namhyung Kim for
      reporting the issue!"
      
      This is here as a hint for anyone thinking about ways to automate
      checking these issues in a more automated way... ;-)
      
      Link: https://lore.kernel.org/lkml/ 20240402172151-mutt-send-email-mst@kernel.org
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Michael S. Tsirkin <mst@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Zhu Lingshan <lingshan.zhu@intel.com>
      Link: https://lore.kernel.org/lkml/ZiaW-csEZLKK48BE@x1Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      e7a8074d
    • Arnaldo Carvalho de Melo's avatar
      Merge remote-tracking branch 'torvalds/master' into perf-tools-next · 173b0b5b
      Arnaldo Carvalho de Melo authored
      To pick up fixes sent via perf-tools, by Namhyung Kim.
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      173b0b5b
  4. 21 Apr, 2024 7 commits
    • Linus Torvalds's avatar
      Linux 6.9-rc5 · ed30a4a5
      Linus Torvalds authored
      ed30a4a5
    • Linus Torvalds's avatar
      Merge tag 'char-misc-6.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc · 48cf398f
      Linus Torvalds authored
      Pull char / misc driver fixes from Greg KH:
       "Here are some small char/misc and other driver fixes for 6.9-rc5.
        Included in here are the following:
      
         - binder driver fix for reported problem
      
         - speakup crash fix
      
         - mei driver fixes for reported problems
      
         - comdei driver fix
      
         - interconnect driver fixes
      
         - rtsx driver fix
      
         - peci.h kernel doc fix
      
        All of these have been in linux-next for over a week with no reported
        problems"
      
      * tag 'char-misc-6.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
        peci: linux/peci.h: fix Excess kernel-doc description warning
        binder: check offset alignment in binder_get_object()
        comedi: vmk80xx: fix incomplete endpoint checking
        mei: vsc: Unregister interrupt handler for system suspend
        Revert "mei: vsc: Call wake_up() in the threaded IRQ handler"
        misc: rtsx: Fix rts5264 driver status incorrect when card removed
        mei: me: disable RPL-S on SPS and IGN firmwares
        speakup: Avoid crash on very long word
        interconnect: Don't access req_list while it's being manipulated
        interconnect: qcom: x1e80100: Remove inexistent ACV_PERF BCM
      48cf398f
    • Linus Torvalds's avatar
      Merge tag 'driver-core-6.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core · 4e90ba75
      Linus Torvalds authored
      Pull kernfs bugfix and documentation update from Greg KH:
       "Here are two changes for 6.9-rc5 that deal with "driver core" stuff,
        that do the following:
      
         - sysfs reference leak fix
      
         - embargoed-hardware-issues.rst update for Power
      
        Both of these have been in linux-next for over a week with no reported
        issues"
      
      * tag 'driver-core-6.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
        Documentation: embargoed-hardware-issues.rst: Add myself for Power
        fs: sysfs: Fix reference leak in sysfs_break_active_protection()
      4e90ba75
    • Linus Torvalds's avatar
      Merge tag 'tty-6.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty · c0c6b5c0
      Linus Torvalds authored
      Pull tty/serial driver fixes from Greg KH:
       "Here are some small tty and serial driver fixes for 6.9-rc5 that
        resolve a bunch of reported problems. Included in here are:
      
         - MAINTAINERS and .mailmap update for Richard Genoud
      
         - serial core regression fixes from 6.9-rc1 changes
      
         - pci id cleanups
      
         - serial core crash fix
      
         - stm32 driver fixes
      
         - 8250 driver fixes
      
        All of these have been in linux-next for a while with no reported
        problems"
      
      * tag 'tty-6.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
        serial: stm32: Reset .throttled state in .startup()
        serial: stm32: Return IRQ_NONE in the ISR if no handling happend
        serial: core: Fix missing shutdown and startup for serial base port
        serial: core: Clearing the circular buffer before NULLifying it
        MAINTAINERS: mailmap: update Richard Genoud's email address
        serial/pmac_zilog: Remove flawed mitigation for rx irq flood
        serial: 8250_pci: Remove redundant PCI IDs
        serial: core: Fix regression when runtime PM is not enabled
        serial: mxs-auart: add spinlock around changing cts state
        serial: 8250_dw: Revert: Do not reclock if already at correct rate
        serial: 8250_lpc18xx: disable clks on error in probe()
      c0c6b5c0
    • Linus Torvalds's avatar
      Merge tag 'usb-6.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb · 5fa0ab45
      Linus Torvalds authored
      Pull USB / Thunderbolt driver fixes from Greg KH:
       "Here are some small USB and Thunderbolt driver fixes for 6.9-rc5.
        Included in here are:
      
         - MAINTAINER file update for invalid email address
      
         - usb-serial device id updates
      
         - typec driver fixes
      
         - thunderbolt / usb4 driver fixes
      
         - usb core shutdown fixes
      
         - cdc-wdm driver revert for reported problem in -rc1
      
         - usb gadget driver fixes
      
         - xhci driver fixes
      
        All of these have been in linux-next for a while with no reported
        problems"
      
      * tag 'usb-6.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (25 commits)
        USB: serial: option: add Telit FN920C04 rmnet compositions
        usb: dwc3: ep0: Don't reset resource alloc flag
        Revert "usb: cdc-wdm: close race between read and workqueue"
        USB: serial: option: add Rolling RW101-GL and RW135-GL support
        USB: serial: option: add Lonsung U8300/U9300 product
        USB: serial: option: add support for Fibocom FM650/FG650
        USB: serial: option: support Quectel EM060K sub-models
        USB: serial: option: add Fibocom FM135-GL variants
        usb: misc: onboard_usb_hub: Disable the USB hub clock on failure
        thunderbolt: Avoid notify PM core about runtime PM resume
        thunderbolt: Fix wake configurations after device unplug
        usb: dwc2: host: Fix dereference issue in DDMA completion flow.
        usb: typec: mux: it5205: Fix ChipID value typo
        MAINTAINERS: Drop Li Yang as their email address stopped working
        usb: gadget: fsl: Initialize udc before using it
        usb: Disable USB3 LPM at shutdown
        usb: gadget: f_ncm: Fix UAF ncm object at re-bind after usb ep transport error
        usb: typec: tcpm: Correct the PDO counting in pd_set
        usb: gadget: functionfs: Wait for fences before enqueueing DMABUF
        usb: gadget: functionfs: Fix inverted DMA fence direction
        ...
      5fa0ab45
    • Linus Torvalds's avatar
      Merge tag 'sched_urgent_for_v6.9_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 3b680865
      Linus Torvalds authored
      Pull scheduler fix from Borislav Petkov:
      
       - Add a missing memory barrier in the concurrency ID mm switching
      
      * tag 'sched_urgent_for_v6.9_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        sched: Add missing memory barrier in switch_mm_cid
      3b680865
    • Linus Torvalds's avatar
      Merge tag 'x86_urgent_for_v6.9_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · d07a0b86
      Linus Torvalds authored
      Pull x86 fixes from Borislav Petkov:
      
       - Fix CPU feature dependencies of GFNI, VAES, and VPCLMULQDQ
      
       - Print the correct error code when FRED reports a bad event type
      
       - Add a FRED-specific INT80 handler without the special dances that
         need to happen in the current one
      
       - Enable the using-the-default-return-thunk-but-you-should-not warning
         only on configs which actually enable those special return thunks
      
       - Check the proper feature flags when selecting BHI retpoline
         mitigation
      
      * tag 'x86_urgent_for_v6.9_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/cpufeatures: Fix dependencies for GFNI, VAES, and VPCLMULQDQ
        x86/fred: Fix incorrect error code printout in fred_bad_type()
        x86/fred: Fix INT80 emulation for FRED
        x86/retpolines: Enable the default thunk warning only on relevant configs
        x86/bugs: Fix BHI retpoline check
      d07a0b86
  5. 20 Apr, 2024 6 commits
    • Linus Torvalds's avatar
      Merge tag 'block-6.9-20240420' of git://git.kernel.dk/linux · 977b1ef5
      Linus Torvalds authored
      Pull block fixes from Jens Axboe:
       "Just two minor fixes that should go into the 6.9 kernel release, one
        fixing a regression with partition scanning errors, and one fixing a
        WARN_ON() that can get triggered if we race with a timer"
      
      * tag 'block-6.9-20240420' of git://git.kernel.dk/linux:
        blk-iocost: do not WARN if iocg was already offlined
        block: propagate partition scanning errors to the BLKRRPART ioctl
      977b1ef5
    • Linus Torvalds's avatar
      Merge tag 'email' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi · 39316e5f
      Linus Torvalds authored
      Pull email address update from James Bottomley:
       "My IBM email has stopped working, so update to a working email
        address"
      
      * tag 'email' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
        MAINTAINERS: update to working email address
      39316e5f
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm · 81777226
      Linus Torvalds authored
      Pull kvm fixes from Paolo Bonzini:
       "This is a bit on the large side, mostly due to two changes:
      
         - Changes to disable some broken PMU virtualization (see below for
           details under "x86 PMU")
      
         - Clean up SVM's enter/exit assembly code so that it can be compiled
           without OBJECT_FILES_NON_STANDARD. This fixes a warning "Unpatched
           return thunk in use. This should not happen!" when running KVM
           selftests.
      
        Everything else is small bugfixes and selftest changes:
      
         - Fix a mostly benign bug in the gfn_to_pfn_cache infrastructure
           where KVM would allow userspace to refresh the cache with a bogus
           GPA. The bug has existed for quite some time, but was exposed by a
           new sanity check added in 6.9 (to ensure a cache is either
           GPA-based or HVA-based).
      
         - Drop an unused param from gfn_to_pfn_cache_invalidate_start() that
           got left behind during a 6.9 cleanup.
      
         - Fix a math goof in x86's hugepage logic for
           KVM_SET_MEMORY_ATTRIBUTES that results in an array overflow
           (detected by KASAN).
      
         - Fix a bug where KVM incorrectly clears root_role.direct when
           userspace sets guest CPUID.
      
         - Fix a dirty logging bug in the where KVM fails to write-protect
           SPTEs used by a nested guest, if KVM is using Page-Modification
           Logging and the nested hypervisor is NOT using EPT.
      
        x86 PMU:
      
         - Drop support for virtualizing adaptive PEBS, as KVM's
           implementation is architecturally broken without an obvious/easy
           path forward, and because exposing adaptive PEBS can leak host LBRs
           to the guest, i.e. can leak host kernel addresses to the guest.
      
         - Set the enable bits for general purpose counters in
           PERF_GLOBAL_CTRL at RESET time, as done by both Intel and AMD
           processors.
      
         - Disable LBR virtualization on CPUs that don't support LBR
           callstacks, as KVM unconditionally uses
           PERF_SAMPLE_BRANCH_CALL_STACK when creating the perf event, and
           would fail on such CPUs.
      
        Tests:
      
         - Fix a flaw in the max_guest_memory selftest that results in it
           exhausting the supply of ucall structures when run with more than
           256 vCPUs.
      
         - Mark KVM_MEM_READONLY as supported for RISC-V in
           set_memory_region_test"
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (30 commits)
        KVM: Drop unused @may_block param from gfn_to_pfn_cache_invalidate_start()
        KVM: selftests: Add coverage of EPT-disabled to vmx_dirty_log_test
        KVM: x86/mmu: Fix and clarify comments about clearing D-bit vs. write-protecting
        KVM: x86/mmu: Remove function comments above clear_dirty_{gfn_range,pt_masked}()
        KVM: x86/mmu: Write-protect L2 SPTEs in TDP MMU when clearing dirty status
        KVM: x86/mmu: Precisely invalidate MMU root_role during CPUID update
        KVM: VMX: Disable LBR virtualization if the CPU doesn't support LBR callstacks
        perf/x86/intel: Expose existence of callback support to KVM
        KVM: VMX: Snapshot LBR capabilities during module initialization
        KVM: x86/pmu: Do not mask LVTPC when handling a PMI on AMD platforms
        KVM: x86: Snapshot if a vCPU's vendor model is AMD vs. Intel compatible
        KVM: x86: Stop compiling vmenter.S with OBJECT_FILES_NON_STANDARD
        KVM: SVM: Create a stack frame in __svm_sev_es_vcpu_run()
        KVM: SVM: Save/restore args across SEV-ES VMRUN via host save area
        KVM: SVM: Save/restore non-volatile GPRs in SEV-ES VMRUN via host save area
        KVM: SVM: Clobber RAX instead of RBX when discarding spec_ctrl_intercepted
        KVM: SVM: Drop 32-bit "support" from __svm_sev_es_vcpu_run()
        KVM: SVM: Wrap __svm_sev_es_vcpu_run() with #ifdef CONFIG_KVM_AMD_SEV
        KVM: SVM: Create a stack frame in __svm_vcpu_run() for unwinding
        KVM: SVM: Remove a useless zeroing of allocated memory
        ...
      81777226
    • Linus Torvalds's avatar
      Merge tag 'powerpc-6.9-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux · e43afae4
      Linus Torvalds authored
      Pull powerpc fixes from Michael Ellerman:
      
       - Fix wireguard loading failure on pre-Power10 due to Power10 crypto
         routines
      
       - Fix papr-vpd selftest failure due to missing variable initialization
      
       - Avoid unnecessary get/put in spapr_tce_platform_iommu_attach_dev()
      
      Thanks to Geetika Moolchandani, Jason Gunthorpe, Michal Suchánek, Nathan
      Lynch, and Shivaprasad G Bhat.
      
      * tag 'powerpc-6.9-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
        selftests/powerpc/papr-vpd: Fix missing variable initialization
        powerpc/crypto/chacha-p10: Fix failure on non Power10
        powerpc/iommu: Refactor spapr_tce_platform_iommu_attach_dev()
      e43afae4
    • Linus Torvalds's avatar
      Merge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux · 560d4e77
      Linus Torvalds authored
      Pull clk fixes from Stephen Boyd:
       "A couple clk driver fixes, a build fix, and a deadlock fix:
      
         - Mediatek mt7988 has broken PCIe because the wrong parent is used
      
         - Mediatek clk drivers may deadlock when registering their clks
           because the clk provider device is repeatedly runtime PM resumed
           and suspended during probe and clk registration.
      
           Resuming the clk provider device deadlocks with an ABBA deadlock
           due to genpd_lock and the clk prepare_lock. The fix is to keep the
           device runtime resumed while registering clks.
      
         - Another runtime PM related deadlock, this time with disabling
           unused clks during late init.
      
           We get an ABBA deadlock where a device is runtime PM resuming (or
           suspending) while the disabling of unused clks is happening in
           parallel. That runtime PM action calls into the clk framework and
           tries to grab the clk prepare_lock while the disabling of unused
           clks holds the prepare_lock and is waiting for that runtime PM
           action to complete.
      
           The fix is to runtime resume all the clk provider devices before
           grabbing the clk prepare_lock during disable unused.
      
         - A build fix to provide an empty devm_clk_rate_exclusive_get()
           function when CONFIG_COMMON_CLK=n"
      
      * tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
        clk: mediatek: mt7988-infracfg: fix clocks for 2nd PCIe port
        clk: mediatek: Do a runtime PM get on controllers during probe
        clk: Get runtime PM before walking tree for clk_summary
        clk: Get runtime PM before walking tree during disable_unused
        clk: Initialize struct clk_core kref earlier
        clk: Don't hold prepare_lock when calling kref_put()
        clk: Remove prepare_lock hold assertion in __clk_release()
        clk: Provide !COMMON_CLK dummy for devm_clk_rate_exclusive_get()
      560d4e77
    • James Bottomley's avatar
      MAINTAINERS: update to working email address · 366c5cec
      James Bottomley authored
      jejb@linux.ibm.com no longer works.
      Signed-off-by: default avatarJames Bottomley <James.Bottomley@HansenPartnership.com>
      366c5cec