1. 12 Sep, 2023 15 commits
    • Yang Jihong's avatar
      perf kwork top: Introduce new top utility · 55c40e50
      Yang Jihong authored
      Some common tools for collecting statistics on CPU usage, such as top,
      obtain statistics from timer interrupt sampling, and then periodically
      read statistics from /proc/stat.
      
      This method has some deviations:
      
      1. In the tick interrupt, the time between the last tick and the current
         tick is counted in the current task. However, the task may be running
         only part of the time.
      2. For each task, the top tool periodically reads the /proc/{PID}/status
         information. For tasks with a short life cycle, it may be missed.
      
      In conclusion, the top tool cannot accurately collect statistics on the
      CPU usage and running time of tasks.
      
      The statistical method based on sched_switch tracepoint can accurately
      calculate the CPU usage of all tasks. This method is applicable to
      scenarios where performance comparison data is of high precision.
      
      Example usage:
      
        # perf kwork
      
         Usage: perf kwork [<options>] {record|report|latency|timehist|top}
      
            -D, --dump-raw-trace  dump raw trace in ASCII
            -f, --force           don't complain, do it
            -k, --kwork <kwork>   list of kwork to profile (irq, softirq, workqueue, sched, etc)
            -v, --verbose         be more verbose (show symbol address, etc)
      
        # perf kwork -k sched record -- perf bench sched messaging -g 1 -l 10000
        # Running 'sched/messaging' benchmark:
        # 20 sender and receiver processes per group
        # 1 groups == 40 processes run
      
             Total time: 14.074 [sec]
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 15.886 MB perf.data (129472 samples) ]
        # perf kwork top
      
        Total  : 115708.178 ms, 8 cpus
        %Cpu(s):   9.78% id
        %Cpu0   [|||||||||||||||||||||||||||     90.55%]
        %Cpu1   [|||||||||||||||||||||||||||     90.51%]
        %Cpu2   [||||||||||||||||||||||||||      88.57%]
        %Cpu3   [|||||||||||||||||||||||||||     91.18%]
        %Cpu4   [|||||||||||||||||||||||||||     91.09%]
        %Cpu5   [|||||||||||||||||||||||||||     90.88%]
        %Cpu6   [||||||||||||||||||||||||||      88.64%]
        %Cpu7   [|||||||||||||||||||||||||||     90.28%]
      
              PID    %CPU           RUNTIME  COMMMAND
          ----------------------------------------------------
             4113   22.23       3221.547 ms  sched-messaging
             4105   21.61       3131.495 ms  sched-messaging
             4119   21.53       3120.937 ms  sched-messaging
             4103   21.39       3101.614 ms  sched-messaging
             4106   21.37       3095.209 ms  sched-messaging
             4104   21.25       3077.269 ms  sched-messaging
             4115   21.21       3073.188 ms  sched-messaging
             4109   21.18       3069.022 ms  sched-messaging
             4111   20.78       3010.033 ms  sched-messaging
             4114   20.74       3007.073 ms  sched-messaging
             4108   20.73       3002.137 ms  sched-messaging
             4107   20.47       2967.292 ms  sched-messaging
             4117   20.39       2955.335 ms  sched-messaging
             4112   20.34       2947.080 ms  sched-messaging
             4118   20.32       2942.519 ms  sched-messaging
             4121   20.23       2929.865 ms  sched-messaging
             4110   20.22       2930.078 ms  sched-messaging
             4122   20.15       2919.542 ms  sched-messaging
             4120   19.77       2866.032 ms  sched-messaging
             4116   19.72       2857.660 ms  sched-messaging
             4127   16.19       2346.334 ms  sched-messaging
             4142   15.86       2297.600 ms  sched-messaging
             4141   15.62       2262.646 ms  sched-messaging
             4136   15.41       2231.408 ms  sched-messaging
             4130   15.38       2227.008 ms  sched-messaging
             4129   15.31       2217.692 ms  sched-messaging
             4126   15.21       2201.711 ms  sched-messaging
             4139   15.19       2200.722 ms  sched-messaging
             4137   15.10       2188.633 ms  sched-messaging
             4134   15.06       2182.082 ms  sched-messaging
             4132   15.02       2177.530 ms  sched-messaging
             4131   14.73       2131.973 ms  sched-messaging
             4125   14.68       2125.439 ms  sched-messaging
             4128   14.66       2122.255 ms  sched-messaging
             4123   14.65       2122.113 ms  sched-messaging
             4135   14.56       2107.144 ms  sched-messaging
             4133   14.51       2103.549 ms  sched-messaging
             4124   14.27       2066.671 ms  sched-messaging
             4140   14.17       2052.251 ms  sched-messaging
             4138   13.81       2000.361 ms  sched-messaging
                0   11.42       1652.009 ms  swapper/2
                0   11.35       1641.694 ms  swapper/6
                0    9.71       1405.108 ms  swapper/7
                0    9.48       1372.338 ms  swapper/1
                0    9.44       1366.013 ms  swapper/0
                0    9.11       1318.382 ms  swapper/5
                0    8.90       1287.582 ms  swapper/4
                0    8.81       1274.356 ms  swapper/3
             4100    2.61        379.328 ms  perf
             4101    1.16        169.487 ms  perf-exec
              151    0.65         94.741 ms  systemd-resolve
              249    0.36         53.030 ms  sd-resolve
              153    0.14         21.405 ms  systemd-timesyn
                1    0.10         16.200 ms  systemd
               16    0.09         15.785 ms  rcu_preempt
             4102    0.06          9.727 ms  perf
             4095    0.03          5.464 ms  kworker/7:1
               98    0.02          3.231 ms  jbd2/sda-8
              353    0.02          4.115 ms  sshd
               75    0.02          3.889 ms  kworker/2:1
               73    0.01          1.552 ms  kworker/5:1
               64    0.01          1.591 ms  kworker/4:1
               74    0.01          1.952 ms  kworker/3:1
               61    0.01          2.608 ms  kcompactd0
              397    0.01          1.602 ms  kworker/1:1
               69    0.01          1.817 ms  kworker/1:1H
               10    0.01          2.553 ms  kworker/u16:0
             2909    0.01          2.684 ms  kworker/0:2
             1211    0.00          0.426 ms  kworker/7:0
               97    0.00          0.153 ms  kworker/7:1H
               51    0.00          0.100 ms  ksoftirqd/7
              120    0.00          0.856 ms  systemd-journal
               76    0.00          1.414 ms  kworker/6:1
               46    0.00          0.246 ms  ksoftirqd/6
               45    0.00          0.164 ms  migration/6
               41    0.00          0.098 ms  ksoftirqd/5
               40    0.00          0.207 ms  migration/5
               86    0.00          1.339 ms  kworker/4:1H
               36    0.00          0.252 ms  ksoftirqd/4
               35    0.00          0.090 ms  migration/4
               31    0.00          0.156 ms  ksoftirqd/3
               30    0.00          0.073 ms  migration/3
               26    0.00          0.180 ms  ksoftirqd/2
               25    0.00          0.085 ms  migration/2
               21    0.00          0.106 ms  ksoftirqd/1
               20    0.00          0.118 ms  migration/1
              302    0.00          1.440 ms  systemd-logind
               17    0.00          0.132 ms  migration/0
               15    0.00          0.255 ms  ksoftirqd/0
      Reviewed-by: default avatarIan Rogers <irogers@google.com>
      Signed-off-by: default avatarYang Jihong <yangjihong1@huawei.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@amd.com>
      Cc: Sandipan Das <sandipan.das@amd.com>
      Link: https://lore.kernel.org/r/20230812084917.169338-10-yangjihong1@huawei.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      55c40e50
    • Yang Jihong's avatar
      perf kwork: Add `root` parameter to work_sort() · b83b5071
      Yang Jihong authored
      Add a `struct rb_root_cached *root` parameter to work_sort() to sort the
      specified rb tree elements.
      
      No functional change.
      Reviewed-by: default avatarIan Rogers <irogers@google.com>
      Signed-off-by: default avatarYang Jihong <yangjihong1@huawei.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@amd.com>
      Cc: Sandipan Das <sandipan.das@amd.com>
      Link: https://lore.kernel.org/r/20230812084917.169338-9-yangjihong1@huawei.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      b83b5071
    • Yang Jihong's avatar
      perf kwork: Add sched record support · 38d8d013
      Yang Jihong authored
      The kwork_class type of sched is added to support recording and parsing of
      sched_switch events.
      
      As follows:
      
        # perf kwork -h
      
         Usage: perf kwork [<options>] {record|report|latency|timehist}
      
            -D, --dump-raw-trace  dump raw trace in ASCII
            -f, --force           don't complain, do it
            -k, --kwork <kwork>   list of kwork to profile (irq, softirq, workqueue, sched, etc)
            -v, --verbose         be more verbose (show symbol address, etc)
      
        # perf kwork -k sched record true
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.083 MB perf.data (47 samples) ]
        # perf evlist
        sched:sched_switch
        dummy:HG
        # Tip: use 'perf evlist --trace-fields' to show fields for tracepoint events
      Reviewed-by: default avatarIan Rogers <irogers@google.com>
      Signed-off-by: default avatarYang Jihong <yangjihong1@huawei.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@amd.com>
      Cc: Sandipan Das <sandipan.das@amd.com>
      Link: https://lore.kernel.org/r/20230812084917.169338-8-yangjihong1@huawei.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      38d8d013
    • Yang Jihong's avatar
      perf kwork: Set default events list if not specified in setup_event_list() · 26b7254f
      Yang Jihong authored
      Currently when no kwork event is specified, all events are configured by
      default. Now set to default event list string, which is more flexible and
      supports subsequent function extension.
      
      Also put setup_event_list() into each subcommand for different settings.
      Reviewed-by: default avatarIan Rogers <irogers@google.com>
      Signed-off-by: default avatarYang Jihong <yangjihong1@huawei.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@amd.com>
      Cc: Sandipan Das <sandipan.das@amd.com>
      Link: https://lore.kernel.org/r/20230812084917.169338-7-yangjihong1@huawei.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      26b7254f
    • Yang Jihong's avatar
      perf kwork: Overwrite original atom in the list when a new atom is pushed. · 86c67c8a
      Yang Jihong authored
      work_push_atom() supports nesting. Currently, all supported kworks are not
      nested. A `overwrite` parameter is added to overwrite the original atom in
      the list.
      Reviewed-by: default avatarIan Rogers <irogers@google.com>
      Signed-off-by: default avatarYang Jihong <yangjihong1@huawei.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@amd.com>
      Cc: Sandipan Das <sandipan.das@amd.com>
      Link: https://lore.kernel.org/r/20230812084917.169338-6-yangjihong1@huawei.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      86c67c8a
    • Yang Jihong's avatar
      perf kwork: Add `kwork` and `src_type` to work_init() for 'struct kwork_class' · 95064b33
      Yang Jihong authored
      To support different types of reports, two parameters `struct perf_kwork
      * kwork` and `enum kwork_trace_type src_type` are added to work_init()
      of struct kwork_class for initialization in different scenarios.
      
      No functional change intended.
      Reviewed-by: default avatarIan Rogers <irogers@google.com>
      Signed-off-by: default avatarYang Jihong <yangjihong1@huawei.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@amd.com>
      Cc: Sandipan Das <sandipan.das@amd.com>
      Link: https://lore.kernel.org/r/20230812084917.169338-5-yangjihong1@huawei.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      95064b33
    • Yang Jihong's avatar
      perf kwork: Set ordered_events to true in 'struct perf_tool' · 0c526579
      Yang Jihong authored
      'perf kwork' processes data based on timestamps and needs to sort events.
      
      Fixes: f98919ec ("perf kwork: Implement 'report' subcommand")
      Reviewed-by: default avatarIan Rogers <irogers@google.com>
      Signed-off-by: default avatarYang Jihong <yangjihong1@huawei.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@amd.com>
      Cc: Sandipan Das <sandipan.das@amd.com>
      Cc: Yang Jihong <yangjihong1@huawei.com>
      Link: https://lore.kernel.org/r/20230812084917.169338-4-yangjihong1@huawei.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      0c526579
    • Yang Jihong's avatar
      perf kwork: Add the supported subcommands to the document · 76e0d8c8
      Yang Jihong authored
      Add missing report, latency and timehist subcommands to the document.
      
      Fixes: f98919ec ("perf kwork: Implement 'report' subcommand")
      Fixes: ad3d9f7a ("perf kwork: Implement perf kwork latency")
      Fixes: bcc8b3e8 ("perf kwork: Implement perf kwork timehist")
      Reviewed-by: default avatarIan Rogers <irogers@google.com>
      Signed-off-by: default avatarYang Jihong <yangjihong1@huawei.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@amd.com>
      Cc: Sandipan Das <sandipan.das@amd.com>
      Link: https://lore.kernel.org/r/20230812084917.169338-3-yangjihong1@huawei.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      76e0d8c8
    • Yang Jihong's avatar
      perf kwork: Fix incorrect and missing free atom in work_push_atom() · d3971008
      Yang Jihong authored
      1. Atoms are managed in page mode and should be released using atom_free()
         instead of free().
      2. When the event does not match, the atom needs to free.
      
      Fixes: f98919ec ("perf kwork: Implement 'report' subcommand")
      Reviewed-by: default avatarIan Rogers <irogers@google.com>
      Signed-off-by: default avatarYang Jihong <yangjihong1@huawei.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@amd.com>
      Cc: Sandipan Das <sandipan.das@amd.com>
      Cc: Yang Jihong <yangjihong1@huawei.com>
      Link: https://lore.kernel.org/r/20230812084917.169338-2-yangjihong1@huawei.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      d3971008
    • Yang Jihong's avatar
      perf test: Add perf_event_attr test for record dummy event · d50ad02c
      Yang Jihong authored
      If only dummy event is recorded, tracking event is not needed.
      Add this test scenario.
      
      Test result:
      
        # ./perf test list 2>&1 | grep 'Setup struct perf_event_attr'
         17: Setup struct perf_event_attr
        # ./perf test 17 -v
         17: Setup struct perf_event_attr                                    :
        --- start ---
        test child forked, pid 720198
        <SNIP>
        running './tests/attr/test-record-dummy-C0'
        <SNIP>
        test child finished with 0
        ---- end ----
        Setup struct perf_event_attr: Ok
      Signed-off-by: default avatarYang Jihong <yangjihong1@huawei.com>
      Tested-by: default avatarRavi Bangoria <ravi.bangoria@amd.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Anshuman Khandual <anshuman.khandual@arm.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Richter <tmricht@linux.ibm.com>
      Link: https://lore.kernel.org/r/20230904023340.12707-7-yangjihong1@huawei.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      d50ad02c
    • Yang Jihong's avatar
      perf test: Add test case for record sideband events · 23b97c7e
      Yang Jihong authored
      Add a new test case to record sideband events for all CPUs when tracing
      selected CPUs
      
      Test result:
      
        # ./perf test list 2>&1 | grep 'perf record sideband tests'
         95: perf record sideband tests
        # ./perf test 95
         95: perf record sideband tests                                      : Ok
      Signed-off-by: default avatarYang Jihong <yangjihong1@huawei.com>
      Tested-by: default avatarRavi Bangoria <ravi.bangoria@amd.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Anshuman Khandual <anshuman.khandual@arm.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Richter <tmricht@linux.ibm.com>
      Link: https://lore.kernel.org/r/20230904023340.12707-6-yangjihong1@huawei.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      23b97c7e
    • Yang Jihong's avatar
      perf record: Track sideband events for all CPUs when tracing selected CPUs · 74b4f3ec
      Yang Jihong authored
      User space tasks can migrate between CPUs, we need to track side-band
      events for all CPUs.
      
      The specific scenarios are as follows:
      
               CPU0                                 CPU1
        perf record -C 0 start
                                    taskA starts to be created and executed
                                      -> PERF_RECORD_COMM and PERF_RECORD_MMAP
                                         events only deliver to CPU1
                                    ......
                                      |
                                migrate to CPU0
                                      |
        Running on CPU0    <----------/
        ...
      
        perf record -C 0 stop
      
      Now perf samples the PC of taskA. However, perf does not record the
      PERF_RECORD_COMM and PERF_RECORD_MMAP events of taskA.
      Therefore, the comm and symbols of taskA cannot be parsed.
      
      The solution is to record sideband events for all CPUs when tracing
      selected CPUs. Because this modifies the default behavior, add related
      comments to the perf record man page.
      
      The sys_perf_event_open invoked is as follows:
      
        # perf --debug verbose=3 record -e cpu-clock -C 1 true
        <SNIP>
        Opening: cpu-clock
        ------------------------------------------------------------
        perf_event_attr:
          type                             1 (PERF_TYPE_SOFTWARE)
          size                             136
          config                           0 (PERF_COUNT_SW_CPU_CLOCK)
          { sample_period, sample_freq }   4000
          sample_type                      IP|TID|TIME|CPU|PERIOD|IDENTIFIER
          read_format                      ID|LOST
          disabled                         1
          inherit                          1
          freq                             1
          sample_id_all                    1
          exclude_guest                    1
        ------------------------------------------------------------
        sys_perf_event_open: pid -1  cpu 1  group_fd -1  flags 0x8 = 5
        Opening: dummy:u
        ------------------------------------------------------------
        perf_event_attr:
          type                             1 (PERF_TYPE_SOFTWARE)
          size                             136
          config                           0x9 (PERF_COUNT_SW_DUMMY)
          { sample_period, sample_freq }   1
          sample_type                      IP|TID|TIME|CPU|IDENTIFIER
          read_format                      ID|LOST
          inherit                          1
          exclude_kernel                   1
          exclude_hv                       1
          mmap                             1
          comm                             1
          task                             1
          sample_id_all                    1
          exclude_guest                    1
          mmap2                            1
          comm_exec                        1
          ksymbol                          1
          bpf_event                        1
        ------------------------------------------------------------
        sys_perf_event_open: pid -1  cpu 0  group_fd -1  flags 0x8 = 6
        sys_perf_event_open: pid -1  cpu 1  group_fd -1  flags 0x8 = 7
        sys_perf_event_open: pid -1  cpu 2  group_fd -1  flags 0x8 = 9
        sys_perf_event_open: pid -1  cpu 3  group_fd -1  flags 0x8 = 10
        sys_perf_event_open: pid -1  cpu 4  group_fd -1  flags 0x8 = 11
        sys_perf_event_open: pid -1  cpu 5  group_fd -1  flags 0x8 = 12
        sys_perf_event_open: pid -1  cpu 6  group_fd -1  flags 0x8 = 13
        sys_perf_event_open: pid -1  cpu 7  group_fd -1  flags 0x8 = 14
        <SNIP>
      Signed-off-by: default avatarYang Jihong <yangjihong1@huawei.com>
      Tested-by: default avatarRavi Bangoria <ravi.bangoria@amd.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Anshuman Khandual <anshuman.khandual@arm.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Richter <tmricht@linux.ibm.com>
      Link: https://lore.kernel.org/r/20230904023340.12707-5-yangjihong1@huawei.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      74b4f3ec
    • Yang Jihong's avatar
      perf record: Move setting tracking events before record__init_thread_masks() · 1285ab30
      Yang Jihong authored
      User space tasks can migrate between CPUs, so when tracing selected CPUs,
      sideband for all CPUs is needed. In this case set the cpu map of the evsel
      to all online CPUs. This may modify the original cpu map of the evlist.
      
      Therefore, need to check whether the preceding scenario exists before
      record__init_thread_masks().
      
      Dummy tracking has been set in record__open(), move it before
      record__init_thread_masks() and add a helper for unified processing.
      
      The sys_perf_event_open invoked is as follows:
      
        # perf --debug verbose=3 record -e cpu-clock -D 100 true
        <SNIP>
        Opening: cpu-clock
        ------------------------------------------------------------
        perf_event_attr:
          type                             1 (PERF_TYPE_SOFTWARE)
          size                             136
          config                           0 (PERF_COUNT_SW_CPU_CLOCK)
          { sample_period, sample_freq }   4000
          sample_type                      IP|TID|TIME|PERIOD|IDENTIFIER
          read_format                      ID|LOST
          disabled                         1
          inherit                          1
          freq                             1
          sample_id_all                    1
          exclude_guest                    1
        ------------------------------------------------------------
        sys_perf_event_open: pid 10318  cpu 0  group_fd -1  flags 0x8 = 5
        sys_perf_event_open: pid 10318  cpu 1  group_fd -1  flags 0x8 = 6
        sys_perf_event_open: pid 10318  cpu 2  group_fd -1  flags 0x8 = 7
        sys_perf_event_open: pid 10318  cpu 3  group_fd -1  flags 0x8 = 9
        sys_perf_event_open: pid 10318  cpu 4  group_fd -1  flags 0x8 = 10
        sys_perf_event_open: pid 10318  cpu 5  group_fd -1  flags 0x8 = 11
        sys_perf_event_open: pid 10318  cpu 6  group_fd -1  flags 0x8 = 12
        sys_perf_event_open: pid 10318  cpu 7  group_fd -1  flags 0x8 = 13
        Opening: dummy:u
        ------------------------------------------------------------
        perf_event_attr:
          type                             1 (PERF_TYPE_SOFTWARE)
          size                             136
          config                           0x9 (PERF_COUNT_SW_DUMMY)
          { sample_period, sample_freq }   1
          sample_type                      IP|TID|TIME|IDENTIFIER
          read_format                      ID|LOST
          disabled                         1
          inherit                          1
          exclude_kernel                   1
          exclude_hv                       1
          mmap                             1
          comm                             1
          enable_on_exec                   1
          task                             1
          sample_id_all                    1
          exclude_guest                    1
          mmap2                            1
          comm_exec                        1
          ksymbol                          1
          bpf_event                        1
        ------------------------------------------------------------
        sys_perf_event_open: pid 10318  cpu 0  group_fd -1  flags 0x8 = 14
        sys_perf_event_open: pid 10318  cpu 1  group_fd -1  flags 0x8 = 15
        sys_perf_event_open: pid 10318  cpu 2  group_fd -1  flags 0x8 = 16
        sys_perf_event_open: pid 10318  cpu 3  group_fd -1  flags 0x8 = 17
        sys_perf_event_open: pid 10318  cpu 4  group_fd -1  flags 0x8 = 18
        sys_perf_event_open: pid 10318  cpu 5  group_fd -1  flags 0x8 = 19
        sys_perf_event_open: pid 10318  cpu 6  group_fd -1  flags 0x8 = 20
        sys_perf_event_open: pid 10318  cpu 7  group_fd -1  flags 0x8 = 21
        <SNIP>
      
      'perf test' needs to update base-record & system-wide-dummy attr expected values
      for test-record-C0:
      
      1. Because a dummy sideband event is added to the sampling of specified
         CPUs. When evlist contains evsel of different sample_type,
         evlist__config() will change the default PERF_SAMPLE_ID bit to
         PERF_SAMPLE_IDENTIFICATION bit.
         The attr sample_type expected value of base-record and system-wide-dummy
         in test-record-C0 needs to be updated.
      
      2. The perf record uses evlist__add_aux_dummy() instead of
         evlist__add_dummy() to add a dummy event.
         The expected value of system-wide-dummy attr needs to be updated.
      
      The 'perf test' result is as follows:
      
        # ./perf test list  2>&1 | grep 'Setup struct perf_event_attr'
         17: Setup struct perf_event_attr
        # ./perf test 17
         17: Setup struct perf_event_attr                                    : Ok
      Signed-off-by: default avatarYang Jihong <yangjihong1@huawei.com>
      Acked-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Tested-by: default avatarRavi Bangoria <ravi.bangoria@amd.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Anshuman Khandual <anshuman.khandual@arm.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Richter <tmricht@linux.ibm.com>
      Link: https://lore.kernel.org/r/20230904023340.12707-4-yangjihong1@huawei.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      1285ab30
    • Yang Jihong's avatar
      perf evlist: Add evlist__findnew_tracking_event() helper · 9c95e4ef
      Yang Jihong authored
      Currently, intel-bts, intel-pt, and arm-spe may add tracking event to the
      evlist. We may need to search for the tracking event for some settings.
      
      Therefore, add evlist__findnew_tracking_event() helper.
      
      If system_wide is true, evlist__findnew_tracking_event() set the cpu map
      of the evsel to all online CPUs.
      Signed-off-by: default avatarYang Jihong <yangjihong1@huawei.com>
      Acked-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Tested-by: default avatarRavi Bangoria <ravi.bangoria@amd.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Anshuman Khandual <anshuman.khandual@arm.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Richter <tmricht@linux.ibm.com>
      Link: https://lore.kernel.org/r/20230904023340.12707-3-yangjihong1@huawei.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      9c95e4ef
    • Yang Jihong's avatar
      perf evlist: Add perf_evlist__go_system_wide() helper · f6ff1c76
      Yang Jihong authored
      For dummy events that keep tracking, we may need to modify its cpu_maps.
      
      For example, change the cpu_maps to record sideband events for all CPUS.
      
      Add perf_evlist__go_system_wide() helper to support this scenario.
      Signed-off-by: default avatarYang Jihong <yangjihong1@huawei.com>
      Acked-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Tested-by: default avatarRavi Bangoria <ravi.bangoria@amd.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Anshuman Khandual <anshuman.khandual@arm.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Richter <tmricht@linux.ibm.com>
      Link: https://lore.kernel.org/r/20230904023340.12707-2-yangjihong1@huawei.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      f6ff1c76
  2. 11 Sep, 2023 16 commits
  3. 10 Sep, 2023 6 commits
    • Linus Torvalds's avatar
      Linux 6.6-rc1 · 0bb80ecc
      Linus Torvalds authored
      0bb80ecc
    • Linus Torvalds's avatar
      Merge tag 'topic/drm-ci-2023-08-31-1' of git://anongit.freedesktop.org/drm/drm · 1548b060
      Linus Torvalds authored
      Pull drm ci scripts from Dave Airlie:
       "This is a bunch of ci integration for the freedesktop gitlab instance
        where we currently do upstream userspace testing on diverse sets of
        GPU hardware. From my perspective I think it's an experiment worth
        going with and seeing how the benefits/noise playout keeping these
        files useful.
      
        Ideally I'd like to get this so we can do pre-merge testing on PRs
        eventually.
      
        Below is some info from danvet on why we've ended up making the
        decision and how we can roll it back if we decide it was a bad plan.
      
        Why in upstream?
      
         - like documentation, testcases, tools CI integration is one of these
           things where you can waste endless amounts of time if you
           accidentally have a version that doesn't match your source code
      
         - but also like the above, there's a balance, this is the initial cut
           of what we think makes sense to keep in sync vs out-of-tree,
           probably needs adjustment
      
         - gitlab supports out-of-repo gitlab integration and that's what's
           been used for the kernel in drm, but it results in per-driver
           fragmentation and lots of duplicated effort. the simple act of
           smashing an arbitrary winner into a topic branch already started
           surfacing patches on dri-devel and sparking good cross driver team
           discussions
      
        Why gitlab?
      
         - it's not any more shit than any of the other CI
      
         - drm userspace uses it extensively for everything in userspace, we
           have a lot of people and experience with this, including
           integration of hw testing labs
      
         - media userspace like gstreamer is also on gitlab.fd.o, and there's
           discussion to extend this to the media subsystem in some fashion
      
        Can this be shared?
      
         - there's definitely a pile of code that could move to scripts/ if
           other subsystem adopt ci integration in upstream kernel git. other
           bits are more drm/gpu specific like the igt-gpu-tests/tools
           integration
      
         - docker images can be run locally or in other CI runners
      
        Will we regret this?
      
         - it's all in one directory, intentionally, for easy deletion
      
         - probably 1-2 years in upstream to see whether this is worth it or a
           Big Mistake. that's roughly what it took to _really_ roll out solid
           CI in the bigger userspace projects we have on gitlab.fd.o like
           mesa3d"
      
      * tag 'topic/drm-ci-2023-08-31-1' of git://anongit.freedesktop.org/drm/drm:
        drm: ci: docs: fix build warning - add missing escape
        drm: Add initial ci/ subdirectory
      1548b060
    • Linus Torvalds's avatar
      Merge tag 'x86-urgent-2023-09-10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · e56b2b60
      Linus Torvalds authored
      Pull x86 fixes from Ingo Molnar:
       "Fix preemption delays in the SGX code, remove unnecessarily
        UAPI-exported code, fix a ld.lld linker (in)compatibility quirk and
        make the x86 SMP init code a bit more conservative to fix kexec()
        lockups"
      
      * tag 'x86-urgent-2023-09-10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/sgx: Break up long non-preemptible delays in sgx_vepc_release()
        x86: Remove the arch_calc_vm_prot_bits() macro from the UAPI
        x86/build: Fix linker fill bytes quirk/incompatibility for ld.lld
        x86/smp: Don't send INIT to non-present and non-booted CPUs
      e56b2b60
    • Linus Torvalds's avatar
      Merge tag 'perf-urgent-2023-09-10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · e79dbf03
      Linus Torvalds authored
      Pull x86 perf event fix from Ingo Molnar:
       "Work around a firmware bug in the uncore PMU driver, affecting certain
        Intel systems"
      
      * tag 'perf-urgent-2023-09-10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        perf/x86/uncore: Correct the number of CHAs on EMR
      e79dbf03
    • Linus Torvalds's avatar
      Merge tag 'perf-tools-for-v6.6-1-2023-09-05' of... · 535a265d
      Linus Torvalds authored
      Merge tag 'perf-tools-for-v6.6-1-2023-09-05' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools
      
      Pull perf tools updates from Arnaldo Carvalho de Melo:
       "perf tools maintainership:
      
         - Add git information for perf-tools and perf-tools-next trees and
           branches to the MAINTAINERS file. That is where development now
           takes place and myself and Namhyung Kim have write access, more
           people to come as we emulate other maintainer groups.
      
        perf record:
      
         - Record kernel data maps when 'perf record --data' is used, so that
           global variables can be resolved and used in tools that do data
           profiling.
      
        perf trace:
      
         - Remove the old, experimental support for BPF events in which a .c
           file was passed as an event: "perf trace -e hello.c" to then get
           compiled and loaded.
      
           The only known usage for that, that shipped with the kernel as an
           example for such events, augmented the raw_syscalls tracepoints and
           was converted to a libbpf skeleton, reusing all the user space
           components and the BPF code connected to the syscalls.
      
           In the end just the way to glue the BPF part and the user space
           type beautifiers changed, now being performed by libbpf skeletons.
      
           The next step is to use BTF to do pretty printing of all syscall
           types, as discussed with Alan Maguire and others.
      
           Now, on a perf built with BUILD_BPF_SKEL=1 we get most if not all
           path/filenames/strings, some of the networking data structures,
           perf_event_attr, etc, i.e. systemwide tracing of nanosleep calls
           and perf_event_open syscalls while 'perf stat' runs 'sleep' for 5
           seconds:
      
            # perf trace -a -e *nanosleep,perf* perf stat -e cycles,instructions sleep 5
               0.000 (   9.034 ms): perf/327641 perf_event_open(attr_uptr: { type: 0 (PERF_TYPE_HARDWARE), size: 136, config: 0 (PERF_COUNT_HW_CPU_CYCLES), sample_type: IDENTIFIER, read_format: TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING, disabled: 1, inherit: 1, enable_on_exec: 1, exclude_guest: 1 }, pid: 327642 (perf), cpu: -1, group_fd: -1, flags: FD_CLOEXEC) = 3
               9.039 (   0.006 ms): perf/327641 perf_event_open(attr_uptr: { type: 0 (PERF_TYPE_HARDWARE), size: 136, config: 0x1 (PERF_COUNT_HW_INSTRUCTIONS), sample_type: IDENTIFIER, read_format: TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING, disabled: 1, inherit: 1, enable_on_exec: 1, exclude_guest: 1 }, pid: 327642 (perf-exec), cpu: -1, group_fd: -1, flags: FD_CLOEXEC) = 4
                   ? (           ): gpm/991  ... [continued]: clock_nanosleep())               = 0
              10.133 (           ): sleep/327642 clock_nanosleep(rqtp: { .tv_sec: 5, .tv_nsec: 0 }, rmtp: 0x7ffd36f83ed0) ...
                   ? (           ): pool-gsd-smart/3051  ... [continued]: clock_nanosleep())   = 0
              30.276 (           ): gpm/991 clock_nanosleep(rqtp: { .tv_sec: 2, .tv_nsec: 0 }, rmtp: 0x7ffcc6f73710) ...
             223.215 (1000.430 ms): pool-gsd-smart/3051 clock_nanosleep(rqtp: { .tv_sec: 1, .tv_nsec: 0 }, rmtp: 0x7f6e7fffec90) = 0
              30.276 (2000.394 ms): gpm/991  ... [continued]: clock_nanosleep())               = 0
            1230.814 (           ): pool-gsd-smart/3051 clock_nanosleep(rqtp: { .tv_sec: 1, .tv_nsec: 0 }, rmtp: 0x7f6e7fffec90) ...
            1230.814 (1000.404 ms): pool-gsd-smart/3051  ... [continued]: clock_nanosleep())   = 0
            2030.886 (           ): gpm/991 clock_nanosleep(rqtp: { .tv_sec: 2, .tv_nsec: 0 }, rmtp: 0x7ffcc6f73710) ...
            2237.709 (1000.153 ms): pool-gsd-smart/3051 clock_nanosleep(rqtp: { .tv_sec: 1, .tv_nsec: 0 }, rmtp: 0x7f6e7fffec90) = 0
                   ? (           ): crond/1172  ... [continued]: clock_nanosleep())            = 0
            3242.699 (           ): pool-gsd-smart/3051 clock_nanosleep(rqtp: { .tv_sec: 1, .tv_nsec: 0 }, rmtp: 0x7f6e7fffec90) ...
            2030.886 (2000.385 ms): gpm/991  ... [continued]: clock_nanosleep())               = 0
            3728.078 (           ): crond/1172 clock_nanosleep(rqtp: { .tv_sec: 60, .tv_nsec: 0 }, rmtp: 0x7ffe0971dcf0) ...
            3242.699 (1000.158 ms): pool-gsd-smart/3051  ... [continued]: clock_nanosleep())   = 0
            4031.409 (           ): gpm/991 clock_nanosleep(rqtp: { .tv_sec: 2, .tv_nsec: 0 }, rmtp: 0x7ffcc6f73710) ...
              10.133 (5000.375 ms): sleep/327642  ... [continued]: clock_nanosleep())          = 0
      
            Performance counter stats for 'sleep 5':
      
                   2,617,347      cycles
                   1,855,997      instructions                     #    0.71  insn per cycle
      
                 5.002282128 seconds time elapsed
      
                 0.000855000 seconds user
                 0.000852000 seconds sys
      
        perf annotate:
      
         - Building with binutils' libopcode now is opt-in (BUILD_NONDISTRO=1)
           for licensing reasons, and we missed a build test on
           tools/perf/tests makefile.
      
           Since we now default to NDEBUG=1, we ended up segfaulting when
           building with BUILD_NONDISTRO=1 because a needed initialization
           routine was being "error checked" via an assert.
      
           Fix it by explicitly checking the result and aborting instead if it
           fails.
      
           We better back propagate the error, but at least 'perf annotate' on
           samples collected for a BPF program is back working when perf is
           built with BUILD_NONDISTRO=1.
      
        perf report/top:
      
         - Add back TUI hierarchy mode header, that is seen when using 'perf
           report/top --hierarchy'.
      
         - Fix the number of entries for 'e' key in the TUI that was
           preventing navigation of lines when expanding an entry.
      
        perf report/script:
      
         - Support cross platform register handling, allowing a perf.data file
           collected on one architecture to have registers sampled correctly
           displayed when analysis tools such as 'perf report' and 'perf
           script' are used on a different architecture.
      
         - Fix handling of event attributes in pipe mode, i.e. when one uses:
      
        	perf record -o - | perf report -i -
      
           When no perf.data files are used.
      
         - Handle files generated via pipe mode with a version of perf and
           then read also via pipe mode with a different version of perf,
           where the event attr record may have changed, use the record size
           field to properly support this version mismatch.
      
        perf probe:
      
         - Accessing global variables from uprobes isn't supported, make the
           error message state that instead of stating that some minimal
           kernel version is needed to have that feature. This seems just a
           tool limitation, the kernel probably has all that is needed.
      
        perf tests:
      
         - Fix a reference count related leak in the dlfilter v0 API where the
           result of a thread__find_symbol_fb() is not matched with an
           addr_location__exit() to drop the reference counts of the resolved
           components (machine, thread, map, symbol, etc). Add a dlfilter test
           to make sure that doesn't regresses.
      
         - Lots of fixes for the 'perf test' written in shell script related
           to problems found with the shellcheck utility.
      
         - Fixes for 'perf test' shell scripts testing features enabled when
           perf is built with BUILD_BPF_SKEL=1, such as 'perf stat' bpf
           counters.
      
         - Add perf record sample filtering test, things like the following
           example, that gets implemented as a BPF filter attached to the
           event:
      
             # perf record -e task-clock -c 10000 --filter 'ip < 0xffffffff00000000'
      
         - Improve the way the task_analyzer test checks if libtraceevent is
           linked, using 'perf version --build-options' instead of the more
           expensinve 'perf record -e "sched:sched_switch"'.
      
         - Add support for riscv in the mmap-basic test. (This went as well
           via the RiscV tree, same contents).
      
        libperf:
      
         - Implement riscv mmap support (This went as well via the RiscV tree,
           same contents).
      
        perf script:
      
         - New tool that converts perf.data files to the firefox profiler
           format so that one can use the visualizer at
           https://profiler.firefox.com/. Done by Anup Sharma as part of this
           year's Google Summer of Code.
      
           One can generate the output and upload it to the web interface but
           Anup also automated everything:
      
             perf script gecko -F 99 -a sleep 60
      
         - Support syscall name parsing on arm64.
      
         - Print "cgroup" field on the same line as "comm".
      
        perf bench:
      
         - Add new 'uprobe' benchmark to measure the overhead of uprobes
           with/without BPF programs attached to it.
      
         - breakpoints are not available on power9, skip that test.
      
        perf stat:
      
         - Add #num_cpus_online literal to be used in 'perf stat' metrics, and
           add this extra 'perf test' check that exemplifies its purpose:
      
        	TEST_ASSERT_VAL("#num_cpus_online",
                               expr__parse(&num_cpus_online, ctx, "#num_cpus_online") == 0);
        	TEST_ASSERT_VAL("#num_cpus", expr__parse(&num_cpus, ctx, "#num_cpus") == 0);
        	TEST_ASSERT_VAL("#num_cpus >= #num_cpus_online", num_cpus >= num_cpus_online);
      
        Miscellaneous:
      
         - Improve tool startup time by lazily reading PMU, JSON, sysfs data.
      
         - Improve error reporting in the parsing of events, passing YYLTYPE
           to error routines, so that the output can show were the parsing
           error was found.
      
         - Add 'perf test' entries to check the parsing of events
           improvements.
      
         - Fix various leak for things detected by -fsanitize=address, mostly
           things that would be freed at tool exit, including:
      
             - Free evsel->filter on the destructor.
      
             - Allow tools to register a thread->priv destructor and use it in
               'perf trace'.
      
             - Free evsel->priv in 'perf trace'.
      
             - Free string returned by synthesize_perf_probe_point() when the
               caller fails to do all it needs.
      
         - Adjust various compiler options to not consider errors some
           warnings when building with broken headers found in things like
           python, flex, bison, as we otherwise build with -Werror. Some for
           gcc, some for clang, some for some specific version of those, some
           for some specific version of flex or bison, or some specific
           combination of these components, bah.
      
         - Allow customization of clang options for BPF target, this helps
           building on gentoo where there are other oddities where BPF targets
           gets passed some compiler options intended for the native build, so
           building with WERROR=0 helps while these oddities are fixed.
      
         - Dont pass ERR_PTR() values to perf_session__delete() in 'perf top'
           and 'perf lock', fixing some segfaults when handling some odd
           failures.
      
         - Add LTO build option.
      
         - Fix format of unordered lists in the perf docs
           (tools/perf/Documentation)
      
         - Overhaul the bison files, using constructs such as YYNOMEM.
      
         - Remove unused tokens from the bison .y files.
      
         - Add more comments to various structs.
      
         - A few LoongArch enablement patches.
      
        Vendor events (JSON):
      
         - Add JSON metrics for Yitian 710 DDR (aarch64). Things like:
      
        	EventName, BriefDescription
        	visible_window_limit_reached_rd, "At least one entry in read queue reaches the visible window limit.",
        	visible_window_limit_reached_wr, "At least one entry in write queue reaches the visible window limit.",
        	op_is_dqsosc_mpc	       , "A DQS Oscillator MPC command to DRAM.",
        	op_is_dqsosc_mrr	       , "A DQS Oscillator MRR command to DRAM.",
        	op_is_tcr_mrr		       , "A Temperature Compensated Refresh(TCR) MRR command to DRAM.",
      
         - Add AmpereOne metrics (aarch64).
      
         - Update N2 and V2 metrics (aarch64) and events using Arm telemetry
           repo.
      
         - Update scale units and descriptions of common topdown metrics on
           aarch64. Things like:
             - "MetricExpr": "stall_slot_frontend / (#slots * cpu_cycles)",
             - "BriefDescription": "Frontend bound L1 topdown metric",
             + "MetricExpr": "100 * (stall_slot_frontend / (#slots * cpu_cycles))",
             + "BriefDescription": "This metric is the percentage of total slots that were stalled due to resource constraints in the frontend of the processor.",
      
         - Update events for intel: meteorlake to 1.04, sapphirerapids to
           1.15, Icelake+ metric constraints.
      
         - Update files for the power10 platform"
      
      * tag 'perf-tools-for-v6.6-1-2023-09-05' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools: (217 commits)
        perf parse-events: Fix driver config term
        perf parse-events: Fixes relating to no_value terms
        perf parse-events: Fix propagation of term's no_value when cloning
        perf parse-events: Name the two term enums
        perf list: Don't print Unit for "default_core"
        perf vendor events intel: Fix modifier in tma_info_system_mem_parallel_reads for skylake
        perf dlfilter: Avoid leak in v0 API test use of resolve_address()
        perf metric: Add #num_cpus_online literal
        perf pmu: Remove str from perf_pmu_alias
        perf parse-events: Make common term list to strbuf helper
        perf parse-events: Minor help message improvements
        perf pmu: Avoid uninitialized use of alias->str
        perf jevents: Use "default_core" for events with no Unit
        perf test stat_bpf_counters_cgrp: Enhance perf stat cgroup BPF counter test
        perf test shell stat_bpf_counters: Fix test on Intel
        perf test shell record_bpf_filter: Skip 6.2 kernel
        libperf: Get rid of attr.id field
        perf tools: Convert to perf_record_header_attr_id()
        libperf: Add perf_record_header_attr_id()
        perf tools: Handle old data in PERF_RECORD_ATTR
        ...
      535a265d
    • Linus Torvalds's avatar
      Merge tag '6.6-rc-smb3-client-fixes-part2' of git://git.samba.org/sfrench/cifs-2.6 · fd3a5940
      Linus Torvalds authored
      Pull smb client fixes from Steve French:
      
       - six smb3 client fixes including ones to allow controlling smb3
         directory caching timeout and limits, and one debugging improvement
      
       - one fix for nls Kconfig (don't need to expose NLS_UCS2_UTILS option)
      
       - one minor spnego registry update
      
      * tag '6.6-rc-smb3-client-fixes-part2' of git://git.samba.org/sfrench/cifs-2.6:
        spnego: add missing OID to oid registry
        smb3: fix minor typo in SMB2_GLOBAL_CAP_LARGE_MTU
        cifs: update internal module version number for cifs.ko
        smb3: allow controlling maximum number of cached directories
        smb3: add trace point for queryfs (statfs)
        nls: Hide new NLS_UCS2_UTILS
        smb3: allow controlling length of time directory entries are cached with dir leases
        smb: propagate error code of extract_sharename()
      fd3a5940
  4. 09 Sep, 2023 3 commits
    • David Howells's avatar
      iov_iter: Kunit tests for page extraction · a3c57ab7
      David Howells authored
      Add some kunit tests for page extraction for ITER_BVEC, ITER_KVEC and
      ITER_XARRAY type iterators.  ITER_UBUF and ITER_IOVEC aren't dealt with
      as they require userspace VM interaction.  ITER_DISCARD isn't dealt with
      either as that can't be extracted.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Christian Brauner <brauner@kernel.org>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: John Hubbard <jhubbard@nvidia.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      a3c57ab7
    • David Howells's avatar
      iov_iter: Kunit tests for copying to/from an iterator · 2d71340f
      David Howells authored
      Add some kunit tests for page extraction for ITER_BVEC, ITER_KVEC and
      ITER_XARRAY type iterators.  ITER_UBUF and ITER_IOVEC aren't dealt with
      as they require userspace VM interaction.  ITER_DISCARD isn't dealt with
      either as that does nothing.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Christian Brauner <brauner@kernel.org>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: John Hubbard <jhubbard@nvidia.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      2d71340f
    • David Howells's avatar
      iov_iter: Fix iov_iter_extract_pages() with zero-sized entries · f741bd71
      David Howells authored
      iov_iter_extract_pages() doesn't correctly handle skipping over initial
      zero-length entries in ITER_KVEC and ITER_BVEC-type iterators.
      
      The problem is that it accidentally reduces maxsize to 0 when it
      skipping and thus runs to the end of the array and returns 0.
      
      Fix this by sticking the calculated size-to-copy in a new variable
      rather than back in maxsize.
      
      Fixes: 7d58fe73 ("iov_iter: Add a function to extract a page list from an iterator")
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Cc: Christian Brauner <brauner@kernel.org>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: John Hubbard <jhubbard@nvidia.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      f741bd71