1. 28 Nov, 2017 15 commits
    • Arnaldo Carvalho de Melo's avatar
      perf top: Ignore kptr_restrict when not sampling the kernel · df7ccfa2
      Arnaldo Carvalho de Melo authored
      If all events have attr.exclude_kernel set, no need to look at
      kptr_restrict.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-yegpzg5bf2im69g0tfizqaqz@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      df7ccfa2
    • Arnaldo Carvalho de Melo's avatar
      perf record: Ignore kptr_restrict when not sampling the kernel · b0ebd811
      Arnaldo Carvalho de Melo authored
      If we're not sampling the kernel, we shouldn't care about kptr_restrict
      neither synthesize anything for assisting in resolving kernel samples,
      like the reference relocation symbol or kernel modules information.
      
      Before:
      
        $ cat /proc/sys/kernel/kptr_restrict /proc/sys/kernel/perf_event_paranoid
        2
        2
        $ perf record sleep 1
        WARNING: Kernel address maps (/proc/{kallsyms,modules}) are restricted,
        check /proc/sys/kernel/kptr_restrict.
      
        Samples in kernel functions may not be resolved if a suitable vmlinux
        file is not found in the buildid cache or in the vmlinux path.
      
        Samples in kernel modules won't be resolved at all.
      
        If some relocation was applied (e.g. kexec) symbols may be misresolved
        even with a suitable vmlinux or kallsyms file.
      
        Couldn't record kernel reference relocation symbol
        Symbol resolution may be skewed if relocation was used (e.g. kexec).
        Check /proc/kallsyms permission or run as root.
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.001 MB perf.data (8 samples) ]
        $ perf evlist -v
        cycles:uppp: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|PERIOD, disabled: 1, inherit: 1, exclude_kernel: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1, precise_ip: 3, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1
        $
      
      After:
      
        $ perf record sleep 1
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.001 MB perf.data (10 samples) ]
        $
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-t025e9zftbx2b8cq2w01g5e5@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      b0ebd811
    • Arnaldo Carvalho de Melo's avatar
      perf report: Ignore kptr_restrict when not sampling the kernel · 3f0a4c87
      Arnaldo Carvalho de Melo authored
      If none of the evsels has attr.exclude_kernel set to zero, no kernel
      samples, so no point in warning the user about problems in processing
      kernel samples, as there will be none.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-7dn926v3at8txxkky92aesz2@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      3f0a4c87
    • Arnaldo Carvalho de Melo's avatar
      perf evlist: Add helper to check if attr.exclude_kernel is set in all evsels · 5b0d1cb4
      Arnaldo Carvalho de Melo authored
      The warning about kptr_restrict needs to be emitted only when it is set
      and we ask for kernel space samples, so add a helper to help with that.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-fh7drty6yljei9gxxzer6eup@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      5b0d1cb4
    • Thomas Richter's avatar
      perf test shell: Fix test case probe libc's inet_pton on s390x · d5c5e46a
      Thomas Richter authored
      The 'perf test' case "probe libc's inet_pton & backtrace it with ping"
      fails on s390x. The reason is the 'realpath /lib64/ld*.so.* | uniq' line
      which returns 2 libraries:
      
              root@s35lp76 shell]# realpath /lib64/ld*.so.* | uniq
              /usr/lib64/ld-2.26.so
              /usr/lib64/ld_pre_smc.so.1.0.1
              [root@s35lp76 shell]
      
      This output makes the "perf probe" command lines invalid.
      
      Use ldd tool to find out the libraries required by "bash" and check if
      symbol "inet_pton" is part of the "libc" library.  Some distros do not
      have a /lib64 directory.
      
      I have also added a check for the existence of an IPv6 network interface
      before it is being used.
      
      Committer changes:
      
      We can't really use ldd for libc, as in some systems, such as x86_64, it
      has hardlinks and then ldd sees one and the kernel the other, so grep
      for libc in /proc/self/maps to get the one we'll receive from
      PERF_RECORD_MMAP.
      
      Thomas checked this change and acked it.
      Signed-off-by: default avatarThomas-Mich Richter <tmricht@linux.vnet.ibm.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Suggested-by: default avatarHendrik Brückner <brueckner@linux.vnet.ibm.com>
      Reviewed-by: default avatarHendrik Brückner <brueckner@linux.vnet.ibm.com>
      Link: http://lkml.kernel.org/r/20171114133409.GN8836@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      d5c5e46a
    • Thomas Richter's avatar
      perf test shell: Fix check open filename arg using 'perf trace' on s390x · ccafc38f
      Thomas Richter authored
      This 'perf test' case fails on s390x. The 'touch' command on s390x uses
      the 'openat' system call to open the file named on the command line:
      
      [root@s35lp76 perf]# perf probe -l
        probe:vfs_getname    (on getname_flags:72@fs/namei.c with pathname)
      [root@s35lp76 perf]# perf trace -e open touch /tmp/abc
           0.400 ( 0.015 ms): touch/27542 open(filename:
      		/usr/lib/locale/locale-archive, flags: CLOEXEC) = 3
      [root@s35lp76 perf]#
      
      There is no 'open' system call for file '/tmp/abc'. Instead the 'openat'
      system call is used:
      
      [root@s35lp76 perf]# strace touch /tmp/abc
          execve("/usr/bin/touch", ["touch", "/tmp/abc"], 0x3ffd547ec98
      			/* 30 vars */) = 0
          [...]
          openat(AT_FDCWD, "/tmp/abc", O_WRONLY|O_CREAT|O_NOCTTY|O_NONBLOCK, 0666) = 3
          [...]
      
      On s390x the 'egrep' command does not find a matching pattern and
      returns an error.
      
      Fix this for s390x create a platform dependent command line to enable
      the 'perf probe' call to listen to the 'openat' system call and get the
      expected output.
      Signed-off-by: default avatarThomas-Mich Richter <tmricht@linux.vnet.ibm.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
      Cc: Thomas-Mich Richter <tmricht@linux.vnet.ibm.com>
      LPU-Reference: 20171114071847.2381-1-tmricht@linux.vnet.ibm.com
      Link: http://lkml.kernel.org/n/tip-3qf38jk0prz54rhmhyu871my@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      ccafc38f
    • Ravi Bangoria's avatar
      perf annotate: Do not truncate instruction names at 6 chars · 05d0e62d
      Ravi Bangoria authored
      There are many instructions, esp on PowerPC, whose mnemonics are longer
      than 6 characters. Using precision limit causes truncation of such
      mnemonics.
      
      Fix this by removing precision limit. Note that, 'width' is still 6, so
      alignment won't get affected for length <= 6.
      
      Before:
      
         li     r11,-1
         xscvdp vs1,vs1
         add.   r10,r10,r11
      
      After:
      
        li     r11,-1
        xscvdpsxds vs1,vs1
        add.   r10,r10,r11
      Reported-by: default avatarDonald Stence <dstence@us.ibm.com>
      Signed-off-by: default avatarRavi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Taeung Song <treeze.taeung@gmail.com>
      Link: http://lkml.kernel.org/r/20171114032540.4564-1-ravi.bangoria@linux.vnet.ibm.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      05d0e62d
    • Namhyung Kim's avatar
      perf help: Fix a bug during strstart() conversion · af98f227
      Namhyung Kim authored
      The commit 8e99b6d4 changed prefixcmp() to strstart() but missed to
      change the return value in some place.  It makes perf help print
      annoying output even for sane config items like below:
      
        $ perf help
        '.root': unsupported man viewer sub key.
        ...
      Reported-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Tested-by: default avatarTaeung Song <treeze.taeung@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Sihyeon Jang <uneedsihyeon@gmail.com>
      Cc: kernel-team@lge.com
      Link: http://lkml.kernel.org/r/20171114001542.GA16464@sejong
      Fixes: 8e99b6d4 ("tools include: Adopt strstarts() from the kernel")
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      af98f227
    • Arnaldo Carvalho de Melo's avatar
      perf machine: Guard against NULL in machine__exit() · 4a2233b1
      Arnaldo Carvalho de Melo authored
      A recent fix for 'perf trace' introduced a bug where
      machine__exit(trace->host) could be called while trace->host was still
      NULL, so make this more robust by guarding against NULL, just like
      free() does.
      
      The problem happens, for instance, when !root users try to run 'perf
      trace':
      
        [acme@jouet linux]$ trace
        Error:	No permissions to read /sys/kernel/debug/tracing/events/raw_syscalls/sys_(enter|exit)
        Hint:	Try 'sudo mount -o remount,mode=755 /sys/kernel/debug/tracing'
      
        perf: Segmentation fault
        Obtained 7 stack frames.
        [0x4f1b2e]
        /lib64/libc.so.6(+0x3671f) [0x7f43a1dd971f]
        [0x4f3fec]
        [0x47468b]
        [0x42a2db]
        /lib64/libc.so.6(__libc_start_main+0xe9) [0x7f43a1dc3509]
        [0x42a6c9]
        Segmentation fault (core dumped)
        [acme@jouet linux]$
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andrei Vagin <avagin@openvz.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Vasily Averin <vvs@virtuozzo.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Fixes: 33974a41 ("perf trace: Call machine__exit() at exit")
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      4a2233b1
    • Arnaldo Carvalho de Melo's avatar
      perf script: Fix --per-event-dump for auxtrace synth evsels · 501e5bbe
      Arnaldo Carvalho de Melo authored
      When processing PERF_RECORD_AUXTRACE_INFO several perf_evsel entries
      will be synthesized and inserted into session->evlist, eventually ending
      in perf_script.tool.sample(), which ends up calling builtin-script.c's
      process_event(), that expects evsel->priv to be a perf_evsel_script
      object with a valid FILE pointer in fp.
      
      So we need to intercept the processing of PERF_RECORD_AUXTRACE_INFO and
      then setup evsel->priv for these newly created perf_evsel instances, do
      it to fix the segfault in process_event() trying to use a NULL for that
      FILE pointer.
      Reported-by: default avatarAlexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Cc: yuzhoujian <yuzhoujian@didichuxing.com>
      Fixes: a14390fd ("perf script: Allow creating per-event dump files")
      Link: http://lkml.kernel.org/n/tip-bthnur8r8de01gxvn2qayx6e@git.kernel.org
      [ Merge fix by Ravi Bangoria before pushing upstream to preserv bisectability ]
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      501e5bbe
    • Arnaldo Carvalho de Melo's avatar
      perf evsel: Fix up leftover perf_evsel_stat usage via evsel->priv · 8e2d8e20
      Arnaldo Carvalho de Melo authored
      I forgot one conversion, which got noticed by Thomas when running:
      
        $ perf stat  -e '{cpu-clock,instructions}' kill
        kill: not enough arguments
        Segmentation fault (core dumped)
        $
      
      Fix it, those stats are in evsel->stats, not anymore in evsel->priv.
      Reported-by: default avatarThomas-Mich Richter <tmricht@linux.vnet.ibm.com>
      Tested-by: default avatarThomas-Mich Richter <tmricht@linux.vnet.ibm.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Fixes: e669e833 ("perf evsel: Restore evsel->priv as a tool private area")
      Link: http://lkml.kernel.org/r/20171109150046.GN4333@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      8e2d8e20
    • Andrei Vagin's avatar
      perf trace: Fix an exit code of trace__symbols_init · 35c33633
      Andrei Vagin authored
      Currently if trace_event__register_resolver() fails, we return -errno,
      but we can't be sure that errno isn't zero in this case.
      Signed-off-by: default avatarAndrei Vagin <avagin@openvz.org>
      Reviewed-by: default avatarJiri Olsa <jolsa@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Vasily Averin <vvs@virtuozzo.com>
      Link: http://lkml.kernel.org/r/20171108002246.8924-2-avagin@openvz.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      35c33633
    • Andi Kleen's avatar
      perf record: Fix -c/-F options for cpu event aliases · 59622fd4
      Andi Kleen authored
      The Intel PMU event aliases have a implicit period= specifier to set the
      default period.
      
      Unfortunately this breaks overriding these periods with -c or -F,
      because the alias terms look like they are user specified to the
      internal parser, and user specified event qualifiers override the
      command line options.
      
      Track that they are coming from aliases by adding a "weak" state to the
      term. Any weak terms don't override command line options.
      
      I only did it for -c/-F for now, I think that's the only case that's
      broken currently.
      
      Before:
      
      $ perf record -c 1000 -vv -e uops_issued.any
      ...
        { sample_period, sample_freq }   2000003
      
      After:
      
      $ perf record -c 1000 -vv -e uops_issued.any
      ...
        { sample_period, sample_freq }   1000
      Signed-off-by: default avatarAndi Kleen <ak@linux.intel.com>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Link: http://lkml.kernel.org/r/20171020202755.21410-2-andi@firstfloor.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      59622fd4
    • Arnaldo Carvalho de Melo's avatar
      perf record: Generate PERF_RECORD_{MMAP,COMM,EXEC} with --delay · dffdcbdb
      Arnaldo Carvalho de Melo authored
      When we use an initial delay, e.g.: 'perf record --delay 1000', we do not
      enable the events until that delay has passed after we started the workload,
      including the tracking event, i.e. the one for which we have attr.mmap, etc,
      enabled to ask the kernel to generate the PERF_RECORD_{MMAP,COMM,EXEC} metadata
      events that will then allow us to resolve addresses in samples to the map, dso
      and symbol. There will be a shadow that even synthesizing samples won't cover,
      i.e. the workload that we start and other processes forking while we
      wait for the initial delay to expire.
      
      So use a dummy event to be the tracking one and make it be enabled on exec.
      
      Before:
      
        # perf record --delay 1000 stress --cpu 1 --timeout 5
        stress: info: [9029] dispatching hogs: 1 cpu, 0 io, 0 vm, 0 hdd
        stress: info: [9029] successful run completed in 5s
        [ perf record: Woken up 3 times to write data ]
        [ perf record: Captured and wrote 0.624 MB perf.data (15908 samples) ]
        # perf script | head
            :9031 9031 32001.826888:       1 cycles:ppp: ffffffff831aa30d event_function (/lib/modules/4.14.0-rc6+/build/vmlinux)
            :9031 9031 32001.826893:       1 cycles:ppp: ffffffff8300d1a0 intel_bts_enable_local (/lib/modules/4.14.0-rc6+/build/vmlinux)
            :9031 9031 32001.826895:       7 cycles:ppp: ffffffff83023870 sched_clock (/lib/modules/4.14.0-rc6+/build/vmlinux)
            :9031 9031 32001.826897:     103 cycles:ppp: ffffffff8300c331 intel_pmu_handle_irq (/lib/modules/4.14.0-rc6+/build/vmlinux)
            :9031 9031 32001.826899:    1615 cycles:ppp: ffffffff830231f8 native_sched_clock (/lib/modules/4.14.0-rc6+/build/vmlinux)
            :9031 9031 32001.826902:   26724 cycles:ppp: ffffffff8384c6a7 native_irq_return_iret (/lib/modules/4.14.0-rc6+/build/vmlinux)
            :9031 9031 32001.826913:  329739 cycles:ppp:     7fb2a5410932 [unknown] ([unknown])
            :9031 9031 32001.827033: 1225451 cycles:ppp:     7fb2a5410930 [unknown] ([unknown])
            :9031 9031 32001.827474: 1391725 cycles:ppp:     7fb2a5410930 [unknown] ([unknown])
            :9031 9031 32001.827978: 1233697 cycles:ppp:     7fb2a5410928 [unknown] ([unknown])
        #
      
      After:
      
        # perf record --delay 1000 stress --cpu 1 --timeout 5
        stress: info: [9741] dispatching hogs: 1 cpu, 0 io, 0 vm, 0 hdd
        stress: info: [9741] successful run completed in 5s
        [ perf record: Woken up 3 times to write data ]
        [ perf record: Captured and wrote 0.751 MB perf.data (15976 samples) ]
        # perf script | head
           stress  9742 32110.959106:          1 cycles:ppp:  ffffffff831b26f6 __perf_event_task_sched_in (/lib/modules/4.14.0-rc6+/build/vmlinux)
           stress 9742 32110.959110:       1 cycles:ppp: ffffffff8300c2e9 intel_pmu_handle_irq (/lib/modules/4.14.0-rc6+/build/vmlinux)
           stress 9742 32110.959112:       7 cycles:ppp: ffffffff830231e0 native_sched_clock (/lib/modules/4.14.0-rc6+/build/vmlinux)
           stress 9742 32110.959115:     101 cycles:ppp: ffffffff83023870 sched_clock (/lib/modules/4.14.0-rc6+/build/vmlinux)
           stress 9742 32110.959117:    1533 cycles:ppp: ffffffff830231f8 native_sched_clock (/lib/modules/4.14.0-rc6+/build/vmlinux)
           stress 9742 32110.959119:   23992 cycles:ppp: ffffffff831b0900 ctx_sched_in (/lib/modules/4.14.0-rc6+/build/vmlinux)
           stress 9742 32110.959129:  329406 cycles:ppp:     7f4b1b661930 __random_r (/usr/lib64/libc-2.25.so)
           stress 9742 32110.959249: 1288322 cycles:ppp:     5566e1e7cbc9 hogcpu (/usr/bin/stress)
           stress 9742 32110.959712: 1464046 cycles:ppp:     7f4b1b66179e __random (/usr/lib64/libc-2.25.so)
           stress 9742 32110.960241: 1266918 cycles:ppp:     7f4b1b66195b __random_r (/usr/lib64/libc-2.25.so)
        #
      Reported-by: default avatarBram Stolk <b.stolk@gmail.com>
      Tested-by: default avatarBram Stolk <b.stolk@gmail.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Fixes: 6619a53e ("perf record: Add --initial-delay option")
      Link: http://lkml.kernel.org/n/tip-nrdfchshqxf7diszhxcecqb9@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      dffdcbdb
    • Arnaldo Carvalho de Melo's avatar
      perf evlist: Set the correct idx when adding dummy events · 555b4ec4
      Arnaldo Carvalho de Melo authored
      The evsel->idx field is used mainly to access the right bucket in
      per-event arrays such as the annotation ones, but also to set
      evsel->tracking, that in turn will decide what of the events will ask
      for PERF_RECORD_{MMAP,COMM,EXEC} to be generated, i.e. which
      perf_event_attr will have its mmap, etc fields set.
      
      When we were adding the "dummy" event using perf_evlist__add_dummy() we
      were not setting it correctly, which could result in multiple tracking
      events.
      
      Now that I'll try using a dummy event to be the tracking one when using
      'perf record --delay', i.e. when we process the --delay
      setting we may already have the evlist set up, like with:
      
        perf record -e cycles,instructions --delay 1000 ./workload
      
      We will need to add a "dummy" event, then reset evsel->tracking for the
      first event, "cycles", and set it instead to the dummy one, and also
      setting its attr.enable_on_exec, so that we get the PERF_RECORD_MMAP,
      etc metadata events while waiting to enable the explicitely requested
      events, so lets get this straight and set the right evsel->idx.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Bram Stolk <b.stolk@gmail.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-nrdfchshqxf7diszhxcecqb9@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      555b4ec4
  2. 14 Nov, 2017 11 commits
    • Rafael J. Wysocki's avatar
      x86 / CPU: Avoid unnecessary IPIs in arch_freq_get_on_cpu() · b29c6ef7
      Rafael J. Wysocki authored
      Even though aperfmperf_snapshot_khz() caches the samples.khz value to
      return if called again in a sufficiently short time, its caller,
      arch_freq_get_on_cpu(), still uses smp_call_function_single() to run it
      which may allow user space to trigger an IPI storm by reading from the
      scaling_cur_freq cpufreq sysfs file in a tight loop.
      
      To avoid that, move the decision on whether or not to return the cached
      samples.khz value to arch_freq_get_on_cpu().
      
      This change was part of commit 941f5f0f ("x86: CPU: Fix up "cpu MHz"
      in /proc/cpuinfo"), but it was not the reason for the revert and it
      remains applicable.
      
      Fixes: 4815d3c5 (cpufreq: x86: Make scaling_cur_freq behave more as expected)
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Reviewed-by: default avatarWANG Chao <chao.wang@ucloud.cn>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      b29c6ef7
    • Linus Torvalds's avatar
      Merge branch 'x86-timers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 99306dfc
      Linus Torvalds authored
      Pull x86 timer updates from Thomas Gleixner:
       "These updates are related to TSC handling:
      
         - Support platforms which have synchronized TSCs but the boot CPU has
           a non zero TSC_ADJUST value, which is considered a firmware bug on
           normal systems.
      
           This applies to HPE/SGI UV platforms where the platform firmware
           uses TSC_ADJUST to ensure TSC synchronization across a huge number
           of sockets, but due to power on timings the boot CPU cannot be
           guaranteed to have a zero TSC_ADJUST register value.
      
         - Fix the ordering of udelay calibration and kvmclock_init()
      
         - Cleanup the udelay and calibration code"
      
      * 'x86-timers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/tsc: Mark cyc2ns_init() and detect_art() __init
        x86/platform/UV: Mark tsc_check_sync as an init function
        x86/tsc: Make CONFIG_X86_TSC=n build work again
        x86/platform/UV: Add check of TSC state set by UV BIOS
        x86/tsc: Provide a means to disable TSC ART
        x86/tsc: Drastically reduce the number of firmware bug warnings
        x86/tsc: Skip TSC test and error messages if already unstable
        x86/tsc: Add option that TSC on Socket 0 being non-zero is valid
        x86/timers: Move simple_udelay_calibration() past kvmclock_init()
        x86/timers: Make recalibrate_cpu_khz() void
        x86/timers: Move the simple udelay calibration to tsc.h
      99306dfc
    • Linus Torvalds's avatar
      Merge branch 'x86-cache-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 3643b7e0
      Linus Torvalds authored
      Pull x86 cache resource updates from Thomas Gleixner:
       "This update provides updates to RDT:
      
        - A diagnostic framework for the Resource Director Technology (RDT)
          user interface (sysfs). The failure modes of the user interface are
          hard to diagnose from the error codes. An extra last command status
          file provides now sensible textual information about the failure so
          its simpler to use.
      
        - A few minor cleanups and updates in the RDT code"
      
      * 'x86-cache-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/intel_rdt: Fix a silent failure when writing zero value schemata
        x86/intel_rdt: Fix potential deadlock during resctrl mount
        x86/intel_rdt: Fix potential deadlock during resctrl unmount
        x86/intel_rdt: Initialize bitmask of shareable resource if CDP enabled
        x86/intel_rdt: Remove redundant assignment
        x86/intel_rdt/cqm: Make integer rmid_limbo_count static
        x86/intel_rdt: Add documentation for "info/last_cmd_status"
        x86/intel_rdt: Add diagnostics when making directories
        x86/intel_rdt: Add diagnostics when writing the cpus file
        x86/intel_rdt: Add diagnostics when writing the tasks file
        x86/intel_rdt: Add diagnostics when writing the schemata file
        x86/intel_rdt: Add framework for better RDT UI diagnostics
      3643b7e0
    • Linus Torvalds's avatar
      Merge branch 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · b18d6289
      Linus Torvalds authored
      Pull x86 APIC updates from Thomas Gleixner:
       "This update provides a major overhaul of the APIC initialization and
        vector allocation code:
      
         - Unification of the APIC and interrupt mode setup which was
           scattered all over the place and was hard to follow. This also
           distangles the timer setup from the APIC initialization which
           brings a clear separation of functionality.
      
           Great detective work from Dou Lyiang!
      
         - Refactoring of the x86 vector allocation mechanism. The existing
           code was based on nested loops and rather convoluted APIC callbacks
           which had a horrible worst case behaviour and tried to serve all
           different use cases in one go. This led to quite odd hacks when
           supporting the new managed interupt facility for multiqueue devices
           and made it more or less impossible to deal with the vector space
           exhaustion which was a major roadblock for server hibernation.
      
           Aside of that the code dealing with cpu hotplug and the system
           vectors was disconnected from the actual vector management and
           allocation code, which made it hard to follow and maintain.
      
           Utilizing the new bitmap matrix allocator core mechanism, the new
           allocator and management code consolidates the handling of system
           vectors, legacy vectors, cpu hotplug mechanisms and the actual
           allocation which needs to be aware of system and legacy vectors and
           hotplug constraints into a single consistent entity.
      
           This has one visible change: The support for multi CPU targets of
           interrupts, which is only available on a certain subset of
           CPUs/APIC variants has been removed in favour of single interrupt
           targets. A proper analysis of the multi CPU target feature revealed
           that there is no real advantage as the vast majority of interrupts
           end up on the CPU with the lowest APIC id in the set of target CPUs
           anyway. That change was agreed on by the relevant folks and allowed
           to simplify the implementation significantly and to replace rather
           fragile constructs like the vector cleanup IPI with straight
           forward and solid code.
      
           Furthermore this allowed to cleanly separate the allocation details
           for legacy, normal and managed interrupts:
      
            * Legacy interrupts are not longer wasting 16 vectors
              unconditionally
      
            * Managed interrupts have now a guaranteed vector reservation, but
              the actual vector assignment happens when the interrupt is
              requested. It's guaranteed not to fail.
      
            * Normal interrupts no longer allocate vectors unconditionally
              when the interrupt is set up (IO/APIC init or MSI(X) enable).
              The mechanism has been switched to a best effort reservation
              mode. The actual allocation happens when the interrupt is
              requested. Contrary to managed interrupts the request can fail
              due to vector space exhaustion, but drivers must handle a fail
              of request_irq() anyway. When the interrupt is freed, the vector
              is handed back as well.
      
              This solves a long standing problem with large unconditional
              vector allocations for a certain class of enterprise devices
              which prevented server hibernation due to vector space
              exhaustion when the unused allocated vectors had to be migrated
              to CPU0 while unplugging all non boot CPUs.
      
           The code has been equipped with trace points and detailed debugfs
           information to aid analysis of the vector space"
      
      * 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (60 commits)
        x86/vector/msi: Select CONFIG_GENERIC_IRQ_RESERVATION_MODE
        PCI/MSI: Set MSI_FLAG_MUST_REACTIVATE in core code
        genirq: Add config option for reservation mode
        x86/vector: Use correct per cpu variable in free_moved_vector()
        x86/apic/vector: Ignore set_affinity call for inactive interrupts
        x86/apic: Fix spelling mistake: "symmectic" -> "symmetric"
        x86/apic: Use dead_cpu instead of current CPU when cleaning up
        ACPI/init: Invoke early ACPI initialization earlier
        x86/vector: Respect affinity mask in irq descriptor
        x86/irq: Simplify hotplug vector accounting
        x86/vector: Switch IOAPIC to global reservation mode
        x86/vector/msi: Switch to global reservation mode
        x86/vector: Handle managed interrupts proper
        x86/io_apic: Reevaluate vector configuration on activate()
        iommu/amd: Reevaluate vector configuration on activate()
        iommu/vt-d: Reevaluate vector configuration on activate()
        x86/apic/msi: Force reactivation of interrupts at startup time
        x86/vector: Untangle internal state from irq_cfg
        x86/vector: Compile SMP only code conditionally
        x86/apic: Remove unused callbacks
        ...
      b18d6289
    • Linus Torvalds's avatar
      Merge branch 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 7d58e1c9
      Linus Torvalds authored
      Pull smp/hotplug updates from Thomas Gleixner:
       "No functional changes, just removal of obsolete and outdated defines,
        macros and documentation"
      
      * 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        cpu/hotplug: Get rid of CPU hotplug notifier leftovers
        cpu/hotplug: Remove obsolete notifier macros
      7d58e1c9
    • Linus Torvalds's avatar
      Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 2bcc6731
      Linus Torvalds authored
      Pull timer updates from Thomas Gleixner:
       "Yet another big pile of changes:
      
         - More year 2038 work from Arnd slowly reaching the point where we
           need to think about the syscalls themself.
      
         - A new timer function which allows to conditionally (re)arm a timer
           only when it's either not running or the new expiry time is sooner
           than the armed expiry time. This allows to use a single timer for
           multiple timeout requirements w/o caring about the first expiry
           time at the call site.
      
         - A new NMI safe accessor to clock real time for the printk timestamp
           work. Can be used by tracing, perf as well if required.
      
         - A large number of timer setup conversions from Kees which got
           collected here because either maintainers requested so or they
           simply got ignored. As Kees pointed out already there are a few
           trivial merge conflicts and some redundant commits which was
           unavoidable due to the size of this conversion effort.
      
         - Avoid a redundant iteration in the timer wheel softirq processing.
      
         - Provide a mechanism to treat RTC implementations depending on their
           hardware properties, i.e. don't inflict the write at the 0.5
           seconds boundary which originates from the PC CMOS RTC to all RTCs.
           No functional change as drivers need to be updated separately.
      
         - The usual small updates to core code clocksource drivers. Nothing
           really exciting"
      
      * 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (111 commits)
        timers: Add a function to start/reduce a timer
        pstore: Use ktime_get_real_fast_ns() instead of __getnstimeofday()
        timer: Prepare to change all DEFINE_TIMER() callbacks
        netfilter: ipvs: Convert timers to use timer_setup()
        scsi: qla2xxx: Convert timers to use timer_setup()
        block/aoe: discover_timer: Convert timers to use timer_setup()
        ide: Convert timers to use timer_setup()
        drbd: Convert timers to use timer_setup()
        mailbox: Convert timers to use timer_setup()
        crypto: Convert timers to use timer_setup()
        drivers/pcmcia: omap1: Fix error in automated timer conversion
        ARM: footbridge: Fix typo in timer conversion
        drivers/sgi-xp: Convert timers to use timer_setup()
        drivers/pcmcia: Convert timers to use timer_setup()
        drivers/memstick: Convert timers to use timer_setup()
        drivers/macintosh: Convert timers to use timer_setup()
        hwrng/xgene-rng: Convert timers to use timer_setup()
        auxdisplay: Convert timers to use timer_setup()
        sparc/led: Convert timers to use timer_setup()
        mips: ip22/32: Convert timers to use timer_setup()
        ...
      2bcc6731
    • Linus Torvalds's avatar
      Merge branch 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 670310df
      Linus Torvalds authored
      Pull irq core updates from Thomas Gleixner:
       "A rather large update for the interrupt core code and the irq chip drivers:
      
         - Add a new bitmap matrix allocator and supporting changes, which is
           used to replace the x86 vector allocator which comes with separate
           pull request. This allows to replace the convoluted nested loop
           allocation function in x86 with a facility which supports the
           recently added property of managed interrupts proper and allows to
           switch to a best effort vector reservation scheme, which addresses
           problems with vector exhaustion.
      
         - A large update to the ARM GIC-V3-ITS driver adding support for
           range selectors.
      
         - New interrupt controllers:
             - Meson and Meson8 GPIO
             - BCM7271 L2
             - Socionext EXIU
      
           If you expected that this will stop at some point, I have to
           disappoint you. There are new ones posted already. Sigh!
      
         - STM32 interrupt controller support for new platforms.
      
         - A pile of fixes, cleanups and updates to the MIPS GIC driver
      
         - The usual small fixes, cleanups and updates all over the place.
           Most visible one is to move the irq chip drivers Kconfig switches
           into a separate Kconfig menu"
      
      * 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (70 commits)
        genirq: Fix type of shifting literal 1 in __setup_irq()
        irqdomain: Drop pointless NULL check in virq_debug_show_one
        genirq/proc: Return proper error code when irq_set_affinity() fails
        irq/work: Use llist_for_each_entry_safe
        irqchip: mips-gic: Print warning if inherited GIC base is used
        irqchip/mips-gic: Add pr_fmt and reword pr_* messages
        irqchip/stm32: Move the wakeup on interrupt mask
        irqchip/stm32: Fix initial values
        irqchip/stm32: Add stm32h7 support
        dt-bindings/interrupt-controllers: Add compatible string for stm32h7
        irqchip/stm32: Add multi-bank management
        irqchip/stm32: Select GENERIC_IRQ_CHIP
        irqchip/exiu: Add support for Socionext Synquacer EXIU controller
        dt-bindings: Add description of Socionext EXIU interrupt controller
        irqchip/gic-v3-its: Fix VPE activate callback return value
        irqchip: mips-gic: Make IPI bitmaps static
        irqchip: mips-gic: Share register writes in gic_set_type()
        irqchip: mips-gic: Remove gic_vpes variable
        irqchip: mips-gic: Use num_possible_cpus() to reserve IPIs
        irqchip: mips-gic: Configure EIC when CPUs come online
        ...
      670310df
    • Linus Torvalds's avatar
      Merge branch 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 43ff2f4d
      Linus Torvalds authored
      Pull x86 platform updates from Ingo Molnar:
       "The main changes in this cycle were:
      
         - a refactoring of the early virt init code by merging 'struct
           x86_hyper' into 'struct x86_platform' and 'struct x86_init', which
           allows simplifications and also the addition of a new
           ->guest_late_init() callback. (Juergen Gross)
      
         - timer_setup() conversion of the UV code (Kees Cook)"
      
      * 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/virt/xen: Use guest_late_init to detect Xen PVH guest
        x86/virt, x86/platform: Add ->guest_late_init() callback to hypervisor_x86 structure
        x86/virt, x86/acpi: Add test for ACPI_FADT_NO_VGA
        x86/virt: Add enum for hypervisors to replace x86_hyper
        x86/virt, x86/platform: Merge 'struct x86_hyper' into 'struct x86_platform' and 'struct x86_init'
        x86/platform/UV: Convert timers to use timer_setup()
      43ff2f4d
    • Linus Torvalds's avatar
      Merge branch 'x86-debug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 13e57da4
      Linus Torvalds authored
      Pull x86 debug update from Ingo Molnar:
       "A single change enhancing stack traces by hiding wrapper function
        entries"
      
      * 'x86-debug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/stacktrace: Avoid recording save_stack_trace() wrappers
      13e57da4
    • Linus Torvalds's avatar
      Merge branch 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · eb4d47c8
      Linus Torvalds authored
      Pull x86 cleanups from Ingo Molnar:
       "Two changes: Propagate const/__initconst, and use ARRAY_SIZE() some
        more"
      
      * 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/events/amd/iommu: Make iommu_pmu const and __initconst
        x86: Use ARRAY_SIZE
      eb4d47c8
    • Linus Torvalds's avatar
      Merge branch 'x86-boot-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 6a9f70b0
      Linus Torvalds authored
      Pull x86 boot updates from Ingo Molnar:
       "Three smaller changes:
      
         - clang fix
      
         - boot message beautification
      
         - unnecessary header inclusion removal"
      
      * 'x86-boot-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/boot: Disable Clang warnings about GNU extensions
        x86/boot: Remove unnecessary #include <generated/utsrelease.h>
        x86/boot: Spell out "boot CPU" for BP
      6a9f70b0
  3. 13 Nov, 2017 14 commits
    • Linus Torvalds's avatar
      Merge branch 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · d6ec9d9a
      Linus Torvalds authored
      Pull x86 core updates from Ingo Molnar:
       "Note that in this cycle most of the x86 topics interacted at a level
        that caused them to be merged into tip:x86/asm - but this should be a
        temporary phenomenon, hopefully we'll back to the usual patterns in
        the next merge window.
      
        The main changes in this cycle were:
      
        Hardware enablement:
      
         - Add support for the Intel UMIP (User Mode Instruction Prevention)
           CPU feature. This is a security feature that disables certain
           instructions such as SGDT, SLDT, SIDT, SMSW and STR. (Ricardo Neri)
      
           [ Note that this is disabled by default for now, there are some
             smaller enhancements in the pipeline that I'll follow up with in
             the next 1-2 days, which allows this to be enabled by default.]
      
         - Add support for the AMD SEV (Secure Encrypted Virtualization) CPU
           feature, on top of SME (Secure Memory Encryption) support that was
           added in v4.14. (Tom Lendacky, Brijesh Singh)
      
         - Enable new SSE/AVX/AVX512 CPU features: AVX512_VBMI2, GFNI, VAES,
           VPCLMULQDQ, AVX512_VNNI, AVX512_BITALG. (Gayatri Kammela)
      
        Other changes:
      
         - A big series of entry code simplifications and enhancements (Andy
           Lutomirski)
      
         - Make the ORC unwinder default on x86 and various objtool
           enhancements. (Josh Poimboeuf)
      
         - 5-level paging enhancements (Kirill A. Shutemov)
      
         - Micro-optimize the entry code a bit (Borislav Petkov)
      
         - Improve the handling of interdependent CPU features in the early
           FPU init code (Andi Kleen)
      
         - Build system enhancements (Changbin Du, Masahiro Yamada)
      
         - ... plus misc enhancements, fixes and cleanups"
      
      * 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (118 commits)
        x86/build: Make the boot image generation less verbose
        selftests/x86: Add tests for the STR and SLDT instructions
        selftests/x86: Add tests for User-Mode Instruction Prevention
        x86/traps: Fix up general protection faults caused by UMIP
        x86/umip: Enable User-Mode Instruction Prevention at runtime
        x86/umip: Force a page fault when unable to copy emulated result to user
        x86/umip: Add emulation code for UMIP instructions
        x86/cpufeature: Add User-Mode Instruction Prevention definitions
        x86/insn-eval: Add support to resolve 16-bit address encodings
        x86/insn-eval: Handle 32-bit address encodings in virtual-8086 mode
        x86/insn-eval: Add wrapper function for 32 and 64-bit addresses
        x86/insn-eval: Add support to resolve 32-bit address encodings
        x86/insn-eval: Compute linear address in several utility functions
        resource: Fix resource_size.cocci warnings
        X86/KVM: Clear encryption attribute when SEV is active
        X86/KVM: Decrypt shared per-cpu variables when SEV is active
        percpu: Introduce DEFINE_PER_CPU_DECRYPTED
        x86: Add support for changing memory encryption attribute in early boot
        x86/io: Unroll string I/O when SEV is active
        x86/boot: Add early boot support when running with SEV active
        ...
      d6ec9d9a
    • Linus Torvalds's avatar
      Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 3e201463
      Linus Torvalds authored
      Pull scheduler updates from Ingo Molnar:
       "The main updates in this cycle were:
      
         - Group balancing enhancements and cleanups (Brendan Jackman)
      
         - Move CPU isolation related functionality into its separate
           kernel/sched/isolation.c file, with related 'housekeeping_*()'
           namespace and nomenclature et al. (Frederic Weisbecker)
      
         - Improve the interactive/cpu-intense fairness calculation (Josef
           Bacik)
      
         - Improve the PELT code and related cleanups (Peter Zijlstra)
      
         - Improve the logic of pick_next_task_fair() (Uladzislau Rezki)
      
         - Improve the RT IPI based balancing logic (Steven Rostedt)
      
         - Various micro-optimizations:
      
         - better !CONFIG_SCHED_DEBUG optimizations (Patrick Bellasi)
      
         - better idle loop (Cheng Jian)
      
         - ... plus misc fixes, cleanups and updates"
      
      * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (54 commits)
        sched/core: Optimize sched_feat() for !CONFIG_SCHED_DEBUG builds
        sched/sysctl: Fix attributes of some extern declarations
        sched/isolation: Document isolcpus= boot parameter flags, mark it deprecated
        sched/isolation: Add basic isolcpus flags
        sched/isolation: Move isolcpus= handling to the housekeeping code
        sched/isolation: Handle the nohz_full= parameter
        sched/isolation: Introduce housekeeping flags
        sched/isolation: Split out new CONFIG_CPU_ISOLATION=y config from CONFIG_NO_HZ_FULL
        sched/isolation: Rename is_housekeeping_cpu() to housekeeping_cpu()
        sched/isolation: Use its own static key
        sched/isolation: Make the housekeeping cpumask private
        sched/isolation: Provide a dynamic off-case to housekeeping_any_cpu()
        sched/isolation, watchdog: Use housekeeping_cpumask() instead of ad-hoc version
        sched/isolation: Move housekeeping related code to its own file
        sched/idle: Micro-optimize the idle loop
        sched/isolcpus: Fix "isolcpus=" boot parameter handling when !CONFIG_CPUMASK_OFFSTACK
        x86/tsc: Append the 'tsc=' description for the 'tsc=unstable' boot parameter
        sched/rt: Simplify the IPI based RT balancing logic
        block/ioprio: Use a helper to check for RT prio
        sched/rt: Add a helper to test for a RT task
        ...
      3e201463
    • Linus Torvalds's avatar
      Merge branch 'ras-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · f2be8bd5
      Linus Torvalds authored
      Pull RAS updates from Ingo Molnar:
       "Two minor updates to AMD SMCA support, plus a timer_setup() conversion"
      
      * 'ras-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/MCE/AMD: Fix mce_severity_amd_smca() signature
        x86/MCE/AMD: Always give panic severity for UC errors in kernel context
        x86/mce: Convert timers to use timer_setup()
      f2be8bd5
    • Linus Torvalds's avatar
      Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 31486372
      Linus Torvalds authored
      Pull perf updates from Ingo Molnar:
       "The main changes in this cycle were:
      
        Kernel:
      
         - kprobes updates: use better W^X patterns for code modifications,
           improve optprobes, remove jprobes. (Masami Hiramatsu, Kees Cook)
      
         - core fixes: event timekeeping (enabled/running times statistics)
           fixes, perf_event_read() locking fixes and cleanups, etc. (Peter
           Zijlstra)
      
         - Extend x86 Intel free-running PEBS support and support x86
           user-register sampling in perf record and perf script. (Andi Kleen)
      
        Tooling:
      
         - Completely rework the way inline frames are handled. Instead of
           querying for the inline nodes on-demand in the individual tools, we
           now create proper callchain nodes for inlined frames. (Milian
           Wolff)
      
         - 'perf trace' updates (Arnaldo Carvalho de Melo)
      
         - Implement a way to print formatted output to per-event files in
           'perf script' to facilitate generate flamegraphs, elliminating the
           need to write scripts to do that separation (yuzhoujian, Arnaldo
           Carvalho de Melo)
      
         - Update vendor events JSON metrics for Intel's Broadwell, Broadwell
           Server, Haswell, Haswell Server, IvyBridge, IvyTown, JakeTown,
           Sandy Bridge, Skylake, SkyLake Server - and Goldmont Plus V1 (Andi
           Kleen, Kan Liang)
      
         - Multithread the synthesizing of PERF_RECORD_ events for
           pre-existing threads in 'perf top', speeding up that phase, greatly
           improving the user experience in systems such as Intel's Knights
           Mill (Kan Liang)
      
         - Introduce the concept of weak groups in 'perf stat': try to set up
           a group, but if it's not schedulable fallback to not using a group.
           That gives us the best of both worlds: groups if they work, but
           still a usable fallback if they don't. E.g: (Andi Kleen)
      
         - perf sched timehist enhancements (David Ahern)
      
         - ... various other enhancements, updates, cleanups and fixes"
      
      * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (139 commits)
        kprobes: Don't spam the build log with deprecation warnings
        arm/kprobes: Remove jprobe test case
        arm/kprobes: Fix kretprobe test to check correct counter
        perf srcline: Show correct function name for srcline of callchains
        perf srcline: Fix memory leak in addr2inlines()
        perf trace beauty kcmp: Beautify arguments
        perf trace beauty: Implement pid_fd beautifier
        tools include uapi: Grab a copy of linux/kcmp.h
        perf callchain: Fix double mapping al->addr for children without self period
        perf stat: Make --per-thread update shadow stats to show metrics
        perf stat: Move the shadow stats scale computation in perf_stat__update_shadow_stats
        perf tools: Add perf_data_file__write function
        perf tools: Add struct perf_data_file
        perf tools: Rename struct perf_data_file to perf_data
        perf script: Print information about per-event-dump files
        perf trace beauty prctl: Generate 'option' string table from kernel headers
        tools include uapi: Grab a copy of linux/prctl.h
        perf script: Allow creating per-event dump files
        perf evsel: Restore evsel->priv as a tool private area
        perf script: Use event_format__fprintf()
        ...
      31486372
    • Linus Torvalds's avatar
      Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 8e9a2dba
      Linus Torvalds authored
      Pull core locking updates from Ingo Molnar:
       "The main changes in this cycle are:
      
         - Another attempt at enabling cross-release lockdep dependency
           tracking (automatically part of CONFIG_PROVE_LOCKING=y), this time
           with better performance and fewer false positives. (Byungchul Park)
      
         - Introduce lockdep_assert_irqs_enabled()/disabled() and convert
           open-coded equivalents to lockdep variants. (Frederic Weisbecker)
      
         - Add down_read_killable() and use it in the VFS's iterate_dir()
           method. (Kirill Tkhai)
      
         - Convert remaining uses of ACCESS_ONCE() to
           READ_ONCE()/WRITE_ONCE(). Most of the conversion was Coccinelle
           driven. (Mark Rutland, Paul E. McKenney)
      
         - Get rid of lockless_dereference(), by strengthening Alpha atomics,
           strengthening READ_ONCE() with smp_read_barrier_depends() and thus
           being able to convert users of lockless_dereference() to
           READ_ONCE(). (Will Deacon)
      
         - Various micro-optimizations:
      
              - better PV qspinlocks (Waiman Long),
              - better x86 barriers (Michael S. Tsirkin)
              - better x86 refcounts (Kees Cook)
      
         - ... plus other fixes and enhancements. (Borislav Petkov, Juergen
           Gross, Miguel Bernal Marin)"
      
      * 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (70 commits)
        locking/x86: Use LOCK ADD for smp_mb() instead of MFENCE
        rcu: Use lockdep to assert IRQs are disabled/enabled
        netpoll: Use lockdep to assert IRQs are disabled/enabled
        timers/posix-cpu-timers: Use lockdep to assert IRQs are disabled/enabled
        sched/clock, sched/cputime: Use lockdep to assert IRQs are disabled/enabled
        irq_work: Use lockdep to assert IRQs are disabled/enabled
        irq/timings: Use lockdep to assert IRQs are disabled/enabled
        perf/core: Use lockdep to assert IRQs are disabled/enabled
        x86: Use lockdep to assert IRQs are disabled/enabled
        smp/core: Use lockdep to assert IRQs are disabled/enabled
        timers/hrtimer: Use lockdep to assert IRQs are disabled/enabled
        timers/nohz: Use lockdep to assert IRQs are disabled/enabled
        workqueue: Use lockdep to assert IRQs are disabled/enabled
        irq/softirqs: Use lockdep to assert IRQs are disabled/enabled
        locking/lockdep: Add IRQs disabled/enabled assertion APIs: lockdep_assert_irqs_enabled()/disabled()
        locking/pvqspinlock: Implement hybrid PV queued/unfair locks
        locking/rwlocks: Fix comments
        x86/paravirt: Set up the virt_spin_lock_key after static keys get initialized
        block, locking/lockdep: Assign a lock_class per gendisk used for wait_for_completion()
        workqueue: Remove now redundant lock acquisitions wrt. workqueue flushes
        ...
      8e9a2dba
    • Linus Torvalds's avatar
      Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 6098850e
      Linus Torvalds authored
      Pull RCU updates from Ingo Molnar:
       "The main changes in this cycle are:
      
         - Documentation updates
      
         - RCU CPU stall-warning updates
      
         - Torture-test updates
      
         - Miscellaneous fixes
      
        Size wise the biggest updates are to documentation. Excluding
        documentation most of the code increase comes from a single commit
        which expands debugging"
      
      * 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (24 commits)
        srcu: Add parameters to SRCU docbook comments
        doc: Rewrite confusing statement about memory barriers
        memory-barriers.txt: Fix typo in pairing example
        rcu/segcblist: Include rcupdate.h
        rcu: Add extended-quiescent-state testing advice
        rcu: Suppress lockdep false-positive ->boost_mtx complaints
        rcu: Do not include rtmutex_common.h unconditionally
        torture: Provide TMPDIR environment variable to specify tmpdir
        rcutorture: Dump writer stack if stalled
        rcutorture: Add interrupt-disable capability to stall-warning tests
        rcu: Suppress RCU CPU stall warnings while dumping trace
        rcu: Turn off tracing before dumping trace
        rcu: Make RCU CPU stall warnings check for irq-disabled CPUs
        sched,rcu: Make cond_resched() provide RCU quiescent state
        sched: Make resched_cpu() unconditional
        irq_work: Map irq_work_on_queue() to irq_work_on() in !SMP
        rcu: Create call_rcu_tasks() kthread at boot time
        rcu: Fix up pending cbs check in rcu_prepare_for_idle
        memory-barriers: Rework multicopy-atomicity section
        memory-barriers: Replace uses of "transitive"
        ...
      6098850e
    • Linus Torvalds's avatar
      Merge tag 'please-pull-gettime_vsyscall_update' of... · f08d8bcc
      Linus Torvalds authored
      Merge tag 'please-pull-gettime_vsyscall_update' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux
      
      Pull ia64 update from Tony Luck:
       "Stop ia64 being the last holdout using GENERIC_TIME_VSYSCALL_OLD so
        that John Stultz can drop that code"
      
      * tag 'please-pull-gettime_vsyscall_update' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux:
        ia64: Update fsyscall gettime to use modern vsyscall_update
      f08d8bcc
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://github.com/openrisc/linux · f3573b8f
      Linus Torvalds authored
      Pull OpenRISC updates from Stafford Horne:
       "The OpenRISC work is a bit more interesting this time, adding SMP
        support and a few general cleanups.
      
        Small Things:
      
         - Move OpenRISC docs into Documentation and clean them up
      
         - Document previously undocumented devicetree bindings
      
         - Update the or1ksim dts to use stdout-path
      
        OpenRISC SMP support details:
      
         - First the "use shadow registers" and "define CPU_BIG_ENDIAN as
           true" get the architecture ready for SMP.
      
         - The "add 1 and 2 byte cmpxchg support" and "use qspinlocks and
           qrwlocks" add the SMP locking infrastructure as needed. Using the
           qspinlocks and qrwlocks as suggested by Peter Z while reviewing the
           original spinlocks implementation.
      
         - The "support for ompic" adds a new irqchip device which is used for
           IPI communication to support SMP.
      
         - The "initial SMP support" adds smp.c and makes changes to all of
           the necessary data-structures to be per-cpu.
      
        The remaining patches are bug fixes and debug helpers which I wanted
        to keep separate from the "initial SMP support" in order to allow them
        to be reviewed on their own. This includes:
      
         - add cacheflush support to fix icache aliasing
      
         - fix initial preempt state for secondary cpu tasks
      
         - sleep instead of spin on secondary wait
      
         - support framepointers and STACKTRACE_SUPPORT
      
         - enable LOCKDEP_SUPPORT and irqflags tracing
      
         - timer sync: Add tick timer sync logic
      
         - fix possible deadlock in timer sync, pointed out by mips guys
      
        Note: the irqchip patch was reviewed with Marc and we agreed to push
        it together with these patches"
      
      * tag 'for-linus' of git://github.com/openrisc/linux:
        openrisc: fix possible deadlock scenario during timer sync
        openrisc: pass endianness info to sparse
        openrisc: add tick timer multi-core sync logic
        openrisc: enable LOCKDEP_SUPPORT and irqflags tracing
        openrisc: support framepointers and STACKTRACE_SUPPORT
        openrisc: add simple_smp dts and defconfig for simulators
        openrisc: add cacheflush support to fix icache aliasing
        openrisc: sleep instead of spin on secondary wait
        openrisc: fix initial preempt state for secondary cpu tasks
        openrisc: initial SMP support
        irqchip: add initial support for ompic
        dt-bindings: add openrisc to vendor prefixes list
        openrisc: use qspinlocks and qrwlocks
        openrisc: add 1 and 2 byte cmpxchg support
        openrisc: use shadow registers to save regs on exception
        dt-bindings: openrisc: Add OpenRISC platform SoC
        Documentation: openrisc: Updates to README
        Documentation: Move OpenRISC docs out of arch/
        MAINTAINERS: Add OpenRISC pic maintainer
        openrisc: dts: or1ksim: Add stdout-path
      f3573b8f
    • Linus Torvalds's avatar
      Merge tag 'm68k-for-v4.15-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k · 9e09d05c
      Linus Torvalds authored
      Pull m68k updates from Geert Uytterhoeven:
      
        - more printk modernization
      
        - various cleanups and fixes (incl. a race condition) for Mac
      
        - defconfig updates
      
      * tag 'm68k-for-v4.15-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k:
        m68k/defconfig: Update defconfigs for v4.14-rc7
        m68k/mac: Add mutual exclusion for IOP interrupt polling
        m68k/mac: Disentangle VIA/RBV and NuBus initialization
        m68k/mac: Disentangle VIA and OSS initialization
        m68k/mac: More printk modernization
      9e09d05c
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux · d60a540a
      Linus Torvalds authored
      Pull s390 updates from Heiko Carstens:
       "Since Martin is on vacation you get the s390 pull request for the
        v4.15 merge window this time from me.
      
        Besides a lot of cleanups and bug fixes these are the most important
        changes:
      
         - a new regset for runtime instrumentation registers
      
         - hardware accelerated AES-GCM support for the aes_s390 module
      
         - support for the new CEX6S crypto cards
      
         - support for FORTIFY_SOURCE
      
         - addition of missing z13 and new z14 instructions to the in-kernel
           disassembler
      
         - generate opcode tables for the in-kernel disassembler out of a
           simple text file instead of having to manually maintain those
           tables
      
         - fast memset16, memset32 and memset64 implementations
      
         - removal of named saved segment support
      
         - hardware counter support for z14
      
         - queued spinlocks and queued rwlocks implementations for s390
      
         - use the stack_depth tracking feature for s390 BPF JIT
      
         - a new s390_sthyi system call which emulates the sthyi (store
           hypervisor information) instruction
      
         - removal of the old KVM virtio transport
      
         - an s390 specific CPU alternatives implementation which is used in
           the new spinlock code"
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (88 commits)
        MAINTAINERS: add virtio-ccw.h to virtio/s390 section
        s390/noexec: execute kexec datamover without DAT
        s390: fix transactional execution control register handling
        s390/bpf: take advantage of stack_depth tracking
        s390: simplify transactional execution elf hwcap handling
        s390/zcrypt: Rework struct ap_qact_ap_info.
        s390/virtio: remove unused header file kvm_virtio.h
        s390: avoid undefined behaviour
        s390/disassembler: generate opcode tables from text file
        s390/disassembler: remove insn_to_mnemonic()
        s390/dasd: avoid calling do_gettimeofday()
        s390: vfio-ccw: Do not attempt to free no-op, test and tic cda.
        s390: remove named saved segment support
        s390/archrandom: Reconsider s390 arch random implementation
        s390/pci: do not require AIS facility
        s390/qdio: sanitize put_indicator
        s390/qdio: use atomic_cmpxchg
        s390/nmi: avoid using long-displacement facility
        s390: pass endianness info to sparse
        s390/decompressor: remove informational messages
        ...
      d60a540a
    • Linus Torvalds's avatar
      Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu · 2101dd64
      Linus Torvalds authored
      Pull m68k updates from Greg Ungerer:
       "The bulk of the changes are to support the ColdFire 5441x SoC family
        with their MMU enabled. The parts have been supported for a long time
        now, but only in no-MMU mode.
      
        Angelo Dureghello has a new board with a 5441x and we have ironed out
        the last problems with MMU enabled on it. So there is also some
        changes to properly support that board too.
      
        Also a fix for a link problem when selecting the traditional 68k beep
        device in no-MMU configurations"
      
      * 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu:
        m68k: add Sysam stmark2 open board support
        m68k: coldfire: add dspi0 module support
        m68k: pull mach_beep in setup.c
        m68k: allow ColdFire m5441x parts to run with MMU enabled
        m68k: fix ColdFire node shift size calculation
        m68k: move coldfire MMU initialization code
      2101dd64
    • Linus Torvalds's avatar
      Merge branch 'next-integrity' of... · b33e3cc5
      Linus Torvalds authored
      Merge branch 'next-integrity' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
      
      Pull security subsystem integrity updates from James Morris:
       "There is a mixture of bug fixes, code cleanup, preparatory code for
        new functionality and new functionality.
      
        Commit 26ddabfe ("evm: enable EVM when X509 certificate is
        loaded") enabled EVM without loading a symmetric key, but was limited
        to defining the x509 certificate pathname at build. Included in this
        set of patches is the ability of enabling EVM, without loading the EVM
        symmetric key, from userspace. New is the ability to prevent the
        loading of an EVM symmetric key."
      
      * 'next-integrity' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
        ima: Remove redundant conditional operator
        ima: Fix bool initialization/comparison
        ima: check signature enforcement against cmdline param instead of CONFIG
        module: export module signature enforcement status
        ima: fix hash algorithm initialization
        EVM: Only complain about a missing HMAC key once
        EVM: Allow userspace to signal an RSA key has been loaded
        EVM: Include security.apparmor in EVM measurements
        ima: call ima_file_free() prior to calling fasync
        integrity: use kernel_read_file_from_path() to read x509 certs
        ima: always measure and audit files in policy
        ima: don't remove the securityfs policy file
        vfs: fix mounting a filesystem with i_version
      b33e3cc5
    • Linus Torvalds's avatar
      Merge branch 'next-general' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security · 55b3a0cb
      Linus Torvalds authored
      Pull general security subsystem updates from James Morris:
       "TPM (from Jarkko):
         - essential clean up for tpm_crb so that ARM64 and x86 versions do
           not distract each other as much as before
      
         - /dev/tpm0 rejects now too short writes (shorter buffer than
           specified in the command header
      
         - use DMA-safe buffer in tpm_tis_spi
      
         - otherwise mostly minor fixes.
      
        Smack:
         - base support for overlafs
      
        Capabilities:
         - BPRM_FCAPS fixes, from Richard Guy Briggs:
      
           The audit subsystem is adding a BPRM_FCAPS record when auditing
           setuid application execution (SYSCALL execve). This is not expected
           as it was supposed to be limited to when the file system actually
           had capabilities in an extended attribute. It lists all
           capabilities making the event really ugly to parse what is
           happening. The PATH record correctly records the setuid bit and
           owner. Suppress the BPRM_FCAPS record on set*id.
      
        TOMOYO:
         - Y2038 timestamping fixes"
      
      * 'next-general' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (28 commits)
        MAINTAINERS: update the IMA, EVM, trusted-keys, encrypted-keys entries
        Smack: Base support for overlayfs
        MAINTAINERS: remove David Safford as maintainer for encrypted+trusted keys
        tomoyo: fix timestamping for y2038
        capabilities: audit log other surprising conditions
        capabilities: fix logic for effective root or real root
        capabilities: invert logic for clarity
        capabilities: remove a layer of conditional logic
        capabilities: move audit log decision to function
        capabilities: use intuitive names for id changes
        capabilities: use root_priveleged inline to clarify logic
        capabilities: rename has_cap to has_fcap
        capabilities: intuitive names for cap gain status
        capabilities: factor out cap_bprm_set_creds privileged root
        tpm, tpm_tis: use ARRAY_SIZE() to define TPM_HID_USR_IDX
        tpm: fix duplicate inline declaration specifier
        tpm: fix type of a local variables in tpm_tis_spi.c
        tpm: fix type of a local variable in tpm2_map_command()
        tpm: fix type of a local variable in tpm2_get_cc_attrs_tbl()
        tpm-dev-common: Reject too short writes
        ...
      55b3a0cb
    • Linus Torvalds's avatar
      Merge tag 'mmc-v4.15' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc · dee02770
      Linus Torvalds authored
      Pull MMC updates from Ulf Hansson:
       "MMC core:
         - Introduce host claiming by context to support blkmq
         - Preparations for enabling CQE (eMMC CMDQ) requests
         - Re-factorizations to prepare for blkmq support
         - Re-factorizations to prepare for CQE support
         - Fix signal voltage switch for SD cards without power cycle
         - Convert RPMB to a character device
         - Export eMMC revision via sysfs
         - Support eMMC DT binding for fixed driver type
         - Document mmc_regulator_get_supply() API
      
       MMC host:
         - omap_hsmmc: Updated regulator management for PBIAS
         - sdhci-omap: Add new OMAP SDHCI driver
         - meson-mx-sdio: New driver for the Amlogic Meson8 and Meson8b SoCs
         - sdhci-pci: Add support for Intel CDF
         - sdhci-acpi: Fix voltage switch for some Intel host controllers
         - sdhci-msm: Enable delay circuit calibration clocks
         - sdhci-msm: Manage power IRQ properly
         - mediatek: Add support of mt2701/mt2712
         - mediatek: Updates management of clocks and tunings
         - mediatek: Upgrade eMMC HS400 support
         - rtsx_pci: Update tuning for gen3 PCI-Express
         - renesas_sdhi: Support R-Car Gen[123] fallback compatibility strings
         - Catch all errors when getting regulators
         - Various additional improvements and cleanups"
      
      * tag 'mmc-v4.15' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc: (91 commits)
        sdhci-fujitsu: add support for setting the CMD_DAT_DELAY attribute
        dt-bindings: sdhci-fujitsu: document cmd-dat-delay property
        mmc: tmio: Replace msleep() of 20ms or less with usleep_range()
        mmc: dw_mmc: Convert timers to use timer_setup()
        mmc: dw_mmc: Cleanup the DTO timer like the CTO one
        mmc: vub300: Use common code in __download_offload_pseudocode()
        mmc: tmio: Use common error handling code in tmio_mmc_host_probe()
        mmc: Convert timers to use timer_setup()
        mmc: sdhci-acpi: Fix voltage switch for some Intel host controllers
        mmc: sdhci-acpi: Let devices define their own private data
        mmc: mediatek: perfer to use rise edge latching for cmd line
        mmc: mediatek: improve eMMC hs400 mode read performance
        mmc: mediatek: add latch-ck support
        mmc: mediatek: add support of source_cg clock
        mmc: mediatek: add stop_clk fix and enhance_rx support
        mmc: mediatek: add busy_check support
        mmc: mediatek: add async fifo and data tune support
        mmc: mediatek: add pad_tune0 support
        mmc: mediatek: make hs400_tune_response only for mt8173
        arm64: dts: mt8173: remove "mediatek, mt8135-mmc" from mmc nodes
        ...
      dee02770