1. 05 May, 2019 1 commit
    • Jiri Olsa's avatar
      perf/x86/intel: Fix race in intel_pmu_disable_event() · 6f55967a
      Jiri Olsa authored
      New race in x86_pmu_stop() was introduced by replacing the
      atomic __test_and_clear_bit() of cpuc->active_mask by separate
      test_bit() and __clear_bit() calls in the following commit:
      
        3966c3fe ("x86/perf/amd: Remove need to check "running" bit in NMI handler")
      
      The race causes panic for PEBS events with enabled callchains:
      
        BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
        ...
        RIP: 0010:perf_prepare_sample+0x8c/0x530
        Call Trace:
         <NMI>
         perf_event_output_forward+0x2a/0x80
         __perf_event_overflow+0x51/0xe0
         handle_pmi_common+0x19e/0x240
         intel_pmu_handle_irq+0xad/0x170
         perf_event_nmi_handler+0x2e/0x50
         nmi_handle+0x69/0x110
         default_do_nmi+0x3e/0x100
         do_nmi+0x11a/0x180
         end_repeat_nmi+0x16/0x1a
        RIP: 0010:native_write_msr+0x6/0x20
        ...
         </NMI>
         intel_pmu_disable_event+0x98/0xf0
         x86_pmu_stop+0x6e/0xb0
         x86_pmu_del+0x46/0x140
         event_sched_out.isra.97+0x7e/0x160
        ...
      
      The event is configured to make samples from PEBS drain code,
      but when it's disabled, we'll go through NMI path instead,
      where data->callchain will not get allocated and we'll crash:
      
                x86_pmu_stop
                  test_bit(hwc->idx, cpuc->active_mask)
                  intel_pmu_disable_event(event)
                  {
                    ...
                    intel_pmu_pebs_disable(event);
                    ...
      
      EVENT OVERFLOW ->  <NMI>
                           intel_pmu_handle_irq
                             handle_pmi_common
         TEST PASSES ->        test_bit(bit, cpuc->active_mask))
                                 perf_event_overflow
                                   perf_prepare_sample
                                   {
                                     ...
                                     if (!(sample_type & __PERF_SAMPLE_CALLCHAIN_EARLY))
                                           data->callchain = perf_callchain(event, regs);
      
               CRASH ->              size += data->callchain->nr;
                                   }
                         </NMI>
                    ...
                    x86_pmu_disable_event(event)
                  }
      
                  __clear_bit(hwc->idx, cpuc->active_mask);
      
      Fixing this by disabling the event itself before setting
      off the PEBS bit.
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: David Arcari <darcari@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Lendacky Thomas <Thomas.Lendacky@amd.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Fixes: 3966c3fe ("x86/perf/amd: Remove need to check "running" bit in NMI handler")
      Link: http://lkml.kernel.org/r/20190504151556.31031-1-jolsa@kernel.orgSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      6f55967a
  2. 03 May, 2019 3 commits
  3. 02 May, 2019 13 commits
    • Arnaldo Carvalho de Melo's avatar
      perf tools: Remove needless asm/unistd.h include fixing build in some places · 7e221b81
      Arnaldo Carvalho de Melo authored
      We were including sys/syscall.h and asm/unistd.h, since sys/syscall.h
      includes asm/unistd.h, sometimes this leads to the redefinition of
      defines, breaking the build.
      
      Noticed on ARC with uCLibc.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Arnaldo Carvalho de Melo <arnaldo.melo@gmail.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Rich Felker <dalias@libc.org>
      Cc: Vineet Gupta <Vineet.Gupta1@synopsys.com>
      Link: https://lkml.kernel.org/n/tip-xjpf80o64i2ko74aj2jih0qg@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      7e221b81
    • Arnaldo Carvalho de Melo's avatar
      tools arch uapi: Copy missing unistd.h headers for arc, hexagon and riscv · 18f90d37
      Arnaldo Carvalho de Melo authored
      Since those were introduced in:
      
        c8ce48f0 ("asm-generic: Make time32 syscall numbers optional")
      
      But when the asm-generic/unistd.h was sync'ed with tools/ in:
      
        1a787fc5 ("tools headers uapi: Sync copy of asm-generic/unistd.h with the kernel sources")
      
      I forgot to copy the files for the architectures that define
      __ARCH_WANT_TIME32_SYSCALLS, so the perf build was breaking there, as
      reported by Vineet Gupta for the ARC architecture.
      
      After updating my ARC container to use the glibc based toolchain + cross
      building libnuma, zlib and elfutils, I finally managed to reproduce the
      problem and verify that this now is fixed and will not regress as will
      be tested before each pull req sent upstream.
      Reported-by: default avatarVineet Gupta <Vineet.Gupta1@synopsys.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Jiri Olsa <jolsa@kernel.org>
      CC: linux-snps-arc@lists.infradead.org
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lkml.kernel.org/r/20190426193531.GC28586@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      18f90d37
    • Arnaldo Carvalho de Melo's avatar
      tools build: Add -ldl to the disassembler-four-args feature test · c638417e
      Arnaldo Carvalho de Melo authored
      Thomas Backlund reported that the perf build was failing on the Mageia 7
      distro, that is because it uses:
      
        cat /tmp/build/perf/feature/test-disassembler-four-args.make.output
        /usr/bin/ld: /usr/lib64/libbfd.a(plugin.o): in function `try_load_plugin':
        /home/iurt/rpmbuild/BUILD/binutils-2.32/objs/bfd/../../bfd/plugin.c:243:
        undefined reference to `dlopen'
        /usr/bin/ld:
        /home/iurt/rpmbuild/BUILD/binutils-2.32/objs/bfd/../../bfd/plugin.c:271:
        undefined reference to `dlsym'
        /usr/bin/ld:
        /home/iurt/rpmbuild/BUILD/binutils-2.32/objs/bfd/../../bfd/plugin.c:256:
        undefined reference to `dlclose'
        /usr/bin/ld:
        /home/iurt/rpmbuild/BUILD/binutils-2.32/objs/bfd/../../bfd/plugin.c:246:
        undefined reference to `dlerror'
        as we allow dynamic linking and loading
      
      Mageia 7 uses these linker flags:
        $ rpm --eval %ldflags
          -Wl,--as-needed -Wl,--no-undefined -Wl,-z,relro -Wl,-O1 -Wl,--build-id -Wl,--enable-new-dtags
      
      So add -ldl to this feature LDFLAGS.
      Reported-by: default avatarThomas Backlund <tmb@mageia.org>
      Tested-by: default avatarThomas Backlund <tmb@mageia.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Song Liu <songliubraving@fb.com>
      Link: https://lkml.kernel.org/r/20190501173158.GC21436@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      c638417e
    • Leo Yan's avatar
      perf cs-etm: Always allocate memory for cs_etm_queue::prev_packet · 35bb59c1
      Leo Yan authored
      Robert Walker reported a segmentation fault is observed when process
      CoreSight trace data; this issue can be easily reproduced by the command
      'perf report --itrace=i1000i' for decoding tracing data.
      
      If neither the 'b' flag (synthesize branches events) nor 'l' flag
      (synthesize last branch entries) are specified to option '--itrace',
      cs_etm_queue::prev_packet will not been initialised.  After merging the
      code to support exception packets and sample flags, there introduced a
      number of uses of cs_etm_queue::prev_packet without checking whether it
      is valid, for these cases any accessing to uninitialised prev_packet
      will cause crash.
      
      As cs_etm_queue::prev_packet is used more widely now and it's already
      hard to follow which functions have been called in a context where the
      validity of cs_etm_queue::prev_packet has been checked, this patch
      always allocates memory for cs_etm_queue::prev_packet.
      Reported-by: default avatarRobert Walker <robert.walker@arm.com>
      Suggested-by: default avatarRobert Walker <robert.walker@arm.com>
      Signed-off-by: default avatarLeo Yan <leo.yan@linaro.org>
      Tested-by: default avatarRobert Walker <robert.walker@arm.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
      Cc: Mike Leach <mike.leach@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Suzuki K Poulouse <suzuki.poulose@arm.com>
      Cc: linux-arm-kernel@lists.infradead.org
      Fixes: 7100b12c ("perf cs-etm: Generate branch sample for exception packet")
      Fixes: 24fff5eb ("perf cs-etm: Avoid stale branch samples when flush packet")
      Link: http://lkml.kernel.org/r/20190428083228.20246-1-leo.yan@linaro.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      35bb59c1
    • Leo Yan's avatar
      perf cs-etm: Don't check cs_etm_queue::prev_packet validity · cf0c37b6
      Leo Yan authored
      Since cs_etm_queue::prev_packet is allocated for all cases, it will
      never be NULL pointer; now validity checking prev_packet is pointless,
      remove all of them.
      Signed-off-by: default avatarLeo Yan <leo.yan@linaro.org>
      Tested-by: default avatarRobert Walker <robert.walker@arm.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
      Cc: Mike Leach <mike.leach@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Suzuki K Poulouse <suzuki.poulose@arm.com>
      Cc: linux-arm-kernel@lists.infradead.org
      Link: http://lkml.kernel.org/r/20190428083228.20246-2-leo.yan@linaro.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      cf0c37b6
    • Thomas Richter's avatar
      perf report: Report OOM in status line in the GTK UI · 167e418f
      Thomas Richter authored
      An -ENOMEM error is not reported in the GTK GUI.  Instead this error
      message pops up on the screen:
      
      [root@m35lp76 perf]# ./perf  report -i perf.data.error68-1
      
      	Processing events... [974K/3M]
      	Error:failed to process sample
      
      	0xf4198 [0x8]: failed to process type: 68
      
      However when I use the same perf.data file with --stdio it works:
      
      [root@m35lp76 perf]# ./perf  report -i perf.data.error68-1 --stdio \
      		| head -12
      
        # Total Lost Samples: 0
        #
        # Samples: 76K of event 'cycles'
        # Event count (approx.): 99056160000
        #
        # Overhead  Command          Shared Object      Symbol
        # ........  ...............  .................  .........
        #
           8.81%  find             [kernel.kallsyms]  [k] ftrace_likely_update
           8.74%  swapper          [kernel.kallsyms]  [k] ftrace_likely_update
           8.34%  sshd             [kernel.kallsyms]  [k] ftrace_likely_update
           2.19%  kworker/u512:1-  [kernel.kallsyms]  [k] ftrace_likely_update
      
      The sample precentage is a bit low.....
      
      The GUI always fails in the FINISHED_ROUND event (68) and does not
      indicate the reason why.
      
      When happened is the following. Perf report calls a lot of functions and
      down deep when a FINISHED_ROUND event is processed, these functions are
      called:
      
        perf_session__process_event()
        + perf_session__process_user_event()
          + process_finished_round()
            + ordered_events__flush()
              + __ordered_events__flush()
      	  + do_flush()
      	    + ordered_events__deliver_event()
      	      + perf_session__deliver_event()
      	        + machine__deliver_event()
      	          + perf_evlist__deliver_event()
      	            + process_sample_event()
      	              + hist_entry_iter_add() --> only called in GUI case!!!
      	                + hist_iter__report__callback()
      	                  + symbol__inc_addr_sample()
      
      	                    Now this functions runs out of memory and
      			    returns -ENOMEM. This is reported all the way up
      			    until function
      
      perf_session__process_event() returns to its caller, where -ENOMEM is
      changed to -EINVAL and processing stops:
      
       if ((skip = perf_session__process_event(session, event, head)) < 0) {
            pr_err("%#" PRIx64 " [%#x]: failed to process type: %d\n",
      	     head, event->header.size, event->header.type);
            err = -EINVAL;
            goto out_err;
       }
      
      This occurred in the FINISHED_ROUND event when it has to process some
      10000 entries and ran out of memory.
      
      This patch indicates the root cause and displays it in the status line
      of ther perf report GUI.
      
      Output before (on GUI status line):
      
        0xf4198 [0x8]: failed to process type: 68
      
      Output after:
      
        0xf4198 [0x8]: failed to process type: 68 [not enough memory]
      
      Committer notes:
      
      the 'skip' variable needs to be initialized to -EINVAL, so that when the
      size is less than sizeof(struct perf_event_attr) we avoid this valid
      compiler warning:
      
        util/session.c: In function ‘perf_session__process_events’:
        util/session.c:1936:7: error: ‘skip’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
           err = skip;
           ~~~~^~~~~~
        util/session.c:1874:6: note: ‘skip’ was declared here
          s64 skip;
              ^~~~
        cc1: all warnings being treated as errors
      Signed-off-by: default avatarThomas Richter <tmricht@linux.ibm.com>
      Reviewed-by: default avatarHendrik Brueckner <brueckner@linux.ibm.com>
      Reviewed-by: default avatarJiri Olsa <jolsa@redhat.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Link: http://lkml.kernel.org/r/20190423105303.61683-1-tmricht@linux.ibm.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      167e418f
    • Arnaldo Carvalho de Melo's avatar
      perf bench numa: Add define for RUSAGE_THREAD if not present · bf561d3c
      Arnaldo Carvalho de Melo authored
      While cross building perf to the ARC architecture on a fedora 30 host,
      we were failing with:
      
            CC       /tmp/build/perf/bench/numa.o
        bench/numa.c: In function ‘worker_thread’:
        bench/numa.c:1261:12: error: ‘RUSAGE_THREAD’ undeclared (first use in this function); did you mean ‘SIGEV_THREAD’?
          getrusage(RUSAGE_THREAD, &rusage);
                    ^~~~~~~~~~~~~
                    SIGEV_THREAD
        bench/numa.c:1261:12: note: each undeclared identifier is reported only once for each function it appears in
      
      [perfbuilder@60d5802468f6 perf]$ /arc_gnu_2019.03-rc1_prebuilt_uclibc_le_archs_linux_install/bin/arc-linux-gcc --version | head -1
      arc-linux-gcc (ARCv2 ISA Linux uClibc toolchain 2019.03-rc1) 8.3.1 20190225
      [perfbuilder@60d5802468f6 perf]$
      
      Trying to reproduce a report by Vineet, I noticed that, with just
      cross-built zlib and numactl libraries, I ended up with the above
      failure.
      
      So, since RUSAGE_THREAD is available as a define, check for that and
      numactl libraries, I ended up with the above failure.
      
      So, since RUSAGE_THREAD is available as a define in the system headers,
      check if it is defined in the 'perf bench numa' sources and define it if
      not.
      
      Now it builds and I have to figure out if the problem reported by Vineet
      only takes place if we have libelf or some other library available.
      
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: linux-snps-arc@lists.infradead.org
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Vineet Gupta <Vineet.Gupta1@synopsys.com>
      Link: https://lkml.kernel.org/n/tip-2wb4r1gir9xrevbpq7qp0amk@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      bf561d3c
    • Leo Yan's avatar
      tools lib traceevent: Change tag string for error · 5f05182f
      Leo Yan authored
      The traceevent lib is used by the perf tool, and when executing
      
        perf test -v 6
      
      it outputs error log on the ARM64 platform:
      
        running test 33 '*:*'trace-cmd: No such file or directory
      
        [...]
      
        trace-cmd: Invalid argument
      
      The trace event parsing code originally came from trace-cmd so it keeps
      the tag string "trace-cmd" for errors, this easily introduces the
      impression that the perf tool launches trace-cmd command for trace event
      parsing, but in fact the related parsing is accomplished by the
      traceevent lib.
      
      This patch changes the tag string to "libtraceevent" so that we can
      avoid confusion and let users to more easily connect the error with
      traceevent lib.
      Signed-off-by: default avatarLeo Yan <leo.yan@linaro.org>
      Acked-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      Link: http://lkml.kernel.org/r/20190424013802.27569-1-leo.yan@linaro.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      5f05182f
    • Thadeu Lima de Souza Cascardo's avatar
      perf annotate: Fix build on 32 bit for BPF annotation · 01e985e9
      Thadeu Lima de Souza Cascardo authored
      Commit 6987561c ("perf annotate: Enable annotation of BPF programs") adds
      support for BPF programs annotations but the new code does not build on 32-bit.
      Signed-off-by: default avatarThadeu Lima de Souza Cascardo <cascardo@canonical.com>
      Acked-by: default avatarSong Liu <songliubraving@fb.com>
      Fixes: 6987561c ("perf annotate: Enable annotation of BPF programs")
      Link: http://lkml.kernel.org/r/20190403194452.10845-1-cascardo@canonical.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      01e985e9
    • Arnaldo Carvalho de Melo's avatar
      tools uapi x86: Sync vmx.h with the kernel · 24e45b49
      Arnaldo Carvalho de Melo authored
      To pick up the changes from:
      
        2b27924b ("KVM: nVMX: always use early vmcs check when EPT is disabled")
      
      That causes this object in the tools/perf build process to be rebuilt:
      
        CC       /tmp/build/perf/arch/x86/util/kvm-stat.o
      
      But it isn't using VMX_ABORT_ prefixed constants, so no change in
      behaviour.
      
      This silences this perf build warning:
      
        Warning: Kernel ABI header at 'tools/arch/x86/include/uapi/asm/vmx.h' differs from latest version at 'arch/x86/include/uapi/asm/vmx.h'
        diff -u tools/arch/x86/include/uapi/asm/vmx.h arch/x86/include/uapi/asm/vmx.h
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Link: https://lkml.kernel.org/n/tip-bjbo3zc0r8i8oa0udpvftya6@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      24e45b49
    • Bo YU's avatar
      perf bpf: Return value with unlocking in perf_env__find_btf() · 2e712675
      Bo YU authored
      In perf_env__find_btf(), we're returning without unlocking
      "env->bpf_progs.lock". There may be cause lockdep issue.
      
      Detected by CoversityScan, CID# 1444762:(program hangs(LOCK))
      Signed-off-by: default avatarBo YU <tsu.yubo@gmail.com>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: Martin KaFai Lau <kafai@fb.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Song Liu <songliubraving@fb.com>
      Cc: Yonghong Song <yhs@fb.com>
      Cc: bpf@vger.kernel.org
      Cc: netdev@vger.kernel.org
      Fixes: 2db7b1e0: (perf bpf: Return NULL when RB tree lookup fails in perf_env__find_btf())
      Link: http://lkml.kernel.org/r/20190422080138.10088-1-tsu.yubo@gmail.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      2e712675
    • Kim Phillips's avatar
      MAINTAINERS: Include vendor specific files under arch/*/events/* · 1804569d
      Kim Phillips authored
      Add an explicit subdirectory specification for arch/x86/events/amd to
      the MAINTAINERS file, to distinguish it from its parent. This will
      produce the correct set of maintainers for the files found therein.
      Signed-off-by: default avatarKim Phillips <kim.phillips@amd.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Gary Hook <Gary.Hook@amd.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Janakarajan Natarajan <Janakarajan.Natarajan@amd.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Martin Liška <mliska@suse.cz>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Pu Wen <puwen@hygon.cn>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Thomas Lendacky <Thomas.Lendacky@amd.com>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: linux-kernel@vger.kernel.org
      Fixes: 39b0332a ("perf/x86: Move perf_event_amd.c ........... => x86/events/amd/core.c")
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      1804569d
    • Kim Phillips's avatar
      perf/x86/amd: Update generic hardware cache events for Family 17h · 0e3b74e2
      Kim Phillips authored
      Add a new amd_hw_cache_event_ids_f17h assignment structure set
      for AMD families 17h and above, since a lot has changed.  Specifically:
      
      L1 Data Cache
      
      The data cache access counter remains the same on Family 17h.
      
      For DC misses, PMCx041's definition changes with Family 17h,
      so instead we use the L2 cache accesses from L1 data cache
      misses counter (PMCx060,umask=0xc8).
      
      For DC hardware prefetch events, Family 17h breaks compatibility
      for PMCx067 "Data Prefetcher", so instead, we use PMCx05a "Hardware
      Prefetch DC Fills."
      
      L1 Instruction Cache
      
      PMCs 0x80 and 0x81 (32-byte IC fetches and misses) are backward
      compatible on Family 17h.
      
      For prefetches, we remove the erroneous PMCx04B assignment which
      counts how many software data cache prefetch load instructions were
      dispatched.
      
      LL - Last Level Cache
      
      Removing PMCs 7D, 7E, and 7F assignments, as they do not exist
      on Family 17h, where the last level cache is L3.  L3 counters
      can be accessed using the existing AMD Uncore driver.
      
      Data TLB
      
      On Intel machines, data TLB accesses ("dTLB-loads") are assigned
      to counters that count load/store instructions retired.  This
      is inconsistent with instruction TLB accesses, where Intel
      implementations report iTLB misses that hit in the STLB.
      
      Ideally, dTLB-loads would count higher level dTLB misses that hit
      in lower level TLBs, and dTLB-load-misses would report those
      that also missed in those lower-level TLBs, therefore causing
      a page table walk.  That would be consistent with instruction
      TLB operation, remove the redundancy between dTLB-loads and
      L1-dcache-loads, and prevent perf from producing artificially
      low percentage ratios, i.e. the "0.01%" below:
      
              42,550,869      L1-dcache-loads
              41,591,860      dTLB-loads
                   4,802      dTLB-load-misses          #    0.01% of all dTLB cache hits
               7,283,682      L1-dcache-stores
               7,912,392      dTLB-stores
                     310      dTLB-store-misses
      
      On AMD Families prior to 17h, the "Data Cache Accesses" counter is
      used, which is slightly better than load/store instructions retired,
      but still counts in terms of individual load/store operations
      instead of TLB operations.
      
      So, for AMD Families 17h and higher, this patch assigns "dTLB-loads"
      to a counter for L1 dTLB misses that hit in the L2 dTLB, and
      "dTLB-load-misses" to a counter for L1 DTLB misses that caused
      L2 DTLB misses and therefore also caused page table walks.  This
      results in a much more accurate view of data TLB performance:
      
              60,961,781      L1-dcache-loads
                   4,601      dTLB-loads
                     963      dTLB-load-misses          #   20.93% of all dTLB cache hits
      
      Note that for all AMD families, data loads and stores are combined
      in a single accesses counter, so no 'L1-dcache-stores' are reported
      separately, and stores are counted with loads in 'L1-dcache-loads'.
      
      Also note that the "% of all dTLB cache hits" string is misleading
      because (a) "dTLB cache": although TLBs can be considered caches for
      page tables, in this context, it can be misinterpreted as data cache
      hits because the figures are similar (at least on Intel), and (b) not
      all those loads (technically accesses) technically "hit" at that
      hardware level.  "% of all dTLB accesses" would be more clear/accurate.
      
      Instruction TLB
      
      On Intel machines, 'iTLB-loads' measure iTLB misses that hit in the
      STLB, and 'iTLB-load-misses' measure iTLB misses that also missed in
      the STLB and completed a page table walk.
      
      For AMD Family 17h and above, for 'iTLB-loads' we replace the
      erroneous instruction cache fetches counter with PMCx084
      "L1 ITLB Miss, L2 ITLB Hit".
      
      For 'iTLB-load-misses' we still use PMCx085 "L1 ITLB Miss,
      L2 ITLB Miss", but set a 0xff umask because without it the event
      does not get counted.
      
      Branch Predictor (BPU)
      
      PMCs 0xc2 and 0xc3 continue to be valid across all AMD Families.
      
      Node Level Events
      
      Family 17h does not have a PMCx0e9 counter, and corresponding counters
      have not been made available publicly, so for now, we mark them as
      unsupported for Families 17h and above.
      
      Reference:
      
        "Open-Source Register Reference For AMD Family 17h Processors Models 00h-2Fh"
        Released 7/17/2018, Publication #56255, Revision 3.03:
        https://www.amd.com/system/files/TechDocs/56255_OSRR.pdf
      
      [ mingo: tidied up the line breaks. ]
      Signed-off-by: default avatarKim Phillips <kim.phillips@amd.com>
      Cc: <stable@vger.kernel.org> # v4.9+
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Janakarajan Natarajan <Janakarajan.Natarajan@amd.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Martin Liška <mliska@suse.cz>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Pu Wen <puwen@hygon.cn>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Thomas Lendacky <Thomas.Lendacky@amd.com>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: linux-kernel@vger.kernel.org
      Cc: linux-perf-users@vger.kernel.org
      Fixes: e40ed154 ("perf/x86: Add perf support for AMD family-17h processors")
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      0e3b74e2
  4. 01 May, 2019 7 commits
  5. 30 Apr, 2019 4 commits
  6. 29 Apr, 2019 5 commits
    • Linus Torvalds's avatar
      Merge tag 'seccomp-v5.1-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux · 83a50840
      Linus Torvalds authored
      Pull seccomp fixes from Kees Cook:
       "Syzbot found a use-after-free bug in seccomp due to flags that should
        not be allowed to be used together.
      
        Tycho fixed this, I updated the self-tests, and the syzkaller PoC has
        been running for several days without triggering KASan (before this
        fix, it would reproduce). These patches have also been in -next for
        almost a week, just to be sure.
      
         - Add logic for making some seccomp flags exclusive (Tycho)
      
         - Update selftests for exclusivity testing (Kees)"
      
      * tag 'seccomp-v5.1-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
        seccomp: Make NEW_LISTENER and TSYNC flags exclusive
        selftests/seccomp: Prepare for exclusive seccomp flags
      83a50840
    • Linus Torvalds's avatar
      x86: make ZERO_PAGE() at least parse its argument · 80871482
      Linus Torvalds authored
      This doesn't really do anything, but at least we now parse teh
      ZERO_PAGE() address argument so that we'll catch the most obvious errors
      in usage next time they'll happen.
      
      See commit 6a5c5d26 ("rdma: fix build errors on s390 and MIPS due to
      bad ZERO_PAGE use") what happens when we don't have any use of the macro
      argument at all.
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      80871482
    • Linus Torvalds's avatar
      rdma: fix build errors on s390 and MIPS due to bad ZERO_PAGE use · 6a5c5d26
      Linus Torvalds authored
      The parameter to ZERO_PAGE() was wrong, but since all architectures
      except for MIPS and s390 ignore it, it wasn't noticed until 0-day
      reported the build error.
      
      Fixes: 67f269b3 ("RDMA/ucontext: Fix regression with disassociate")
      Cc: stable@vger.kernel.org
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Leon Romanovsky <leonro@mellanox.com>
      Cc: Jason Gunthorpe <jgg@mellanox.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      6a5c5d26
    • Paulo Alcantara's avatar
      selinux: use kernel linux/socket.h for genheaders and mdp · dfbd199a
      Paulo Alcantara authored
      When compiling genheaders and mdp from a newer host kernel, the
      following error happens:
      
          In file included from scripts/selinux/genheaders/genheaders.c:18:
          ./security/selinux/include/classmap.h:238:2: error: #error New
          address family defined, please update secclass_map.  #error New
          address family defined, please update secclass_map.  ^~~~~
          make[3]: *** [scripts/Makefile.host:107:
          scripts/selinux/genheaders/genheaders] Error 1 make[2]: ***
          [scripts/Makefile.build:599: scripts/selinux/genheaders] Error 2
          make[1]: *** [scripts/Makefile.build:599: scripts/selinux] Error 2
          make[1]: *** Waiting for unfinished jobs....
      
      Instead of relying on the host definition, include linux/socket.h in
      classmap.h to have PF_MAX.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarPaulo Alcantara <paulo@paulo.ac>
      Acked-by: default avatarStephen Smalley <sds@tycho.nsa.gov>
      [PM: manually merge in mdp.c, subject line tweaks]
      Signed-off-by: default avatarPaul Moore <paul@paul-moore.com>
      dfbd199a
    • Linus Torvalds's avatar
      Linux 5.1-rc7 · 37624b58
      Linus Torvalds authored
      37624b58
  7. 28 Apr, 2019 6 commits
    • Jan Kara's avatar
      fsnotify: Fix NULL ptr deref in fanotify_get_fsid() · b1da6a51
      Jan Kara authored
      fanotify_get_fsid() is reading mark->connector->fsid under srcu. It can
      happen that it sees mark not fully initialized or mark that is already
      detached from the object list. In these cases mark->connector
      can be NULL leading to NULL ptr dereference. Fix the problem by
      being careful when reading mark->connector and check it for being NULL.
      Also use WRITE_ONCE when writing the mark just to prevent compiler from
      doing something stupid.
      
      Reported-by: syzbot+15927486a4f1bfcbaf91@syzkaller.appspotmail.com
      Fixes: 77115225 ("fanotify: cache fsid in fsnotify_mark_connector")
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      b1da6a51
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm · 9520b532
      Linus Torvalds authored
      Pull ARM fixes from Russell King:
       "A small number of ARM fixes
      
         - Fix function tracer and unwinder dependencies so that we don't end
           up building kernels that will crash
      
         - Fix ARMv7M nommu initialisation (missing register initialisation)
      
         - Fix EFI decompressor entry (ensuring barrier instructions are
           enabled prior to use)"
      
      * tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm:
        ARM: 8857/1: efi: enable CP15 DMB instructions before cleaning the cache
        ARM: 8856/1: NOMMU: Fix CCR register faulty initialization when MPU is disabled
        ARM: fix function graph tracer and unwinder dependencies
      9520b532
    • Linus Torvalds's avatar
      Merge tag 'powerpc-5.1-6' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux · 0d82044e
      Linus Torvalds authored
      Pull powerpc fixes from Michael Ellerman:
       "A one-liner to make our Radix MMU support depend on HUGETLB_PAGE. We
        use some of the hugetlb inlines (eg. pud_huge()) when operating on the
        linear mapping and if they're compiled into empty wrappers we can
        corrupt memory.
      
        Then two fixes to our VFIO IOMMU code. The first is not a regression
        but fixes the locking to avoid a user-triggerable deadlock.
      
        The second does fix a regression since rc1, and depends on the first
        fix. It makes it possible to run guests with large amounts of memory
        again (~256GB).
      
        Thanks to Alexey Kardashevskiy"
      
      * tag 'powerpc-5.1-6' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
        powerpc/mm_iommu: Allow pinning large regions
        powerpc/mm_iommu: Fix potential deadlock
        powerpc/mm/radix: Make Radix require HUGETLB_PAGE
      0d82044e
    • Linus Torvalds's avatar
      Merge tag 'for-linus-20190428' of git://git.kernel.dk/linux-block · 975a0f40
      Linus Torvalds authored
      Pull block fixes from Jens Axboe:
       "A set of io_uring fixes that should go into this release. In
        particular, this contains:
      
         - The mutex lock vs ctx ref count fix (me)
      
         - Removal of a dead variable (me)
      
         - Two race fixes (Stefan)
      
         - Ring head/tail condition fix for poll full SQ detection (Stefan)"
      
      * tag 'for-linus-20190428' of git://git.kernel.dk/linux-block:
        io_uring: remove 'state' argument from io_{read,write} path
        io_uring: fix poll full SQ detection
        io_uring: fix race condition when sq threads goes sleeping
        io_uring: fix race condition reading SQ entries
        io_uring: fail io_uring_register(2) on a dying io_uring instance
      975a0f40
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma · 14f974d7
      Linus Torvalds authored
      Pull rdma fixes from Jason Gunthorpe:
       "One core bug fix and a few driver ones
      
         - FRWR memory registration for hfi1/qib didn't work with with some
           iovas causing a NFSoRDMA failure regression due to a fix in the NFS
           side
      
         - A command flow error in mlx5 allowed user space to send a corrupt
           command (and also smash the kernel stack we've since learned)
      
         - Fix a regression and some bugs with device hot unplug that was
           discovered while reviewing Andrea's patches
      
         - hns has a failure if the user asks for certain QP configurations"
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
        RDMA/hns: Bugfix for mapping user db
        RDMA/ucontext: Fix regression with disassociate
        RDMA/mlx5: Use rdma_user_map_io for mapping BAR pages
        RDMA/mlx5: Do not allow the user to write to the clock page
        IB/mlx5: Fix scatter to CQE in DCT QP creation
        IB/rdmavt: Fix frwr memory registration
      14f974d7
    • Linus Torvalds's avatar
      Merge tag 'dmaengine-fix-5.1-rc7' of git://git.infradead.org/users/vkoul/slave-dma · 72a6e35d
      Linus Torvalds authored
      Pull dmaengine fixes from Vinod Koul:
      
       - fix for wrong register use in mediatek driver
      
       - fix in sh driver for glitch is tx_status and treating 0 a valid
         residue for cyclic
      
       - fix in bcm driver for using right memory allocation flag
      
      * tag 'dmaengine-fix-5.1-rc7' of git://git.infradead.org/users/vkoul/slave-dma:
        dmaengine: mediatek-cqdma: fix wrong register usage in mtk_cqdma_start
        dmaengine: sh: rcar-dmac: Fix glitch in dmaengine_tx_status
        dmaengine: sh: rcar-dmac: With cyclic DMA residue 0 is valid
        dmaengine: bcm2835: Avoid GFP_KERNEL in device_prep_slave_sg
      72a6e35d
  8. 27 Apr, 2019 1 commit