1. 03 Nov, 2021 2 commits
  2. 04 Oct, 2021 2 commits
    • Andrea Arcangeli's avatar
      x86: deduplicate the spectre_v2_user documentation · d9bbdbf3
      Andrea Arcangeli authored
      This would need updating to make prctl be the new default, but it's
      simpler to delete it and refer to the dup.
      Signed-off-by: default avatarAndrea Arcangeli <aarcange@redhat.com>
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      Link: https://lore.kernel.org/r/20201105001406.13005-2-aarcange@redhat.com
      d9bbdbf3
    • Andrea Arcangeli's avatar
      x86: change default to spec_store_bypass_disable=prctl spectre_v2_user=prctl · 2f46993d
      Andrea Arcangeli authored
      Switch the kernel default of SSBD and STIBP to the ones with
      CONFIG_SECCOMP=n (i.e. spec_store_bypass_disable=prctl
      spectre_v2_user=prctl) even if CONFIG_SECCOMP=y.
      
      Several motivations listed below:
      
      - If SMT is enabled the seccomp jail can still attack the rest of the
        system even with spectre_v2_user=seccomp by using MDS-HT (except on
        XEON PHI where MDS can be tamed with SMT left enabled, but that's a
        special case). Setting STIBP become a very expensive window dressing
        after MDS-HT was discovered.
      
      - The seccomp jail cannot attack the kernel with spectre-v2-HT
        regardless (even if STIBP is not set), but with MDS-HT the seccomp
        jail can attack the kernel too.
      
      - With spec_store_bypass_disable=prctl the seccomp jail can attack the
        other userland (guest or host mode) using spectre-v2-HT, but the
        userland attack is already mitigated by both ASLR and pid namespaces
        for host userland and through virt isolation with libkrun or
        kata. (if something if somebody is worried about spectre-v2-HT it's
        best to mount proc with hidepid=2,gid=proc on workstations where not
        all apps may run under container runtimes, rather than slowing down
        all seccomp jails, but the best is to add pid namespaces to the
        seccomp jail). As opposed MDS-HT is not mitigated and the seccomp
        jail can still attack all other host and guest userland if SMT is
        enabled even with spec_store_bypass_disable=seccomp.
      
      - If full security is required then MDS-HT must also be mitigated with
        nosmt and then spectre_v2_user=prctl and spectre_v2_user=seccomp
        would become identical.
      
      - Setting spectre_v2_user=seccomp is overall lower priority than to
        setting javascript.options.wasm false in about:config to protect
        against remote wasm MDS-HT, instead of worrying about Spectre-v2-HT
        and STIBP which again is already statistically well mitigated by
        other means in userland and it's fully mitigated in kernel with
        retpolines (unlike the wasm assist call with MDS-HT).
      
      - SSBD is needed to prevent reading the JIT memory and the primary
        user being the OpenJDK. However the primary user of SSBD wouldn't be
        covered by spec_store_bypass_disable=seccomp because it doesn't use
        seccomp and the primary user also explicitly declined to set
        PR_SET_SPECULATION_CTRL+PR_SPEC_STORE_BYPASS despite it easily
        could. In fact it would need to set it only when the sandboxing
        mechanism is enabled for javaws applets, but it still declined it by
        declaring security within the same user address space as an
        untenable objective for their JIT, even in the sandboxing case where
        performance would be a lesser concern (for the record: I kind of
        disagree in not setting PR_SPEC_STORE_BYPASS in the sandbox case and
        I prefer to run javaws through a wrapper that sets
        PR_SPEC_STORE_BYPASS if I need). In turn it can be inferred that
        even if the primary user of SSBD would use seccomp, they would
        invoke it with SECCOMP_FILTER_FLAG_SPEC_ALLOW by now.
      
      - runc/crun already set SECCOMP_FILTER_FLAG_SPEC_ALLOW by default, k8s
        and podman have a default json seccomp allowlist that cannot be
        slowed down, so for the #1 seccomp user this change is already a
        noop.
      
      - systemd/sshd or other apps that use seccomp, if they really need
        STIBP or SSBD, they need to explicitly set the
        PR_SET_SPECULATION_CTRL by now. The stibp/ssbd seccomp blind
        catch-all approach was done probably initially with a wishful
        thinking objective to pretend to have a peace of mind that it could
        magically fix it all. That was wishful thinking before MDS-HT was
        discovered, but after MDS-HT has been discovered it become just
        window dressing.
      
      - For qemu "-sandbox" seccomp jail it wouldn't make sense to set STIBP
        or SSBD. SSBD doesn't help with KVM because there's no JIT (if it's
        needed with TCG it should be an opt-in with
        PR_SET_SPECULATION_CTRL+PR_SPEC_STORE_BYPASS and it shouldn't
        slowdown KVM for nothing). For qemu+KVM STIBP would be even more
        window dressing than it is for all other apps, because in the
        qemu+KVM case there's not only the MDS attack to worry about with
        SMT enabled. Even after disabling SMT, there's still a theoretical
        spectre-v2 attack possible within the same thread context from guest
        mode to host ring3 that the host kernel retpoline mitigation has no
        theoretical chance to mitigate. On some kernels a
        ibrs-always/ibrs-retpoline opt-in model is provided that will
        enabled IBRS in the qemu host ring3 userland which fixes this
        theoretical concern. Only after enabling IBRS in the host userland
        it would then make sense to proceed and worry about STIBP and an
        attack on the other host userland, but then again SMT would need to
        be disabled for full security anyway, so that would render STIBP
        again a noop.
      
      - last but not the least: the lack of "spec_store_bypass_disable=prctl
        spectre_v2_user=prctl" means the moment a guest boots and
        sshd/systemd runs, the guest kernel will write to SPEC_CTRL MSR
        which will make the guest vmexit forever slower, forcing KVM to
        issue a very slow rdmsr instruction at every vmexit. So the end
        result is that SPEC_CTRL MSR is only available in GCE. Most other
        public cloud providers don't expose SPEC_CTRL, which means that not
        only STIBP/SSBD isn't available, but IBPB isn't available either
        (which would cause no overhead to the guest or the hypervisor
        because it's write only and requires no reading during vmexit). So
        the current default already net loss in security (missing IBPB)
        which means most public cloud providers cannot achieve a fully
        secure guest with nosmt (and nosmt is enough to fully mitigate
        MDS-HT). It also means GCE and is unfairly penalized in performance
        because it provides the option to enable full security in the guest
        as an opt-in (i.e. nosmt and IBPB). So this change will allow all
        cloud providers to expose SPEC_CTRL without incurring into any
        hypervisor slowdown and at the same time it will remove the unfair
        penalization of GCE performance for doing the right thing and it'll
        allow to get full security with nosmt with IBPB being available (and
        STIBP becoming meaningless).
      
      Example to put things in prospective: the STIBP enabled in seccomp has
      never been about protecting apps using seccomp like sshd from an
      attack from a malicious userland, but to the contrary it has always
      been about protecting the system from an attack from sshd, after a
      successful remote network exploit against sshd. In fact initially it
      wasn't obvious STIBP would work both ways (STIBP was about preventing
      the task that runs with STIBP to be attacked with spectre-v2-HT, but
      accidentally in the STIBP case it also prevents the attack in the
      other direction). In the hypothetical case that sshd has been remotely
      exploited the last concern should be STIBP being set, because it'll be
      still possible to obtain info even from the kernel by using MDS if
      nosmt wasn't set (and if it was set, STIBP is a noop in the first
      place). As opposed kernel cannot leak anything with spectre-v2 HT
      because of retpolines and the userland is mitigated by ASLR already
      and ideally PID namespaces too. If something it'd be worth checking if
      sshd run the seccomp thread under pid namespaces too if available in
      the running kernel. SSBD also would be a noop for sshd, since sshd
      uses no JIT. If sshd prefers to keep doing the STIBP window dressing
      exercise, it still can even after this change of defaults by opting-in
      with PR_SPEC_INDIRECT_BRANCH.
      
      Ultimately setting SSBD and STIBP by default for all seccomp jails is
      a bad sweet spot and bad default with more cons than pros that end up
      reducing security in the public cloud (by giving an huge incentive to
      not expose SPEC_CTRL which would be needed to get full security with
      IBPB after setting nosmt in the guest) and by excessively hurting
      performance to more secure apps using seccomp that end up having to
      opt out with SECCOMP_FILTER_FLAG_SPEC_ALLOW.
      
      The following is the verified result of the new default with SMT
      enabled:
      
      (gdb) print spectre_v2_user_stibp
      $1 = SPECTRE_V2_USER_PRCTL
      (gdb) print spectre_v2_user_ibpb
      $2 = SPECTRE_V2_USER_PRCTL
      (gdb) print ssb_mode
      $3 = SPEC_STORE_BYPASS_PRCTL
      Signed-off-by: default avatarAndrea Arcangeli <aarcange@redhat.com>
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      Link: https://lore.kernel.org/r/20201104235054.5678-1-aarcange@redhat.comAcked-by: default avatarJosh Poimboeuf <jpoimboe@redhat.com>
      Link: https://lore.kernel.org/lkml/AAA2EF2C-293D-4D5B-BFA6-FF655105CD84@redhat.comAcked-by: default avatarWaiman Long <longman@redhat.com>
      Link: https://lore.kernel.org/lkml/c0722838-06f7-da6b-138f-e0f26362f16a@redhat.com
      2f46993d
  3. 20 Sep, 2021 2 commits
    • Linus Torvalds's avatar
      Linux 5.15-rc2 · e4e737bb
      Linus Torvalds authored
      e4e737bb
    • Linus Torvalds's avatar
      pci_iounmap'2: Electric Boogaloo: try to make sense of it all · 316e8d79
      Linus Torvalds authored
      Nathan Chancellor reports that the recent change to pci_iounmap in
      commit 9caea000 ("parisc: Declare pci_iounmap() parisc version only
      when CONFIG_PCI enabled") causes build errors on arm64.
      
      It took me about two hours to convince myself that I think I know what
      the logic of that mess of #ifdef's in the <asm-generic/io.h> header file
      really aim to do, and rewrite it to be easier to follow.
      
      Famous last words.
      
      Anyway, the code has now been lifted from that grotty header file into
      lib/pci_iomap.c, and has fairly extensive comments about what the logic
      is.  It also avoids indirecting through another confusing (and badly
      named) helper function that has other preprocessor config conditionals.
      
      Let's see what odd architecture did something else strange in this area
      to break things.  But my arm64 cross build is clean.
      
      Fixes: 9caea000 ("parisc: Declare pci_iounmap() parisc version only when CONFIG_PCI enabled")
      Reported-by: default avatarNathan Chancellor <nathan@kernel.org>
      Cc: Helge Deller <deller@gmx.de>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Guenter Roeck <linux@roeck-us.net>
      Cc: Ulrich Teichert <krypton@ulrich-teichert.org>
      Cc: James Bottomley <James.Bottomley@hansenpartnership.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      316e8d79
  4. 19 Sep, 2021 18 commits
  5. 18 Sep, 2021 12 commits
    • Linus Torvalds's avatar
      alpha: move __udiv_qrnnd library function to arch/alpha/lib/ · d4d016ca
      Linus Torvalds authored
      We already had the implementation for __udiv_qrnnd (unsigned divide for
      multi-precision arithmetic) as part of the alpha math emulation code.
      
      But you can disable the math emulation code - even if you shouldn't -
      and then the MPI code that actually wants this functionality (and is
      needed by various crypto functions) will fail to build.
      
      So move the extended-precision divide code to be a regular library
      function, just like all the regular division code is.  That way ie is
      available regardless of math-emulation.
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      d4d016ca
    • Linus Torvalds's avatar
      alpha: mark 'Jensen' platform as no longer broken · ab41f75e
      Linus Torvalds authored
      Ok, it almost certainly is still broken on actual hardware, but the
      immediate reason for it having been marked BROKEN was a build error that
      is fixed by just making sure the low-level IO header file is included
      sufficiently early that the __EXTERN_INLINE hackery takes effect.
      
      This was marked broken back in 2017 by commit 1883c9f4 ("alpha: mark
      jensen as broken"), but Ulrich Teichert made me look at it as part of my
      cross-build work to make sure -Werror actually does the right thing.
      
      There are lots of alpha configurations that do not build cleanly, but
      now it's no longer because Jensen wouldn't be buildable.  That said,
      because the Jensen platform doesn't force PCI to be enabled (Jensen only
      had EISA), it ends up being somewhat interesting as a source of odd
      configs.
      Reported-by: default avatarUlrich Teichert <krypton@ulrich-teichert.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      ab41f75e
    • Andrii Nakryiko's avatar
      perf bpf: Ignore deprecation warning when using libbpf's btf__get_from_id() · 219d720e
      Andrii Nakryiko authored
      Perf code re-implements libbpf's btf__load_from_kernel_by_id() API as
      a weak function, presumably to dynamically link against old version of
      libbpf shared library. Unfortunately this causes compilation warning
      when perf is compiled against libbpf v0.6+.
      
      For now, just ignore deprecation warning, but there might be a better
      solution, depending on perf's needs.
      Signed-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: kernel-team@fb.com
      LPU-Reference: 20210914170004.4185659-1-andrii@kernel.org
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      219d720e
    • Ian Rogers's avatar
      libperf evsel: Make use of FD robust. · aba5daeb
      Ian Rogers authored
      FD uses xyarray__entry that may return NULL if an index is out of
      bounds. If NULL is returned then a segv happens as FD unconditionally
      dereferences the pointer. This was happening in a case of with perf
      iostat as shown below. The fix is to make FD an "int*" rather than an
      int and handle the NULL case as either invalid input or a closed fd.
      
        $ sudo gdb --args perf stat --iostat  list
        ...
        Breakpoint 1, perf_evsel__alloc_fd (evsel=0x5555560951a0, ncpus=1, nthreads=1) at evsel.c:50
        50      {
        (gdb) bt
         #0  perf_evsel__alloc_fd (evsel=0x5555560951a0, ncpus=1, nthreads=1) at evsel.c:50
         #1  0x000055555585c188 in evsel__open_cpu (evsel=0x5555560951a0, cpus=0x555556093410,
            threads=0x555556086fb0, start_cpu=0, end_cpu=1) at util/evsel.c:1792
         #2  0x000055555585cfb2 in evsel__open (evsel=0x5555560951a0, cpus=0x0, threads=0x555556086fb0)
            at util/evsel.c:2045
         #3  0x000055555585d0db in evsel__open_per_thread (evsel=0x5555560951a0, threads=0x555556086fb0)
            at util/evsel.c:2065
         #4  0x00005555558ece64 in create_perf_stat_counter (evsel=0x5555560951a0,
            config=0x555555c34700 <stat_config>, target=0x555555c2f1c0 <target>, cpu=0) at util/stat.c:590
         #5  0x000055555578e927 in __run_perf_stat (argc=1, argv=0x7fffffffe4a0, run_idx=0)
            at builtin-stat.c:833
         #6  0x000055555578f3c6 in run_perf_stat (argc=1, argv=0x7fffffffe4a0, run_idx=0)
            at builtin-stat.c:1048
         #7  0x0000555555792ee5 in cmd_stat (argc=1, argv=0x7fffffffe4a0) at builtin-stat.c:2534
         #8  0x0000555555835ed3 in run_builtin (p=0x555555c3f540 <commands+288>, argc=3,
            argv=0x7fffffffe4a0) at perf.c:313
         #9  0x0000555555836154 in handle_internal_command (argc=3, argv=0x7fffffffe4a0) at perf.c:365
         #10 0x000055555583629f in run_argv (argcp=0x7fffffffe2ec, argv=0x7fffffffe2e0) at perf.c:409
         #11 0x0000555555836692 in main (argc=3, argv=0x7fffffffe4a0) at perf.c:539
        ...
        (gdb) c
        Continuing.
        Error:
        The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (uncore_iio_0/event=0x83,umask=0x04,ch_mask=0xF,fc_mask=0x07/).
        /bin/dmesg | grep -i perf may provide additional information.
      
        Program received signal SIGSEGV, Segmentation fault.
        0x00005555559b03ea in perf_evsel__close_fd_cpu (evsel=0x5555560951a0, cpu=1) at evsel.c:166
        166                     if (FD(evsel, cpu, thread) >= 0)
      
      v3. fixes a bug in perf_evsel__run_ioctl where the sense of a branch was
          backward.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Acked-by: default avatarJiri Olsa <jolsa@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lore.kernel.org/lkml/20210918054440.2350466-1-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      aba5daeb
    • Michael Petlan's avatar
      perf machine: Initialize srcline string member in add_location struct · 57f0ff05
      Michael Petlan authored
      It's later supposed to be either a correct address or NULL. Without the
      initialization, it may contain an undefined value which results in the
      following segmentation fault:
      
        # perf top --sort comm -g --ignore-callees=do_idle
      
      terminates with:
      
        #0  0x00007ffff56b7685 in __strlen_avx2 () from /lib64/libc.so.6
        #1  0x00007ffff55e3802 in strdup () from /lib64/libc.so.6
        #2  0x00005555558cb139 in hist_entry__init (callchain_size=<optimized out>, sample_self=true, template=0x7fffde7fb110, he=0x7fffd801c250) at util/hist.c:489
        #3  hist_entry__new (template=template@entry=0x7fffde7fb110, sample_self=sample_self@entry=true) at util/hist.c:564
        #4  0x00005555558cb4ba in hists__findnew_entry (hists=hists@entry=0x5555561d9e38, entry=entry@entry=0x7fffde7fb110, al=al@entry=0x7fffde7fb420,
            sample_self=sample_self@entry=true) at util/hist.c:657
        #5  0x00005555558cba1b in __hists__add_entry (hists=hists@entry=0x5555561d9e38, al=0x7fffde7fb420, sym_parent=<optimized out>, bi=bi@entry=0x0, mi=mi@entry=0x0,
            sample=sample@entry=0x7fffde7fb4b0, sample_self=true, ops=0x0, block_info=0x0) at util/hist.c:288
        #6  0x00005555558cbb70 in hists__add_entry (sample_self=true, sample=0x7fffde7fb4b0, mi=0x0, bi=0x0, sym_parent=<optimized out>, al=<optimized out>, hists=0x5555561d9e38)
            at util/hist.c:1056
        #7  iter_add_single_cumulative_entry (iter=0x7fffde7fb460, al=<optimized out>) at util/hist.c:1056
        #8  0x00005555558cc8a4 in hist_entry_iter__add (iter=iter@entry=0x7fffde7fb460, al=al@entry=0x7fffde7fb420, max_stack_depth=<optimized out>, arg=arg@entry=0x7fffffff7db0)
            at util/hist.c:1231
        #9  0x00005555557cdc9a in perf_event__process_sample (machine=<optimized out>, sample=0x7fffde7fb4b0, evsel=<optimized out>, event=<optimized out>, tool=0x7fffffff7db0)
            at builtin-top.c:842
        #10 deliver_event (qe=<optimized out>, qevent=<optimized out>) at builtin-top.c:1202
        #11 0x00005555558a9318 in do_flush (show_progress=false, oe=0x7fffffff80e0) at util/ordered-events.c:244
        #12 __ordered_events__flush (oe=oe@entry=0x7fffffff80e0, how=how@entry=OE_FLUSH__TOP, timestamp=timestamp@entry=0) at util/ordered-events.c:323
        #13 0x00005555558a9789 in __ordered_events__flush (timestamp=<optimized out>, how=<optimized out>, oe=<optimized out>) at util/ordered-events.c:339
        #14 ordered_events__flush (how=OE_FLUSH__TOP, oe=0x7fffffff80e0) at util/ordered-events.c:341
        #15 ordered_events__flush (oe=oe@entry=0x7fffffff80e0, how=how@entry=OE_FLUSH__TOP) at util/ordered-events.c:339
        #16 0x00005555557cd631 in process_thread (arg=0x7fffffff7db0) at builtin-top.c:1114
        #17 0x00007ffff7bb817a in start_thread () from /lib64/libpthread.so.0
        #18 0x00007ffff5656dc3 in clone () from /lib64/libc.so.6
      
      If you look at the frame #2, the code is:
      
      488	 if (he->srcline) {
      489          he->srcline = strdup(he->srcline);
      490          if (he->srcline == NULL)
      491              goto err_rawdata;
      492	 }
      
      If he->srcline is not NULL (it is not NULL if it is uninitialized rubbish),
      it gets strdupped and strdupping a rubbish random string causes the problem.
      
      Also, if you look at the commit 1fb7d06a, it adds the srcline property
      into the struct, but not initializing it everywhere needed.
      
      Committer notes:
      
      Now I see, when using --ignore-callees=do_idle we end up here at line
      2189 in add_callchain_ip():
      
      2181         if (al.sym != NULL) {
      2182                 if (perf_hpp_list.parent && !*parent &&
      2183                     symbol__match_regex(al.sym, &parent_regex))
      2184                         *parent = al.sym;
      2185                 else if (have_ignore_callees && root_al &&
      2186                   symbol__match_regex(al.sym, &ignore_callees_regex)) {
      2187                         /* Treat this symbol as the root,
      2188                            forgetting its callees. */
      2189                         *root_al = al;
      2190                         callchain_cursor_reset(cursor);
      2191                 }
      2192         }
      
      And the al that doesn't have the ->srcline field initialized will be
      copied to the root_al, so then, back to:
      
      1211 int hist_entry_iter__add(struct hist_entry_iter *iter, struct addr_location *al,
      1212                          int max_stack_depth, void *arg)
      1213 {
      1214         int err, err2;
      1215         struct map *alm = NULL;
      1216
      1217         if (al)
      1218                 alm = map__get(al->map);
      1219
      1220         err = sample__resolve_callchain(iter->sample, &callchain_cursor, &iter->parent,
      1221                                         iter->evsel, al, max_stack_depth);
      1222         if (err) {
      1223                 map__put(alm);
      1224                 return err;
      1225         }
      1226
      1227         err = iter->ops->prepare_entry(iter, al);
      1228         if (err)
      1229                 goto out;
      1230
      1231         err = iter->ops->add_single_entry(iter, al);
      1232         if (err)
      1233                 goto out;
      1234
      
      That al at line 1221 is what hist_entry_iter__add() (called from
      sample__resolve_callchain()) saw as 'root_al', and then:
      
              iter->ops->add_single_entry(iter, al);
      
      will go on with al->srcline with a bogus value, I'll add the above
      sequence to the cset and apply, thanks!
      Signed-off-by: default avatarMichael Petlan <mpetlan@redhat.com>
      CC: Milian Wolff <milian.wolff@kdab.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Fixes: 1fb7d06a ("perf report Use srcline from callchain for hist entries")
      Link: https //lore.kernel.org/r/20210719145332.29747-1-mpetlan@redhat.com
      Reported-by: default avatarJuri Lelli <jlelli@redhat.com>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      57f0ff05
    • Adrian Hunter's avatar
      perf script: Fix ip display when type != attr->type · ff6f41fb
      Adrian Hunter authored
      set_print_ip_opts() was not being called when type != attr->type
      because there is not a one-to-one relationship between output types
      and attr->type. That resulted in ip not printing.
      
      The attr_type() function is removed, and the match of attr->type to
      output type is corrected.
      
      Example on ADL using taskset to select an atom cpu:
      
       # perf record -e cpu_atom/cpu-cycles/ taskset 0x1000 uname
       Linux
       [ perf record: Woken up 1 times to write data ]
       [ perf record: Captured and wrote 0.003 MB perf.data (7 samples) ]
      
       Before:
      
        # perf script | head
               taskset   428 [-01] 10394.179041:          1 cpu_atom/cpu-cycles/:
               taskset   428 [-01] 10394.179043:          1 cpu_atom/cpu-cycles/:
               taskset   428 [-01] 10394.179044:         11 cpu_atom/cpu-cycles/:
               taskset   428 [-01] 10394.179045:        407 cpu_atom/cpu-cycles/:
               taskset   428 [-01] 10394.179046:      16789 cpu_atom/cpu-cycles/:
               taskset   428 [-01] 10394.179052:     676300 cpu_atom/cpu-cycles/:
                 uname   428 [-01] 10394.179278:    4079859 cpu_atom/cpu-cycles/:
      
       After:
      
        # perf script | head
               taskset   428 10394.179041:          1 cpu_atom/cpu-cycles/:  ffffffff95a0bb97 __intel_pmu_enable_all.constprop.48+0x47 ([kernel.kallsyms])
               taskset   428 10394.179043:          1 cpu_atom/cpu-cycles/:  ffffffff95a0bb97 __intel_pmu_enable_all.constprop.48+0x47 ([kernel.kallsyms])
               taskset   428 10394.179044:         11 cpu_atom/cpu-cycles/:  ffffffff95a0bb97 __intel_pmu_enable_all.constprop.48+0x47 ([kernel.kallsyms])
               taskset   428 10394.179045:        407 cpu_atom/cpu-cycles/:  ffffffff95a0bb97 __intel_pmu_enable_all.constprop.48+0x47 ([kernel.kallsyms])
               taskset   428 10394.179046:      16789 cpu_atom/cpu-cycles/:  ffffffff95a0bb97 __intel_pmu_enable_all.constprop.48+0x47 ([kernel.kallsyms])
               taskset   428 10394.179052:     676300 cpu_atom/cpu-cycles/:      7f829ef73800 cfree+0x0 (/lib/libc-2.32.so)
                 uname   428 10394.179278:    4079859 cpu_atom/cpu-cycles/:  ffffffff95bae912 vma_interval_tree_remove+0x1f2 ([kernel.kallsyms])
      Signed-off-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Reviewed-by: default avatarKan Liang <kan.liang@linux.intel.com>
      Cc: Jin Yao <yao.jin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Link: http://lore.kernel.org/lkml/20210911133053.15682-1-adrian.hunter@intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      ff6f41fb
    • Ravi Bangoria's avatar
      perf annotate: Fix fused instr logic for assembly functions · 7efbcc8c
      Ravi Bangoria authored
      Some x86 microarchitectures fuse a subset of cmp/test/ALU instructions
      with branch instructions, and thus perf annotate highlight such valid
      pairs as fused.
      
      When annotated with source, perf uses struct disasm_line to contain
      either source or instruction line from objdump output. Usually, a C
      statement generates multiple instructions which include such
      cmp/test/ALU + branch instruction pairs. But in case of assembly
      function, each individual assembly source line generate one
      instruction.
      
      The 'perf annotate' instruction fusion logic assumes the previous
      disasm_line as the previous instruction line, which is wrong because,
      for assembly function, previous disasm_line contains source line.  And
      thus perf fails to highlight valid fused instruction pairs for assembly
      functions.
      
      Fix it by searching backward until we find an instruction line and
      consider that disasm_line as fused with current branch instruction.
      
      Before:
               │    cmpq    %rcx, RIP+8(%rsp)
          0.00 │      cmp    %rcx,0x88(%rsp)
               │    je      .Lerror_bad_iret      <--- Source line
          0.14 │   ┌──je     b4                   <--- Instruction line
               │   │movl    %ecx, %eax
      
      After:
               │    cmpq    %rcx, RIP+8(%rsp)
          0.00 │   ┌──cmp    %rcx,0x88(%rsp)
               │   │je      .Lerror_bad_iret
          0.14 │   ├──je     b4
               │   │movl    %ecx, %eax
      Reviewed-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Signed-off-by: default avatarRavi Bangoria <ravi.bangoria@amd.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kim Phillips <kim.phillips@amd.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https //lore.kernel.org/r/20210911043854.8373-1-ravi.bangoria@amd.com
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      7efbcc8c
    • Linus Torvalds's avatar
      Merge tag 's390-5.15-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux · 93ff9f13
      Linus Torvalds authored
      Pull s390 fixes from Vasily Gorbik:
      
       - Fix potential out-of-range access during secure boot facility
         detection.
      
       - Fully validate the VMA before calling follow_pte() in pci code.
      
       - Remove arch specific WARN_DYNAMIC_STACK config option.
      
       - Fix zcrypto kernel doc comments.
      
       - Update defconfigs.
      
      * tag 's390-5.15-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
        s390: remove WARN_DYNAMIC_STACK
        s390/ap: fix kernel doc comments
        s390: update defconfigs
        s390/sclp: fix Secure-IPL facility detection
        s390/pci_mmio: fully validate the VMA before calling follow_pte()
      93ff9f13
    • Linus Torvalds's avatar
      Merge tag 'devicetree-fixes-for-5.15-2' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux · d1a88690
      Linus Torvalds authored
      Pull devicetree fixes from Rob Herring:
      
       - Revert fw_devlink tracking 'phy-handle' links. This broke at least a
         few platforms. A better solution is being worked on.
      
       - Add Samsung UFS binding which fell thru the cracks
      
       - Doc reference fixes from Mauro
      
       - Fix for restricted DMA error handling
      
      * tag 'devicetree-fixes-for-5.15-2' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux:
        dt-bindings: arm: Fix Toradex compatible typo
        of: restricted dma: Fix condition for rmem init
        dt-bindings: arm: mediatek: mmsys: update mediatek,mmsys.yaml reference
        dt-bindings: net: dsa: sja1105: update nxp,sja1105.yaml reference
        dt-bindings: ufs: Add bindings for Samsung ufs host
        Revert "of: property: fw_devlink: Add support for "phy-handle" property"
      d1a88690
    • Linus Torvalds's avatar
      tgafb: clarify dependencies · cd395d52
      Linus Torvalds authored
      The TGA boards were based on the DECchip 21030 PCI graphics accelerator
      used mainly for alpha, and existed in a TURBOchannel (TC) version for
      the DECstation (MIPS) workstations.
      
      However, the config option for the TGA code is a bit confused, and says
      
      	depends on FB && (ALPHA || TC)
      
      because people didn't really want to enable the option for random PCI
      environments, so the "ALPHA" stands in for that case (while the TC case
      is then the MIPS DECstation case).
      
      So that config dependency is kind of a mixture of architecture and bus
      choices.  But it's incorrect, in that there were non-PCI-based alpha
      hardware, and then the driver just causes warnings:
      
        drivers/video/fbdev/tgafb.c:1532:13: error: ‘tgafb_unregister’ defined but not used [-Werror=unused-function]
         1532 | static void tgafb_unregister(struct device *dev)
              |             ^~~~~~~~~~~~~~~~
        drivers/video/fbdev/tgafb.c:1387:12: error: ‘tgafb_register’ defined but not used [-Werror=unused-function]
         1387 | static int tgafb_register(struct device *dev)
              |            ^~~~~~~~~~~~~~
      
      so let's make the config option dependencies a bit more explict:
      
      	depends on FB
      	depends on PCI || TC
      	depends on ALPHA || TC
      
      where that first "FB" is the software configuration dependency, the
      second "PCI || TC" is the hardware bus dependency, while that final
      "ALPHA || TC" dependency is the "don't bother asking except for these
      situations.
      
      We could make that third case have "COMPILE_TEST" as an option, and mark
      the register/unregister functions as __maybe_unused, but I'm not sure
      it's really worth it.
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      cd395d52
    • Linus Torvalds's avatar
      alpha: make 'Jensen' IO functions build again · cc9d3aaa
      Linus Torvalds authored
      The Jensen IO functions are overly copmplicated because some of the IO
      addresses refer to special 'local IO' ports, and they get accessed
      differently.
      
      That then makes gcc not actually inline them, and since they were marked
      "extern inline" when included through the regular <asm/io.h> path, and
      then only marked "inline" when included from sys_jensen.c, you never
      necessarily got a body for the IO functions at all.
      
      The intent of the sys_jensen.c code is to actually get the non-inlined
      copy generated, so remove the 'inline' from the magic macro that is
      supposed to sort this all out.
      
      Also, do not mix 'extern inline' functions (that may or may not be
      inlined and will not generate a function body if they are not) with
      'static inline' (that _will_ generate a function body when not inlined).
      Because gcc will complain about this situation:
      
         error: ‘jensen_bus_outb’ is static but used in inline function ‘jensen_outb’ which is not static
      
      because gcc basically doesn't know whether to generate a body for that
      static inline function or not for that call site.
      
      So make all of these use that __EXTERN_INLINE marker.  Gcc will
      generally not inline these things on use, and then generate the function
      body out-of-line in sys_jensen.c.
      
      This makes the core IO functions build for the alpha Jensen config.
      
      Not that the rest then builds, because it turns out Jensen also doesn't
      enable PCI, which then makes other drievrs very unhappy, but that's a
      separate issue.
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      cc9d3aaa
    • Linus Torvalds's avatar
      spi: Fix tegra20 build with CONFIG_PM=n · efafec27
      Linus Torvalds authored
      Without CONFIG_PM enabled, the SET_RUNTIME_PM_OPS() macro ends up being
      empty, and the only use of tegra_slink_runtime_{resume,suspend} goes
      away, resulting in
      
        drivers/spi/spi-tegra20-slink.c:1200:12: error: ‘tegra_slink_runtime_resume’ defined but not used [-Werror=unused-function]
         1200 | static int tegra_slink_runtime_resume(struct device *dev)
              |            ^~~~~~~~~~~~~~~~~~~~~~~~~~
        drivers/spi/spi-tegra20-slink.c:1188:12: error: ‘tegra_slink_runtime_suspend’ defined but not used [-Werror=unused-function]
         1188 | static int tegra_slink_runtime_suspend(struct device *dev)
              |            ^~~~~~~~~~~~~~~~~~~~~~~~~~~
      
      mark the functions __maybe_unused to make the build happy.
      
      This hits the alpha allmodconfig build (and others).
      Reported-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      efafec27
  6. 17 Sep, 2021 4 commits