1. 07 Oct, 2021 12 commits
    • Pali Rohár's avatar
      PCI: aardvark: Fix checking for link up via LTSSM state · 661c399a
      Pali Rohár authored
      Current implementation of advk_pcie_link_up() is wrong as it marks also
      link disabled or hot reset states as link up.
      
      Fix it by marking link up only to those states which are defined in PCIe
      Base specification 3.0, Table 4-14: Link Status Mapped to the LTSSM.
      
      To simplify implementation, Define macros for every LTSSM state which
      aardvark hardware can return in CFG_REG register.
      
      Fix also checking for link training according to the same Table 4-14.
      Define a new function advk_pcie_link_training() for this purpose.
      
      Link: https://lore.kernel.org/r/20211005180952.6812-13-kabel@kernel.org
      Fixes: 8c39d710 ("PCI: aardvark: Add Aardvark PCI host controller driver")
      Signed-off-by: default avatarPali Rohár <pali@kernel.org>
      Signed-off-by: default avatarMarek Behún <kabel@kernel.org>
      Signed-off-by: default avatarLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Reviewed-by: default avatarMarek Behún <kabel@kernel.org>
      Cc: stable@vger.kernel.org
      Cc: Remi Pommarel <repk@triplefau.lt>
      661c399a
    • Pali Rohár's avatar
      PCI: aardvark: Fix link training · f76b36d4
      Pali Rohár authored
      Fix multiple link training issues in aardvark driver. The main reason of
      these issues was misunderstanding of what certain registers do, since their
      names and comments were misleading: before commit 96be36db ("PCI:
      aardvark: Replace custom macros by standard linux/pci_regs.h macros"), the
      pci-aardvark.c driver used custom macros for accessing standard PCIe Root
      Bridge registers, and misleading comments did not help to understand what
      the code was really doing.
      
      After doing more tests and experiments I've come to the conclusion that the
      SPEED_GEN register in aardvark sets the PCIe revision / generation
      compliance and forces maximal link speed. Both GEN3 and GEN2 values set the
      read-only PCI_EXP_FLAGS_VERS bits (PCIe capabilities version of Root
      Bridge) to value 2, while GEN1 value sets PCI_EXP_FLAGS_VERS to 1, which
      matches with PCI Express specifications revisions 3, 2 and 1 respectively.
      Changing SPEED_GEN also sets the read-only bits PCI_EXP_LNKCAP_SLS and
      PCI_EXP_LNKCAP2_SLS to corresponding speed.
      
      (Note that PCI Express rev 1 specification does not define PCI_EXP_LNKCAP2
       and PCI_EXP_LNKCTL2 registers and when SPEED_GEN is set to GEN1 (which
       also sets PCI_EXP_FLAGS_VERS set to 1), lspci cannot access
       PCI_EXP_LNKCAP2 and PCI_EXP_LNKCTL2 registers.)
      
      Changing PCIe link speed can be done via PCI_EXP_LNKCTL2_TLS bits of
      PCI_EXP_LNKCTL2 register. Armada 3700 Functional Specifications says that
      the default value of PCI_EXP_LNKCTL2_TLS is based on SPEED_GEN value, but
      tests showed that the default value is always 8.0 GT/s, independently of
      speed set by SPEED_GEN. So after setting SPEED_GEN, we must also set value
      in PCI_EXP_LNKCTL2 register via PCI_EXP_LNKCTL2_TLS bits.
      
      Triggering PCI_EXP_LNKCTL_RL bit immediately after setting LINK_TRAINING_EN
      bit actually doesn't do anything. Tests have shown that a delay is needed
      after enabling LINK_TRAINING_EN bit. As triggering PCI_EXP_LNKCTL_RL
      currently does nothing, remove it.
      
      Commit 43fc679c ("PCI: aardvark: Improve link training") introduced
      code which sets SPEED_GEN register based on negotiated link speed from
      PCI_EXP_LNKSTA_CLS bits of PCI_EXP_LNKSTA register. This code was added to
      fix detection of Compex WLE900VX (Atheros QCA9880) WiFi GEN1 PCIe cards, as
      otherwise these cards were "invisible" on PCIe bus (probably because they
      crashed). But apparently more people reported the same issues with these
      cards also with other PCIe controllers [1] and I was able to reproduce this
      issue also with other "noname" WiFi cards based on Atheros QCA9890 chip
      (with the same PCI vendor/device ids as Atheros QCA9880). So this is not an
      issue in aardvark but rather an issue in Atheros QCA98xx chips. Also, this
      issue only exists if the kernel is compiled with PCIe ASPM support, and a
      generic workaround for this is to change PCIe Bridge to 2.5 GT/s link speed
      via PCI_EXP_LNKCTL2_TLS_2_5GT bits in PCI_EXP_LNKCTL2 register [2], before
      triggering PCI_EXP_LNKCTL_RL bit. This workaround also works when SPEED_GEN
      is set to value GEN2 (5 GT/s). So remove this hack completely in the
      aardvark driver and always set SPEED_GEN to value from 'max-link-speed' DT
      property. Fix for Atheros QCA98xx chips is handled separately by patch [2].
      
      These two things (code for triggering PCI_EXP_LNKCTL_RL bit and changing
      SPEED_GEN value) also explain why commit 69644945 ("PCI: aardvark:
      Train link immediately after enabling training") somehow fixed detection of
      those problematic Compex cards with Atheros chips: if triggering link
      retraining (via PCI_EXP_LNKCTL_RL bit) was done immediately after enabling
      link training (via LINK_TRAINING_EN), it did nothing. If there was a
      specific delay, aardvark HW already initialized PCIe link and therefore
      triggering link retraining caused the above issue. Compex cards triggered
      link down event and disappeared from the PCIe bus.
      
      Commit f4c7d053 ("PCI: aardvark: Wait for endpoint to be ready before
      training link") added 100ms sleep before calling 'Start link training'
      command and explained that it is a requirement of PCI Express
      specification. But the code after this 100ms sleep was not doing 'Start
      link training', rather it triggered PCI_EXP_LNKCTL_RL bit via PCIe Root
      Bridge to put link into Recovery state.
      
      The required delay after fundamental reset is already done in function
      advk_pcie_wait_for_link() which also checks whether PCIe link is up.
      So after removing the code which triggers PCI_EXP_LNKCTL_RL bit on PCIe
      Root Bridge, there is no need to wait 100ms again. Remove the extra
      msleep() call and update comment about the delay required by the PCI
      Express specification.
      
      According to Marvell Armada 3700 Functional Specifications, Link training
      should be enabled via aardvark register LINK_TRAINING_EN after selecting
      PCIe generation and x1 lane. There is no need to disable it prior resetting
      card via PERST# signal. This disabling code was introduced in commit
      5169a985 ("PCI: aardvark: Issue PERST via GPIO") as a workaround for
      some Atheros cards. It turns out that this also is Atheros specific issue
      and affects any PCIe controller, not only aardvark. Moreover this Atheros
      issue was triggered by juggling with PCI_EXP_LNKCTL_RL, LINK_TRAINING_EN
      and SPEED_GEN bits interleaved with sleeps. Now, after removing triggering
      PCI_EXP_LNKCTL_RL, there is no need to explicitly disable LINK_TRAINING_EN
      bit. So remove this code too. The problematic Compex cards described in
      previous git commits are correctly detected in advk_pcie_train_link()
      function even after applying all these changes.
      
      Note that with this patch, and also prior this patch, some NVMe disks which
      support PCIe GEN3 with 8 GT/s speed are negotiated only at the lowest link
      speed 2.5 GT/s, independently of SPEED_GEN value. After manually triggering
      PCI_EXP_LNKCTL_RL bit (e.g. from userspace via setpci), these NVMe disks
      change link speed to 5 GT/s when SPEED_GEN was configured to GEN2. This
      issue first needs to be properly investigated. I will send a fix in the
      future.
      
      On the other hand, some other GEN2 PCIe cards with 5 GT/s speed are
      autonomously by HW autonegotiated at full 5 GT/s speed without need of any
      software interaction.
      
      Armada 3700 Functional Specifications describes the following steps for
      link training: set SPEED_GEN to GEN2, enable LINK_TRAINING_EN, poll until
      link training is complete, trigger PCI_EXP_LNKCTL_RL, poll until signal
      rate is 5 GT/s, poll until link training is complete, enable ASPM L0s.
      
      The requirement for triggering PCI_EXP_LNKCTL_RL can be explained by the
      need to achieve 5 GT/s speed (as changing link speed is done by throw to
      recovery state entered by PCI_EXP_LNKCTL_RL) or maybe as a part of enabling
      ASPM L0s (but in this case ASPM L0s should have been enabled prior
      PCI_EXP_LNKCTL_RL).
      
      It is unknown why the original pci-aardvark.c driver was triggering
      PCI_EXP_LNKCTL_RL bit before waiting for the link to be up. This does not
      align with neither PCIe base specifications nor with Armada 3700 Functional
      Specification. (Note that in older versions of aardvark, this bit was
      called incorrectly PCIE_CORE_LINK_TRAINING, so this may be the reason.)
      
      It is also unknown why Armada 3700 Functional Specification says that it is
      needed to trigger PCI_EXP_LNKCTL_RL for GEN2 mode, as according to PCIe
      base specification 5 GT/s speed negotiation is supposed to be entirely
      autonomous, even if initial speed is 2.5 GT/s.
      
      [1] - https://lore.kernel.org/linux-pci/87h7l8axqp.fsf@toke.dk/
      [2] - https://lore.kernel.org/linux-pci/20210326124326.21163-1-pali@kernel.org/
      
      Link: https://lore.kernel.org/r/20211005180952.6812-12-kabel@kernel.orgSigned-off-by: default avatarPali Rohár <pali@kernel.org>
      Signed-off-by: default avatarMarek Behún <kabel@kernel.org>
      Signed-off-by: default avatarLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Reviewed-by: default avatarMarek Behún <kabel@kernel.org>
      f76b36d4
    • Pali Rohár's avatar
      PCI: aardvark: Simplify initialization of rootcap on virtual bridge · 454c5327
      Pali Rohár authored
      PCIe config space can be initialized also before pci_bridge_emul_init()
      call, so move rootcap initialization after PCI config space initialization.
      
      This simplifies the function a little since it removes one if (ret < 0)
      check.
      
      Link: https://lore.kernel.org/r/20211005180952.6812-11-kabel@kernel.orgSigned-off-by: default avatarPali Rohár <pali@kernel.org>
      Signed-off-by: default avatarMarek Behún <kabel@kernel.org>
      Signed-off-by: default avatarLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Reviewed-by: default avatarMarek Behún <kabel@kernel.org>
      454c5327
    • Pali Rohár's avatar
      PCI: aardvark: Implement re-issuing config requests on CRS response · 223dec14
      Pali Rohár authored
      Commit 43f5c77b ("PCI: aardvark: Fix reporting CRS value") fixed
      handling of CRS response and when CRSSVE flag was not enabled it marked CRS
      response as failed transaction (due to simplicity).
      
      But pci-aardvark.c driver is already waiting up to the PIO_RETRY_CNT count
      for PIO config response and so we can with a small change implement
      re-issuing of config requests as described in PCIe base specification.
      
      This change implements re-issuing of config requests when response is CRS.
      Set upper bound of wait cycles to around PIO_RETRY_CNT, afterwards the
      transaction is marked as failed and an all-ones value is returned as
      before.
      
      We do this by returning appropriate error codes from function
      advk_pcie_check_pio_status(). On CRS we return -EAGAIN and caller then
      reissues transaction.
      
      Link: https://lore.kernel.org/r/20211005180952.6812-10-kabel@kernel.orgSigned-off-by: default avatarPali Rohár <pali@kernel.org>
      Signed-off-by: default avatarMarek Behún <kabel@kernel.org>
      Signed-off-by: default avatarLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Reviewed-by: default avatarMarek Behún <kabel@kernel.org>
      223dec14
    • Marek Behún's avatar
      PCI: aardvark: Deduplicate code in advk_pcie_rd_conf() · 67cb2a4c
      Marek Behún authored
      Avoid code repetition in advk_pcie_rd_conf() by handling errors with
      goto jump, as is customary in kernel.
      
      Link: https://lore.kernel.org/r/20211005180952.6812-9-kabel@kernel.org
      Fixes: 43f5c77b ("PCI: aardvark: Fix reporting CRS value")
      Signed-off-by: default avatarMarek Behún <kabel@kernel.org>
      Signed-off-by: default avatarLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      67cb2a4c
    • Pali Rohár's avatar
      PCI: aardvark: Do not unmask unused interrupts · 1fb95d7d
      Pali Rohár authored
      There are lot of undocumented interrupt bits. To prevent unwanted
      spurious interrupts, fix all *_ALL_MASK macros to define all interrupt
      bits, so that driver can properly mask all interrupts, including those
      which are undocumented.
      
      Link: https://lore.kernel.org/r/20211005180952.6812-8-kabel@kernel.org
      Fixes: 8c39d710 ("PCI: aardvark: Add Aardvark PCI host controller driver")
      Signed-off-by: default avatarPali Rohár <pali@kernel.org>
      Signed-off-by: default avatarMarek Behún <kabel@kernel.org>
      Signed-off-by: default avatarLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Reviewed-by: default avatarMarek Behún <kabel@kernel.org>
      Cc: stable@vger.kernel.org
      1fb95d7d
    • Pali Rohár's avatar
      PCI: aardvark: Do not clear status bits of masked interrupts · a7ca6d7f
      Pali Rohár authored
      The PCIE_ISR1_REG says which interrupts are currently set / active,
      including those which are masked.
      
      The driver currently reads this register and looks if some unmasked
      interrupts are active, and if not, it clears status bits of _all_
      interrupts, including the masked ones.
      
      This is incorrect, since, for example, some drivers may poll these bits.
      
      Remove this clearing, and also remove this early return statement
      completely, since it does not change functionality in any way.
      
      Link: https://lore.kernel.org/r/20211005180952.6812-7-kabel@kernel.org
      Fixes: 8c39d710 ("PCI: aardvark: Add Aardvark PCI host controller driver")
      Signed-off-by: default avatarPali Rohár <pali@kernel.org>
      Signed-off-by: default avatarMarek Behún <kabel@kernel.org>
      Signed-off-by: default avatarLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Reviewed-by: default avatarMarek Behún <kabel@kernel.org>
      Cc: stable@vger.kernel.org
      a7ca6d7f
    • Pali Rohár's avatar
      PCI: aardvark: Fix configuring Reference clock · 46ef6090
      Pali Rohár authored
      Commit 36669701 ("PCI: aardvark: Add PHY support") introduced
      configuration of PCIe Reference clock via PCIE_CORE_REF_CLK_REG register,
      but did it incorrectly.
      
      PCIe Reference clock differential pair is routed from system board to
      endpoint card, so on CPU side it has output direction. Therefore it is
      required to enable transmitting and disable receiving.
      
      Default configuration according to Armada 3700 Functional Specifications is
      enabled receiver part and disabled transmitter.
      
      We need this change because otherwise PCIe Reference clock is configured to
      some undefined state when differential pair is used for both transmitting
      and receiving.
      
      Fix this by disabling receiver part.
      
      Link: https://lore.kernel.org/r/20211005180952.6812-6-kabel@kernel.org
      Fixes: 36669701 ("PCI: aardvark: Add PHY support")
      Signed-off-by: default avatarPali Rohár <pali@kernel.org>
      Signed-off-by: default avatarMarek Behún <kabel@kernel.org>
      Signed-off-by: default avatarLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Reviewed-by: default avatarMarek Behún <kabel@kernel.org>
      Cc: stable@vger.kernel.org
      46ef6090
    • Pali Rohár's avatar
      PCI: aardvark: Fix preserving PCI_EXP_RTCTL_CRSSVE flag on emulated bridge · d419052b
      Pali Rohár authored
      Commit 43f5c77b ("PCI: aardvark: Fix reporting CRS value") started
      using CRSSVE flag for handling CRS responses.
      
      PCI_EXP_RTCTL_CRSSVE flag is stored only in emulated config space buffer
      and there is handler for PCI_EXP_RTCTL register. So every read operation
      from config space automatically clears CRSSVE flag as it is not defined in
      PCI_EXP_RTCTL read handler.
      
      Fix this by reading current CRSSVE bit flag from emulated space buffer and
      appending it to PCI_EXP_RTCTL read response.
      
      Link: https://lore.kernel.org/r/20211005180952.6812-5-kabel@kernel.org
      Fixes: 43f5c77b ("PCI: aardvark: Fix reporting CRS value")
      Signed-off-by: default avatarPali Rohár <pali@kernel.org>
      Signed-off-by: default avatarMarek Behún <kabel@kernel.org>
      Signed-off-by: default avatarLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Reviewed-by: default avatarMarek Behún <kabel@kernel.org>
      d419052b
    • Marek Behún's avatar
      PCI: aardvark: Don't spam about PIO Response Status · 464de7e7
      Marek Behún authored
      Use dev_dbg() instead of dev_err() in advk_pcie_check_pio_status().
      
      For example CRS is not an error status, it just says that the request
      should be retried.
      
      Link: https://lore.kernel.org/r/20211005180952.6812-4-kabel@kernel.org
      Fixes: 8c39d710 ("PCI: aardvark: Add Aardvark PCI host controller driver")
      Signed-off-by: default avatarMarek Behún <kabel@kernel.org>
      Signed-off-by: default avatarLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      464de7e7
    • Pali Rohár's avatar
      PCI: aardvark: Fix PCIe Max Payload Size setting · a4e17d65
      Pali Rohár authored
      Change PCIe Max Payload Size setting in PCIe Device Control register to 512
      bytes to align with PCIe Link Initialization sequence as defined in Marvell
      Armada 3700 Functional Specification. According to the specification,
      maximal Max Payload Size supported by this device is 512 bytes.
      
      Without this kernel prints suspicious line:
      
          pci 0000:01:00.0: Upstream bridge's Max Payload Size set to 256 (was 16384, max 512)
      
      With this change it changes to:
      
          pci 0000:01:00.0: Upstream bridge's Max Payload Size set to 256 (was 512, max 512)
      
      Link: https://lore.kernel.org/r/20211005180952.6812-3-kabel@kernel.org
      Fixes: 8c39d710 ("PCI: aardvark: Add Aardvark PCI host controller driver")
      Signed-off-by: default avatarPali Rohár <pali@kernel.org>
      Signed-off-by: default avatarMarek Behún <kabel@kernel.org>
      Signed-off-by: default avatarLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Reviewed-by: default avatarMarek Behún <kabel@kernel.org>
      Cc: stable@vger.kernel.org
      a4e17d65
    • Pali Rohár's avatar
      PCI: Add PCI_EXP_DEVCTL_PAYLOAD_* macros · 460275f1
      Pali Rohár authored
      Define a macro PCI_EXP_DEVCTL_PAYLOAD_* for every possible Max Payload
      Size in linux/pci_regs.h, in the same style as PCI_EXP_DEVCTL_READRQ_*.
      
      Link: https://lore.kernel.org/r/20211005180952.6812-2-kabel@kernel.orgSigned-off-by: default avatarPali Rohár <pali@kernel.org>
      Signed-off-by: default avatarMarek Behún <kabel@kernel.org>
      Signed-off-by: default avatarLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Reviewed-by: default avatarMarek Behún <kabel@kernel.org>
      Reviewed-by: default avatarBjorn Helgaas <bhelgaas@google.com>
      460275f1
  2. 20 Sep, 2021 2 commits
    • Linus Torvalds's avatar
      Linux 5.15-rc2 · e4e737bb
      Linus Torvalds authored
      e4e737bb
    • Linus Torvalds's avatar
      pci_iounmap'2: Electric Boogaloo: try to make sense of it all · 316e8d79
      Linus Torvalds authored
      Nathan Chancellor reports that the recent change to pci_iounmap in
      commit 9caea000 ("parisc: Declare pci_iounmap() parisc version only
      when CONFIG_PCI enabled") causes build errors on arm64.
      
      It took me about two hours to convince myself that I think I know what
      the logic of that mess of #ifdef's in the <asm-generic/io.h> header file
      really aim to do, and rewrite it to be easier to follow.
      
      Famous last words.
      
      Anyway, the code has now been lifted from that grotty header file into
      lib/pci_iomap.c, and has fairly extensive comments about what the logic
      is.  It also avoids indirecting through another confusing (and badly
      named) helper function that has other preprocessor config conditionals.
      
      Let's see what odd architecture did something else strange in this area
      to break things.  But my arm64 cross build is clean.
      
      Fixes: 9caea000 ("parisc: Declare pci_iounmap() parisc version only when CONFIG_PCI enabled")
      Reported-by: default avatarNathan Chancellor <nathan@kernel.org>
      Cc: Helge Deller <deller@gmx.de>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Guenter Roeck <linux@roeck-us.net>
      Cc: Ulrich Teichert <krypton@ulrich-teichert.org>
      Cc: James Bottomley <James.Bottomley@hansenpartnership.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      316e8d79
  3. 19 Sep, 2021 18 commits
  4. 18 Sep, 2021 8 commits
    • Linus Torvalds's avatar
      alpha: move __udiv_qrnnd library function to arch/alpha/lib/ · d4d016ca
      Linus Torvalds authored
      We already had the implementation for __udiv_qrnnd (unsigned divide for
      multi-precision arithmetic) as part of the alpha math emulation code.
      
      But you can disable the math emulation code - even if you shouldn't -
      and then the MPI code that actually wants this functionality (and is
      needed by various crypto functions) will fail to build.
      
      So move the extended-precision divide code to be a regular library
      function, just like all the regular division code is.  That way ie is
      available regardless of math-emulation.
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      d4d016ca
    • Linus Torvalds's avatar
      alpha: mark 'Jensen' platform as no longer broken · ab41f75e
      Linus Torvalds authored
      Ok, it almost certainly is still broken on actual hardware, but the
      immediate reason for it having been marked BROKEN was a build error that
      is fixed by just making sure the low-level IO header file is included
      sufficiently early that the __EXTERN_INLINE hackery takes effect.
      
      This was marked broken back in 2017 by commit 1883c9f4 ("alpha: mark
      jensen as broken"), but Ulrich Teichert made me look at it as part of my
      cross-build work to make sure -Werror actually does the right thing.
      
      There are lots of alpha configurations that do not build cleanly, but
      now it's no longer because Jensen wouldn't be buildable.  That said,
      because the Jensen platform doesn't force PCI to be enabled (Jensen only
      had EISA), it ends up being somewhat interesting as a source of odd
      configs.
      Reported-by: default avatarUlrich Teichert <krypton@ulrich-teichert.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      ab41f75e
    • Andrii Nakryiko's avatar
      perf bpf: Ignore deprecation warning when using libbpf's btf__get_from_id() · 219d720e
      Andrii Nakryiko authored
      Perf code re-implements libbpf's btf__load_from_kernel_by_id() API as
      a weak function, presumably to dynamically link against old version of
      libbpf shared library. Unfortunately this causes compilation warning
      when perf is compiled against libbpf v0.6+.
      
      For now, just ignore deprecation warning, but there might be a better
      solution, depending on perf's needs.
      Signed-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: kernel-team@fb.com
      LPU-Reference: 20210914170004.4185659-1-andrii@kernel.org
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      219d720e
    • Ian Rogers's avatar
      libperf evsel: Make use of FD robust. · aba5daeb
      Ian Rogers authored
      FD uses xyarray__entry that may return NULL if an index is out of
      bounds. If NULL is returned then a segv happens as FD unconditionally
      dereferences the pointer. This was happening in a case of with perf
      iostat as shown below. The fix is to make FD an "int*" rather than an
      int and handle the NULL case as either invalid input or a closed fd.
      
        $ sudo gdb --args perf stat --iostat  list
        ...
        Breakpoint 1, perf_evsel__alloc_fd (evsel=0x5555560951a0, ncpus=1, nthreads=1) at evsel.c:50
        50      {
        (gdb) bt
         #0  perf_evsel__alloc_fd (evsel=0x5555560951a0, ncpus=1, nthreads=1) at evsel.c:50
         #1  0x000055555585c188 in evsel__open_cpu (evsel=0x5555560951a0, cpus=0x555556093410,
            threads=0x555556086fb0, start_cpu=0, end_cpu=1) at util/evsel.c:1792
         #2  0x000055555585cfb2 in evsel__open (evsel=0x5555560951a0, cpus=0x0, threads=0x555556086fb0)
            at util/evsel.c:2045
         #3  0x000055555585d0db in evsel__open_per_thread (evsel=0x5555560951a0, threads=0x555556086fb0)
            at util/evsel.c:2065
         #4  0x00005555558ece64 in create_perf_stat_counter (evsel=0x5555560951a0,
            config=0x555555c34700 <stat_config>, target=0x555555c2f1c0 <target>, cpu=0) at util/stat.c:590
         #5  0x000055555578e927 in __run_perf_stat (argc=1, argv=0x7fffffffe4a0, run_idx=0)
            at builtin-stat.c:833
         #6  0x000055555578f3c6 in run_perf_stat (argc=1, argv=0x7fffffffe4a0, run_idx=0)
            at builtin-stat.c:1048
         #7  0x0000555555792ee5 in cmd_stat (argc=1, argv=0x7fffffffe4a0) at builtin-stat.c:2534
         #8  0x0000555555835ed3 in run_builtin (p=0x555555c3f540 <commands+288>, argc=3,
            argv=0x7fffffffe4a0) at perf.c:313
         #9  0x0000555555836154 in handle_internal_command (argc=3, argv=0x7fffffffe4a0) at perf.c:365
         #10 0x000055555583629f in run_argv (argcp=0x7fffffffe2ec, argv=0x7fffffffe2e0) at perf.c:409
         #11 0x0000555555836692 in main (argc=3, argv=0x7fffffffe4a0) at perf.c:539
        ...
        (gdb) c
        Continuing.
        Error:
        The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (uncore_iio_0/event=0x83,umask=0x04,ch_mask=0xF,fc_mask=0x07/).
        /bin/dmesg | grep -i perf may provide additional information.
      
        Program received signal SIGSEGV, Segmentation fault.
        0x00005555559b03ea in perf_evsel__close_fd_cpu (evsel=0x5555560951a0, cpu=1) at evsel.c:166
        166                     if (FD(evsel, cpu, thread) >= 0)
      
      v3. fixes a bug in perf_evsel__run_ioctl where the sense of a branch was
          backward.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Acked-by: default avatarJiri Olsa <jolsa@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lore.kernel.org/lkml/20210918054440.2350466-1-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      aba5daeb
    • Michael Petlan's avatar
      perf machine: Initialize srcline string member in add_location struct · 57f0ff05
      Michael Petlan authored
      It's later supposed to be either a correct address or NULL. Without the
      initialization, it may contain an undefined value which results in the
      following segmentation fault:
      
        # perf top --sort comm -g --ignore-callees=do_idle
      
      terminates with:
      
        #0  0x00007ffff56b7685 in __strlen_avx2 () from /lib64/libc.so.6
        #1  0x00007ffff55e3802 in strdup () from /lib64/libc.so.6
        #2  0x00005555558cb139 in hist_entry__init (callchain_size=<optimized out>, sample_self=true, template=0x7fffde7fb110, he=0x7fffd801c250) at util/hist.c:489
        #3  hist_entry__new (template=template@entry=0x7fffde7fb110, sample_self=sample_self@entry=true) at util/hist.c:564
        #4  0x00005555558cb4ba in hists__findnew_entry (hists=hists@entry=0x5555561d9e38, entry=entry@entry=0x7fffde7fb110, al=al@entry=0x7fffde7fb420,
            sample_self=sample_self@entry=true) at util/hist.c:657
        #5  0x00005555558cba1b in __hists__add_entry (hists=hists@entry=0x5555561d9e38, al=0x7fffde7fb420, sym_parent=<optimized out>, bi=bi@entry=0x0, mi=mi@entry=0x0,
            sample=sample@entry=0x7fffde7fb4b0, sample_self=true, ops=0x0, block_info=0x0) at util/hist.c:288
        #6  0x00005555558cbb70 in hists__add_entry (sample_self=true, sample=0x7fffde7fb4b0, mi=0x0, bi=0x0, sym_parent=<optimized out>, al=<optimized out>, hists=0x5555561d9e38)
            at util/hist.c:1056
        #7  iter_add_single_cumulative_entry (iter=0x7fffde7fb460, al=<optimized out>) at util/hist.c:1056
        #8  0x00005555558cc8a4 in hist_entry_iter__add (iter=iter@entry=0x7fffde7fb460, al=al@entry=0x7fffde7fb420, max_stack_depth=<optimized out>, arg=arg@entry=0x7fffffff7db0)
            at util/hist.c:1231
        #9  0x00005555557cdc9a in perf_event__process_sample (machine=<optimized out>, sample=0x7fffde7fb4b0, evsel=<optimized out>, event=<optimized out>, tool=0x7fffffff7db0)
            at builtin-top.c:842
        #10 deliver_event (qe=<optimized out>, qevent=<optimized out>) at builtin-top.c:1202
        #11 0x00005555558a9318 in do_flush (show_progress=false, oe=0x7fffffff80e0) at util/ordered-events.c:244
        #12 __ordered_events__flush (oe=oe@entry=0x7fffffff80e0, how=how@entry=OE_FLUSH__TOP, timestamp=timestamp@entry=0) at util/ordered-events.c:323
        #13 0x00005555558a9789 in __ordered_events__flush (timestamp=<optimized out>, how=<optimized out>, oe=<optimized out>) at util/ordered-events.c:339
        #14 ordered_events__flush (how=OE_FLUSH__TOP, oe=0x7fffffff80e0) at util/ordered-events.c:341
        #15 ordered_events__flush (oe=oe@entry=0x7fffffff80e0, how=how@entry=OE_FLUSH__TOP) at util/ordered-events.c:339
        #16 0x00005555557cd631 in process_thread (arg=0x7fffffff7db0) at builtin-top.c:1114
        #17 0x00007ffff7bb817a in start_thread () from /lib64/libpthread.so.0
        #18 0x00007ffff5656dc3 in clone () from /lib64/libc.so.6
      
      If you look at the frame #2, the code is:
      
      488	 if (he->srcline) {
      489          he->srcline = strdup(he->srcline);
      490          if (he->srcline == NULL)
      491              goto err_rawdata;
      492	 }
      
      If he->srcline is not NULL (it is not NULL if it is uninitialized rubbish),
      it gets strdupped and strdupping a rubbish random string causes the problem.
      
      Also, if you look at the commit 1fb7d06a, it adds the srcline property
      into the struct, but not initializing it everywhere needed.
      
      Committer notes:
      
      Now I see, when using --ignore-callees=do_idle we end up here at line
      2189 in add_callchain_ip():
      
      2181         if (al.sym != NULL) {
      2182                 if (perf_hpp_list.parent && !*parent &&
      2183                     symbol__match_regex(al.sym, &parent_regex))
      2184                         *parent = al.sym;
      2185                 else if (have_ignore_callees && root_al &&
      2186                   symbol__match_regex(al.sym, &ignore_callees_regex)) {
      2187                         /* Treat this symbol as the root,
      2188                            forgetting its callees. */
      2189                         *root_al = al;
      2190                         callchain_cursor_reset(cursor);
      2191                 }
      2192         }
      
      And the al that doesn't have the ->srcline field initialized will be
      copied to the root_al, so then, back to:
      
      1211 int hist_entry_iter__add(struct hist_entry_iter *iter, struct addr_location *al,
      1212                          int max_stack_depth, void *arg)
      1213 {
      1214         int err, err2;
      1215         struct map *alm = NULL;
      1216
      1217         if (al)
      1218                 alm = map__get(al->map);
      1219
      1220         err = sample__resolve_callchain(iter->sample, &callchain_cursor, &iter->parent,
      1221                                         iter->evsel, al, max_stack_depth);
      1222         if (err) {
      1223                 map__put(alm);
      1224                 return err;
      1225         }
      1226
      1227         err = iter->ops->prepare_entry(iter, al);
      1228         if (err)
      1229                 goto out;
      1230
      1231         err = iter->ops->add_single_entry(iter, al);
      1232         if (err)
      1233                 goto out;
      1234
      
      That al at line 1221 is what hist_entry_iter__add() (called from
      sample__resolve_callchain()) saw as 'root_al', and then:
      
              iter->ops->add_single_entry(iter, al);
      
      will go on with al->srcline with a bogus value, I'll add the above
      sequence to the cset and apply, thanks!
      Signed-off-by: default avatarMichael Petlan <mpetlan@redhat.com>
      CC: Milian Wolff <milian.wolff@kdab.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Fixes: 1fb7d06a ("perf report Use srcline from callchain for hist entries")
      Link: https //lore.kernel.org/r/20210719145332.29747-1-mpetlan@redhat.com
      Reported-by: default avatarJuri Lelli <jlelli@redhat.com>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      57f0ff05
    • Adrian Hunter's avatar
      perf script: Fix ip display when type != attr->type · ff6f41fb
      Adrian Hunter authored
      set_print_ip_opts() was not being called when type != attr->type
      because there is not a one-to-one relationship between output types
      and attr->type. That resulted in ip not printing.
      
      The attr_type() function is removed, and the match of attr->type to
      output type is corrected.
      
      Example on ADL using taskset to select an atom cpu:
      
       # perf record -e cpu_atom/cpu-cycles/ taskset 0x1000 uname
       Linux
       [ perf record: Woken up 1 times to write data ]
       [ perf record: Captured and wrote 0.003 MB perf.data (7 samples) ]
      
       Before:
      
        # perf script | head
               taskset   428 [-01] 10394.179041:          1 cpu_atom/cpu-cycles/:
               taskset   428 [-01] 10394.179043:          1 cpu_atom/cpu-cycles/:
               taskset   428 [-01] 10394.179044:         11 cpu_atom/cpu-cycles/:
               taskset   428 [-01] 10394.179045:        407 cpu_atom/cpu-cycles/:
               taskset   428 [-01] 10394.179046:      16789 cpu_atom/cpu-cycles/:
               taskset   428 [-01] 10394.179052:     676300 cpu_atom/cpu-cycles/:
                 uname   428 [-01] 10394.179278:    4079859 cpu_atom/cpu-cycles/:
      
       After:
      
        # perf script | head
               taskset   428 10394.179041:          1 cpu_atom/cpu-cycles/:  ffffffff95a0bb97 __intel_pmu_enable_all.constprop.48+0x47 ([kernel.kallsyms])
               taskset   428 10394.179043:          1 cpu_atom/cpu-cycles/:  ffffffff95a0bb97 __intel_pmu_enable_all.constprop.48+0x47 ([kernel.kallsyms])
               taskset   428 10394.179044:         11 cpu_atom/cpu-cycles/:  ffffffff95a0bb97 __intel_pmu_enable_all.constprop.48+0x47 ([kernel.kallsyms])
               taskset   428 10394.179045:        407 cpu_atom/cpu-cycles/:  ffffffff95a0bb97 __intel_pmu_enable_all.constprop.48+0x47 ([kernel.kallsyms])
               taskset   428 10394.179046:      16789 cpu_atom/cpu-cycles/:  ffffffff95a0bb97 __intel_pmu_enable_all.constprop.48+0x47 ([kernel.kallsyms])
               taskset   428 10394.179052:     676300 cpu_atom/cpu-cycles/:      7f829ef73800 cfree+0x0 (/lib/libc-2.32.so)
                 uname   428 10394.179278:    4079859 cpu_atom/cpu-cycles/:  ffffffff95bae912 vma_interval_tree_remove+0x1f2 ([kernel.kallsyms])
      Signed-off-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Reviewed-by: default avatarKan Liang <kan.liang@linux.intel.com>
      Cc: Jin Yao <yao.jin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Link: http://lore.kernel.org/lkml/20210911133053.15682-1-adrian.hunter@intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      ff6f41fb
    • Ravi Bangoria's avatar
      perf annotate: Fix fused instr logic for assembly functions · 7efbcc8c
      Ravi Bangoria authored
      Some x86 microarchitectures fuse a subset of cmp/test/ALU instructions
      with branch instructions, and thus perf annotate highlight such valid
      pairs as fused.
      
      When annotated with source, perf uses struct disasm_line to contain
      either source or instruction line from objdump output. Usually, a C
      statement generates multiple instructions which include such
      cmp/test/ALU + branch instruction pairs. But in case of assembly
      function, each individual assembly source line generate one
      instruction.
      
      The 'perf annotate' instruction fusion logic assumes the previous
      disasm_line as the previous instruction line, which is wrong because,
      for assembly function, previous disasm_line contains source line.  And
      thus perf fails to highlight valid fused instruction pairs for assembly
      functions.
      
      Fix it by searching backward until we find an instruction line and
      consider that disasm_line as fused with current branch instruction.
      
      Before:
               │    cmpq    %rcx, RIP+8(%rsp)
          0.00 │      cmp    %rcx,0x88(%rsp)
               │    je      .Lerror_bad_iret      <--- Source line
          0.14 │   ┌──je     b4                   <--- Instruction line
               │   │movl    %ecx, %eax
      
      After:
               │    cmpq    %rcx, RIP+8(%rsp)
          0.00 │   ┌──cmp    %rcx,0x88(%rsp)
               │   │je      .Lerror_bad_iret
          0.14 │   ├──je     b4
               │   │movl    %ecx, %eax
      Reviewed-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Signed-off-by: default avatarRavi Bangoria <ravi.bangoria@amd.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kim Phillips <kim.phillips@amd.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https //lore.kernel.org/r/20210911043854.8373-1-ravi.bangoria@amd.com
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      7efbcc8c
    • Linus Torvalds's avatar
      Merge tag 's390-5.15-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux · 93ff9f13
      Linus Torvalds authored
      Pull s390 fixes from Vasily Gorbik:
      
       - Fix potential out-of-range access during secure boot facility
         detection.
      
       - Fully validate the VMA before calling follow_pte() in pci code.
      
       - Remove arch specific WARN_DYNAMIC_STACK config option.
      
       - Fix zcrypto kernel doc comments.
      
       - Update defconfigs.
      
      * tag 's390-5.15-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
        s390: remove WARN_DYNAMIC_STACK
        s390/ap: fix kernel doc comments
        s390: update defconfigs
        s390/sclp: fix Secure-IPL facility detection
        s390/pci_mmio: fully validate the VMA before calling follow_pte()
      93ff9f13