1. 13 Dec, 2023 16 commits
    • Nathan Lynch's avatar
      powerpc/pseries/papr-sysparm: Validate buffer object lengths · 35aae182
      Nathan Lynch authored
      The ability to get and set system parameters will be exposed to user
      space, so let's get a little more strict about malformed
      papr_sysparm_buf objects.
      
      * Create accessors for the length field of struct papr_sysparm_buf.
        The length is always stored in MSB order and this is better than
        spreading the necessary conversions all over.
      
      * Reject attempts to submit invalid buffers to RTAS.
      
      * Warn if RTAS returns a buffer with an invalid length, clamping the
        returned length to a safe value that won't overrun the buffer.
      
      These are meant as precautionary measures to mitigate both firmware
      and kernel bugs in this area, should they arise, but I am not aware of
      any.
      Signed-off-by: default avatarNathan Lynch <nathanl@linux.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Link: https://msgid.link/20231212-papr-sys_rtas-vs-lockdown-v6-10-e9eafd0c8c6c@linux.ibm.com
      35aae182
    • Nathan Lynch's avatar
      powerpc/pseries: Add papr-vpd character driver for VPD retrieval · 514f6ff4
      Nathan Lynch authored
      PowerVM LPARs may retrieve Vital Product Data (VPD) for system
      components using the ibm,get-vpd RTAS function.
      
      We can expose this to user space with a /dev/papr-vpd character
      device, where the programming model is:
      
        struct papr_location_code plc = { .str = "", }; /* obtain all VPD */
        int devfd = open("/dev/papr-vpd", O_RDONLY);
        int vpdfd = ioctl(devfd, PAPR_VPD_CREATE_HANDLE, &plc);
        size_t size = lseek(vpdfd, 0, SEEK_END);
        char *buf = malloc(size);
        pread(devfd, buf, size, 0);
      
      When a file descriptor is obtained from ioctl(PAPR_VPD_CREATE_HANDLE),
      the file contains the result of a complete ibm,get-vpd sequence. The
      file contents are immutable from the POV of user space. To get a new
      view of the VPD, the client must create a new handle.
      
      This design choice insulates user space from most of the complexities
      that ibm,get-vpd brings:
      
      * ibm,get-vpd must be called more than once to obtain complete
        results.
      
      * Only one ibm,get-vpd call sequence should be in progress at a time;
        interleaved sequences will disrupt each other. Callers must have a
        protocol for serializing their use of the function.
      
      * A call sequence in progress may receive a "VPD changed, try again"
        status, requiring the client to abandon the sequence and start
        over.
      
      The memory required for the VPD buffers seems acceptable, around 20KB
      for all VPD on one of my systems. And the value of the
      /rtas/ibm,vpd-size DT property (the estimated maximum size of VPD) is
      consistently 300KB across various systems I've checked.
      
      I've implemented support for this new ABI in the rtas_get_vpd()
      function in librtas, which the vpdupdate command currently uses to
      populate its VPD database. I've verified that an unmodified vpdupdate
      binary generates an identical database when using a librtas.so that
      prefers the new ABI.
      
      Along with the papr-vpd.h header exposed to user space, this
      introduces a common papr-miscdev.h uapi header to share a base ioctl
      ID with similar drivers to come.
      Tested-by: default avatarMichal Suchánek <msuchanek@suse.de>
      Signed-off-by: default avatarNathan Lynch <nathanl@linux.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Link: https://msgid.link/20231212-papr-sys_rtas-vs-lockdown-v6-9-e9eafd0c8c6c@linux.ibm.com
      514f6ff4
    • Nathan Lynch's avatar
      powerpc/rtas: Warn if per-function lock isn't held · e3681107
      Nathan Lynch authored
      If the function descriptor has a populated lock member, then callers
      are required to hold it across calls. Now that the firmware activation
      sequence is appropriately guarded, we can warn when the requirement
      isn't satisfied.
      
      __do_enter_rtas_trace() gets reorganized a bit as a result of
      performing the function descriptor lookup unconditionally now.
      Reviewed-by: default avatar"Aneesh Kumar K.V (IBM)" <aneesh.kumar@kernel.org>
      Signed-off-by: default avatarNathan Lynch <nathanl@linux.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Link: https://msgid.link/20231212-papr-sys_rtas-vs-lockdown-v6-8-e9eafd0c8c6c@linux.ibm.com
      e3681107
    • Nathan Lynch's avatar
      powerpc/rtas: Serialize firmware activation sequences · dc7637c4
      Nathan Lynch authored
      Use rtas_ibm_activate_firmware_lock to prevent interleaving call
      sequences of the ibm,activate-firmware RTAS function, which typically
      requires multiple calls to complete the update. While the spec does
      not specifically prohibit interleaved sequences, there's almost
      certainly no advantage to allowing them.
      Reviewed-by: default avatar"Aneesh Kumar K.V (IBM)" <aneesh.kumar@kernel.org>
      Signed-off-by: default avatarNathan Lynch <nathanl@linux.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Link: https://msgid.link/20231212-papr-sys_rtas-vs-lockdown-v6-7-e9eafd0c8c6c@linux.ibm.com
      dc7637c4
    • Nathan Lynch's avatar
      powerpc/rtas: Facilitate high-level call sequences · adf7a019
      Nathan Lynch authored
      On RTAS platforms there is a general restriction that the OS must not
      enter RTAS on more than one CPU at a time. This low-level
      serialization requirement is satisfied by holding a spin
      lock (rtas_lock) across most RTAS function invocations.
      
      However, some pseries RTAS functions require multiple successive calls
      to complete a logical operation. Beginning a new call sequence for such a
      function may disrupt any other sequences of that function already in
      progress. Safe and reliable use of these functions effectively
      requires higher-level serialization beyond what is already done at the
      level of RTAS entry and exit.
      
      Where a sequence-based RTAS function is invoked only through
      sys_rtas(), with no in-kernel users, there is no issue as far as the
      kernel is concerned. User space is responsible for appropriately
      serializing its call sequences. (Whether user space code actually
      takes measures to prevent sequence interleaving is another matter.)
      Examples of such functions currently include ibm,platform-dump and
      ibm,get-vpd.
      
      But where a sequence-based RTAS function has both user space and
      in-kernel uesrs, there is a hazard. Even if the in-kernel call sites
      of such a function serialize their sequences correctly, a user of
      sys_rtas() can invoke the same function at any time, potentially
      disrupting a sequence in progress.
      
      So in order to prevent disruption of kernel-based RTAS call sequences,
      they must serialize not only with themselves but also with sys_rtas()
      users, somehow. Preferably without adding more function-specific hacks
      to sys_rtas(). This is a prerequisite for adding an in-kernel call
      sequence of ibm,get-vpd, which is in a change to follow.
      
      Note that it has never been feasible for the kernel to prevent
      sys_rtas()-based sequences from being disrupted because control
      returns to user space on every call. sys_rtas()-based users of these
      functions have always been, and continue to be, responsible for
      coordinating their call sequences with other users, even those which
      may invoke the RTAS functions through less direct means than
      sys_rtas(). This is an unavoidable consequence of exposing
      sequence-based RTAS functions through sys_rtas().
      
      * Add an optional mutex member to struct rtas_function.
      
      * Statically define a mutex for each RTAS function with known call
        sequence serialization requirements, and assign its address to the
        .lock member of the corresponding function table entry, along with
        justifying commentary.
      
      * In sys_rtas(), if the table entry for the RTAS function being
        called has a populated lock member, acquire it before taking
        rtas_lock and entering RTAS.
      
      * Kernel-based RTAS call sequences are expected to access the
        appropriate mutex explicitly by name. For example, a user of the
        ibm,activate-firmware RTAS function would do:
      
              int token = rtas_function_token(RTAS_FN_IBM_ACTIVATE_FIRMWARE);
              int fwrc;
      
              mutex_lock(&rtas_ibm_activate_firmware_lock);
      
              do {
                      fwrc = rtas_call(token, 0, 1, NULL);
              } while (rtas_busy_delay(fwrc));
      
              mutex_unlock(&rtas_ibm_activate_firmware_lock);
      
      There should be no perceivable change introduced here except that
      concurrent callers of the same RTAS function via sys_rtas() may block
      on a mutex instead of spinning on rtas_lock.
      Signed-off-by: default avatarNathan Lynch <nathanl@linux.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Link: https://msgid.link/20231212-papr-sys_rtas-vs-lockdown-v6-6-e9eafd0c8c6c@linux.ibm.com
      adf7a019
    • Nathan Lynch's avatar
      powerpc/rtas: Move token validation from block_rtas_call() to sys_rtas() · e7582edb
      Nathan Lynch authored
      The rtas system call handler sys_rtas() delegates certain input
      validation steps to a helper function: block_rtas_call(). One of these
      steps ensures that the user-supplied token value maps to a known RTAS
      function. This is done by performing a "reverse" token-to-function
      lookup via rtas_token_to_function_untrusted() to obtain an
      rtas_function object.
      
      In changes to come, sys_rtas() itself will need the function
      descriptor for the token. To prepare:
      
      * Move the lookup and validation up into sys_rtas() and pass the
        resulting rtas_function pointer to block_rtas_call(), which is
        otherwise unconcerned with the token value.
      
      * Change block_rtas_call() to report the RTAS function name instead of
        the token value on validation failures, since it can now rely on
        having a valid function descriptor.
      
      One behavior change is that sys_rtas() now silently errors out when
      passed a bad token, before calling block_rtas_call(). So we will no
      longer log "RTAS call blocked - exploit attempt?" on invalid
      tokens. This is consistent with how sys_rtas() currently handles other
      "metadata" (nargs and nret), while block_rtas_call() is primarily
      concerned with validating the arguments to be passed to specific RTAS
      functions.
      Signed-off-by: default avatarNathan Lynch <nathanl@linux.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Link: https://msgid.link/20231212-papr-sys_rtas-vs-lockdown-v6-5-e9eafd0c8c6c@linux.ibm.com
      e7582edb
    • Nathan Lynch's avatar
      powerpc/rtas: Add function return status constants · 9592aa5a
      Nathan Lynch authored
      Not all of the generic RTAS function statuses specified in PAPR have
      symbolic constants and descriptions in rtas.h. Fix this, providing a
      little more background, slightly updating the existing wording, and
      improving the formatting.
      Reviewed-by: default avatar"Aneesh Kumar K.V (IBM)" <aneesh.kumar@kernel.org>
      Signed-off-by: default avatarNathan Lynch <nathanl@linux.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Link: https://msgid.link/20231212-papr-sys_rtas-vs-lockdown-v6-4-e9eafd0c8c6c@linux.ibm.com
      9592aa5a
    • Nathan Lynch's avatar
      powerpc/rtas: Fall back to linear search on failed token->function lookup · 669acc7e
      Nathan Lynch authored
      Enabling any of the powerpc:rtas_* tracepoints at boot is likely to
      result in an oops on RTAS platforms. For example, booting a QEMU
      pseries model with 'trace_event=powerpc:rtas_input' in the command
      line leads to:
      
        BUG: Kernel NULL pointer dereference on read at 0x00000008
        Oops: Kernel access of bad area, sig: 7 [#1]
        NIP [c00000000004231c] do_enter_rtas+0x1bc/0x460
        LR [c00000000004231c] do_enter_rtas+0x1bc/0x460
        Call Trace:
          do_enter_rtas+0x1bc/0x460 (unreliable)
          rtas_call+0x22c/0x4a0
          rtas_get_boot_time+0x80/0x14c
          read_persistent_clock64+0x124/0x150
          read_persistent_wall_and_boot_offset+0x28/0x58
          timekeeping_init+0x70/0x348
          start_kernel+0xa0c/0xc1c
          start_here_common+0x1c/0x20
      
      (This is preceded by a warning for the failed lookup in
      rtas_token_to_function().)
      
      This happens when __do_enter_rtas_trace() attempts a token to function
      descriptor lookup before the xarray containing the mappings has been
      set up.
      
      Fall back to linear scan of the table if rtas_token_to_function_xarray
      is empty.
      
      Fixes: 24098f58 ("powerpc/rtas: add tracepoints around RTAS entry")
      Reviewed-by: default avatar"Aneesh Kumar K.V (IBM)" <aneesh.kumar@kernel.org>
      Signed-off-by: default avatarNathan Lynch <nathanl@linux.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Link: https://msgid.link/20231212-papr-sys_rtas-vs-lockdown-v6-3-e9eafd0c8c6c@linux.ibm.com
      669acc7e
    • Nathan Lynch's avatar
      powerpc/rtas: Add for_each_rtas_function() iterator · c500c6e7
      Nathan Lynch authored
      Add a convenience macro for iterating over every element of the
      internal function table and convert the one site that can use it. An
      additional user of the macro is anticipated in changes to follow.
      Reviewed-by: default avatar"Aneesh Kumar K.V (IBM)" <aneesh.kumar@kernel.org>
      Signed-off-by: default avatarNathan Lynch <nathanl@linux.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Link: https://msgid.link/20231212-papr-sys_rtas-vs-lockdown-v6-2-e9eafd0c8c6c@linux.ibm.com
      c500c6e7
    • Nathan Lynch's avatar
      powerpc/rtas: Avoid warning on invalid token argument to sys_rtas() · 01e346ff
      Nathan Lynch authored
      rtas_token_to_function() WARNs when passed an invalid token; it's
      meant to catch bugs in kernel-based users of RTAS functions. However,
      user space controls the token value passed to rtas_token_to_function()
      by block_rtas_call(), so user space with sufficient privilege to use
      sys_rtas() can trigger the warnings at will:
      
        unexpected failed lookup for token 2048
        WARNING: CPU: 20 PID: 2247 at arch/powerpc/kernel/rtas.c:556
          rtas_token_to_function+0xfc/0x110
        ...
        NIP rtas_token_to_function+0xfc/0x110
        LR  rtas_token_to_function+0xf8/0x110
        Call Trace:
          rtas_token_to_function+0xf8/0x110 (unreliable)
          sys_rtas+0x188/0x880
          system_call_exception+0x268/0x530
          system_call_common+0x160/0x2c4
      
      It's desirable to continue warning on bogus tokens in
      rtas_token_to_function(). Currently it is used to look up RTAS
      function descriptors when tracing, where we know there has to have
      been a successful descriptor lookup by different means already, and it
      would be a serious inconsistency for the reverse lookup to fail.
      
      So instead of weakening rtas_token_to_function()'s contract by
      removing the warnings, introduce rtas_token_to_function_untrusted(),
      which has no opinion on failed lookups. Convert block_rtas_call() and
      rtas_token_to_function() to use it.
      
      Fixes: 8252b882 ("powerpc/rtas: improve function information lookups")
      Signed-off-by: default avatarNathan Lynch <nathanl@linux.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Link: https://msgid.link/20231212-papr-sys_rtas-vs-lockdown-v6-1-e9eafd0c8c6c@linux.ibm.com
      01e346ff
    • Kajol Jain's avatar
      powerpc/hv-gpci: Add return value check in affinity_domain_via_partition_show function · 070b71f4
      Kajol Jain authored
      To access hv-gpci kernel interface files data, the
      "Enable Performance Information Collection" option has to be set
      in hmc. Incase that option is not set and user try to read
      the interface files, it should give error message as
      operation not permitted.
      
      Result of accessing added interface files with disabled
      performance collection option:
      
      [command]# cat processor_bus_topology
      cat: processor_bus_topology: Operation not permitted
      
      [command]# cat processor_config
      cat: processor_config: Operation not permitted
      
      [command]# cat affinity_domain_via_domain
      cat: affinity_domain_via_domain: Operation not permitted
      
      [command]# cat affinity_domain_via_virtual_processor
      cat: affinity_domain_via_virtual_processor: Operation not permitted
      
      [command]# cat affinity_domain_via_partition
      
      Based on above result there is no error message when reading
      affinity_domain_via_partition file because of missing
      check for failed hcall. Fix this issue by adding
      a check in the start of affinity_domain_via_partition_show
      function, to return error incase hcall fails, with error type
      other then H_PARAMETER.
      
      Fixes: a15e0d6a ("powerpc/hv_gpci: Add sysfs file inside hv_gpci device to show affinity domain via partition information")
      Reported-by: default avatarDisha Goel <disgoel@linux.vnet.ibm.com>
      Signed-off-by: default avatarKajol Jain <kjain@linux.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Link: https://msgid.link/20231116122033.160964-1-kjain@linux.ibm.com
      070b71f4
    • Michael Ellerman's avatar
      selftests/powerpc: Check all FPRs in fpu_syscall test · 1bdf2258
      Michael Ellerman authored
      There is a selftest that checks if FPRs are corrupted across a fork, aka
      clone. It was added as part of the series that optimised the clone path
      to save the parent's FP state without "giving up" (turning off FP).
      
      See commit 8792468d ("powerpc: Add the ability to save FPU without
      giving it up").
      
      The test encodes the assumption that FPRs 0-13 are volatile across the
      syscall, by only checking the volatile FPRs are not changed by the fork.
      There was also a comment in the fpu_preempt test alluding to that:
      
        The check_fpu function in asm only checks the non volatile registers
        as it is reused from the syscall test
      
      It is true that the function call ABI treats f0-f13 as volatile,
      however the syscall ABI has since been documented as *not* treating those
      registers as volatile. See commit 7b8845a2 ("powerpc/64: Document
      the syscall ABI").
      
      So change the test to check all FPRs are not corrupted by the syscall.
      Note that this currently fails, because save_fpu() etc. do not restore
      f0/vsr0.
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Link: https://msgid.link/20231128132748.1990179-5-mpe@ellerman.id.au
      1bdf2258
    • Michael Ellerman's avatar
      selftests/powerpc: Run fpu_preempt test for 60 seconds · 60d2c3af
      Michael Ellerman authored
      The FPU preempt test only runs for 20 seconds, which is not particularly
      long. Run it for 60 seconds to increase the chance of detecting
      corruption.
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Link: https://msgid.link/20231128132748.1990179-4-mpe@ellerman.id.au
      60d2c3af
    • Michael Ellerman's avatar
      selftests/powerpc: Generate better bit patterns for FPU tests · 2ba107f6
      Michael Ellerman authored
      The fpu_preempt test randomly initialises an array of doubles to try and
      detect FPU register corruption.
      
      However the values it generates do not occupy the full range of values
      possible in the 64-bit double, meaning some partial register corruption
      could go undetected.
      
      Without getting too carried away, add some better initialisation to
      generate values that occupy more bits.
      
      Sample values before:
      
        f0             902677510               (raw 0x41cae6e203000000)
        f1             325217596               (raw 0x41b3626d3c000000)
        f2             1856578300              (raw 0x41dbaa48bf000000)
        f3             1247189984              (raw 0x41d295a6f8000000)
      
      And after:
      
        f0             1.1078153481413311e-09  (raw 0x3e13083932805cc2)
        f1             1.0576648474801922e+17  (raw 0x43777c20eb88c261)
        f2             -6.6245033413594075e-10 (raw 0xbe06c2f989facae9)
        f3             3.0085988827156291e+18  (raw 0x43c4e0585f2df37b)
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Link: https://msgid.link/20231128132748.1990179-3-mpe@ellerman.id.au
      2ba107f6
    • Michael Ellerman's avatar
      selftests/powerpc: Check all FPRs in fpu_preempt · e5d00aaa
      Michael Ellerman authored
      There's a selftest that checks FPRs aren't corrupted by preemption, or
      just process scheduling. However it only checks the non-volatile FPRs,
      meaning corruption of the volatile FPRs could go undetected.
      
      The check_fpu function it calls is used by several other tests, so for
      now add a new routine to check all the FPRs. Increase the size of the
      array of FPRs to 32, and initialise them all with random values.
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Link: https://msgid.link/20231128132748.1990179-2-mpe@ellerman.id.au
      e5d00aaa
    • Michael Ellerman's avatar
      selftests/powerpc: Fix error handling in FPU/VMX preemption tests · 9dbd5927
      Michael Ellerman authored
      The FPU & VMX preemption tests do not check for errors returned by the
      low-level asm routines, preempt_fpu() / preempt_vsx() respectively.
      That means any register corruption detected by the asm routines does not
      result in a test failure.
      
      Fix it by returning the return value of the asm routines from the
      pthread child routines.
      
      Fixes: e5ab8be6 ("selftests/powerpc: Test preservation of FPU and VMX regs across preemption")
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Link: https://msgid.link/20231128132748.1990179-1-mpe@ellerman.id.au
      9dbd5927
  2. 07 Dec, 2023 7 commits
  3. 01 Dec, 2023 7 commits
  4. 30 Nov, 2023 5 commits
  5. 28 Nov, 2023 1 commit
    • Nathan Lynch's avatar
      powerpc/rtas_pci: rename and properly expose config access APIs · 9be4feb7
      Nathan Lynch authored
      The rtas_read_config() and rtas_write_config() functions in
      kernel/rtas_pci.c have external linkage and two users in arch/powerpc:
      the rtas_pci code itself and the pseries platform's "enhanced error
      handling" (EEH) support code.
      
      The prototypes for these functions in asm/ppc-pci.h have until now
      been guarded by CONFIG_EEH since the only external caller is the
      pseries EEH code. However, this presumably has always generated
      warnings when built with !CONFIG_EEH and -Wmissing-prototypes:
      
        arch/powerpc/kernel/rtas_pci.c:46:5: error: no previous prototype for
        function 'rtas_read_config' [-Werror,-Wmissing-prototypes]
           46 | int rtas_read_config(struct pci_dn *pdn, int where,
                                     int size, u32 *val)
      
        arch/powerpc/kernel/rtas_pci.c:98:5: error: no previous prototype for
        function 'rtas_write_config' [-Werror,-Wmissing-prototypes]
           98 | int rtas_write_config(struct pci_dn *pdn, int where,
                                      int size, u32 val)
      
      The introduction of commit c6345dfa6e3e ("Makefile.extrawarn: turn on
      missing-prototypes globally") forces the issue.
      
      The efika and chrp platform code have (static) functions with the same
      names but different signatures. We may as well eliminate the potential
      for conflicts and confusion by renaming the globally visible versions
      as their prototypes get moved out of the CONFIG_EEH-guarded region;
      their current names are too generic anyway. Since they operate on
      objects of the type 'struct pci_dn *', give them the slightly more
      verbose prefix "rtas_pci_dn_" and fix up all the call sites.
      
      Fixes: c6345dfa6e3e ("Makefile.extrawarn: turn on missing-prototypes globally")
      Reported-by: default avatarLinux Kernel Functional Testing <lkft@linaro.org>
      Closes: https://lore.kernel.org/linuxppc-dev/CA+G9fYt0LLXtjSz+Hkf3Fhm-kf0ZQanrhUS+zVZGa3O+Wt2+vg@mail.gmail.com/Signed-off-by: default avatarNathan Lynch <nathanl@linux.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Link: https://msgid.link/20231127-rtas-pci-rw-config-v1-1-385d29ace3df@linux.ibm.com
      9be4feb7
  6. 27 Nov, 2023 4 commits