1. 03 Dec, 2020 6 commits
  2. 02 Dec, 2020 8 commits
    • Linus Torvalds's avatar
      Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux · 3bb61aa6
      Linus Torvalds authored
      Pull arm64 fixes from Will Deacon:
       "I'm sad to say that we've got an unusually large arm64 fixes pull for
        rc7 which addresses numerous significant instrumentation issues with
        our entry code.
      
        Without these patches, lockdep is hopelessly unreliable in some
        configurations [1,2] and syzkaller is therefore not a lot of use
        because it's so noisy.
      
        Although much of this has always been broken, it appears to have been
        exposed more readily by other changes such as 044d0d6d ("lockdep:
        Only trace IRQ edges") and general lockdep improvements around IRQ
        tracing and NMIs.
      
        Fixing this properly required moving much of the instrumentation hooks
        from our entry assembly into C, which Mark has been working on for the
        last few weeks. We're not quite ready to move to the recently added
        generic functions yet, but the code here has been deliberately written
        to mimic that closely so we can look at cleaning things up once we
        have a bit more breathing room.
      
        Having said all that, the second version of these patches was posted
        last week and I pushed it into our CI (kernelci and cki) along with a
        commit which forced on PROVE_LOCKING, NOHZ_FULL and
        CONTEXT_TRACKING_FORCE. The result? We found a real bug in the
        md/raid10 code [3].
      
        Oh, and there's also a really silly typo patch that's unrelated.
      
        Summary:
      
         - Fix numerous issues with instrumentation and exception entry
      
         - Fix hideous typo in unused register field definition"
      
      [1] https://lore.kernel.org/r/CACT4Y+aAzoJ48Mh1wNYD17pJqyEcDnrxGfApir=-j171TnQXhw@mail.gmail.com
      [2] https://lore.kernel.org/r/20201119193819.GA2601289@elver.google.com
      [3] https://lore.kernel.org/r/94c76d5e-466a-bc5f-e6c2-a11b65c39f83@redhat.com
      
      * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
        arm64: mte: Fix typo in macro definition
        arm64: entry: fix EL1 debug transitions
        arm64: entry: fix NMI {user, kernel}->kernel transitions
        arm64: entry: fix non-NMI kernel<->kernel transitions
        arm64: ptrace: prepare for EL1 irq/rcu tracking
        arm64: entry: fix non-NMI user<->kernel transitions
        arm64: entry: move el1 irq/nmi logic to C
        arm64: entry: prepare ret_to_user for function call
        arm64: entry: move enter_from_user_mode to entry-common.c
        arm64: entry: mark entry code as noinstr
        arm64: mark idle code as noinstr
        arm64: syscall: exit userspace before unmasking exceptions
      3bb61aa6
    • Linus Torvalds's avatar
      Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost · 2c6ffa9e
      Linus Torvalds authored
      Pull vdpa fixes from Michael Tsirkin:
       "A couple of fixes that surfaced at the last minute"
      
      * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
        vhost_vdpa: return -EFAULT if copy_to_user() fails
        vdpa: mlx5: fix vdpa/vhost dependencies
      2c6ffa9e
    • Linus Torvalds's avatar
      Merge tag 'sound-5.10-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound · bb95d607
      Linus Torvalds authored
      Pull sound fixes from Takashi Iwai:
       "Here are the pending sound fixes for 5.10: all small device-specific
        fixes, and nothing particular stands out, so far"
      
      * tag 'sound-5.10-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
        ALSA: hda/realtek: Add mute LED quirk to yet another HP x360 model
        ALSA: hda/realtek: Fix bass speaker DAC assignment on Asus Zephyrus G14
        ALSA: hda/generic: Add option to enforce preferred_dacs pairs
        ALSA: usb-audio: US16x08: fix value count for level meters
        ALSA: hda/realtek - Add new codec supported for ALC897
        ASoC: rt5682: change SAR voltage threshold
        ASoC: wm_adsp: fix error return code in wm_adsp_load()
        ALSA: hda/realtek: Enable headset of ASUS UX482EG & B9400CEA with ALC294
        ASoC: qcom: Fix enabling BCLK and LRCLK in LPAIF invalid state
        ALSA: hda/realtek - Fixed Dell AIO wrong sound tone
        ASoC: Intel: bytcr_rt5640: Fix HP Pavilion x2 Detachable quirks
      bb95d607
    • Linus Torvalds's avatar
      Merge tag 'trace-v5.10-rc6-bootconfig' of... · 8a02ec8f
      Linus Torvalds authored
      Merge tag 'trace-v5.10-rc6-bootconfig' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
      
      Pull bootconfig fixes from Steven Rostedt:
       "Have bootconfig size and checksum be little endian
      
        In case the bootconfig is created on one kind of endian machine, and
        then read on the other kind of endian kernel, the size and checksum
        will be incorrect. Instead, have both the size and checksum always be
        little endian and have the tool and the kernel convert it from little
        endian to or from the host endian"
      
      * tag 'trace-v5.10-rc6-bootconfig' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
        docs: bootconfig: Add the endianness of fields
        tools/bootconfig: Store size and checksum in footer as le32
        bootconfig: Load size and checksum in the footer as le32
      8a02ec8f
    • Heiko Carstens's avatar
      s390: fix irq state tracing · b1cae1f8
      Heiko Carstens authored
      With commit 58c644ba ("sched/idle: Fix arch_cpu_idle() vs
      tracing") common code calls arch_cpu_idle() with a lockdep state that
      tells irqs are on.
      
      This doesn't work very well for s390: psw_idle() will enable interrupts
      to wait for an interrupt. As soon as an interrupt occurs the interrupt
      handler will verify if the old context was psw_idle(). If that is the
      case the interrupt enablement bits in the old program status word will
      be cleared.
      
      A subsequent test in both the external as well as the io interrupt
      handler checks if in the old context interrupts were enabled. Due to
      the above patching of the old program status word it is assumed the
      old context had interrupts disabled, and therefore a call to
      TRACE_IRQS_OFF (aka trace_hardirqs_off_caller) is skipped. Which in
      turn makes lockdep incorrectly "think" that interrupts are enabled
      within the interrupt handler.
      
      Fix this by unconditionally calling TRACE_IRQS_OFF when entering
      interrupt handlers. Also call unconditionally TRACE_IRQS_ON when
      leaving interrupts handlers.
      
      This leaves the special psw_idle() case, which now returns with
      interrupts disabled, but has an "irqs on" lockdep state. So callers of
      psw_idle() must adjust the state on their own, if required. This is
      currently only __udelay_disabled().
      
      Fixes: 58c644ba ("sched/idle: Fix arch_cpu_idle() vs tracing")
      Acked-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Signed-off-by: default avatarHeiko Carstens <hca@linux.ibm.com>
      b1cae1f8
    • Alexander Gordeev's avatar
      s390/pci: fix CPU address in MSI for directed IRQ · a2bd4097
      Alexander Gordeev authored
      The directed MSIs are delivered to CPUs whose address is
      written to the MSI message address. The current code assumes
      that a CPU logical number (as it is seen by the kernel)
      is also the CPU address.
      
      The above assumption is not correct, as the CPU address
      is rather the value returned by STAP instruction. That
      value does not necessarily match the kernel logical CPU
      number.
      
      Fixes: e979ce7b ("s390/pci: provide support for CPU directed interrupts")
      Cc: <stable@vger.kernel.org> # v5.2+
      Signed-off-by: default avatarAlexander Gordeev <agordeev@linux.ibm.com>
      Reviewed-by: default avatarHalil Pasic <pasic@linux.ibm.com>
      Reviewed-by: default avatarNiklas Schnelle <schnelle@linux.ibm.com>
      Signed-off-by: default avatarNiklas Schnelle <schnelle@linux.ibm.com>
      Signed-off-by: default avatarHeiko Carstens <hca@linux.ibm.com>
      a2bd4097
    • Dan Carpenter's avatar
      vhost_vdpa: return -EFAULT if copy_to_user() fails · 2c602741
      Dan Carpenter authored
      The copy_to_user() function returns the number of bytes remaining to be
      copied but this should return -EFAULT to the user.
      
      Fixes: 1b48dc03 ("vhost: vdpa: report iova range")
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Link: https://lore.kernel.org/r/X8c32z5EtDsMyyIL@mwandaSigned-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Acked-by: default avatarJason Wang <jasowang@redhat.com>
      Reviewed-by: default avatarStefano Garzarella <sgarzare@redhat.com>
      2c602741
    • Randy Dunlap's avatar
      vdpa: mlx5: fix vdpa/vhost dependencies · 98701a2a
      Randy Dunlap authored
      drivers/vdpa/mlx5/ uses vhost_iotlb*() interfaces, so select
      VHOST_IOTLB to make them be built.
      
      However, if VHOST_IOTLB is the only VHOST symbol that is
      set/enabled, the object file still won't be built because
      drivers/Makefile won't descend into drivers/vhost/ to build it,
      so make drivers/Makefile build the needed binary whenever
      VHOST_IOTLB is set, like it does for VHOST_RING.
      
      Fixes these build errors:
      ERROR: modpost: "vhost_iotlb_itree_next" [drivers/vdpa/mlx5/mlx5_vdpa.ko] undefined!
      ERROR: modpost: "vhost_iotlb_itree_first" [drivers/vdpa/mlx5/mlx5_vdpa.ko] undefined!
      
      Fixes: 29064bfd ("vdpa/mlx5: Add support library for mlx5 VDPA implementation")
      Fixes: aff90770 ("vdpa/mlx5: Fix dependency on MLX5_CORE")
      Reported-by: default avatarkernel test robot <lkp@intel.com>
      Signed-off-by: default avatarRandy Dunlap <rdunlap@infradead.org>
      Cc: Eli Cohen <eli@mellanox.com>
      Cc: Parav Pandit <parav@mellanox.com>
      Cc: "Michael S. Tsirkin" <mst@redhat.com>
      Cc: Jason Wang <jasowang@redhat.com>
      Cc: virtualization@lists.linux-foundation.org
      Cc: Saeed Mahameed <saeedm@nvidia.com>
      Cc: Leon Romanovsky <leonro@nvidia.com>
      Cc: netdev@vger.kernel.org
      Link: https://lore.kernel.org/r/20201128213905.27409-1-rdunlap@infradead.orgSigned-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Acked-by: default avatarJason Wang <jasowang@redhat.com>
      98701a2a
  3. 01 Dec, 2020 14 commits
  4. 30 Nov, 2020 12 commits
    • Andreas Gruenbacher's avatar
      gfs2: Fix deadlock between gfs2_{create_inode,inode_lookup} and delete_work_func · dd0ecf54
      Andreas Gruenbacher authored
      In gfs2_create_inode and gfs2_inode_lookup, make sure to cancel any pending
      delete work before taking the inode glock.  Otherwise, gfs2_cancel_delete_work
      may block waiting for delete_work_func to complete, and delete_work_func may
      block trying to acquire the inode glock in gfs2_inode_lookup.
      Reported-by: default avatarAlexander Aring <aahringo@redhat.com>
      Fixes: a0e3cc65 ("gfs2: Turn gl_delete into a delayed work")
      Cc: stable@vger.kernel.org # v5.8+
      Signed-off-by: default avatarAndreas Gruenbacher <agruenba@redhat.com>
      dd0ecf54
    • Paulo Alcantara's avatar
      cifs: fix potential use-after-free in cifs_echo_request() · 21225336
      Paulo Alcantara authored
      This patch fixes a potential use-after-free bug in
      cifs_echo_request().
      
      For instance,
      
        thread 1
        --------
        cifs_demultiplex_thread()
          clean_demultiplex_info()
            kfree(server)
      
        thread 2 (workqueue)
        --------
        apic_timer_interrupt()
          smp_apic_timer_interrupt()
            irq_exit()
              __do_softirq()
                run_timer_softirq()
                  call_timer_fn()
      	      cifs_echo_request() <- use-after-free in server ptr
      Signed-off-by: default avatarPaulo Alcantara (SUSE) <pc@cjr.nz>
      CC: Stable <stable@vger.kernel.org>
      Reviewed-by: default avatarRonnie Sahlberg <lsahlber@redhat.com>
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      21225336
    • Paulo Alcantara's avatar
      cifs: allow syscalls to be restarted in __smb_send_rqst() · 6988a619
      Paulo Alcantara authored
      A customer has reported that several files in their multi-threaded app
      were left with size of 0 because most of the read(2) calls returned
      -EINTR and they assumed no bytes were read.  Obviously, they could
      have fixed it by simply retrying on -EINTR.
      
      We noticed that most of the -EINTR on read(2) were due to real-time
      signals sent by glibc to process wide credential changes (SIGRT_1),
      and its signal handler had been established with SA_RESTART, in which
      case those calls could have been automatically restarted by the
      kernel.
      
      Let the kernel decide to whether or not restart the syscalls when
      there is a signal pending in __smb_send_rqst() by returning
      -ERESTARTSYS.  If it can't, it will return -EINTR anyway.
      Signed-off-by: default avatarPaulo Alcantara (SUSE) <pc@cjr.nz>
      CC: Stable <stable@vger.kernel.org>
      Reviewed-by: default avatarRonnie Sahlberg <lsahlber@redhat.com>
      Reviewed-by: default avatarPavel Shilovsky <pshilov@microsoft.com>
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      6988a619
    • Andrea Righi's avatar
      ring-buffer: Set the right timestamp in the slow path of __rb_reserve_next() · 8785f51a
      Andrea Righi authored
      In the slow path of __rb_reserve_next() a nested event(s) can happen
      between evaluating the timestamp delta of the current event and updating
      write_stamp via local_cmpxchg(); in this case the delta is not valid
      anymore and it should be set to 0 (same timestamp as the interrupting
      event), since the event that we are currently processing is not the last
      event in the buffer.
      
      Link: https://lkml.kernel.org/r/X8IVJcp1gRE+FJCJ@xps-13-7390
      
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: stable@vger.kernel.org
      Link: https://lwn.net/Articles/831207
      Fixes: a389d86f ("ring-buffer: Have nested events still record running time stamp")
      Signed-off-by: default avatarAndrea Righi <andrea.righi@canonical.com>
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      8785f51a
    • Steven Rostedt (VMware)'s avatar
      ring-buffer: Update write stamp with the correct ts · 55ea4cf4
      Steven Rostedt (VMware) authored
      The write stamp, used to calculate deltas between events, was updated with
      the stale "ts" value in the "info" structure, and not with the updated "ts"
      variable. This caused the deltas between events to be inaccurate, and when
      crossing into a new sub buffer, had time go backwards.
      
      Link: https://lkml.kernel.org/r/20201124223917.795844-1-elavila@google.com
      
      Cc: stable@vger.kernel.org
      Fixes: a389d86f ("ring-buffer: Have nested events still record running time stamp")
      Reported-by: default avatar"J. Avila" <elavila@google.com>
      Tested-by: default avatarDaniel Mentz <danielmentz@google.com>
      Tested-by: default avatarWill McVicker <willmcvicker@google.com>
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      55ea4cf4
    • Vincenzo Frascino's avatar
      arm64: mte: Fix typo in macro definition · 9e5344e0
      Vincenzo Frascino authored
      UL in the definition of SYS_TFSR_EL1_TF1 was misspelled causing
      compilation issues when trying to implement in kernel MTE async
      mode.
      
      Fix the macro correcting the typo.
      
      Note: MTE async mode will be introduced with a future series.
      
      Fixes: c058b1c4 ("arm64: mte: system register definitions")
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will@kernel.org>
      Signed-off-by: default avatarVincenzo Frascino <vincenzo.frascino@arm.com>
      Acked-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Link: https://lore.kernel.org/r/20201130170709.22309-1-vincenzo.frascino@arm.comSigned-off-by: default avatarWill Deacon <will@kernel.org>
      9e5344e0
    • Mark Rutland's avatar
      arm64: entry: fix EL1 debug transitions · 2a9b3e6a
      Mark Rutland authored
      In debug_exception_enter() and debug_exception_exit() we trace hardirqs
      on/off while RCU isn't guaranteed to be watching, and we don't save and
      restore the hardirq state, and so may return with this having changed.
      
      Handle this appropriately with new entry/exit helpers which do the bare
      minimum to ensure this is appropriately maintained, without marking
      debug exceptions as NMIs. These are placed in entry-common.c with the
      other entry/exit helpers.
      
      In future we'll want to reconsider whether some debug exceptions should
      be NMIs, but this will require a significant refactoring, and for now
      this should prevent issues with lockdep and RCU.
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Cc: Catalin Marins <catalin.marinas@arm.com>
      Cc: James Morse <james.morse@arm.com>
      Cc: Will Deacon <will@kernel.org>
      Link: https://lore.kernel.org/r/20201130115950.22492-12-mark.rutland@arm.comSigned-off-by: default avatarWill Deacon <will@kernel.org>
      2a9b3e6a
    • Mark Rutland's avatar
      arm64: entry: fix NMI {user, kernel}->kernel transitions · f0cd5ac1
      Mark Rutland authored
      Exceptions which can be taken at (almost) any time are consdiered to be
      NMIs. On arm64 that includes:
      
      * SDEI events
      * GICv3 Pseudo-NMIs
      * Kernel stack overflows
      * Unexpected/unhandled exceptions
      
      ... but currently debug exceptions (BRKs, breakpoints, watchpoints,
      single-step) are not considered NMIs.
      
      As these can be taken at any time, kernel features (lockdep, RCU,
      ftrace) may not be in a consistent kernel state. For example, we may
      take an NMI from the idle code or partway through an entry/exit path.
      
      While nmi_enter() and nmi_exit() handle most of this state, notably they
      don't save/restore the lockdep state across an NMI being taken and
      handled. When interrupts are enabled and an NMI is taken, lockdep may
      see interrupts become disabled within the NMI code, but not see
      interrupts become enabled when returning from the NMI, leaving lockdep
      believing interrupts are disabled when they are actually disabled.
      
      The x86 code handles this in idtentry_{enter,exit}_nmi(), which will
      shortly be moved to the generic entry code. As we can't use either yet,
      we copy the x86 approach in arm64-specific helpers. All the NMI
      entrypoints are marked as noinstr to prevent any instrumentation
      handling code being invoked before the state has been corrected.
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: James Morse <james.morse@arm.com>
      Cc: Will Deacon <will@kernel.org>
      Link: https://lore.kernel.org/r/20201130115950.22492-11-mark.rutland@arm.comSigned-off-by: default avatarWill Deacon <will@kernel.org>
      f0cd5ac1
    • Mark Rutland's avatar
      arm64: entry: fix non-NMI kernel<->kernel transitions · 7cd1ea10
      Mark Rutland authored
      There are periods in kernel mode when RCU is not watching and/or the
      scheduler tick is disabled, but we can still take exceptions such as
      interrupts. The arm64 exception handlers do not account for this, and
      it's possible that RCU is not watching while an exception handler runs.
      
      The x86/generic entry code handles this by ensuring that all (non-NMI)
      kernel exception handlers call irqentry_enter() and irqentry_exit(),
      which handle RCU, lockdep, and IRQ flag tracing. We can't yet move to
      the generic entry code, and already hadnle the user<->kernel transitions
      elsewhere, so we add new kernel<->kernel transition helpers alog the
      lines of the generic entry code.
      
      Since we now track interrupts becoming masked when an exception is
      taken, local_daif_inherit() is modified to track interrupts becoming
      re-enabled when the original context is inherited. To balance the
      entry/exit paths, each handler masks all DAIF exceptions before
      exit_to_kernel_mode().
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: James Morse <james.morse@arm.com>
      Cc: Will Deacon <will@kernel.org>
      Link: https://lore.kernel.org/r/20201130115950.22492-10-mark.rutland@arm.comSigned-off-by: default avatarWill Deacon <will@kernel.org>
      7cd1ea10
    • Mark Rutland's avatar
      arm64: ptrace: prepare for EL1 irq/rcu tracking · 1ec2f2c0
      Mark Rutland authored
      Exceptions from EL1 may be taken when RCU isn't watching (e.g. in idle
      sequences), or when the lockdep hardirqs transiently out-of-sync with
      the hardware state (e.g. in the middle of local_irq_enable()). To
      correctly handle these cases, we'll need to save/restore this state
      across some exceptions taken from EL1.
      
      A series of subsequent patches will update EL1 exception handlers to
      handle this. In preparation for this, and to avoid dependencies between
      those patches, this patch adds two new fields to struct pt_regs so that
      exception handlers can track this state.
      
      Note that this is placed in pt_regs as some entry/exit sequences such as
      el1_irq are invoked from assembly, which makes it very difficult to add
      a separate structure as with the irqentry_state used by x86. We can
      separate this once more of the exception logic is moved to C. While the
      fields only need to be bool, they are both made u64 to keep pt_regs
      16-byte aligned.
      
      There should be no functional change as a result of this patch.
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: James Morse <james.morse@arm.com>
      Cc: Will Deacon <will@kernel.org>
      Link: https://lore.kernel.org/r/20201130115950.22492-9-mark.rutland@arm.comSigned-off-by: default avatarWill Deacon <will@kernel.org>
      1ec2f2c0
    • Mark Rutland's avatar
      arm64: entry: fix non-NMI user<->kernel transitions · 23529049
      Mark Rutland authored
      When built with PROVE_LOCKING, NO_HZ_FULL, and CONTEXT_TRACKING_FORCE
      will WARN() at boot time that interrupts are enabled when we call
      context_tracking_user_enter(), despite the DAIF flags indicating that
      IRQs are masked.
      
      The problem is that we're not tracking IRQ flag changes accurately, and
      so lockdep believes interrupts are enabled when they are not (and
      vice-versa). We can shuffle things so to make this more accurate. For
      kernel->user transitions there are a number of constraints we need to
      consider:
      
      1) When we call __context_tracking_user_enter() HW IRQs must be disabled
         and lockdep must be up-to-date with this.
      
      2) Userspace should be treated as having IRQs enabled from the PoV of
         both lockdep and tracing.
      
      3) As context_tracking_user_enter() stops RCU from watching, we cannot
         use RCU after calling it.
      
      4) IRQ flag tracing and lockdep have state that must be manipulated
         before RCU is disabled.
      
      ... with similar constraints applying for user->kernel transitions, with
      the ordering reversed.
      
      The generic entry code has enter_from_user_mode() and
      exit_to_user_mode() helpers to handle this. We can't use those directly,
      so we add arm64 copies for now (without the instrumentation markers
      which aren't used on arm64). These replace the existing user_exit() and
      user_exit_irqoff() calls spread throughout handlers, and the exception
      unmasking is left as-is.
      
      Note that:
      
      * The accounting for debug exceptions from userspace now happens in
        el0_dbg() and ret_to_user(), so this is removed from
        debug_exception_enter() and debug_exception_exit(). As
        user_exit_irqoff() wakes RCU, the userspace-specific check is removed.
      
      * The accounting for syscalls now happens in el0_svc(),
        el0_svc_compat(), and ret_to_user(), so this is removed from
        el0_svc_common(). This does not adversely affect the workaround for
        erratum 1463225, as this does not depend on any of the state tracking.
      
      * In ret_to_user() we mask interrupts with local_daif_mask(), and so we
        need to inform lockdep and tracing. Here a trace_hardirqs_off() is
        sufficient and safe as we have not yet exited kernel context and RCU
        is usable.
      
      * As PROVE_LOCKING selects TRACE_IRQFLAGS, the ifdeferry in entry.S only
        needs to check for the latter.
      
      * EL0 SError handling will be dealt with in a subsequent patch, as this
        needs to be treated as an NMI.
      
      Prior to this patch, booting an appropriately-configured kernel would
      result in spats as below:
      
      | DEBUG_LOCKS_WARN_ON(lockdep_hardirqs_enabled())
      | WARNING: CPU: 2 PID: 1 at kernel/locking/lockdep.c:5280 check_flags.part.54+0x1dc/0x1f0
      | Modules linked in:
      | CPU: 2 PID: 1 Comm: init Not tainted 5.10.0-rc3 #3
      | Hardware name: linux,dummy-virt (DT)
      | pstate: 804003c5 (Nzcv DAIF +PAN -UAO -TCO BTYPE=--)
      | pc : check_flags.part.54+0x1dc/0x1f0
      | lr : check_flags.part.54+0x1dc/0x1f0
      | sp : ffff80001003bd80
      | x29: ffff80001003bd80 x28: ffff66ce801e0000
      | x27: 00000000ffffffff x26: 00000000000003c0
      | x25: 0000000000000000 x24: ffffc31842527258
      | x23: ffffc31842491368 x22: ffffc3184282d000
      | x21: 0000000000000000 x20: 0000000000000001
      | x19: ffffc318432ce000 x18: 0080000000000000
      | x17: 0000000000000000 x16: ffffc31840f18a78
      | x15: 0000000000000001 x14: ffffc3184285c810
      | x13: 0000000000000001 x12: 0000000000000000
      | x11: ffffc318415857a0 x10: ffffc318406614c0
      | x9 : ffffc318415857a0 x8 : ffffc31841f1d000
      | x7 : 647261685f706564 x6 : ffffc3183ff7c66c
      | x5 : ffff66ce801e0000 x4 : 0000000000000000
      | x3 : ffffc3183fe00000 x2 : ffffc31841500000
      | x1 : e956dc24146b3500 x0 : 0000000000000000
      | Call trace:
      |  check_flags.part.54+0x1dc/0x1f0
      |  lock_is_held_type+0x10c/0x188
      |  rcu_read_lock_sched_held+0x70/0x98
      |  __context_tracking_enter+0x310/0x350
      |  context_tracking_enter.part.3+0x5c/0xc8
      |  context_tracking_user_enter+0x6c/0x80
      |  finish_ret_to_user+0x2c/0x13cr
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: James Morse <james.morse@arm.com>
      Cc: Will Deacon <will@kernel.org>
      Link: https://lore.kernel.org/r/20201130115950.22492-8-mark.rutland@arm.comSigned-off-by: default avatarWill Deacon <will@kernel.org>
      23529049
    • Mark Rutland's avatar
      arm64: entry: move el1 irq/nmi logic to C · 105fc335
      Mark Rutland authored
      In preparation for reworking the EL1 irq/nmi entry code, move the
      existing logic to C. We no longer need the asm_nmi_enter() and
      asm_nmi_exit() wrappers, so these are removed. The new C functions are
      marked noinstr, which prevents compiler instrumentation and runtime
      probing.
      
      In subsequent patches we'll want the new C helpers to be called in all
      cases, so we don't bother wrapping the calls with ifdeferry. Even when
      the new C functions are stubs the trivial calls are unlikely to have a
      measurable impact on the IRQ or NMI paths anyway.
      
      Prototypes are added to <asm/exception.h> as otherwise (in some
      configurations) GCC will complain about the lack of a forward
      declaration. We already do this for existing function, e.g.
      enter_from_user_mode().
      
      The new helpers are marked as noinstr (which prevents all
      instrumentation, tracing, and kprobes). Otherwise, there should be no
      functional change as a result of this patch.
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: James Morse <james.morse@arm.com>
      Cc: Will Deacon <will@kernel.org>
      Link: https://lore.kernel.org/r/20201130115950.22492-7-mark.rutland@arm.comSigned-off-by: default avatarWill Deacon <will@kernel.org>
      105fc335