1. 28 May, 2020 5 commits
    • Julia Cartwright's avatar
      squashfs: Make use of local lock in multi_cpu decompressor · fd56200a
      Julia Cartwright authored
      The squashfs multi CPU decompressor makes use of get_cpu_ptr() to
      acquire a pointer to per-CPU data. get_cpu_ptr() implicitly disables
      preemption which serializes the access to the per-CPU data.
      
      But decompression can take quite some time depending on the size. The
      observed preempt disabled times in real world scenarios went up to 8ms,
      causing massive wakeup latencies. This happens on all CPUs as the
      decompression is fully parallelized.
      
      Replace the implicit preemption control with an explicit local lock.
      This allows RT kernels to substitute it with a real per CPU lock, which
      serializes the access but keeps the code section preemptible. On non RT
      kernels this maps to preempt_disable() as before, i.e. no functional
      change.
      
      [ bigeasy: Use local_lock(), patch description]
      Reported-by: default avatarAlexander Stein <alexander.stein@systec-electronic.com>
      Signed-off-by: default avatarJulia Cartwright <julia@ni.com>
      Signed-off-by: default avatarSebastian Andrzej Siewior <bigeasy@linutronix.de>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Tested-by: default avatarAlexander Stein <alexander.stein@systec-electronic.com>
      Acked-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20200527201119.1692513-5-bigeasy@linutronix.de
      fd56200a
    • Ingo Molnar's avatar
      mm/swap: Use local_lock for protection · b01b2141
      Ingo Molnar authored
      The various struct pagevec per CPU variables are protected by disabling
      either preemption or interrupts across the critical sections. Inside
      these sections spinlocks have to be acquired.
      
      These spinlocks are regular spinlock_t types which are converted to
      "sleeping" spinlocks on PREEMPT_RT enabled kernels. Obviously sleeping
      locks cannot be acquired in preemption or interrupt disabled sections.
      
      local locks provide a trivial way to substitute preempt and interrupt
      disable instances. On a non PREEMPT_RT enabled kernel local_lock() maps
      to preempt_disable() and local_lock_irq() to local_irq_disable().
      
      Create lru_rotate_pvecs containing the pagevec and the locallock.
      Create lru_pvecs containing the remaining pagevecs and the locallock.
      Add lru_add_drain_cpu_zone() which is used from compact_zone() to avoid
      exporting the pvec structure.
      
      Change the relevant call sites to acquire these locks instead of using
      preempt_disable() / get_cpu() / get_cpu_var() and local_irq_disable() /
      local_irq_save().
      
      There is neither a functional change nor a change in the generated
      binary code for non PREEMPT_RT enabled non-debug kernels.
      
      When lockdep is enabled local locks have lockdep maps embedded. These
      allow lockdep to validate the protections, i.e. inappropriate usage of a
      preemption only protected sections would result in a lockdep warning
      while the same problem would not be noticed with a plain
      preempt_disable() based protection.
      
      local locks also improve readability as they provide a named scope for
      the protections while preempt/interrupt disable are opaque scopeless.
      
      Finally local locks allow PREEMPT_RT to substitute them with real
      locking primitives to ensure the correctness of operation in a fully
      preemptible kernel.
      
      [ bigeasy: Adopted to use local_lock ]
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarSebastian Andrzej Siewior <bigeasy@linutronix.de>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Acked-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20200527201119.1692513-4-bigeasy@linutronix.de
      b01b2141
    • Sebastian Andrzej Siewior's avatar
      radix-tree: Use local_lock for protection · cfa6705d
      Sebastian Andrzej Siewior authored
      The radix-tree and idr preload mechanisms use preempt_disable() to protect
      the complete operation between xxx_preload() and xxx_preload_end().
      
      As the code inside the preempt disabled section acquires regular spinlocks,
      which are converted to 'sleeping' spinlocks on a PREEMPT_RT kernel and
      eventually calls into a memory allocator, this conflicts with the RT
      semantics.
      
      Convert it to a local_lock which allows RT kernels to substitute them with
      a real per CPU lock. On non RT kernels this maps to preempt_disable() as
      before, but provides also lockdep coverage of the critical region.
      No functional change.
      Signed-off-by: default avatarSebastian Andrzej Siewior <bigeasy@linutronix.de>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Acked-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20200527201119.1692513-3-bigeasy@linutronix.de
      cfa6705d
    • Thomas Gleixner's avatar
      locking: Introduce local_lock() · 91710728
      Thomas Gleixner authored
      preempt_disable() and local_irq_disable/save() are in principle per CPU big
      kernel locks. This has several downsides:
      
        - The protection scope is unknown
      
        - Violation of protection rules is hard to detect by instrumentation
      
        - For PREEMPT_RT such sections, unless in low level critical code, can
          violate the preemptability constraints.
      
      To address this PREEMPT_RT introduced the concept of local_locks which are
      strictly per CPU.
      
      The lock operations map to preempt_disable(), local_irq_disable/save() and
      the enabling counterparts on non RT enabled kernels.
      
      If lockdep is enabled local locks gain a lock map which tracks the usage
      context. This will catch cases where an area is protected by
      preempt_disable() but the access also happens from interrupt context. local
      locks have identified quite a few such issues over the years, the most
      recent example is:
      
        b7d5dc21 ("random: add a spinlock_t to struct batched_entropy")
      
      Aside of the lockdep coverage this also improves code readability as it
      precisely annotates the protection scope.
      
      PREEMPT_RT substitutes these local locks with 'sleeping' spinlocks to
      protect such sections while maintaining preemtability and CPU locality.
      
      local locks can replace:
      
        - preempt_enable()/disable() pairs
        - local_irq_disable/enable() pairs
        - local_irq_save/restore() pairs
      
      They are also used to replace code which implicitly disables preemption
      like:
      
        - get_cpu()/put_cpu()
        - get_cpu_var()/put_cpu_var()
      
      with PREEMPT_RT friendly constructs.
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarSebastian Andrzej Siewior <bigeasy@linutronix.de>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Acked-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20200527201119.1692513-2-bigeasy@linutronix.de
      91710728
    • Ingo Molnar's avatar
  2. 24 May, 2020 5 commits
    • Linus Torvalds's avatar
      Linux 5.7-rc7 · 9cb1fd0e
      Linus Torvalds authored
      9cb1fd0e
    • Linus Torvalds's avatar
      Merge tag 'efi-urgent-2020-05-24' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 98790bba
      Linus Torvalds authored
      Pull EFI fixes from Thomas Gleixner:
       "A set of EFI fixes:
      
         - Don't return a garbage screen info when EFI framebuffer is not
           available
      
         - Make the early EFI console work properly with wider fonts instead
           of drawing garbage
      
         - Prevent a memory buffer leak in allocate_e820()
      
         - Print the firmware error record properly so it can be decoded by
           users
      
         - Fix a symbol clash in the host tool build which only happens with
           newer compilers.
      
         - Add a missing check for the event log version of TPM which caused
           boot failures on several Dell systems due to an attempt to decode
           SHA-1 format with the crypto agile algorithm"
      
      * tag 'efi-urgent-2020-05-24' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        tpm: check event log version before reading final events
        efi: Pull up arch-specific prototype efi_systab_show_arch()
        x86/boot: Mark global variables as static
        efi: cper: Add support for printing Firmware Error Record Reference
        efi/libstub/x86: Avoid EFI map buffer alloc in allocate_e820()
        efi/earlycon: Fix early printk for wider fonts
        efi/libstub: Avoid returning uninitialized data from setup_graphics()
      98790bba
    • Linus Torvalds's avatar
      Merge tag 'x86-urgent-2020-05-24' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 667b6249
      Linus Torvalds authored
      Pull x86 fixes from Thomas Gleixner:
       "Two fixes for x86:
      
         - Unbreak stack dumps for inactive tasks by interpreting the special
           first frame left by __switch_to_asm() correctly.
      
           The recent change not to skip the first frame so ORC and frame
           unwinder behave in the same way caused all entries to be
           unreliable, i.e. prepended with '?'.
      
         - Use cpumask_available() instead of an implicit NULL check of a
           cpumask_var_t in mmio trace to prevent a Clang build warning"
      
      * tag 'x86-urgent-2020-05-24' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/unwind/orc: Fix unwind_get_return_address_ptr() for inactive tasks
        x86/mmiotrace: Use cpumask_available() for cpumask_var_t variables
      667b6249
    • Linus Torvalds's avatar
      Merge tag 'sched-urgent-2020-05-24' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 9e61d12b
      Linus Torvalds authored
      Pull scheduler fixes from Thomas Gleixner:
       "A set of fixes for the scheduler:
      
         - Fix handling of throttled parents in enqueue_task_fair() completely.
      
           The recent fix overlooked a corner case where the first iteration
           terminates due to an entity already being on the runqueue which
           makes the list management incomplete and later triggers the
           assertion which checks for completeness.
      
         - Fix a similar problem in unthrottle_cfs_rq().
      
         - Show the correct uclamp values in procfs which prints the effective
           value twice instead of requested and effective"
      
      * tag 'sched-urgent-2020-05-24' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        sched/fair: Fix unthrottle_cfs_rq() for leaf_cfs_rq list
        sched/debug: Fix requested task uclamp values shown in procfs
        sched/fair: Fix enqueue_task_fair() warning some more
      9e61d12b
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · caffb99b
      Linus Torvalds authored
      Pull networking fixes from David Miller:
      
       1) Fix RCU warnings in ipv6 multicast router code, from Madhuparna
          Bhowmik.
      
       2) Nexthop attributes aren't being checked properly because of
          mis-initialized iterator, from David Ahern.
      
       3) Revert iop_idents_reserve() change as it caused performance
          regressions and was just working around what is really a UBSAN bug
          in the compiler. From Yuqi Jin.
      
       4) Read MAC address properly from ROM in bmac driver (double iteration
          proceeds past end of address array), from Jeremy Kerr.
      
       5) Add Microsoft Surface device IDs to r8152, from Marc Payne.
      
       6) Prevent reference to freed SKB in __netif_receive_skb_core(), from
          Boris Sukholitko.
      
       7) Fix ACK discard behavior in rxrpc, from David Howells.
      
       8) Preserve flow hash across packet scrubbing in wireguard, from Jason
          A. Donenfeld.
      
       9) Cap option length properly for SO_BINDTODEVICE in AX25, from Eric
          Dumazet.
      
      10) Fix encryption error checking in kTLS code, from Vadim Fedorenko.
      
      11) Missing BPF prog ref release in flow dissector, from Jakub Sitnicki.
      
      12) dst_cache must be used with BH disabled in tipc, from Eric Dumazet.
      
      13) Fix use after free in mlxsw driver, from Jiri Pirko.
      
      14) Order kTLS key destruction properly in mlx5 driver, from Tariq
          Toukan.
      
      15) Check devm_platform_ioremap_resource() return value properly in
          several drivers, from Tiezhu Yang.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (71 commits)
        net: smsc911x: Fix runtime PM imbalance on error
        net/mlx4_core: fix a memory leak bug.
        net: ethernet: ti: cpsw: fix ASSERT_RTNL() warning during suspend
        net: phy: mscc: fix initialization of the MACsec protocol mode
        net: stmmac: don't attach interface until resume finishes
        net: Fix return value about devm_platform_ioremap_resource()
        net/mlx5: Fix error flow in case of function_setup failure
        net/mlx5e: CT: Correctly get flow rule
        net/mlx5e: Update netdev txq on completions during closure
        net/mlx5: Annotate mutex destroy for root ns
        net/mlx5: Don't maintain a case of del_sw_func being null
        net/mlx5: Fix cleaning unmanaged flow tables
        net/mlx5: Fix memory leak in mlx5_events_init
        net/mlx5e: Fix inner tirs handling
        net/mlx5e: kTLS, Destroy key object after destroying the TIS
        net/mlx5e: Fix allowed tc redirect merged eswitch offload cases
        net/mlx5: Avoid processing commands before cmdif is ready
        net/mlx5: Fix a race when moving command interface to events mode
        net/mlx5: Add command entry handling completion
        rxrpc: Fix a memory leak in rxkad_verify_response()
        ...
      caffb99b
  3. 23 May, 2020 30 commits