1. 09 Oct, 2023 1 commit
    • Juergen Gross's avatar
      xen/events: replace evtchn_rwlock with RCU · 87797fad
      Juergen Gross authored
      In unprivileged Xen guests event handling can cause a deadlock with
      Xen console handling. The evtchn_rwlock and the hvc_lock are taken in
      opposite sequence in __hvc_poll() and in Xen console IRQ handling.
      Normally this is no problem, as the evtchn_rwlock is taken as a reader
      in both paths, but as soon as an event channel is being closed, the
      lock will be taken as a writer, which will cause read_lock() to block:
      
      CPU0                     CPU1                CPU2
      (IRQ handling)           (__hvc_poll())      (closing event channel)
      
      read_lock(evtchn_rwlock)
                               spin_lock(hvc_lock)
                                                   write_lock(evtchn_rwlock)
                                                       [blocks]
      spin_lock(hvc_lock)
          [blocks]
                              read_lock(evtchn_rwlock)
                                  [blocks due to writer waiting,
                                   and not in_interrupt()]
      
      This issue can be avoided by replacing evtchn_rwlock with RCU in
      xen_free_irq(). Note that RCU is used only to delay freeing of the
      irq_info memory. There is no RCU based dereferencing or replacement of
      pointers involved.
      
      In order to avoid potential races between removing the irq_info
      reference and handling of interrupts, set the irq_info pointer to NULL
      only when freeing its memory. The IRQ itself must be freed at that
      time, too, as otherwise the same IRQ number could be allocated again
      before handling of the old instance would have been finished.
      
      This is XSA-441 / CVE-2023-34324.
      
      Fixes: 54c9de89 ("xen/events: add a new "late EOI" evtchn framework")
      Reported-by: default avatarMarek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
      Signed-off-by: default avatarJuergen Gross <jgross@suse.com>
      Reviewed-by: default avatarJulien Grall <jgrall@amazon.com>
      Signed-off-by: default avatarJuergen Gross <jgross@suse.com>
      87797fad
  2. 08 Oct, 2023 4 commits
  3. 07 Oct, 2023 10 commits
  4. 06 Oct, 2023 22 commits
  5. 05 Oct, 2023 3 commits
    • Jeff Moyer's avatar
      io-wq: fully initialize wqe before calling cpuhp_state_add_instance_nocalls() · 0f8baa3c
      Jeff Moyer authored
      I received a bug report with the following signature:
      
      [ 1759.937637] BUG: unable to handle page fault for address: ffffffffffffffe8
      [ 1759.944564] #PF: supervisor read access in kernel mode
      [ 1759.949732] #PF: error_code(0x0000) - not-present page
      [ 1759.954901] PGD 7ab615067 P4D 7ab615067 PUD 7ab617067 PMD 0
      [ 1759.960596] Oops: 0000 1 PREEMPT SMP PTI
      [ 1759.964804] CPU: 15 PID: 109 Comm: cpuhp/15 Kdump: loaded Tainted: G X ------- — 5.14.0-362.3.1.el9_3.x86_64 #1
      [ 1759.976609] Hardware name: HPE ProLiant DL380 Gen10/ProLiant DL380 Gen10, BIOS U30 06/20/2018
      [ 1759.985181] RIP: 0010:io_wq_for_each_worker.isra.0+0x24/0xa0
      [ 1759.990877] Code: 90 90 90 90 90 90 0f 1f 44 00 00 41 56 41 55 41 54 55 48 8d 6f 78 53 48 8b 47 78 48 39 c5 74 4f 49 89 f5 49 89 d4 48 8d 58 e8 <8b> 13 85 d2 74 32 8d 4a 01 89 d0 f0 0f b1 0b 75 5c 09 ca 78 3d 48
      [ 1760.009758] RSP: 0000:ffffb6f403603e20 EFLAGS: 00010286
      [ 1760.015013] RAX: 0000000000000000 RBX: ffffffffffffffe8 RCX: 0000000000000000
      [ 1760.022188] RDX: ffffb6f403603e50 RSI: ffffffffb11e95b0 RDI: ffff9f73b09e9400
      [ 1760.029362] RBP: ffff9f73b09e9478 R08: 000000000000000f R09: 0000000000000000
      [ 1760.036536] R10: ffffffffffffff00 R11: ffffb6f403603d80 R12: ffffb6f403603e50
      [ 1760.043712] R13: ffffffffb11e95b0 R14: ffffffffb28531e8 R15: ffff9f7a6fbdf548
      [ 1760.050887] FS: 0000000000000000(0000) GS:ffff9f7a6fbc0000(0000) knlGS:0000000000000000
      [ 1760.059025] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [ 1760.064801] CR2: ffffffffffffffe8 CR3: 00000007ab610002 CR4: 00000000007706e0
      [ 1760.071976] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [ 1760.079150] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [ 1760.086325] PKRU: 55555554
      [ 1760.089044] Call Trace:
      [ 1760.091501] <TASK>
      [ 1760.093612] ? show_trace_log_lvl+0x1c4/0x2df
      [ 1760.097995] ? show_trace_log_lvl+0x1c4/0x2df
      [ 1760.102377] ? __io_wq_cpu_online+0x54/0xb0
      [ 1760.106584] ? __die_body.cold+0x8/0xd
      [ 1760.110356] ? page_fault_oops+0x134/0x170
      [ 1760.114479] ? kernelmode_fixup_or_oops+0x84/0x110
      [ 1760.119298] ? exc_page_fault+0xa8/0x150
      [ 1760.123247] ? asm_exc_page_fault+0x22/0x30
      [ 1760.127458] ? __pfx_io_wq_worker_affinity+0x10/0x10
      [ 1760.132453] ? __pfx_io_wq_worker_affinity+0x10/0x10
      [ 1760.137446] ? io_wq_for_each_worker.isra.0+0x24/0xa0
      [ 1760.142527] __io_wq_cpu_online+0x54/0xb0
      [ 1760.146558] cpuhp_invoke_callback+0x109/0x460
      [ 1760.151029] ? __pfx_io_wq_cpu_offline+0x10/0x10
      [ 1760.155673] ? __pfx_smpboot_thread_fn+0x10/0x10
      [ 1760.160320] cpuhp_thread_fun+0x8d/0x140
      [ 1760.164266] smpboot_thread_fn+0xd3/0x1a0
      [ 1760.168297] kthread+0xdd/0x100
      [ 1760.171457] ? __pfx_kthread+0x10/0x10
      [ 1760.175225] ret_from_fork+0x29/0x50
      [ 1760.178826] </TASK>
      [ 1760.181022] Modules linked in: rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache netfs rfkill sunrpc vfat fat dm_multipath intel_rapl_msr intel_rapl_common isst_if_common ipmi_ssif nfit libnvdimm mgag200 i2c_algo_bit ioatdma drm_shmem_helper drm_kms_helper acpi_ipmi syscopyarea x86_pkg_temp_thermal sysfillrect ipmi_si intel_powerclamp sysimgblt ipmi_devintf coretemp acpi_power_meter ipmi_msghandler rapl pcspkr dca intel_pch_thermal intel_cstate ses lpc_ich intel_uncore enclosure hpilo mei_me mei acpi_tad fuse drm xfs sd_mod sg bnx2x nvme nvme_core crct10dif_pclmul crc32_pclmul nvme_common ghash_clmulni_intel smartpqi tg3 t10_pi mdio uas libcrc32c crc32c_intel scsi_transport_sas usb_storage hpwdt wmi dm_mirror dm_region_hash dm_log dm_mod
      [ 1760.248623] CR2: ffffffffffffffe8
      
      A cpu hotplug callback was issued before wq->all_list was initialized.
      This results in a null pointer dereference.  The fix is to fully setup
      the io_wq before calling cpuhp_state_add_instance_nocalls().
      Signed-off-by: default avatarJeff Moyer <jmoyer@redhat.com>
      Link: https://lore.kernel.org/r/x49y1ghnecs.fsf@segfault.boston.devel.redhat.comSigned-off-by: default avatarJens Axboe <axboe@kernel.dk>
      0f8baa3c
    • Xuewen Yan's avatar
      cpufreq: schedutil: Update next_freq when cpufreq_limits change · 9e0bc36a
      Xuewen Yan authored
      When cpufreq's policy is 'single', there is a scenario that will
      cause sg_policy's next_freq to be unable to update.
      
      When the CPU's util is always max, the cpufreq will be max,
      and then if we change the policy's scaling_max_freq to be a
      lower freq, indeed, the sg_policy's next_freq need change to
      be the lower freq, however, because the cpu_is_busy, the next_freq
      would keep the max_freq.
      
      For example:
      
      The cpu7 is a single CPU:
      
        unisoc:/sys/devices/system/cpu/cpufreq/policy7 # while true;do done& [1] 4737
        unisoc:/sys/devices/system/cpu/cpufreq/policy7 # taskset -p 80 4737
        pid 4737's current affinity mask: ff
        pid 4737's new affinity mask: 80
        unisoc:/sys/devices/system/cpu/cpufreq/policy7 # cat scaling_max_freq
        2301000
        unisoc:/sys/devices/system/cpu/cpufreq/policy7 # cat scaling_cur_freq
        2301000
        unisoc:/sys/devices/system/cpu/cpufreq/policy7 # echo 2171000 > scaling_max_freq
        unisoc:/sys/devices/system/cpu/cpufreq/policy7 # cat scaling_max_freq
        2171000
      
      At this time, the sg_policy's next_freq would stay at 2301000, which
      is wrong.
      
      To fix this, add a check for the ->need_freq_update flag.
      
      [ mingo: Clarified the changelog. ]
      Co-developed-by: default avatarGuohua Yan <guohua.yan@unisoc.com>
      Signed-off-by: default avatarXuewen Yan <xuewen.yan@unisoc.com>
      Signed-off-by: default avatarGuohua Yan <guohua.yan@unisoc.com>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Acked-by: default avatar"Rafael J. Wysocki" <rafael@kernel.org>
      Link: https://lore.kernel.org/r/20230719130527.8074-1-xuewen.yan@unisoc.com
      9e0bc36a
    • Renan Guilherme Lebre Ramos's avatar