1. 30 Apr, 2024 3 commits
    • Rafael J. Wysocki's avatar
      thermal: core: Move passive polling management to the core · 042a3d80
      Rafael J. Wysocki authored
      Passive polling is enabled by setting the 'passive' field in
      struct thermal_zone_device to a positive value so long as the
      'passive_delay_jiffies' field is greater than zero.  It causes
      the thermal core to actively check the thermal zone temperature
      periodically which in theory should be done after crossing a
      passive trip point on the way up in order to allow governors to
      react more rapidly to temperature changes and adjust mitigation
      more precisely.
      
      However, the 'passive' field in struct thermal_zone_device is currently
      managed by governors which is quite problematic.  First of all, only
      two governors, Step-Wise and Power Allocator, update that field at
      all, so the other governors do not benefit from passive polling,
      although in principle they should.  Moreover, if the zone governor is
      changed from, say, Step-Wise to Fair-Share after 'passive' has been
      incremented by the former, it is not going to be reset back to zero by
      the latter even if the zone temperature falls down below all passive
      trip points.
      
      For this reason, make handle_thermal_trip() increment 'passive'
      to enable passive polling for the given thermal zone whenever a
      passive trip point is crossed on the way up and decrement it
      whenever a passive trip point is crossed on the way down.  Also
      remove the 'passive' field updates from governors and additionally
      clear it in thermal_zone_device_init() to prevent passive polling
      from being enabled after a system resume just beacuse it was enabled
      before suspending the system.
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Reviewed-by: default avatarLukasz Luba <lukasz.luba@arm.com>
      Tested-by: default avatarLukasz Luba <lukasz.luba@arm.com>
      042a3d80
    • Rafael J. Wysocki's avatar
      thermal: core: Do not call handle_thermal_trip() if zone temperature is invalid · 202aa0d4
      Rafael J. Wysocki authored
      Make __thermal_zone_device_update() bail out if update_temperature()
      fails to update the zone temperature because __thermal_zone_get_temp()
      has returned an error and the current zone temperature is
      THERMAL_TEMP_INVALID (user space receiving netlink thermal messages,
      thermal debug code and thermal governors may get confused otherwise).
      
      Fixes: 9ad18043 ("thermal: core: Send trip crossing notifications at init time if needed")
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Reviewed-by: default avatarLukasz Luba <lukasz.luba@arm.com>
      Tested-by: default avatarLukasz Luba <lukasz.luba@arm.com>
      202aa0d4
    • Rafael J. Wysocki's avatar
      thermal: trip: Add missing empty code line · 1502718a
      Rafael J. Wysocki authored
      Add missing empty line of code to thermal_zone_trip_id().
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Reviewed-by: default avatarLukasz Luba <lukasz.luba@arm.com>
      1502718a
  2. 26 Apr, 2024 7 commits
    • Rafael J. Wysocki's avatar
      thermal/debugfs: Avoid printing zero duration for mitigation events in progress · bd700ba9
      Rafael J. Wysocki authored
      If a thermal mitigation event is in progress, its duration value has
      not been updated yet, so 0 will be printed as the event duration by
      tze_seq_show() which is confusing.
      
      Avoid doing that by marking the beginning of the event with the
      KTIME_MIN duration value and making tze_seq_show() compute the current
      event duration on the fly, in which case '>' will be printed instead of
      '=' in the event duration value field.
      
      Similarly, for trip points that have been crossed on the down, mark
      the end of mitigation with the KTIME_MAX timestamp value and make
      tze_seq_show() compute the current duration on the fly for the trip
      points still involved in the mitigation, in which cases the duration
      value printed by it will be prepended with a '>' character.
      
      Fixes: 7ef01f22 ("thermal/debugfs: Add thermal debugfs information for mitigation episodes")
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Reviewed-by: default avatarLukasz Luba <lukasz.luba@arm.com>
      Tested-by: default avatarLukasz Luba <lukasz.luba@arm.com>
      bd700ba9
    • Rafael J. Wysocki's avatar
      thermal/debugfs: Pass cooling device state to thermal_debug_cdev_add() · 31a0fa00
      Rafael J. Wysocki authored
      If cdev_dt_seq_show() runs before the first state transition of a cooling
      device, it will not print any state residency information for it, even
      though it might be reasonably expected to print residency information for
      the initial state of the cooling device.
      
      For this reason, rearrange the code to get the initial state of a cooling
      device at the registration time and pass it to thermal_debug_cdev_add(),
      so that the latter can create a duration record for that state which will
      allow cdev_dt_seq_show() to print its residency information.
      
      Fixes: 755113d7 ("thermal/debugfs: Add thermal cooling device debugfs information")
      Reported-by: default avatarLukasz Luba <lukasz.luba@arm.com>
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Reviewed-by: default avatarLukasz Luba <lukasz.luba@arm.com>
      Tested-by: default avatarLukasz Luba <lukasz.luba@arm.com>
      31a0fa00
    • Rafael J. Wysocki's avatar
      thermal/debugfs: Create records for cdev states as they get used · f4ae18fc
      Rafael J. Wysocki authored
      Because thermal_debug_cdev_state_update() only creates a duration record
      for the old state of a cooling device, if its new state is used for the
      first time, there will be no record for it and cdev_dt_seq_show() will
      not print the duration information for it even though it contains code
      to compute the duration value in that case.
      
      Address this by making thermal_debug_cdev_state_update() create a
      duration record for the new state if there is none.
      
      Fixes: 755113d7 ("thermal/debugfs: Add thermal cooling device debugfs information")
      Reported-by: default avatarLukasz Luba <lukasz.luba@arm.com>
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Reviewed-by: default avatarLukasz Luba <lukasz.luba@arm.com>
      Tested-by: default avatarLukasz Luba <lukasz.luba@arm.com>
      f4ae18fc
    • Rafael J. Wysocki's avatar
      8c882f17
    • Rafael J. Wysocki's avatar
      thermal/debugfs: Prevent use-after-free from occurring after cdev removal · d351eb0a
      Rafael J. Wysocki authored
      Since thermal_debug_cdev_remove() does not run under cdev->lock, it can
      run in parallel with thermal_debug_cdev_state_update() and it may free
      the struct thermal_debugfs object used by the latter after it has been
      checked against NULL.
      
      If that happens, thermal_debug_cdev_state_update() will access memory
      that has been freed already causing the kernel to crash.
      
      Address this by using cdev->lock in thermal_debug_cdev_remove() around
      the cdev->debugfs value check (in case the same cdev is removed at the
      same time in two different threads) and its reset to NULL.
      
      Fixes: 755113d7 ("thermal/debugfs: Add thermal cooling device debugfs information")
      Cc :6.8+ <stable@vger.kernel.org> # 6.8+
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Reviewed-by: default avatarLukasz Luba <lukasz.luba@arm.com>
      d351eb0a
    • Rafael J. Wysocki's avatar
      thermal/debugfs: Fix two locking issues with thermal zone debug · c7f7c372
      Rafael J. Wysocki authored
      With the current thermal zone locking arrangement in the debugfs code,
      user space can open the "mitigations" file for a thermal zone before
      the zone's debugfs pointer is set which will result in a NULL pointer
      dereference in tze_seq_start().
      
      Moreover, thermal_debug_tz_remove() is not called under the thermal
      zone lock, so it can run in parallel with the other functions accessing
      the thermal zone's struct thermal_debugfs object.  Then, it may clear
      tz->debugfs after one of those functions has checked it and the
      struct thermal_debugfs object may be freed prematurely.
      
      To address the first problem, pass a pointer to the thermal zone's
      struct thermal_debugfs object to debugfs_create_file() in
      thermal_debug_tz_add() and make tze_seq_start(), tze_seq_next(),
      tze_seq_stop(), and tze_seq_show() retrieve it from s->private
      instead of a pointer to the thermal zone object.  This will ensure
      that tz_debugfs will be valid across the "mitigations" file accesses
      until thermal_debugfs_remove_id() called by thermal_debug_tz_remove()
      removes that file.
      
      To address the second problem, use tz->lock in thermal_debug_tz_remove()
      around the tz->debugfs value check (in case the same thermal zone is
      removed at the same time in two different threads) and its reset to NULL.
      
      Fixes: 7ef01f22 ("thermal/debugfs: Add thermal debugfs information for mitigation episodes")
      Cc :6.8+ <stable@vger.kernel.org> # 6.8+
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Reviewed-by: default avatarLukasz Luba <lukasz.luba@arm.com>
      c7f7c372
    • Rafael J. Wysocki's avatar
      thermal/debugfs: Free all thermal zone debug memory on zone removal · 72c1afff
      Rafael J. Wysocki authored
      Because thermal_debug_tz_remove() does not free all memory allocated for
      thermal zone diagnostics, some of that memory becomes unreachable after
      freeing the thermal zone's struct thermal_debugfs object.
      
      Address this by making thermal_debug_tz_remove() free all of the memory
      in question.
      
      Fixes: 7ef01f22 ("thermal/debugfs: Add thermal debugfs information for mitigation episodes")
      Cc :6.8+ <stable@vger.kernel.org> # 6.8+
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Reviewed-by: default avatarLukasz Luba <lukasz.luba@arm.com>
      72c1afff
  3. 24 Apr, 2024 14 commits
  4. 23 Apr, 2024 6 commits
  5. 21 Apr, 2024 7 commits
    • Linus Torvalds's avatar
      Linux 6.9-rc5 · ed30a4a5
      Linus Torvalds authored
      ed30a4a5
    • Linus Torvalds's avatar
      Merge tag 'char-misc-6.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc · 48cf398f
      Linus Torvalds authored
      Pull char / misc driver fixes from Greg KH:
       "Here are some small char/misc and other driver fixes for 6.9-rc5.
        Included in here are the following:
      
         - binder driver fix for reported problem
      
         - speakup crash fix
      
         - mei driver fixes for reported problems
      
         - comdei driver fix
      
         - interconnect driver fixes
      
         - rtsx driver fix
      
         - peci.h kernel doc fix
      
        All of these have been in linux-next for over a week with no reported
        problems"
      
      * tag 'char-misc-6.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
        peci: linux/peci.h: fix Excess kernel-doc description warning
        binder: check offset alignment in binder_get_object()
        comedi: vmk80xx: fix incomplete endpoint checking
        mei: vsc: Unregister interrupt handler for system suspend
        Revert "mei: vsc: Call wake_up() in the threaded IRQ handler"
        misc: rtsx: Fix rts5264 driver status incorrect when card removed
        mei: me: disable RPL-S on SPS and IGN firmwares
        speakup: Avoid crash on very long word
        interconnect: Don't access req_list while it's being manipulated
        interconnect: qcom: x1e80100: Remove inexistent ACV_PERF BCM
      48cf398f
    • Linus Torvalds's avatar
      Merge tag 'driver-core-6.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core · 4e90ba75
      Linus Torvalds authored
      Pull kernfs bugfix and documentation update from Greg KH:
       "Here are two changes for 6.9-rc5 that deal with "driver core" stuff,
        that do the following:
      
         - sysfs reference leak fix
      
         - embargoed-hardware-issues.rst update for Power
      
        Both of these have been in linux-next for over a week with no reported
        issues"
      
      * tag 'driver-core-6.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
        Documentation: embargoed-hardware-issues.rst: Add myself for Power
        fs: sysfs: Fix reference leak in sysfs_break_active_protection()
      4e90ba75
    • Linus Torvalds's avatar
      Merge tag 'tty-6.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty · c0c6b5c0
      Linus Torvalds authored
      Pull tty/serial driver fixes from Greg KH:
       "Here are some small tty and serial driver fixes for 6.9-rc5 that
        resolve a bunch of reported problems. Included in here are:
      
         - MAINTAINERS and .mailmap update for Richard Genoud
      
         - serial core regression fixes from 6.9-rc1 changes
      
         - pci id cleanups
      
         - serial core crash fix
      
         - stm32 driver fixes
      
         - 8250 driver fixes
      
        All of these have been in linux-next for a while with no reported
        problems"
      
      * tag 'tty-6.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
        serial: stm32: Reset .throttled state in .startup()
        serial: stm32: Return IRQ_NONE in the ISR if no handling happend
        serial: core: Fix missing shutdown and startup for serial base port
        serial: core: Clearing the circular buffer before NULLifying it
        MAINTAINERS: mailmap: update Richard Genoud's email address
        serial/pmac_zilog: Remove flawed mitigation for rx irq flood
        serial: 8250_pci: Remove redundant PCI IDs
        serial: core: Fix regression when runtime PM is not enabled
        serial: mxs-auart: add spinlock around changing cts state
        serial: 8250_dw: Revert: Do not reclock if already at correct rate
        serial: 8250_lpc18xx: disable clks on error in probe()
      c0c6b5c0
    • Linus Torvalds's avatar
      Merge tag 'usb-6.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb · 5fa0ab45
      Linus Torvalds authored
      Pull USB / Thunderbolt driver fixes from Greg KH:
       "Here are some small USB and Thunderbolt driver fixes for 6.9-rc5.
        Included in here are:
      
         - MAINTAINER file update for invalid email address
      
         - usb-serial device id updates
      
         - typec driver fixes
      
         - thunderbolt / usb4 driver fixes
      
         - usb core shutdown fixes
      
         - cdc-wdm driver revert for reported problem in -rc1
      
         - usb gadget driver fixes
      
         - xhci driver fixes
      
        All of these have been in linux-next for a while with no reported
        problems"
      
      * tag 'usb-6.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (25 commits)
        USB: serial: option: add Telit FN920C04 rmnet compositions
        usb: dwc3: ep0: Don't reset resource alloc flag
        Revert "usb: cdc-wdm: close race between read and workqueue"
        USB: serial: option: add Rolling RW101-GL and RW135-GL support
        USB: serial: option: add Lonsung U8300/U9300 product
        USB: serial: option: add support for Fibocom FM650/FG650
        USB: serial: option: support Quectel EM060K sub-models
        USB: serial: option: add Fibocom FM135-GL variants
        usb: misc: onboard_usb_hub: Disable the USB hub clock on failure
        thunderbolt: Avoid notify PM core about runtime PM resume
        thunderbolt: Fix wake configurations after device unplug
        usb: dwc2: host: Fix dereference issue in DDMA completion flow.
        usb: typec: mux: it5205: Fix ChipID value typo
        MAINTAINERS: Drop Li Yang as their email address stopped working
        usb: gadget: fsl: Initialize udc before using it
        usb: Disable USB3 LPM at shutdown
        usb: gadget: f_ncm: Fix UAF ncm object at re-bind after usb ep transport error
        usb: typec: tcpm: Correct the PDO counting in pd_set
        usb: gadget: functionfs: Wait for fences before enqueueing DMABUF
        usb: gadget: functionfs: Fix inverted DMA fence direction
        ...
      5fa0ab45
    • Linus Torvalds's avatar
      Merge tag 'sched_urgent_for_v6.9_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 3b680865
      Linus Torvalds authored
      Pull scheduler fix from Borislav Petkov:
      
       - Add a missing memory barrier in the concurrency ID mm switching
      
      * tag 'sched_urgent_for_v6.9_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        sched: Add missing memory barrier in switch_mm_cid
      3b680865
    • Linus Torvalds's avatar
      Merge tag 'x86_urgent_for_v6.9_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · d07a0b86
      Linus Torvalds authored
      Pull x86 fixes from Borislav Petkov:
      
       - Fix CPU feature dependencies of GFNI, VAES, and VPCLMULQDQ
      
       - Print the correct error code when FRED reports a bad event type
      
       - Add a FRED-specific INT80 handler without the special dances that
         need to happen in the current one
      
       - Enable the using-the-default-return-thunk-but-you-should-not warning
         only on configs which actually enable those special return thunks
      
       - Check the proper feature flags when selecting BHI retpoline
         mitigation
      
      * tag 'x86_urgent_for_v6.9_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/cpufeatures: Fix dependencies for GFNI, VAES, and VPCLMULQDQ
        x86/fred: Fix incorrect error code printout in fred_bad_type()
        x86/fred: Fix INT80 emulation for FRED
        x86/retpolines: Enable the default thunk warning only on relevant configs
        x86/bugs: Fix BHI retpoline check
      d07a0b86
  6. 20 Apr, 2024 3 commits
    • Linus Torvalds's avatar
      Merge tag 'block-6.9-20240420' of git://git.kernel.dk/linux · 977b1ef5
      Linus Torvalds authored
      Pull block fixes from Jens Axboe:
       "Just two minor fixes that should go into the 6.9 kernel release, one
        fixing a regression with partition scanning errors, and one fixing a
        WARN_ON() that can get triggered if we race with a timer"
      
      * tag 'block-6.9-20240420' of git://git.kernel.dk/linux:
        blk-iocost: do not WARN if iocg was already offlined
        block: propagate partition scanning errors to the BLKRRPART ioctl
      977b1ef5
    • Linus Torvalds's avatar
      Merge tag 'email' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi · 39316e5f
      Linus Torvalds authored
      Pull email address update from James Bottomley:
       "My IBM email has stopped working, so update to a working email
        address"
      
      * tag 'email' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
        MAINTAINERS: update to working email address
      39316e5f
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm · 81777226
      Linus Torvalds authored
      Pull kvm fixes from Paolo Bonzini:
       "This is a bit on the large side, mostly due to two changes:
      
         - Changes to disable some broken PMU virtualization (see below for
           details under "x86 PMU")
      
         - Clean up SVM's enter/exit assembly code so that it can be compiled
           without OBJECT_FILES_NON_STANDARD. This fixes a warning "Unpatched
           return thunk in use. This should not happen!" when running KVM
           selftests.
      
        Everything else is small bugfixes and selftest changes:
      
         - Fix a mostly benign bug in the gfn_to_pfn_cache infrastructure
           where KVM would allow userspace to refresh the cache with a bogus
           GPA. The bug has existed for quite some time, but was exposed by a
           new sanity check added in 6.9 (to ensure a cache is either
           GPA-based or HVA-based).
      
         - Drop an unused param from gfn_to_pfn_cache_invalidate_start() that
           got left behind during a 6.9 cleanup.
      
         - Fix a math goof in x86's hugepage logic for
           KVM_SET_MEMORY_ATTRIBUTES that results in an array overflow
           (detected by KASAN).
      
         - Fix a bug where KVM incorrectly clears root_role.direct when
           userspace sets guest CPUID.
      
         - Fix a dirty logging bug in the where KVM fails to write-protect
           SPTEs used by a nested guest, if KVM is using Page-Modification
           Logging and the nested hypervisor is NOT using EPT.
      
        x86 PMU:
      
         - Drop support for virtualizing adaptive PEBS, as KVM's
           implementation is architecturally broken without an obvious/easy
           path forward, and because exposing adaptive PEBS can leak host LBRs
           to the guest, i.e. can leak host kernel addresses to the guest.
      
         - Set the enable bits for general purpose counters in
           PERF_GLOBAL_CTRL at RESET time, as done by both Intel and AMD
           processors.
      
         - Disable LBR virtualization on CPUs that don't support LBR
           callstacks, as KVM unconditionally uses
           PERF_SAMPLE_BRANCH_CALL_STACK when creating the perf event, and
           would fail on such CPUs.
      
        Tests:
      
         - Fix a flaw in the max_guest_memory selftest that results in it
           exhausting the supply of ucall structures when run with more than
           256 vCPUs.
      
         - Mark KVM_MEM_READONLY as supported for RISC-V in
           set_memory_region_test"
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (30 commits)
        KVM: Drop unused @may_block param from gfn_to_pfn_cache_invalidate_start()
        KVM: selftests: Add coverage of EPT-disabled to vmx_dirty_log_test
        KVM: x86/mmu: Fix and clarify comments about clearing D-bit vs. write-protecting
        KVM: x86/mmu: Remove function comments above clear_dirty_{gfn_range,pt_masked}()
        KVM: x86/mmu: Write-protect L2 SPTEs in TDP MMU when clearing dirty status
        KVM: x86/mmu: Precisely invalidate MMU root_role during CPUID update
        KVM: VMX: Disable LBR virtualization if the CPU doesn't support LBR callstacks
        perf/x86/intel: Expose existence of callback support to KVM
        KVM: VMX: Snapshot LBR capabilities during module initialization
        KVM: x86/pmu: Do not mask LVTPC when handling a PMI on AMD platforms
        KVM: x86: Snapshot if a vCPU's vendor model is AMD vs. Intel compatible
        KVM: x86: Stop compiling vmenter.S with OBJECT_FILES_NON_STANDARD
        KVM: SVM: Create a stack frame in __svm_sev_es_vcpu_run()
        KVM: SVM: Save/restore args across SEV-ES VMRUN via host save area
        KVM: SVM: Save/restore non-volatile GPRs in SEV-ES VMRUN via host save area
        KVM: SVM: Clobber RAX instead of RBX when discarding spec_ctrl_intercepted
        KVM: SVM: Drop 32-bit "support" from __svm_sev_es_vcpu_run()
        KVM: SVM: Wrap __svm_sev_es_vcpu_run() with #ifdef CONFIG_KVM_AMD_SEV
        KVM: SVM: Create a stack frame in __svm_vcpu_run() for unwinding
        KVM: SVM: Remove a useless zeroing of allocated memory
        ...
      81777226