1. 15 Nov, 2022 3 commits
  2. 27 Oct, 2022 2 commits
    • Ravi Bangoria's avatar
      perf: Optimize perf_tp_event() · 571f97f7
      Ravi Bangoria authored
      Use the event group trees to iterate only perf_tracepoint events.
      
      Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      571f97f7
    • Peter Zijlstra's avatar
      perf: Rewrite core context handling · bd275681
      Peter Zijlstra authored
      There have been various issues and limitations with the way perf uses
      (task) contexts to track events. Most notable is the single hardware
      PMU task context, which has resulted in a number of yucky things (both
      proposed and merged).
      
      Notably:
       - HW breakpoint PMU
       - ARM big.little PMU / Intel ADL PMU
       - Intel Branch Monitoring PMU
       - AMD IBS PMU
       - S390 cpum_cf PMU
       - PowerPC trace_imc PMU
      
      *Current design:*
      
      Currently we have a per task and per cpu perf_event_contexts:
      
        task_struct::perf_events_ctxp[] <-> perf_event_context <-> perf_cpu_context
             ^                                 |    ^     |           ^
             `---------------------------------'    |     `--> pmu ---'
                                                    v           ^
                                               perf_event ------'
      
      Each task has an array of pointers to a perf_event_context. Each
      perf_event_context has a direct relation to a PMU and a group of
      events for that PMU. The task related perf_event_context's have a
      pointer back to that task.
      
      Each PMU has a per-cpu pointer to a per-cpu perf_cpu_context, which
      includes a perf_event_context, which again has a direct relation to
      that PMU, and a group of events for that PMU.
      
      The perf_cpu_context also tracks which task context is currently
      associated with that CPU and includes a few other things like the
      hrtimer for rotation etc.
      
      Each perf_event is then associated with its PMU and one
      perf_event_context.
      
      *Proposed design:*
      
      New design proposed by this patch reduce to a single task context and
      a single CPU context but adds some intermediate data-structures:
      
        task_struct::perf_event_ctxp -> perf_event_context <- perf_cpu_context
             ^                           |   ^ ^
             `---------------------------'   | |
                                             | |    perf_cpu_pmu_context <--.
                                             | `----.    ^                  |
                                             |      |    |                  |
                                             |      v    v                  |
                                             | ,--> perf_event_pmu_context  |
                                             | |                            |
                                             | |                            |
                                             v v                            |
                                        perf_event ---> pmu ----------------'
      
      With the new design, perf_event_context will hold all events for all
      pmus in the (respective pinned/flexible) rbtrees. This can be achieved
      by adding pmu to rbtree key:
      
        {cpu, pmu, cgroup, group_index}
      
      Each perf_event_context carries a list of perf_event_pmu_context which
      is used to hold per-pmu-per-context state. For example, it keeps track
      of currently active events for that pmu, a pmu specific task_ctx_data,
      a flag to tell whether rotation is required or not etc.
      
      Additionally, perf_cpu_pmu_context is used to hold per-pmu-per-cpu
      state like hrtimer details to drive the event rotation, a pointer to
      perf_event_pmu_context of currently running task and some other
      ancillary information.
      
      Each perf_event is associated to it's pmu, perf_event_context and
      perf_event_pmu_context.
      
      Further optimizations to current implementation are possible. For
      example, ctx_resched() can be optimized to reschedule only single pmu
      events.
      
      Much thanks to Ravi for picking this up and pushing it towards
      completion.
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Co-developed-by: default avatarRavi Bangoria <ravi.bangoria@amd.com>
      Signed-off-by: default avatarRavi Bangoria <ravi.bangoria@amd.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Link: https://lkml.kernel.org/r/20221008062424.313-1-ravi.bangoria@amd.com
      bd275681
  3. 23 Oct, 2022 9 commits
  4. 22 Oct, 2022 21 commits
  5. 21 Oct, 2022 5 commits
    • Linus Torvalds's avatar
      Merge tag '6.1-rc1-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6 · bd8e9634
      Linus Torvalds authored
      Pull cifs fixes from Steve French:
      
       - memory leak fixes
      
       - fixes for directory leases, including an important one which fixes a
         problem noticed by git functional tests
      
       - fixes relating to missing free_xid calls (helpful for
         tracing/debugging of entry/exit into cifs.ko)
      
       - a multichannel fix
      
       - a small cleanup fix (use of list_move instead of list_del/list_add)
      
      * tag '6.1-rc1-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6:
        cifs: update internal module number
        cifs: fix memory leaks in session setup
        cifs: drop the lease for cached directories on rmdir or rename
        smb3: interface count displayed incorrectly
        cifs: Fix memory leak when build ntlmssp negotiate blob failed
        cifs: set rc to -ENOENT if we can not get a dentry for the cached dir
        cifs: use LIST_HEAD() and list_move() to simplify code
        cifs: Fix xid leak in cifs_get_file_info_unix()
        cifs: Fix xid leak in cifs_ses_add_channel()
        cifs: Fix xid leak in cifs_flock()
        cifs: Fix xid leak in cifs_copy_file_range()
        cifs: Fix xid leak in cifs_create()
      bd8e9634
    • Linus Torvalds's avatar
      Merge tag 'nfsd-6.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux · 022c028f
      Linus Torvalds authored
      Pull nfsd fixes from Chuck Lever:
       "Fixes for patches merged in v6.1"
      
      * tag 'nfsd-6.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux:
        nfsd: ensure we always call fh_verify_error tracepoint
        NFSD: unregister shrinker when nfsd_init_net() fails
      022c028f
    • Chang S. Bae's avatar
      x86/fpu: Fix copy_xstate_to_uabi() to copy init states correctly · 471f0aa7
      Chang S. Bae authored
      When an extended state component is not present in fpstate, but in init
      state, the function copies from init_fpstate via copy_feature().
      
      But, dynamic states are not present in init_fpstate because of all-zeros
      init states. Then retrieving them from init_fpstate will explode like this:
      
       BUG: kernel NULL pointer dereference, address: 0000000000000000
       ...
       RIP: 0010:memcpy_erms+0x6/0x10
        ? __copy_xstate_to_uabi_buf+0x381/0x870
        fpu_copy_guest_fpstate_to_uabi+0x28/0x80
        kvm_arch_vcpu_ioctl+0x14c/0x1460 [kvm]
        ? __this_cpu_preempt_check+0x13/0x20
        ? vmx_vcpu_put+0x2e/0x260 [kvm_intel]
        kvm_vcpu_ioctl+0xea/0x6b0 [kvm]
        ? kvm_vcpu_ioctl+0xea/0x6b0 [kvm]
        ? __fget_light+0xd4/0x130
        __x64_sys_ioctl+0xe3/0x910
        ? debug_smp_processor_id+0x17/0x20
        ? fpregs_assert_state_consistent+0x27/0x50
        do_syscall_64+0x3f/0x90
        entry_SYSCALL_64_after_hwframe+0x63/0xcd
      
      Adjust the 'mask' to zero out the userspace buffer for the features that
      are not available both from fpstate and from init_fpstate.
      
      The dynamic features depend on the compacted XSAVE format. Ensure it is
      enabled before reading XCOMP_BV in init_fpstate.
      
      Fixes: 2308ee57 ("x86/fpu/amx: Enable the AMX feature in 64-bit mode")
      Reported-by: default avatarYuan Yao <yuan.yao@intel.com>
      Suggested-by: default avatarDave Hansen <dave.hansen@intel.com>
      Signed-off-by: default avatarChang S. Bae <chang.seok.bae@intel.com>
      Signed-off-by: default avatarDave Hansen <dave.hansen@linux.intel.com>
      Tested-by: default avatarYuan Yao <yuan.yao@intel.com>
      Link: https://lore.kernel.org/lkml/BYAPR11MB3717EDEF2351C958F2C86EED95259@BYAPR11MB3717.namprd11.prod.outlook.com/
      Link: https://lkml.kernel.org/r/20221021185844.13472-1-chang.seok.bae@intel.com
      471f0aa7
    • Linus Torvalds's avatar
      Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi · ed537795
      Linus Torvalds authored
      Pull SCSI fixes from James Bottomley:
       "Two small changes, one in the lpfc driver and the other in the core.
      
        The core change is an additional footgun guard which prevents users
        from writing the wrong state to sysfs and causing a hang"
      
      * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
        scsi: lpfc: Fix memory leak in lpfc_create_port()
        scsi: core: Restrict legal sdev_state transitions via sysfs
      ed537795
    • Linus Torvalds's avatar
      Merge tag 'block-6.1-2022-10-20' of git://git.kernel.dk/linux · d4b7332e
      Linus Torvalds authored
      Pull block fixes from Jens Axboe:
      
       - NVMe pull request via Christoph:
            - fix nvme-hwmon for DMA non-cohehrent architectures (Serge Semin)
            - add a nvme-hwmong maintainer (Christoph Hellwig)
            - fix error pointer dereference in error handling (Dan Carpenter)
            - fix invalid memory reference in nvmet_subsys_attr_qid_max_show
              (Daniel Wagner)
            - don't limit the DMA segment size in nvme-apple (Russell King)
            - fix workqueue MEM_RECLAIM flushing dependency (Sagi Grimberg)
            - disable write zeroes on various Kingston SSDs (Xander Li)
      
       - fix a memory leak with block device tracing (Ye)
      
       - flexible-array fix for ublk (Yushan)
      
       - document the ublk recovery feature from this merge window
         (ZiyangZhang)
      
       - remove dead bfq variable in struct (Yuwei)
      
       - error handling rq clearing fix (Yu)
      
       - add an IRQ safety check for the cached bio freeing (Pavel)
      
       - drbd bio cloning fix (Christoph)
      
      * tag 'block-6.1-2022-10-20' of git://git.kernel.dk/linux:
        blktrace: remove unnessary stop block trace in 'blk_trace_shutdown'
        blktrace: fix possible memleak in '__blk_trace_remove'
        blktrace: introduce 'blk_trace_{start,stop}' helper
        bio: safeguard REQ_ALLOC_CACHE bio put
        block, bfq: remove unused variable for bfq_queue
        drbd: only clone bio if we have a backing device
        ublk_drv: use flexible-array member instead of zero-length array
        nvmet: fix invalid memory reference in nvmet_subsys_attr_qid_max_show
        nvmet: fix workqueue MEM_RECLAIM flushing dependency
        nvme-hwmon: kmalloc the NVME SMART log buffer
        nvme-hwmon: consistently ignore errors from nvme_hwmon_init
        nvme: add Guenther as nvme-hwmon maintainer
        nvme-apple: don't limit DMA segement size
        nvme-pci: disable write zeroes on various Kingston SSD
        nvme: fix error pointer dereference in error handling
        Documentation: document ublk user recovery feature
        blk-mq: fix null pointer dereference in blk_mq_clear_rq_mapping()
      d4b7332e