1. 22 Nov, 2017 12 commits
    • Chris Wilson's avatar
      drm/i915: Call i915_gem_init_userptr() before taking struct_mutex · ee48700d
      Chris Wilson authored
      We don't need struct_mutex to initialise userptr (it just allocates a
      workqueue for itself etc), but we do need struct_mutex later on in
      i915_gem_init() in order to feed requests onto the HW.
      
      This should break the chain
      
      [  385.697902] ======================================================
      [  385.697907] WARNING: possible circular locking dependency detected
      [  385.697913] 4.14.0-CI-Patchwork_7234+ #1 Tainted: G     U
      [  385.697917] ------------------------------------------------------
      [  385.697922] perf_pmu/2631 is trying to acquire lock:
      [  385.697927]  (&mm->mmap_sem){++++}, at: [<ffffffff811bfe1e>] __might_fault+0x3e/0x90
      [  385.697941]
                     but task is already holding lock:
      [  385.697946]  (&cpuctx_mutex){+.+.}, at: [<ffffffff8116fe8c>] perf_event_ctx_lock_nested+0xbc/0x1d0
      [  385.697957]
                     which lock already depends on the new lock.
      
      [  385.697963]
                     the existing dependency chain (in reverse order) is:
      [  385.697970]
                     -> #4 (&cpuctx_mutex){+.+.}:
      [  385.697980]        __mutex_lock+0x86/0x9b0
      [  385.697985]        perf_event_init_cpu+0x5a/0x90
      [  385.697991]        perf_event_init+0x178/0x1a4
      [  385.697997]        start_kernel+0x27f/0x3f1
      [  385.698003]        verify_cpu+0x0/0xfb
      [  385.698006]
                     -> #3 (pmus_lock){+.+.}:
      [  385.698015]        __mutex_lock+0x86/0x9b0
      [  385.698020]        perf_event_init_cpu+0x21/0x90
      [  385.698025]        cpuhp_invoke_callback+0xca/0xc00
      [  385.698030]        _cpu_up+0xa7/0x170
      [  385.698035]        do_cpu_up+0x57/0x70
      [  385.698039]        smp_init+0x62/0xa6
      [  385.698044]        kernel_init_freeable+0x97/0x193
      [  385.698050]        kernel_init+0xa/0x100
      [  385.698055]        ret_from_fork+0x27/0x40
      [  385.698058]
                     -> #2 (cpu_hotplug_lock.rw_sem){++++}:
      [  385.698068]        cpus_read_lock+0x39/0xa0
      [  385.698073]        apply_workqueue_attrs+0x12/0x50
      [  385.698078]        __alloc_workqueue_key+0x1d8/0x4d8
      [  385.698134]        i915_gem_init_userptr+0x5f/0x80 [i915]
      [  385.698176]        i915_gem_init+0x7c/0x390 [i915]
      [  385.698213]        i915_driver_load+0x99e/0x15c0 [i915]
      [  385.698250]        i915_pci_probe+0x33/0x90 [i915]
      [  385.698256]        pci_device_probe+0xa1/0x130
      [  385.698262]        driver_probe_device+0x293/0x440
      [  385.698267]        __driver_attach+0xde/0xe0
      [  385.698272]        bus_for_each_dev+0x5c/0x90
      [  385.698277]        bus_add_driver+0x16d/0x260
      [  385.698282]        driver_register+0x57/0xc0
      [  385.698287]        do_one_initcall+0x3e/0x160
      [  385.698292]        do_init_module+0x5b/0x1fa
      [  385.698297]        load_module+0x2374/0x2dc0
      [  385.698302]        SyS_finit_module+0xaa/0xe0
      [  385.698307]        entry_SYSCALL_64_fastpath+0x1c/0xb1
      [  385.698311]
                     -> #1 (&dev->struct_mutex){+.+.}:
      [  385.698320]        __mutex_lock+0x86/0x9b0
      [  385.698361]        i915_mutex_lock_interruptible+0x4c/0x130 [i915]
      [  385.698403]        i915_gem_fault+0x206/0x760 [i915]
      [  385.698409]        __do_fault+0x1a/0x70
      [  385.698413]        __handle_mm_fault+0x7c4/0xdb0
      [  385.698417]        handle_mm_fault+0x154/0x300
      [  385.698440]        __do_page_fault+0x2d6/0x570
      [  385.698445]        page_fault+0x22/0x30
      [  385.698449]
                     -> #0 (&mm->mmap_sem){++++}:
      [  385.698459]        lock_acquire+0xaf/0x200
      [  385.698464]        __might_fault+0x68/0x90
      [  385.698470]        _copy_to_user+0x1e/0x70
      [  385.698475]        perf_read+0x1aa/0x290
      [  385.698480]        __vfs_read+0x23/0x120
      [  385.698484]        vfs_read+0xa3/0x150
      [  385.698488]        SyS_read+0x45/0xb0
      [  385.698493]        entry_SYSCALL_64_fastpath+0x1c/0xb1
      [  385.698497]
                     other info that might help us debug this:
      
      [  385.698505] Chain exists of:
                       &mm->mmap_sem --> pmus_lock --> &cpuctx_mutex
      
      [  385.698517]  Possible unsafe locking scenario:
      
      [  385.698522]        CPU0                    CPU1
      [  385.698526]        ----                    ----
      [  385.698529]   lock(&cpuctx_mutex);
      [  385.698553]                                lock(pmus_lock);
      [  385.698558]                                lock(&cpuctx_mutex);
      [  385.698564]   lock(&mm->mmap_sem);
      [  385.698568]
                      *** DEADLOCK ***
      
      [  385.698574] 1 lock held by perf_pmu/2631:
      [  385.698578]  #0:  (&cpuctx_mutex){+.+.}, at: [<ffffffff8116fe8c>] perf_event_ctx_lock_nested+0xbc/0x1d0
      [  385.698589]
                     stack backtrace:
      [  385.698595] CPU: 3 PID: 2631 Comm: perf_pmu Tainted: G     U          4.14.0-CI-Patchwork_7234+ #1
      [  385.698602] Hardware name:                  /NUC6CAYB, BIOS AYAPLCEL.86A.0040.2017.0619.1722 06/19/2017
      [  385.698609] Call Trace:
      [  385.698615]  dump_stack+0x5f/0x86
      [  385.698621]  print_circular_bug.isra.18+0x1d0/0x2c0
      [  385.698627]  __lock_acquire+0x19c3/0x1b60
      [  385.698634]  ? generic_exec_single+0x77/0xe0
      [  385.698640]  ? lock_acquire+0xaf/0x200
      [  385.698644]  lock_acquire+0xaf/0x200
      [  385.698650]  ? __might_fault+0x3e/0x90
      [  385.698655]  __might_fault+0x68/0x90
      [  385.698660]  ? __might_fault+0x3e/0x90
      [  385.698665]  _copy_to_user+0x1e/0x70
      [  385.698670]  perf_read+0x1aa/0x290
      [  385.698675]  __vfs_read+0x23/0x120
      [  385.698682]  ? __fget+0x101/0x1f0
      [  385.698686]  vfs_read+0xa3/0x150
      [  385.698691]  SyS_read+0x45/0xb0
      [  385.698696]  entry_SYSCALL_64_fastpath+0x1c/0xb1
      [  385.698701] RIP: 0033:0x7ff1c46876ed
      [  385.698705] RSP: 002b:00007fff13552f90 EFLAGS: 00000293 ORIG_RAX: 0000000000000000
      [  385.698712] RAX: ffffffffffffffda RBX: ffffc90000647ff0 RCX: 00007ff1c46876ed
      [  385.698718] RDX: 0000000000000010 RSI: 00007fff13552fa0 RDI: 0000000000000005
      [  385.698723] RBP: 000056063d300580 R08: 0000000000000000 R09: 0000000000000060
      [  385.698729] R10: 0000000000000000 R11: 0000000000000293 R12: 0000000000000046
      [  385.698734] R13: 00007fff13552c6f R14: 00007ff1c6279d00 R15: 00007ff1c6279a40
      
      Testcase: igt/perf_pmu
      Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20171122172621.16158-1-chris@chris-wilson.co.ukReviewed-by: default avatarTvrtko Ursulin <tvrtko.ursulin@intel.com>
      ee48700d
    • Chris Wilson's avatar
      drm/i915: Remove success dmesg noise for intel_rotate_pages() · 62d0fe45
      Chris Wilson authored
      During selftesting intel_rotate_pages() is very, very verbose without
      giving us any information. Suppress the noise.
      Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20171122145646.1859-1-chris@chris-wilson.co.ukReviewed-by: default avatarTvrtko Ursulin <tvrtko.ursulin@intel.com>
      62d0fe45
    • Chris Wilson's avatar
      drm/i915/selftests: Use NOWARN for large allocations · c65c8b0f
      Chris Wilson authored
      We may try to do a large kmalloc for the permutation array, falling back
      to a smaller array/test if the first allocation fails. Since we are
      intentionally trying a large allocation which may fail, pass __GFP_NOWARN.
      
      Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103842Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Cc: Matthew Auld <matthew.auld@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20171122120600.27025-1-chris@chris-wilson.co.ukReviewed-by: default avatarMatthew Auld <matthew.auld@intel.com>
      c65c8b0f
    • Tvrtko Ursulin's avatar
      drm/i915/pmu: Add RC6 residency metrics · 6060b6ae
      Tvrtko Ursulin authored
      For clients like intel-gpu-overlay it is easier to read the
      counters via the perf API than having to parse sysfs.
      Signed-off-by: default avatarTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Reviewed-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Link: https://patchwork.freedesktop.org/patch/msgid/20171121181852.16128-9-tvrtko.ursulin@linux.intel.com
      6060b6ae
    • Tvrtko Ursulin's avatar
      drm/i915: Convert intel_rc6_residency_us to ns · 36cc8b96
      Tvrtko Ursulin authored
      Will be used for exposing the PMU counters.
      
      v2:
       * Move intel_runtime_pm_get/put to the callers. (Chris Wilson)
       * Restore full unit conversion precision.
      Signed-off-by: default avatarTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Reviewed-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Link: https://patchwork.freedesktop.org/patch/msgid/20171121181852.16128-8-tvrtko.ursulin@linux.intel.com
      36cc8b96
    • Tvrtko Ursulin's avatar
      drm/i915/pmu: Add interrupt count metric · 0cd4684d
      Tvrtko Ursulin authored
      For clients like intel-gpu-overlay it is easier to read the
      count via the perf API than having to parse /proc.
      Signed-off-by: default avatarTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Reviewed-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Link: https://patchwork.freedesktop.org/patch/msgid/20171121181852.16128-7-tvrtko.ursulin@linux.intel.com
      0cd4684d
    • Tvrtko Ursulin's avatar
      drm/i915/pmu: Wire up engine busy stats to PMU · b3add01e
      Tvrtko Ursulin authored
      We can use engine busy stats instead of the sampling timer for
      better accuracy.
      
      By doing this we replace the stohastic sampling with busyness
      metric derived directly from engine activity. This is context
      switch interrupt driven, so as accurate as we can get from
      software tracking.
      
      As a secondary benefit, we can also not run the sampling timer
      in cases only busyness metric is enabled.
      
      v2: Rebase.
      v3:
       * Rebase, comments.
       * Leave engine busyness controls out of workers.
      v4: Checkpatch cleanup.
      v5: Added comment to pmu_needs_timer change.
      v6:
       * Rebase.
       * Fix style of some comments. (Chris Wilson)
      v7: Rebase and commit message update. (Chris Wilson)
      v8: Add delayed stats disabling to improve accuracy in face of
          CPU hotplug events.
      v9: Rebase.
      v10: Rebase - i915_modparams.enable_execlists removal.
      Signed-off-by: default avatarTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Reviewed-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Link: https://patchwork.freedesktop.org/patch/msgid/20171121181852.16128-6-tvrtko.ursulin@linux.intel.com
      b3add01e
    • Tvrtko Ursulin's avatar
      drm/i915: Engine busy time tracking · 30e17b78
      Tvrtko Ursulin authored
      Track total time requests have been executing on the hardware.
      
      We add new kernel API to allow software tracking of time GPU
      engines are spending executing requests.
      
      Both per-engine and global API is added with the latter also
      being exported for use by external users.
      
      v2:
       * Squashed with the internal API.
       * Dropped static key.
       * Made per-engine.
       * Store time in monotonic ktime.
      
      v3: Moved stats clearing to disable.
      
      v4:
       * Comments.
       * Don't export the API just yet.
      
      v5: Whitespace cleanup.
      
      v6:
       * Rename ref to active.
       * Drop engine aggregate stats for now.
       * Account initial busy period after enabling stats.
      
      v7:
       * Rebase.
      
      v8:
       * Move context in notification after the notifier. (Chris Wilson)
      
      v9:
      
      In cases where stats tracking is getting disabled while there is
      an active context on an engine, add up the current value to the
      total. This also implies we don't clear the total when tracking
      is disabled any longer. There is no real need to do so because
      we define the stats as relative while enabled, meaning
      comparison between two samples while tracking is enabled is the
      valid usage. However, when busy stats will later be plugged into
      the perf PMU API, it is beneficial to not reset the total, since
      the PMU core likes to do some counter disable/enable cycles on
      startup, and while doing so during a single long context
      executing on an engine we would lose some accuracy and so make
      unit testing more difficult than needs to be.
      
      v10:
       * Fix accounting for preemption.
      
      v11:
       * Rebase for i915_modparams.enable_execlists removal.
      Signed-off-by: default avatarTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Reviewed-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Link: https://patchwork.freedesktop.org/patch/msgid/20171121181852.16128-5-tvrtko.ursulin@linux.intel.com
      30e17b78
    • Tvrtko Ursulin's avatar
      drm/i915: Wrap context schedule notification · 73fd9d38
      Tvrtko Ursulin authored
      No functional change just something which will be handy in the
      following patch.
      Signed-off-by: default avatarTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Reviewed-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Link: https://patchwork.freedesktop.org/patch/msgid/20171121181852.16128-4-tvrtko.ursulin@linux.intel.com
      73fd9d38
    • Tvrtko Ursulin's avatar
      drm/i915/pmu: Suspend sampling when GPU is idle · feff0dc6
      Tvrtko Ursulin authored
      If only a subset of events is enabled we can afford to suspend
      the sampling timer when the GPU is idle and so save some cycles
      and power.
      
      v2: Rebase and limit timer even more.
      v3: Rebase.
      v4: Rebase.
      v5: Skip action if perf PMU failed to register.
      v6: Checkpatch cleanup.
      v7:
       * Add a common helper to start the timer if needed. (Chris Wilson)
       * Add comment explaining bitwise logic in pmu_needs_timer.
      v8: Fix some comments styles. (Chris Wilson)
      v9: Rebase.
      v10: Move function declarations to i915_pmu.h.
      v11: Rename functions to i915_pmu_gt_(un)parked. (Chris Wilson)
      Signed-off-by: default avatarTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Reviewed-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Link: https://patchwork.freedesktop.org/patch/msgid/20171121181852.16128-3-tvrtko.ursulin@linux.intel.com
      feff0dc6
    • Tvrtko Ursulin's avatar
      drm/i915/pmu: Expose a PMU interface for perf queries · b46a33e2
      Tvrtko Ursulin authored
      From: Chris Wilson <chris@chris-wilson.co.uk>
      From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      From: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
      
      The first goal is to be able to measure GPU (and invidual ring) busyness
      without having to poll registers from userspace. (Which not only incurs
      holding the forcewake lock indefinitely, perturbing the system, but also
      runs the risk of hanging the machine.) As an alternative we can use the
      perf event counter interface to sample the ring registers periodically
      and send those results to userspace.
      
      Functionality we are exporting to userspace is via the existing perf PMU
      API and can be exercised via the existing tools. For example:
      
        perf stat -a -e i915/rcs0-busy/ -I 1000
      
      Will print the render engine busynnes once per second. All the performance
      counters can be enumerated (perf list) and have their unit of measure
      correctly reported in sysfs.
      
      v1-v2 (Chris Wilson):
      
      v2: Use a common timer for the ring sampling.
      
      v3: (Tvrtko Ursulin)
       * Decouple uAPI from i915 engine ids.
       * Complete uAPI defines.
       * Refactor some code to helpers for clarity.
       * Skip sampling disabled engines.
       * Expose counters in sysfs.
       * Pass in fake regs to avoid null ptr deref in perf core.
       * Convert to class/instance uAPI.
       * Use shared driver code for rc6 residency, power and frequency.
      
      v4: (Dmitry Rogozhkin)
       * Register PMU with .task_ctx_nr=perf_invalid_context
       * Expose cpumask for the PMU with the single CPU in the mask
       * Properly support pmu->stop(): it should call pmu->read()
       * Properly support pmu->del(): it should call stop(event, PERF_EF_UPDATE)
       * Introduce refcounting of event subscriptions.
       * Make pmu.busy_stats a refcounter to avoid busy stats going away
         with some deleted event.
       * Expose cpumask for i915 PMU to avoid multiple events creation of
         the same type followed by counter aggregation by perf-stat.
       * Track CPUs getting online/offline to migrate perf context. If (likely)
         cpumask will initially set CPU0, CONFIG_BOOTPARAM_HOTPLUG_CPU0 will be
         needed to see effect of CPU status tracking.
       * End result is that only global events are supported and perf stat
         works correctly.
       * Deny perf driver level sampling - it is prohibited for uncore PMU.
      
      v5: (Tvrtko Ursulin)
      
       * Don't hardcode number of engine samplers.
       * Rewrite event ref-counting for correctness and simplicity.
       * Store initial counter value when starting already enabled events
         to correctly report values to all listeners.
       * Fix RC6 residency readout.
       * Comments, GPL header.
      
      v6:
       * Add missing entry to v4 changelog.
       * Fix accounting in CPU hotplug case by copying the approach from
         arch/x86/events/intel/cstate.c. (Dmitry Rogozhkin)
      
      v7:
       * Log failure message only on failure.
       * Remove CPU hotplug notification state on unregister.
      
      v8:
       * Fix error unwind on failed registration.
       * Checkpatch cleanup.
      
      v9:
       * Drop the energy metric, it is available via intel_rapl_perf.
         (Ville Syrjälä)
       * Use HAS_RC6(p). (Chris Wilson)
       * Handle unsupported non-engine events. (Dmitry Rogozhkin)
       * Rebase for intel_rc6_residency_ns needing caller managed
         runtime pm.
       * Drop HAS_RC6 checks from the read callback since creating those
         events will be rejected at init time already.
       * Add counter units to sysfs so perf stat output is nicer.
       * Cleanup the attribute tables for brevity and readability.
      
      v10:
       * Fixed queued accounting.
      
      v11:
       * Move intel_engine_lookup_user to intel_engine_cs.c
       * Commit update. (Joonas Lahtinen)
      
      v12:
       * More accurate sampling. (Chris Wilson)
       * Store and report frequency in MHz for better usability from
         perf stat.
       * Removed metrics: queued, interrupts, rc6 counters.
       * Sample engine busyness based on seqno difference only
         for less MMIO (and forcewake) on all platforms. (Chris Wilson)
      
      v13:
       * Comment spelling, use mul_u32_u32 to work around potential GCC
         issue and somne code alignment changes. (Chris Wilson)
      
      v14:
       * Rebase.
      
      v15:
       * Rebase for RPS refactoring.
      
      v16:
       * Use the dynamic slot in the CPU hotplug state machine so that we are
         free to setup our state as multi-instance. Previously we were re-using
         the CPUHP_AP_PERF_X86_UNCORE_ONLINE slot which is neither used as
         multi-instance, nor owned by our driver to start with.
       * Register the CPU hotplug handlers after the PMU, otherwise the callback
         will get called before the PMU is initialized which can end up in
         perf_pmu_migrate_context with an un-initialized base.
       * Added workaround for a probable bug in cpuhp core.
      
      v17:
       * Remove workaround for the cpuhp bug.
      
      v18:
       * Rebase for drm_i915_gem_engine_class getting upstream before us.
      
      v19:
       * Rebase. (trivial)
      Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: default avatarTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Signed-off-by: default avatarDmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Cc: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Reviewed-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: default avatarTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20171121181852.16128-2-tvrtko.ursulin@linux.intel.com
      b46a33e2
    • Tvrtko Ursulin's avatar
      drm/i915: Extract intel_get_cagf · c84b2705
      Tvrtko Ursulin authored
      Code to be shared between debugfs and the PMU implementation.
      
      v2: Checkpatch cleanup.
      v3: Also consolidate i915_sysfs.c/gt_act_freq_mhz_show.
      v4: Rebase.
      Signed-off-by: default avatarTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Reviewed-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Link: https://patchwork.freedesktop.org/patch/msgid/20171121181852.16128-1-tvrtko.ursulin@linux.intel.com
      c84b2705
  2. 21 Nov, 2017 11 commits
  3. 20 Nov, 2017 17 commits