1. 02 Oct, 2019 1 commit
    • Chris Wilson's avatar
      drm/i915/userptr: Never allow userptr into the mappable GGTT · a4311745
      Chris Wilson authored
      Daniel Vetter uncovered a nasty cycle in using the mmu-notifiers to
      invalidate userptr objects which also happen to be pulled into GGTT
      mmaps. That is when we unbind the userptr object (on mmu invalidation),
      we revoke all CPU mmaps, which may then recurse into mmu invalidation.
      
      We looked for ways of breaking the cycle, but the revocation on
      invalidation is required and cannot be avoided. The only solution we
      could see was to not allow such GGTT bindings of userptr objects in the
      first place. In practice, no one really wants to use a GGTT mmapping of
      a CPU pointer...
      
      Just before Daniel's explosive lockdep patches land in v5.4-rc1, we got
      a genuine blip from CI:
      
      <4>[  246.793958] ======================================================
      <4>[  246.793972] WARNING: possible circular locking dependency detected
      <4>[  246.793989] 5.3.0-gbd6c56f50d15-drmtip_372+ #1 Tainted: G     U
      <4>[  246.794003] ------------------------------------------------------
      <4>[  246.794017] kswapd0/145 is trying to acquire lock:
      <4>[  246.794030] 000000003f565be6 (&dev->struct_mutex/1){+.+.}, at: userptr_mn_invalidate_range_start+0x18f/0x220 [i915]
      <4>[  246.794250]
                        but task is already holding lock:
      <4>[  246.794263] 000000001799cef9 (&anon_vma->rwsem){++++}, at: page_lock_anon_vma_read+0xe6/0x2a0
      <4>[  246.794291]
                        which lock already depends on the new lock.
      
      <4>[  246.794307]
                        the existing dependency chain (in reverse order) is:
      <4>[  246.794322]
                        -> #3 (&anon_vma->rwsem){++++}:
      <4>[  246.794344]        down_write+0x33/0x70
      <4>[  246.794357]        __vma_adjust+0x3d9/0x7b0
      <4>[  246.794370]        __split_vma+0x16a/0x180
      <4>[  246.794385]        mprotect_fixup+0x2a5/0x320
      <4>[  246.794399]        do_mprotect_pkey+0x208/0x2e0
      <4>[  246.794413]        __x64_sys_mprotect+0x16/0x20
      <4>[  246.794429]        do_syscall_64+0x55/0x1c0
      <4>[  246.794443]        entry_SYSCALL_64_after_hwframe+0x49/0xbe
      <4>[  246.794456]
                        -> #2 (&mapping->i_mmap_rwsem){++++}:
      <4>[  246.794478]        down_write+0x33/0x70
      <4>[  246.794493]        unmap_mapping_pages+0x48/0x130
      <4>[  246.794519]        i915_vma_revoke_mmap+0x81/0x1b0 [i915]
      <4>[  246.794519]        i915_vma_unbind+0x11d/0x4a0 [i915]
      <4>[  246.794519]        i915_vma_destroy+0x31/0x300 [i915]
      <4>[  246.794519]        __i915_gem_free_objects+0xb8/0x4b0 [i915]
      <4>[  246.794519]        drm_file_free.part.0+0x1e6/0x290
      <4>[  246.794519]        drm_release+0xa6/0xe0
      <4>[  246.794519]        __fput+0xc2/0x250
      <4>[  246.794519]        task_work_run+0x82/0xb0
      <4>[  246.794519]        do_exit+0x35b/0xdb0
      <4>[  246.794519]        do_group_exit+0x34/0xb0
      <4>[  246.794519]        __x64_sys_exit_group+0xf/0x10
      <4>[  246.794519]        do_syscall_64+0x55/0x1c0
      <4>[  246.794519]        entry_SYSCALL_64_after_hwframe+0x49/0xbe
      <4>[  246.794519]
                        -> #1 (&vm->mutex){+.+.}:
      <4>[  246.794519]        i915_gem_shrinker_taints_mutex+0x6d/0xe0 [i915]
      <4>[  246.794519]        i915_address_space_init+0x9f/0x160 [i915]
      <4>[  246.794519]        i915_ggtt_init_hw+0x55/0x170 [i915]
      <4>[  246.794519]        i915_driver_probe+0xc9f/0x1620 [i915]
      <4>[  246.794519]        i915_pci_probe+0x43/0x1b0 [i915]
      <4>[  246.794519]        pci_device_probe+0x9e/0x120
      <4>[  246.794519]        really_probe+0xea/0x3d0
      <4>[  246.794519]        driver_probe_device+0x10b/0x120
      <4>[  246.794519]        device_driver_attach+0x4a/0x50
      <4>[  246.794519]        __driver_attach+0x97/0x130
      <4>[  246.794519]        bus_for_each_dev+0x74/0xc0
      <4>[  246.794519]        bus_add_driver+0x13f/0x210
      <4>[  246.794519]        driver_register+0x56/0xe0
      <4>[  246.794519]        do_one_initcall+0x58/0x300
      <4>[  246.794519]        do_init_module+0x56/0x1f6
      <4>[  246.794519]        load_module+0x25bd/0x2a40
      <4>[  246.794519]        __se_sys_finit_module+0xd3/0xf0
      <4>[  246.794519]        do_syscall_64+0x55/0x1c0
      <4>[  246.794519]        entry_SYSCALL_64_after_hwframe+0x49/0xbe
      <4>[  246.794519]
                        -> #0 (&dev->struct_mutex/1){+.+.}:
      <4>[  246.794519]        __lock_acquire+0x15d8/0x1e90
      <4>[  246.794519]        lock_acquire+0xa6/0x1c0
      <4>[  246.794519]        __mutex_lock+0x9d/0x9b0
      <4>[  246.794519]        userptr_mn_invalidate_range_start+0x18f/0x220 [i915]
      <4>[  246.794519]        __mmu_notifier_invalidate_range_start+0x85/0x110
      <4>[  246.794519]        try_to_unmap_one+0x76b/0x860
      <4>[  246.794519]        rmap_walk_anon+0x104/0x280
      <4>[  246.794519]        try_to_unmap+0xc0/0xf0
      <4>[  246.794519]        shrink_page_list+0x561/0xc10
      <4>[  246.794519]        shrink_inactive_list+0x220/0x440
      <4>[  246.794519]        shrink_node_memcg+0x36e/0x740
      <4>[  246.794519]        shrink_node+0xcb/0x490
      <4>[  246.794519]        balance_pgdat+0x241/0x580
      <4>[  246.794519]        kswapd+0x16c/0x530
      <4>[  246.794519]        kthread+0x119/0x130
      <4>[  246.794519]        ret_from_fork+0x24/0x50
      <4>[  246.794519]
                        other info that might help us debug this:
      
      <4>[  246.794519] Chain exists of:
                          &dev->struct_mutex/1 --> &mapping->i_mmap_rwsem --> &anon_vma->rwsem
      
      <4>[  246.794519]  Possible unsafe locking scenario:
      
      <4>[  246.794519]        CPU0                    CPU1
      <4>[  246.794519]        ----                    ----
      <4>[  246.794519]   lock(&anon_vma->rwsem);
      <4>[  246.794519]                                lock(&mapping->i_mmap_rwsem);
      <4>[  246.794519]                                lock(&anon_vma->rwsem);
      <4>[  246.794519]   lock(&dev->struct_mutex/1);
      <4>[  246.794519]
                         *** DEADLOCK ***
      
      v2: Say no to mmap_ioctl
      
      Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111744
      Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111870Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Cc: stable@vger.kernel.org
      Reviewed-by: default avatarTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190928082546.3473-1-chris@chris-wilson.co.uk
      a4311745
  2. 01 Oct, 2019 2 commits
    • Srinivasan S's avatar
      drm/i915/dp: Fix DP MST error after unplugging TypeC cable · 99785b86
      Srinivasan S authored
      This patch avoids DP MST payload error message in dmesg, as it is trying
      to update the payload to the disconnected DP MST device. After DP MST
      device is disconnected we should not be updating the payload and
      hence remove the error.
      
      v2: Removed the connector status check and converted from error to debug.
      
      Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111632Signed-off-by: default avatarSrinivasan S <srinivasan.s@intel.com>
      Signed-off-by: default avatarVille Syrjälä <ville.syrjala@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/1569371742-109402-1-git-send-email-srinivasan.s@intel.com
      99785b86
    • Chris Wilson's avatar
      drm/i915: Initialise breadcrumb lists on the virtual engine · f8db4d05
      Chris Wilson authored
      With deferring the breadcrumb signalling to the virtual engine (thanks
      preempt-to-busy) we need to make sure the lists and irq-worker are ready
      to send a signal.
      
      [41958.710544] BUG: kernel NULL pointer dereference, address: 0000000000000000
      [41958.710553] #PF: supervisor write access in kernel mode
      [41958.710556] #PF: error_code(0x0002) - not-present page
      [41958.710558] PGD 0 P4D 0
      [41958.710562] Oops: 0002 [#1] SMP
      [41958.710565] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G     U            5.3.0+ #207
      [41958.710568] Hardware name: Intel Corporation NUC7i5BNK/NUC7i5BNB, BIOS BNKBL357.86A.0052.2017.0918.1346 09/18/2017
      [41958.710602] RIP: 0010:i915_request_enable_breadcrumb+0xe1/0x130 [i915]
      [41958.710605] Code: 8b 44 24 30 48 89 41 08 48 89 08 48 8b 85 98 01 00 00 48 8d 8d 90 01 00 00 48 89 95 98 01 00 00 49 89 4c 24 28 49 89 44 24 30 <48> 89 10 f0 80 4b 30 10 c6 85 88 01 00 00 00 e9 1a ff ff ff 48 83
      [41958.710609] RSP: 0018:ffffc90000003de0 EFLAGS: 00010046
      [41958.710612] RAX: 0000000000000000 RBX: ffff888735424480 RCX: ffff8887cddb2190
      [41958.710614] RDX: ffff8887cddb3570 RSI: ffff888850362190 RDI: ffff8887cddb2188
      [41958.710617] RBP: ffff8887cddb2000 R08: ffff8888503624a8 R09: 0000000000000100
      [41958.710619] R10: 0000000000000001 R11: 0000000000000000 R12: ffff8887cddb3548
      [41958.710622] R13: 0000000000000000 R14: 0000000000000046 R15: ffff888850362070
      [41958.710625] FS:  0000000000000000(0000) GS:ffff88885ea00000(0000) knlGS:0000000000000000
      [41958.710628] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [41958.710630] CR2: 0000000000000000 CR3: 0000000002c09002 CR4: 00000000001606f0
      [41958.710633] Call Trace:
      [41958.710636]  <IRQ>
      [41958.710668]  __i915_request_submit+0x12b/0x160 [i915]
      [41958.710693]  virtual_submit_request+0x67/0x120 [i915]
      [41958.710720]  __unwind_incomplete_requests+0x131/0x170 [i915]
      [41958.710744]  execlists_dequeue+0xb40/0xe00 [i915]
      [41958.710771]  execlists_submission_tasklet+0x10f/0x150 [i915]
      [41958.710776]  tasklet_action_common.isra.17+0x41/0xa0
      [41958.710781]  __do_softirq+0xc8/0x221
      [41958.710785]  irq_exit+0xa6/0xb0
      [41958.710788]  smp_apic_timer_interrupt+0x4d/0x80
      [41958.710791]  apic_timer_interrupt+0xf/0x20
      [41958.710794]  </IRQ>
      
      Fixes: cb2377a9 ("drm/i915: Fixup preempt-to-busy vs reset of a virtual request")
      Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Reviewed-by: default avatarTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20191001103518.9113-1-chris@chris-wilson.co.uk
      f8db4d05
  3. 30 Sep, 2019 2 commits
  4. 27 Sep, 2019 15 commits
  5. 26 Sep, 2019 8 commits
  6. 25 Sep, 2019 12 commits