1. 06 Sep, 2018 17 commits
    • Ben Skeggs's avatar
      drm/nouveau/disp: remove unused struct member · 60655770
      Ben Skeggs authored
      Signed-off-by: default avatarBen Skeggs <bskeggs@redhat.com>
      60655770
    • Ben Skeggs's avatar
      drm/nouveau/TBDdevinit: don't fail when PMU/PRE_OS is missing from VBIOS · 0a6986c6
      Ben Skeggs authored
      This Falcon application doesn't appear to be present on some newer
      systems, so let's not fail init if we can't find it.
      
      TBD: is there a way to determine whether it *should* be there?
      Signed-off-by: default avatarBen Skeggs <bskeggs@redhat.com>
      0a6986c6
    • Ben Skeggs's avatar
      drm/nouveau/mmu: don't attempt to dereference vmm without valid instance pointer · 51ed833c
      Ben Skeggs authored
      Fixes oopses in certain failure paths.
      Signed-off-by: default avatarBen Skeggs <bskeggs@redhat.com>
      51ed833c
    • Ben Skeggs's avatar
      drm/nouveau: fix oops in client init failure path · a43b16dd
      Ben Skeggs authored
      The NV_ERROR macro requires drm->client to be initialised, which it may not
      be at this stage of the init process.
      Signed-off-by: default avatarBen Skeggs <bskeggs@redhat.com>
      a43b16dd
    • Lyude Paul's avatar
      drm/nouveau: Fix nouveau_connector_ddc_detect() · d5986a1c
      Lyude Paul authored
      It looks like that when we moved over to using
      drm_connector_for_each_possible_encoder() in nouveau, that one rather
      important part of this function got dropped by accident:
      
      	/*          Right   v   here */
      	for (i = 0; nv_encoder = NULL, i < DRM_CONNECTOR_MAX_ENCODER; i++) {
      		int id = connector->encoder_ids[i];
      		if (id == 0)
      			break;
      
      Since it's rather difficult to notice: the conditional in this loop is
      actually:
      
      	nv_encoder = NULL, i < DRM_CONNECTOR_MAX_ENCODER
      
      Meaning that all early breaks result in nv_encoder keeping it's value,
      otherwise nv_encoder = NULL. Ugh.
      
      Since this got dropped, nouveau_connector_ddc_detect() now returns an
      encoder for every single connector, regardless of whether or not it's
      detected:
      
          [ 1780.056185] nouveau 0000:01:00.0: DRM: DDC responded, but no EDID for DP-2
      
      So: fix this to ensure we only return an encoder if we actually found
      one, and clean up the rest of the function while we're at it since it's
      nearly impossible to read properly.
      
      Changes since v1:
      - Don't skip ddc probing for LVDS if we can't switch DDC through
        vga-switcheroo, just do the DDC probing without calling
        vga_switcheroo_lock_ddc() - skeggsb
      Signed-off-by: default avatarLyude Paul <lyude@redhat.com>
      Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
      Fixes: ddba766d ("drm/nouveau: Use drm_connector_for_each_possible_encoder()")
      Reviewed-by: default avatarVille Syrjälä <ville.syrjala@linux.intel.com>
      Signed-off-by: default avatarBen Skeggs <bskeggs@redhat.com>
      d5986a1c
    • Lyude Paul's avatar
      drm/nouveau/drm/nouveau: Don't forget to cancel hpd_work on suspend/unload · 2f7ca781
      Lyude Paul authored
      Currently, there's nothing in nouveau that actually cancels this work
      struct. So, cancel it on suspend/unload. Otherwise, if we're unlucky
      enough hpd_work might try to keep running up until the system is
      suspended.
      Signed-off-by: default avatarLyude Paul <lyude@redhat.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarBen Skeggs <bskeggs@redhat.com>
      2f7ca781
    • Lyude Paul's avatar
      drm/nouveau/drm/nouveau: Prevent handling ACPI HPD events too early · 79e765ad
      Lyude Paul authored
      On most systems with ACPI hotplugging support, it seems that we always
      receive a hotplug event once we re-enable EC interrupts even if the GPU
      hasn't even been resumed yet.
      
      This can cause problems since even though we schedule hpd_work to handle
      connector reprobing for us, hpd_work synchronizes on
      pm_runtime_get_sync() to wait until the device is ready to perform
      reprobing. Since runtime suspend/resume callbacks are disabled before
      the PM core calls ->suspend(), any calls to pm_runtime_get_sync() during
      this period will grab a runtime PM ref and return immediately with
      -EACCES. Because we schedule hpd_work from our ACPI HPD handler, and
      hpd_work synchronizes on pm_runtime_get_sync(), this causes us to launch
      a connector reprobe immediately even if the GPU isn't actually resumed
      just yet. This causes various warnings in dmesg and occasionally, also
      prevents some displays connected to the dedicated GPU from coming back
      up after suspend. Example:
      
      usb 1-4: USB disconnect, device number 14
      usb 1-4.1: USB disconnect, device number 15
      WARNING: CPU: 0 PID: 838 at drivers/gpu/drm/nouveau/include/nvkm/subdev/i2c.h:170 nouveau_dp_detect+0x17e/0x370 [nouveau]
      CPU: 0 PID: 838 Comm: kworker/0:6 Not tainted 4.17.14-201.Lyude.bz1477182.V3.fc28.x86_64 #1
      Hardware name: LENOVO 20EQS64N00/20EQS64N00, BIOS N1EET77W (1.50 ) 03/28/2018
      Workqueue: events nouveau_display_hpd_work [nouveau]
      RIP: 0010:nouveau_dp_detect+0x17e/0x370 [nouveau]
      RSP: 0018:ffffa15143933cf0 EFLAGS: 00010293
      RAX: 0000000000000000 RBX: ffff8cb4f656c400 RCX: 0000000000000000
      RDX: ffffa1514500e4e4 RSI: ffffa1514500e4e4 RDI: 0000000001009002
      RBP: ffff8cb4f4a8a800 R08: ffffa15143933cfd R09: ffffa15143933cfc
      R10: 0000000000000000 R11: 0000000000000000 R12: ffff8cb4fb57a000
      R13: ffff8cb4fb57a000 R14: ffff8cb4f4a8f800 R15: ffff8cb4f656c418
      FS:  0000000000000000(0000) GS:ffff8cb51f400000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00007f78ec938000 CR3: 000000073720a003 CR4: 00000000003606f0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Call Trace:
       ? _cond_resched+0x15/0x30
       nouveau_connector_detect+0x2ce/0x520 [nouveau]
       ? _cond_resched+0x15/0x30
       ? ww_mutex_lock+0x12/0x40
       drm_helper_probe_detect_ctx+0x8b/0xe0 [drm_kms_helper]
       drm_helper_hpd_irq_event+0xa8/0x120 [drm_kms_helper]
       nouveau_display_hpd_work+0x2a/0x60 [nouveau]
       process_one_work+0x187/0x340
       worker_thread+0x2e/0x380
       ? pwq_unbound_release_workfn+0xd0/0xd0
       kthread+0x112/0x130
       ? kthread_create_worker_on_cpu+0x70/0x70
       ret_from_fork+0x35/0x40
      Code: 4c 8d 44 24 0d b9 00 05 00 00 48 89 ef ba 09 00 00 00 be 01 00 00 00 e8 e1 09 f8 ff 85 c0 0f 85 b2 01 00 00 80 7c 24 0c 03 74 02 <0f> 0b 48 89 ef e8 b8 07 f8 ff f6 05 51 1b c8 ff 02 0f 84 72 ff
      ---[ end trace 55d811b38fc8e71a ]---
      
      So, to fix this we attempt to grab a runtime PM reference in the ACPI
      handler itself asynchronously. If the GPU is already awake (it will have
      normal hotplugging at this point) or runtime PM callbacks are currently
      disabled on the device, we drop our reference without updating the
      autosuspend delay. We only schedule connector reprobes when we
      successfully managed to queue up a resume request with our asynchronous
      PM ref.
      
      This also has the added benefit of preventing redundant connector
      reprobes from ACPI while the GPU is runtime resumed!
      Signed-off-by: default avatarLyude Paul <lyude@redhat.com>
      Cc: stable@vger.kernel.org
      Cc: Karol Herbst <kherbst@redhat.com>
      Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1477182#c41Signed-off-by: default avatarLyude Paul <lyude@redhat.com>
      Signed-off-by: default avatarBen Skeggs <bskeggs@redhat.com>
      79e765ad
    • Lyude Paul's avatar
      drm/nouveau: Reset MST branching unit before enabling · fa3cdf8d
      Lyude Paul authored
      When probing a new MST device, it's not safe to make any assumptions
      about it's current state. While most well mannered MST hubs will just
      disable the branching unit on hotplug disconnects, this isn't enough to
      save us from various other scenarios that might have resulted in
      something writing to the MST branching unit before we got control of it.
      This could happen if a previous probe we tried failed, if we're booting
      in kexec context and the hub is still in the state the last kernel put
      it in, etc.
      
      Luckily; there is no reason we can't just reset the branching unit
      every time we enable a new topology. So, fix this by resetting it on
      enabling new topologies to ensure that we always start off with a clean,
      unmodified topology state on MST sinks.
      
      This fixes occasional hard-lockups on my P50's laptop dock (e.g. AUX
      times out all DPCD trasactions) observed after multiple docks, undocks,
      and module reloads.
      Signed-off-by: default avatarLyude Paul <lyude@redhat.com>
      Cc: stable@vger.kernel.org
      Cc: Karol Herbst <karolherbst@gmail.com>
      Signed-off-by: default avatarBen Skeggs <bskeggs@redhat.com>
      fa3cdf8d
    • Lyude Paul's avatar
      drm/nouveau: Only write DP_MSTM_CTRL when needed · b26b4590
      Lyude Paul authored
      Currently, nouveau will re-write the DP_MSTM_CTRL register for an MST
      hub every time it receives a long HPD pulse on DP. This isn't actually
      necessary and additionally, has some unintended side effects.
      
      With the P50 I've got here, rewriting DP_MSTM_CTRL constantly seems to
      make it rather likely (1 out of 5 times usually) that bringing up MST
      with it's ThinkPad dock will fail and result in sideband messages timing
      out in the middle. Afterwards, successive probes don't manage to get the
      dock to communicate properly over MST sideband properly.
      
      Many times sideband message timeouts from MST hubs are indicative of
      either the source or the sink dropping an ESI event, which can cause
      DRM's perspective of the topology's current state to go out of sync with
      reality. While it's tough to really know for sure what's happening to
      the dock, using userspace tools to write to DP_MSTM_CTRL in the middle
      of the MST link probing process does appear to make things flaky. It's
      possible that when we write to DP_MSTM_CTRL, the function that gets
      triggered to respond in the dock's firmware temporarily puts it in a
      state where it might end up not reporting an ESI to the source, or ends
      up dropping a sideband message we sent it.
      
      So, to fix this we make it so that when probing an MST topology, we
      respect it's current state. If the dock's already enabled, we simply
      read DP_MSTM_CTRL and disable the topology if it's value is not what we
      expected. Otherwise, we perform the normal MST probing dance. We avoid
      taking any action except if the state of the MST topology actually
      changes.
      
      This fixes MST sideband message timeouts and detection failures on my
      P50 with its ThinkPad dock.
      Signed-off-by: default avatarLyude Paul <lyude@redhat.com>
      Cc: stable@vger.kernel.org
      Cc: Karol Herbst <karolherbst@gmail.com>
      Signed-off-by: default avatarBen Skeggs <bskeggs@redhat.com>
      b26b4590
    • Lyude Paul's avatar
      drm/nouveau: Remove useless poll_enable() call in drm_load() · 7326ead9
      Lyude Paul authored
      Again, this doesn't do anything. drm_kms_helper_poll_enable() will have
      already been called in nouveau_display_init()
      Signed-off-by: default avatarLyude Paul <lyude@redhat.com>
      Reviewed-by: default avatarKarol Herbst <kherbst@redhat.com>
      Acked-by: default avatarDaniel Vetter <daniel@ffwll.ch>
      Cc: Lukas Wunner <lukas@wunner.de>
      Signed-off-by: default avatarBen Skeggs <bskeggs@redhat.com>
      7326ead9
    • Lyude Paul's avatar
      drm/nouveau: Remove useless poll_disable() call in switcheroo_set_state() · 0d7b2d4d
      Lyude Paul authored
      This won't do anything but potentially make us miss hotplugs. We already
      call drm_kms_helper_poll_disable() in
      nouveau_pmops_suspend()->nouveau_display_suspend()->nouveau_display_fini()
      Signed-off-by: default avatarLyude Paul <lyude@redhat.com>
      Reviewed-by: default avatarKarol Herbst <kherbst@redhat.com>
      Acked-by: default avatarDaniel Vetter <daniel@ffwll.ch>
      Cc: Lukas Wunner <lukas@wunner.de>
      Signed-off-by: default avatarBen Skeggs <bskeggs@redhat.com>
      0d7b2d4d
    • Lyude Paul's avatar
      drm/nouveau: Remove useless poll_enable() call in switcheroo_set_state() · 0445f753
      Lyude Paul authored
      This doesn't do anything, drm_kms_helper_poll_enable() gets called in
      nouveau_pmops_resume()->nouveau_display_resume()->nouveau_display_init()
      already.
      Signed-off-by: default avatarLyude Paul <lyude@redhat.com>
      Reviewed-by: default avatarKarol Herbst <kherbst@redhat.com>
      Acked-by: default avatarDaniel Vetter <daniel@ffwll.ch>
      Cc: Lukas Wunner <lukas@wunner.de>
      Signed-off-by: default avatarBen Skeggs <bskeggs@redhat.com>
      0445f753
    • Lyude Paul's avatar
      drm/nouveau: Fix deadlocks in nouveau_connector_detect() · 3e1a1275
      Lyude Paul authored
      When we disable hotplugging on the GPU, we need to be able to
      synchronize with each connector's hotplug interrupt handler before the
      interrupt is finally disabled. This can be a problem however, since
      nouveau_connector_detect() currently grabs a runtime power reference
      when handling connector probing. This will deadlock the runtime suspend
      handler like so:
      
      [  861.480896] INFO: task kworker/0:2:61 blocked for more than 120 seconds.
      [  861.483290]       Tainted: G           O      4.18.0-rc6Lyude-Test+ #1
      [  861.485158] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      [  861.486332] kworker/0:2     D    0    61      2 0x80000000
      [  861.487044] Workqueue: events nouveau_display_hpd_work [nouveau]
      [  861.487737] Call Trace:
      [  861.488394]  __schedule+0x322/0xaf0
      [  861.489070]  schedule+0x33/0x90
      [  861.489744]  rpm_resume+0x19c/0x850
      [  861.490392]  ? finish_wait+0x90/0x90
      [  861.491068]  __pm_runtime_resume+0x4e/0x90
      [  861.491753]  nouveau_display_hpd_work+0x22/0x60 [nouveau]
      [  861.492416]  process_one_work+0x231/0x620
      [  861.493068]  worker_thread+0x44/0x3a0
      [  861.493722]  kthread+0x12b/0x150
      [  861.494342]  ? wq_pool_ids_show+0x140/0x140
      [  861.494991]  ? kthread_create_worker_on_cpu+0x70/0x70
      [  861.495648]  ret_from_fork+0x3a/0x50
      [  861.496304] INFO: task kworker/6:2:320 blocked for more than 120 seconds.
      [  861.496968]       Tainted: G           O      4.18.0-rc6Lyude-Test+ #1
      [  861.497654] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      [  861.498341] kworker/6:2     D    0   320      2 0x80000080
      [  861.499045] Workqueue: pm pm_runtime_work
      [  861.499739] Call Trace:
      [  861.500428]  __schedule+0x322/0xaf0
      [  861.501134]  ? wait_for_completion+0x104/0x190
      [  861.501851]  schedule+0x33/0x90
      [  861.502564]  schedule_timeout+0x3a5/0x590
      [  861.503284]  ? mark_held_locks+0x58/0x80
      [  861.503988]  ? _raw_spin_unlock_irq+0x2c/0x40
      [  861.504710]  ? wait_for_completion+0x104/0x190
      [  861.505417]  ? trace_hardirqs_on_caller+0xf4/0x190
      [  861.506136]  ? wait_for_completion+0x104/0x190
      [  861.506845]  wait_for_completion+0x12c/0x190
      [  861.507555]  ? wake_up_q+0x80/0x80
      [  861.508268]  flush_work+0x1c9/0x280
      [  861.508990]  ? flush_workqueue_prep_pwqs+0x1b0/0x1b0
      [  861.509735]  nvif_notify_put+0xb1/0xc0 [nouveau]
      [  861.510482]  nouveau_display_fini+0xbd/0x170 [nouveau]
      [  861.511241]  nouveau_display_suspend+0x67/0x120 [nouveau]
      [  861.511969]  nouveau_do_suspend+0x5e/0x2d0 [nouveau]
      [  861.512715]  nouveau_pmops_runtime_suspend+0x47/0xb0 [nouveau]
      [  861.513435]  pci_pm_runtime_suspend+0x6b/0x180
      [  861.514165]  ? pci_has_legacy_pm_support+0x70/0x70
      [  861.514897]  __rpm_callback+0x7a/0x1d0
      [  861.515618]  ? pci_has_legacy_pm_support+0x70/0x70
      [  861.516313]  rpm_callback+0x24/0x80
      [  861.517027]  ? pci_has_legacy_pm_support+0x70/0x70
      [  861.517741]  rpm_suspend+0x142/0x6b0
      [  861.518449]  pm_runtime_work+0x97/0xc0
      [  861.519144]  process_one_work+0x231/0x620
      [  861.519831]  worker_thread+0x44/0x3a0
      [  861.520522]  kthread+0x12b/0x150
      [  861.521220]  ? wq_pool_ids_show+0x140/0x140
      [  861.521925]  ? kthread_create_worker_on_cpu+0x70/0x70
      [  861.522622]  ret_from_fork+0x3a/0x50
      [  861.523299] INFO: task kworker/6:0:1329 blocked for more than 120 seconds.
      [  861.523977]       Tainted: G           O      4.18.0-rc6Lyude-Test+ #1
      [  861.524644] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      [  861.525349] kworker/6:0     D    0  1329      2 0x80000000
      [  861.526073] Workqueue: events nvif_notify_work [nouveau]
      [  861.526751] Call Trace:
      [  861.527411]  __schedule+0x322/0xaf0
      [  861.528089]  schedule+0x33/0x90
      [  861.528758]  rpm_resume+0x19c/0x850
      [  861.529399]  ? finish_wait+0x90/0x90
      [  861.530073]  __pm_runtime_resume+0x4e/0x90
      [  861.530798]  nouveau_connector_detect+0x7e/0x510 [nouveau]
      [  861.531459]  ? ww_mutex_lock+0x47/0x80
      [  861.532097]  ? ww_mutex_lock+0x47/0x80
      [  861.532819]  ? drm_modeset_lock+0x88/0x130 [drm]
      [  861.533481]  drm_helper_probe_detect_ctx+0xa0/0x100 [drm_kms_helper]
      [  861.534127]  drm_helper_hpd_irq_event+0xa4/0x120 [drm_kms_helper]
      [  861.534940]  nouveau_connector_hotplug+0x98/0x120 [nouveau]
      [  861.535556]  nvif_notify_work+0x2d/0xb0 [nouveau]
      [  861.536221]  process_one_work+0x231/0x620
      [  861.536994]  worker_thread+0x44/0x3a0
      [  861.537757]  kthread+0x12b/0x150
      [  861.538463]  ? wq_pool_ids_show+0x140/0x140
      [  861.539102]  ? kthread_create_worker_on_cpu+0x70/0x70
      [  861.539815]  ret_from_fork+0x3a/0x50
      [  861.540521]
                     Showing all locks held in the system:
      [  861.541696] 2 locks held by kworker/0:2/61:
      [  861.542406]  #0: 000000002dbf8af5 ((wq_completion)"events"){+.+.}, at: process_one_work+0x1b3/0x620
      [  861.543071]  #1: 0000000076868126 ((work_completion)(&drm->hpd_work)){+.+.}, at: process_one_work+0x1b3/0x620
      [  861.543814] 1 lock held by khungtaskd/64:
      [  861.544535]  #0: 0000000059db4b53 (rcu_read_lock){....}, at: debug_show_all_locks+0x23/0x185
      [  861.545160] 3 locks held by kworker/6:2/320:
      [  861.545896]  #0: 00000000d9e1bc59 ((wq_completion)"pm"){+.+.}, at: process_one_work+0x1b3/0x620
      [  861.546702]  #1: 00000000c9f92d84 ((work_completion)(&dev->power.work)){+.+.}, at: process_one_work+0x1b3/0x620
      [  861.547443]  #2: 000000004afc5de1 (drm_connector_list_iter){.+.+}, at: nouveau_display_fini+0x96/0x170 [nouveau]
      [  861.548146] 1 lock held by dmesg/983:
      [  861.548889] 2 locks held by zsh/1250:
      [  861.549605]  #0: 00000000348e3cf6 (&tty->ldisc_sem){++++}, at: ldsem_down_read+0x37/0x40
      [  861.550393]  #1: 000000007009a7a8 (&ldata->atomic_read_lock){+.+.}, at: n_tty_read+0xc1/0x870
      [  861.551122] 6 locks held by kworker/6:0/1329:
      [  861.551957]  #0: 000000002dbf8af5 ((wq_completion)"events"){+.+.}, at: process_one_work+0x1b3/0x620
      [  861.552765]  #1: 00000000ddb499ad ((work_completion)(&notify->work)#2){+.+.}, at: process_one_work+0x1b3/0x620
      [  861.553582]  #2: 000000006e013cbe (&dev->mode_config.mutex){+.+.}, at: drm_helper_hpd_irq_event+0x6c/0x120 [drm_kms_helper]
      [  861.554357]  #3: 000000004afc5de1 (drm_connector_list_iter){.+.+}, at: drm_helper_hpd_irq_event+0x78/0x120 [drm_kms_helper]
      [  861.555227]  #4: 0000000044f294d9 (crtc_ww_class_acquire){+.+.}, at: drm_helper_probe_detect_ctx+0x3d/0x100 [drm_kms_helper]
      [  861.556133]  #5: 00000000db193642 (crtc_ww_class_mutex){+.+.}, at: drm_modeset_lock+0x4b/0x130 [drm]
      
      [  861.557864] =============================================
      
      [  861.559507] NMI backtrace for cpu 2
      [  861.560363] CPU: 2 PID: 64 Comm: khungtaskd Tainted: G           O      4.18.0-rc6Lyude-Test+ #1
      [  861.561197] Hardware name: LENOVO 20EQS64N0B/20EQS64N0B, BIOS N1EET78W (1.51 ) 05/18/2018
      [  861.561948] Call Trace:
      [  861.562757]  dump_stack+0x8e/0xd3
      [  861.563516]  nmi_cpu_backtrace.cold.3+0x14/0x5a
      [  861.564269]  ? lapic_can_unplug_cpu.cold.27+0x42/0x42
      [  861.565029]  nmi_trigger_cpumask_backtrace+0xa1/0xae
      [  861.565789]  arch_trigger_cpumask_backtrace+0x19/0x20
      [  861.566558]  watchdog+0x316/0x580
      [  861.567355]  kthread+0x12b/0x150
      [  861.568114]  ? reset_hung_task_detector+0x20/0x20
      [  861.568863]  ? kthread_create_worker_on_cpu+0x70/0x70
      [  861.569598]  ret_from_fork+0x3a/0x50
      [  861.570370] Sending NMI from CPU 2 to CPUs 0-1,3-7:
      [  861.571426] NMI backtrace for cpu 6 skipped: idling at intel_idle+0x7f/0x120
      [  861.571429] NMI backtrace for cpu 7 skipped: idling at intel_idle+0x7f/0x120
      [  861.571432] NMI backtrace for cpu 3 skipped: idling at intel_idle+0x7f/0x120
      [  861.571464] NMI backtrace for cpu 5 skipped: idling at intel_idle+0x7f/0x120
      [  861.571467] NMI backtrace for cpu 0 skipped: idling at intel_idle+0x7f/0x120
      [  861.571469] NMI backtrace for cpu 4 skipped: idling at intel_idle+0x7f/0x120
      [  861.571472] NMI backtrace for cpu 1 skipped: idling at intel_idle+0x7f/0x120
      [  861.572428] Kernel panic - not syncing: hung_task: blocked tasks
      
      So: fix this by making it so that normal hotplug handling /only/ happens
      so long as the GPU is currently awake without any pending runtime PM
      requests. In the event that a hotplug occurs while the device is
      suspending or resuming, we can simply defer our response until the GPU
      is fully runtime resumed again.
      
      Changes since v4:
      - Use a new trick I came up with using pm_runtime_get() instead of the
        hackish junk we had before
      Signed-off-by: default avatarLyude Paul <lyude@redhat.com>
      Reviewed-by: default avatarKarol Herbst <kherbst@redhat.com>
      Acked-by: default avatarDaniel Vetter <daniel@ffwll.ch>
      Cc: stable@vger.kernel.org
      Cc: Lukas Wunner <lukas@wunner.de>
      Signed-off-by: default avatarBen Skeggs <bskeggs@redhat.com>
      3e1a1275
    • Lyude Paul's avatar
      drm/nouveau/drm/nouveau: Use pm_runtime_get_noresume() in connector_detect() · 6833fb1e
      Lyude Paul authored
      It's true we can't resume the device from poll workers in
      nouveau_connector_detect(). We can however, prevent the autosuspend
      timer from elapsing immediately if it hasn't already without risking any
      sort of deadlock with the runtime suspend/resume operations. So do that
      instead of entirely avoiding grabbing a power reference.
      Signed-off-by: default avatarLyude Paul <lyude@redhat.com>
      Reviewed-by: default avatarKarol Herbst <kherbst@redhat.com>
      Acked-by: default avatarDaniel Vetter <daniel@ffwll.ch>
      Cc: stable@vger.kernel.org
      Cc: Lukas Wunner <lukas@wunner.de>
      Signed-off-by: default avatarBen Skeggs <bskeggs@redhat.com>
      6833fb1e
    • Lyude Paul's avatar
      drm/nouveau/drm/nouveau: Fix deadlock with fb_helper with async RPM requests · 7fec8f53
      Lyude Paul authored
      Currently, nouveau uses the generic drm_fb_helper_output_poll_changed()
      function provided by DRM as it's output_poll_changed callback.
      Unfortunately however, this function doesn't grab runtime PM references
      early enough and even if it did-we can't block waiting for the device to
      resume in output_poll_changed() since it's very likely that we'll need
      to grab the fb_helper lock at some point during the runtime resume
      process. This currently results in deadlocking like so:
      
      [  246.669625] INFO: task kworker/4:0:37 blocked for more than 120 seconds.
      [  246.673398]       Not tainted 4.18.0-rc5Lyude-Test+ #2
      [  246.675271] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      [  246.676527] kworker/4:0     D    0    37      2 0x80000000
      [  246.677580] Workqueue: events output_poll_execute [drm_kms_helper]
      [  246.678704] Call Trace:
      [  246.679753]  __schedule+0x322/0xaf0
      [  246.680916]  schedule+0x33/0x90
      [  246.681924]  schedule_preempt_disabled+0x15/0x20
      [  246.683023]  __mutex_lock+0x569/0x9a0
      [  246.684035]  ? kobject_uevent_env+0x117/0x7b0
      [  246.685132]  ? drm_fb_helper_hotplug_event.part.28+0x20/0xb0 [drm_kms_helper]
      [  246.686179]  mutex_lock_nested+0x1b/0x20
      [  246.687278]  ? mutex_lock_nested+0x1b/0x20
      [  246.688307]  drm_fb_helper_hotplug_event.part.28+0x20/0xb0 [drm_kms_helper]
      [  246.689420]  drm_fb_helper_output_poll_changed+0x23/0x30 [drm_kms_helper]
      [  246.690462]  drm_kms_helper_hotplug_event+0x2a/0x30 [drm_kms_helper]
      [  246.691570]  output_poll_execute+0x198/0x1c0 [drm_kms_helper]
      [  246.692611]  process_one_work+0x231/0x620
      [  246.693725]  worker_thread+0x214/0x3a0
      [  246.694756]  kthread+0x12b/0x150
      [  246.695856]  ? wq_pool_ids_show+0x140/0x140
      [  246.696888]  ? kthread_create_worker_on_cpu+0x70/0x70
      [  246.697998]  ret_from_fork+0x3a/0x50
      [  246.699034] INFO: task kworker/0:1:60 blocked for more than 120 seconds.
      [  246.700153]       Not tainted 4.18.0-rc5Lyude-Test+ #2
      [  246.701182] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      [  246.702278] kworker/0:1     D    0    60      2 0x80000000
      [  246.703293] Workqueue: pm pm_runtime_work
      [  246.704393] Call Trace:
      [  246.705403]  __schedule+0x322/0xaf0
      [  246.706439]  ? wait_for_completion+0x104/0x190
      [  246.707393]  schedule+0x33/0x90
      [  246.708375]  schedule_timeout+0x3a5/0x590
      [  246.709289]  ? mark_held_locks+0x58/0x80
      [  246.710208]  ? _raw_spin_unlock_irq+0x2c/0x40
      [  246.711222]  ? wait_for_completion+0x104/0x190
      [  246.712134]  ? trace_hardirqs_on_caller+0xf4/0x190
      [  246.713094]  ? wait_for_completion+0x104/0x190
      [  246.713964]  wait_for_completion+0x12c/0x190
      [  246.714895]  ? wake_up_q+0x80/0x80
      [  246.715727]  ? get_work_pool+0x90/0x90
      [  246.716649]  flush_work+0x1c9/0x280
      [  246.717483]  ? flush_workqueue_prep_pwqs+0x1b0/0x1b0
      [  246.718442]  __cancel_work_timer+0x146/0x1d0
      [  246.719247]  cancel_delayed_work_sync+0x13/0x20
      [  246.720043]  drm_kms_helper_poll_disable+0x1f/0x30 [drm_kms_helper]
      [  246.721123]  nouveau_pmops_runtime_suspend+0x3d/0xb0 [nouveau]
      [  246.721897]  pci_pm_runtime_suspend+0x6b/0x190
      [  246.722825]  ? pci_has_legacy_pm_support+0x70/0x70
      [  246.723737]  __rpm_callback+0x7a/0x1d0
      [  246.724721]  ? pci_has_legacy_pm_support+0x70/0x70
      [  246.725607]  rpm_callback+0x24/0x80
      [  246.726553]  ? pci_has_legacy_pm_support+0x70/0x70
      [  246.727376]  rpm_suspend+0x142/0x6b0
      [  246.728185]  pm_runtime_work+0x97/0xc0
      [  246.728938]  process_one_work+0x231/0x620
      [  246.729796]  worker_thread+0x44/0x3a0
      [  246.730614]  kthread+0x12b/0x150
      [  246.731395]  ? wq_pool_ids_show+0x140/0x140
      [  246.732202]  ? kthread_create_worker_on_cpu+0x70/0x70
      [  246.732878]  ret_from_fork+0x3a/0x50
      [  246.733768] INFO: task kworker/4:2:422 blocked for more than 120 seconds.
      [  246.734587]       Not tainted 4.18.0-rc5Lyude-Test+ #2
      [  246.735393] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      [  246.736113] kworker/4:2     D    0   422      2 0x80000080
      [  246.736789] Workqueue: events_long drm_dp_mst_link_probe_work [drm_kms_helper]
      [  246.737665] Call Trace:
      [  246.738490]  __schedule+0x322/0xaf0
      [  246.739250]  schedule+0x33/0x90
      [  246.739908]  rpm_resume+0x19c/0x850
      [  246.740750]  ? finish_wait+0x90/0x90
      [  246.741541]  __pm_runtime_resume+0x4e/0x90
      [  246.742370]  nv50_disp_atomic_commit+0x31/0x210 [nouveau]
      [  246.743124]  drm_atomic_commit+0x4a/0x50 [drm]
      [  246.743775]  restore_fbdev_mode_atomic+0x1c8/0x240 [drm_kms_helper]
      [  246.744603]  restore_fbdev_mode+0x31/0x140 [drm_kms_helper]
      [  246.745373]  drm_fb_helper_restore_fbdev_mode_unlocked+0x54/0xb0 [drm_kms_helper]
      [  246.746220]  drm_fb_helper_set_par+0x2d/0x50 [drm_kms_helper]
      [  246.746884]  drm_fb_helper_hotplug_event.part.28+0x96/0xb0 [drm_kms_helper]
      [  246.747675]  drm_fb_helper_output_poll_changed+0x23/0x30 [drm_kms_helper]
      [  246.748544]  drm_kms_helper_hotplug_event+0x2a/0x30 [drm_kms_helper]
      [  246.749439]  nv50_mstm_hotplug+0x15/0x20 [nouveau]
      [  246.750111]  drm_dp_send_link_address+0x177/0x1c0 [drm_kms_helper]
      [  246.750764]  drm_dp_check_and_send_link_address+0xa8/0xd0 [drm_kms_helper]
      [  246.751602]  drm_dp_mst_link_probe_work+0x51/0x90 [drm_kms_helper]
      [  246.752314]  process_one_work+0x231/0x620
      [  246.752979]  worker_thread+0x44/0x3a0
      [  246.753838]  kthread+0x12b/0x150
      [  246.754619]  ? wq_pool_ids_show+0x140/0x140
      [  246.755386]  ? kthread_create_worker_on_cpu+0x70/0x70
      [  246.756162]  ret_from_fork+0x3a/0x50
      [  246.756847]
                 Showing all locks held in the system:
      [  246.758261] 3 locks held by kworker/4:0/37:
      [  246.759016]  #0: 00000000f8df4d2d ((wq_completion)"events"){+.+.}, at: process_one_work+0x1b3/0x620
      [  246.759856]  #1: 00000000e6065461 ((work_completion)(&(&dev->mode_config.output_poll_work)->work)){+.+.}, at: process_one_work+0x1b3/0x620
      [  246.760670]  #2: 00000000cb66735f (&helper->lock){+.+.}, at: drm_fb_helper_hotplug_event.part.28+0x20/0xb0 [drm_kms_helper]
      [  246.761516] 2 locks held by kworker/0:1/60:
      [  246.762274]  #0: 00000000fff6be0f ((wq_completion)"pm"){+.+.}, at: process_one_work+0x1b3/0x620
      [  246.762982]  #1: 000000005ab44fb4 ((work_completion)(&dev->power.work)){+.+.}, at: process_one_work+0x1b3/0x620
      [  246.763890] 1 lock held by khungtaskd/64:
      [  246.764664]  #0: 000000008cb8b5c3 (rcu_read_lock){....}, at: debug_show_all_locks+0x23/0x185
      [  246.765588] 5 locks held by kworker/4:2/422:
      [  246.766440]  #0: 00000000232f0959 ((wq_completion)"events_long"){+.+.}, at: process_one_work+0x1b3/0x620
      [  246.767390]  #1: 00000000bb59b134 ((work_completion)(&mgr->work)){+.+.}, at: process_one_work+0x1b3/0x620
      [  246.768154]  #2: 00000000cb66735f (&helper->lock){+.+.}, at: drm_fb_helper_restore_fbdev_mode_unlocked+0x4c/0xb0 [drm_kms_helper]
      [  246.768966]  #3: 000000004c8f0b6b (crtc_ww_class_acquire){+.+.}, at: restore_fbdev_mode_atomic+0x4b/0x240 [drm_kms_helper]
      [  246.769921]  #4: 000000004c34a296 (crtc_ww_class_mutex){+.+.}, at: drm_modeset_backoff+0x8a/0x1b0 [drm]
      [  246.770839] 1 lock held by dmesg/1038:
      [  246.771739] 2 locks held by zsh/1172:
      [  246.772650]  #0: 00000000836d0438 (&tty->ldisc_sem){++++}, at: ldsem_down_read+0x37/0x40
      [  246.773680]  #1: 000000001f4f4d48 (&ldata->atomic_read_lock){+.+.}, at: n_tty_read+0xc1/0x870
      
      [  246.775522] =============================================
      
      After trying dozens of different solutions, I found one very simple one
      that should also have the benefit of preventing us from having to fight
      locking for the rest of our lives. So, we work around these deadlocks by
      deferring all fbcon hotplug events that happen after the runtime suspend
      process starts until after the device is resumed again.
      
      Changes since v7:
       - Fixup commit message - Daniel Vetter
      
      Changes since v6:
       - Remove unused nouveau_fbcon_hotplugged_in_suspend() - Ilia
      
      Changes since v5:
       - Come up with the (hopefully final) solution for solving this dumb
         problem, one that is a lot less likely to cause issues with locking in
         the future. This should work around all deadlock conditions with fbcon
         brought up thus far.
      
      Changes since v4:
       - Add nouveau_fbcon_hotplugged_in_suspend() to workaround deadlock
         condition that Lukas described
       - Just move all of this out of drm_fb_helper. It seems that other DRM
         drivers have already figured out other workarounds for this. If other
         drivers do end up needing this in the future, we can just move this
         back into drm_fb_helper again.
      
      Changes since v3:
      - Actually check if fb_helper is NULL in both new helpers
      - Actually check drm_fbdev_emulation in both new helpers
      - Don't fire off a fb_helper hotplug unconditionally; only do it if
        the following conditions are true (as otherwise, calling this in the
        wrong spot will cause Bad Things to happen):
        - fb_helper hotplug handling was actually inhibited previously
        - fb_helper actually has a delayed hotplug pending
        - fb_helper is actually bound
        - fb_helper is actually initialized
      - Add __must_check to drm_fb_helper_suspend_hotplug(). There's no
        situation where a driver would actually want to use this without
        checking the return value, so enforce that
      - Rewrite and clarify the documentation for both helpers.
      - Make sure to return true in the drm_fb_helper_suspend_hotplug() stub
        that's provided in drm_fb_helper.h when CONFIG_DRM_FBDEV_EMULATION
        isn't enabled
      - Actually grab the toplevel fb_helper lock in
        drm_fb_helper_resume_hotplug(), since it's possible other activity
        (such as a hotplug) could be going on at the same time the driver
        calls drm_fb_helper_resume_hotplug(). We need this to check whether or
        not drm_fb_helper_hotplug_event() needs to be called anyway
      Signed-off-by: default avatarLyude Paul <lyude@redhat.com>
      Reviewed-by: default avatarKarol Herbst <kherbst@redhat.com>
      Acked-by: default avatarDaniel Vetter <daniel@ffwll.ch>
      Cc: stable@vger.kernel.org
      Cc: Lukas Wunner <lukas@wunner.de>
      Signed-off-by: default avatarBen Skeggs <bskeggs@redhat.com>
      7fec8f53
    • Lyude Paul's avatar
      drm/nouveau: Remove duplicate poll_enable() in pmops_runtime_suspend() · 611ce855
      Lyude Paul authored
      Since actual hotplug notifications don't get disabled until
      nouveau_display_fini() is called, all this will do is cause any hotplugs
      that happen between this drm_kms_helper_poll_disable() call and the
      actual hotplug disablement to potentially be dropped if ACPI isn't
      around to help us.
      Signed-off-by: default avatarLyude Paul <lyude@redhat.com>
      Acked-by: default avatarKarol Herbst <kherbst@redhat.com>
      Acked-by: default avatarDaniel Vetter <daniel@ffwll.ch>
      Cc: stable@vger.kernel.org
      Cc: Lukas Wunner <lukas@wunner.de>
      Signed-off-by: default avatarBen Skeggs <bskeggs@redhat.com>
      611ce855
    • Lyude Paul's avatar
      drm/nouveau/drm/nouveau: Fix bogus drm_kms_helper_poll_enable() placement · d77ef138
      Lyude Paul authored
      Turns out this part is my fault for not noticing when reviewing
      9a2eba33 ("drm/nouveau: Fix drm poll_helper handling"). Currently
      we call drm_kms_helper_poll_enable() from nouveau_display_hpd_work().
      This makes basically no sense however, because that means we're calling
      drm_kms_helper_poll_enable() every time we schedule the hotplug
      detection work. This is also against the advice mentioned in
      drm_kms_helper_poll_enable()'s documentation:
      
       Note that calls to enable and disable polling must be strictly ordered,
       which is automatically the case when they're only call from
       suspend/resume callbacks.
      
      Of course, hotplugs can't really be ordered. They could even happen
      immediately after we called drm_kms_helper_poll_disable() in
      nouveau_display_fini(), which can lead to all sorts of issues.
      
      Additionally; enabling polling /after/ we call
      drm_helper_hpd_irq_event() could also mean that we'd miss a hotplug
      event anyway, since drm_helper_hpd_irq_event() wouldn't bother trying to
      probe connectors so long as polling is disabled.
      
      So; simply move this back into nouveau_display_init() again. The race
      condition that both of these patches attempted to work around has
      already been fixed properly in
      
        d61a5c10 ("drm/nouveau: Fix deadlock on runtime suspend")
      
      Fixes: 9a2eba33 ("drm/nouveau: Fix drm poll_helper handling")
      Signed-off-by: default avatarLyude Paul <lyude@redhat.com>
      Acked-by: default avatarKarol Herbst <kherbst@redhat.com>
      Acked-by: default avatarDaniel Vetter <daniel@ffwll.ch>
      Cc: Lukas Wunner <lukas@wunner.de>
      Cc: Peter Ujfalusi <peter.ujfalusi@ti.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarBen Skeggs <bskeggs@redhat.com>
      d77ef138
  2. 26 Aug, 2018 10 commits
    • Linus Torvalds's avatar
      Linux 4.19-rc1 · 5b394b2d
      Linus Torvalds authored
      5b394b2d
    • Linus Torvalds's avatar
      Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · b933d6eb
      Linus Torvalds authored
      Pull timer update from Thomas Gleixner:
       "New defines for the compat time* types so they can be shared between
        32bit and 64bit builds. Not used yet, but merging them now allows the
        actual conversions to be merged through different maintainer trees
        without dependencies
      
        We still have compat interfaces for 32bit on 64bit even with the new
        2038 safe timespec/val variants because pointer size is different. And
        for the old style timespec/val interfaces we need yet another 'compat'
        interface for both 32bit native and 32bit on 64bit"
      
      * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        y2038: Provide aliases for compat helpers
      b933d6eb
    • Linus Torvalds's avatar
      Merge branch 'ida-4.19' of git://git.infradead.org/users/willy/linux-dax · aba16dc5
      Linus Torvalds authored
      Pull IDA updates from Matthew Wilcox:
       "A better IDA API:
      
            id = ida_alloc(ida, GFP_xxx);
            ida_free(ida, id);
      
        rather than the cumbersome ida_simple_get(), ida_simple_remove().
      
        The new IDA API is similar to ida_simple_get() but better named.  The
        internal restructuring of the IDA code removes the bitmap
        preallocation nonsense.
      
        I hope the net -200 lines of code is convincing"
      
      * 'ida-4.19' of git://git.infradead.org/users/willy/linux-dax: (29 commits)
        ida: Change ida_get_new_above to return the id
        ida: Remove old API
        test_ida: check_ida_destroy and check_ida_alloc
        test_ida: Convert check_ida_conv to new API
        test_ida: Move ida_check_max
        test_ida: Move ida_check_leaf
        idr-test: Convert ida_check_nomem to new API
        ida: Start new test_ida module
        target/iscsi: Allocate session IDs from an IDA
        iscsi target: fix session creation failure handling
        drm/vmwgfx: Convert to new IDA API
        dmaengine: Convert to new IDA API
        ppc: Convert vas ID allocation to new IDA API
        media: Convert entity ID allocation to new IDA API
        ppc: Convert mmu context allocation to new IDA API
        Convert net_namespace to new IDA API
        cb710: Convert to new IDA API
        rsxx: Convert to new IDA API
        osd: Convert to new IDA API
        sd: Convert to new IDA API
        ...
      aba16dc5
    • Linus Torvalds's avatar
      Merge tag 'gcc-plugins-v4.19-rc1-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux · c4726e77
      Linus Torvalds authored
      Pull gcc plugin fix from Kees Cook:
       "Lift gcc test into Kconfig. This is for better behavior when the
        kernel is built with Clang, reported by Stefan Agner"
      
      * tag 'gcc-plugins-v4.19-rc1-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
        gcc-plugins: Disable when building under Clang
      c4726e77
    • Linus Torvalds's avatar
      Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · d207ea8e
      Linus Torvalds authored
      Pull perf updates from Thomas Gleixner:
       "Kernel:
         - Improve kallsyms coverage
         - Add x86 entry trampolines to kcore
         - Fix ARM SPE handling
         - Correct PPC event post processing
      
        Tools:
         - Make the build system more robust
         - Small fixes and enhancements all over the place
         - Update kernel ABI header copies
         - Preparatory work for converting libtraceevnt to a shared library
         - License cleanups"
      
      * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (100 commits)
        tools arch: Update arch/x86/lib/memcpy_64.S copy used in 'perf bench mem memcpy'
        tools arch x86: Update tools's copy of cpufeatures.h
        perf python: Fix pyrf_evlist__read_on_cpu() interface
        perf mmap: Store real cpu number in 'struct perf_mmap'
        perf tools: Remove ext from struct kmod_path
        perf tools: Add gzip_is_compressed function
        perf tools: Add lzma_is_compressed function
        perf tools: Add is_compressed callback to compressions array
        perf tools: Move the temp file processing into decompress_kmodule
        perf tools: Use compression id in decompress_kmodule()
        perf tools: Store compression id into struct dso
        perf tools: Add compression id into 'struct kmod_path'
        perf tools: Make is_supported_compression() static
        perf tools: Make decompress_to_file() function static
        perf tools: Get rid of dso__needs_decompress() call in __open_dso()
        perf tools: Get rid of dso__needs_decompress() call in symbol__disassemble()
        perf tools: Get rid of dso__needs_decompress() call in read_object_code()
        tools lib traceevent: Change to SPDX License format
        perf llvm: Allow passing options to llc in addition to clang
        perf parser: Improve error message for PMU address filters
        ...
      d207ea8e
    • Linus Torvalds's avatar
      Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 2a8a2b7c
      Linus Torvalds authored
      Pull x86 fixes from Thomas Gleixner:
      
       - Correct the L1TF fallout on 32bit and the off by one in the 'too much
         RAM for protection' calculation.
      
       - Add a helpful kernel message for the 'too much RAM' case
      
       - Unbreak the VDSO in case that the compiler desides to use indirect
         jumps/calls and emits retpolines which cannot be resolved because the
         kernel uses its own thunks, which does not work for the VDSO. Make it
         use the builtin thunks.
      
       - Re-export start_thread() which was unexported when the 32/64bit
         implementation was unified. start_thread() is required by modular
         binfmt handlers.
      
       - Trivial cleanups
      
      * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/speculation/l1tf: Suggest what to do on systems with too much RAM
        x86/speculation/l1tf: Fix off-by-one error when warning that system has too much RAM
        x86/kvm/vmx: Remove duplicate l1d flush definitions
        x86/speculation/l1tf: Fix overflow in l1tf_pfn_limit() on 32bit
        x86/process: Re-export start_thread()
        x86/mce: Add notifier_block forward declaration
        x86/vdso: Fix vDSO build if a retpoline is emitted
      2a8a2b7c
    • Linus Torvalds's avatar
      Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · de375035
      Linus Torvalds authored
      Pull irq update from Thomas Gleixner:
       "A small set of updats/fixes for the irq subsystem:
      
         - Allow GICv3 interrupts to be configured as wake-up sources to
           enable wakeup from suspend
      
         - Make the error handling of the STM32 irqchip init function work
      
         - A set of small cleanups and improvements"
      
      * 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        irqchip/gic-v3: Allow interrupt to be configured as wake-up sources
        irqchip/tango: Set irq handler and data in one go
        dt-bindings: irqchip: renesas-irqc: Document r8a774a1 support
        irqchip/s3c24xx: Remove unneeded comparison of unsigned long to 0
        irqchip/stm32: Fix init error handling
        irqchip/bcm7038-l1: Hide cpu offline callback when building for !SMP
      de375035
    • Linus Torvalds's avatar
      Merge branch 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · a9ce3233
      Linus Torvalds authored
      Pull licking update from Thomas Gleixner:
       "Mark the switch cases which fall through to the next case with the
        proper comment so the fallthrough compiler checks can be enabled"
      
      * 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        futex: Mark expected switch fall-throughs
      a9ce3233
    • Linus Torvalds's avatar
      Merge tag 'libnvdimm-for-4.19_dax-memory-failure' of... · 2923b27e
      Linus Torvalds authored
      Merge tag 'libnvdimm-for-4.19_dax-memory-failure' of gitolite.kernel.org:pub/scm/linux/kernel/git/nvdimm/nvdimm
      
      Pull libnvdimm memory-failure update from Dave Jiang:
       "As it stands, memory_failure() gets thoroughly confused by dev_pagemap
        backed mappings. The recovery code has specific enabling for several
        possible page states and needs new enabling to handle poison in dax
        mappings.
      
        In order to support reliable reverse mapping of user space addresses:
      
         1/ Add new locking in the memory_failure() rmap path to prevent races
            that would typically be handled by the page lock.
      
         2/ Since dev_pagemap pages are hidden from the page allocator and the
            "compound page" accounting machinery, add a mechanism to determine
            the size of the mapping that encompasses a given poisoned pfn.
      
         3/ Given pmem errors can be repaired, change the speculatively
            accessed poison protection, mce_unmap_kpfn(), to be reversible and
            otherwise allow ongoing access from the kernel.
      
        A side effect of this enabling is that MADV_HWPOISON becomes usable
        for dax mappings, however the primary motivation is to allow the
        system to survive userspace consumption of hardware-poison via dax.
        Specifically the current behavior is:
      
           mce: Uncorrected hardware memory error in user-access at af34214200
           {1}[Hardware Error]: It has been corrected by h/w and requires no further action
           mce: [Hardware Error]: Machine check events logged
           {1}[Hardware Error]: event severity: corrected
           Memory failure: 0xaf34214: reserved kernel page still referenced by 1 users
           [..]
           Memory failure: 0xaf34214: recovery action for reserved kernel page: Failed
           mce: Memory error not recovered
           <reboot>
      
        ...and with these changes:
      
           Injecting memory failure for pfn 0x20cb00 at process virtual address 0x7f763dd00000
           Memory failure: 0x20cb00: Killing dax-pmd:5421 due to hardware memory corruption
           Memory failure: 0x20cb00: recovery action for dax page: Recovered
      
        Given all the cross dependencies I propose taking this through
        nvdimm.git with acks from Naoya, x86/core, x86/RAS, and of course dax
        folks"
      
      * tag 'libnvdimm-for-4.19_dax-memory-failure' of gitolite.kernel.org:pub/scm/linux/kernel/git/nvdimm/nvdimm:
        libnvdimm, pmem: Restore page attributes when clearing errors
        x86/memory_failure: Introduce {set, clear}_mce_nospec()
        x86/mm/pat: Prepare {reserve, free}_memtype() for "decoy" addresses
        mm, memory_failure: Teach memory_failure() about dev_pagemap pages
        filesystem-dax: Introduce dax_lock_mapping_entry()
        mm, memory_failure: Collect mapping size in collect_procs()
        mm, madvise_inject_error: Let memory_failure() optionally take a page reference
        mm, dev_pagemap: Do not clear ->mapping on final put
        mm, madvise_inject_error: Disable MADV_SOFT_OFFLINE for ZONE_DEVICE pages
        filesystem-dax: Set page->index
        device-dax: Set page->index
        device-dax: Enable page_mapping()
        device-dax: Convert to vmf_insert_mixed and vm_fault_t
      2923b27e
    • Linus Torvalds's avatar
      Merge tag 'libnvdimm-for-4.19_misc' of gitolite.kernel.org:pub/scm/linux/kernel/git/nvdimm/nvdimm · 828bf6e9
      Linus Torvalds authored
      Pull libnvdimm updates from Dave Jiang:
       "Collection of misc libnvdimm patches for 4.19 submission:
      
         - Adding support to read locked nvdimm capacity.
      
         - Change test code to make DSM failure code injection an override.
      
         - Add support for calculate maximum contiguous area for namespace.
      
         - Add support for queueing a short ARS when there is on going ARS for
           nvdimm.
      
         - Allow NULL to be passed in to ->direct_access() for kaddr and pfn
           params.
      
         - Improve smart injection support for nvdimm emulation testing.
      
         - Fix test code that supports for emulating controller temperature.
      
         - Fix hang on error before devm_memremap_pages()
      
         - Fix a bug that causes user memory corruption when data returned to
           user for ars_status.
      
         - Maintainer updates for Ross Zwisler emails and adding Jan Kara to
           fsdax"
      
      * tag 'libnvdimm-for-4.19_misc' of gitolite.kernel.org:pub/scm/linux/kernel/git/nvdimm/nvdimm:
        libnvdimm: fix ars_status output length calculation
        device-dax: avoid hang on error before devm_memremap_pages()
        tools/testing/nvdimm: improve emulation of smart injection
        filesystem-dax: Do not request kaddr and pfn when not required
        md/dm-writecache: Don't request pointer dummy_addr when not required
        dax/super: Do not request a pointer kaddr when not required
        tools/testing/nvdimm: kaddr and pfn can be NULL to ->direct_access()
        s390, dcssblk: kaddr and pfn can be NULL to ->direct_access()
        libnvdimm, pmem: kaddr and pfn can be NULL to ->direct_access()
        acpi/nfit: queue issuing of ars when an uc error notification comes in
        libnvdimm: Export max available extent
        libnvdimm: Use max contiguous area for namespace size
        MAINTAINERS: Add Jan Kara for filesystem DAX
        MAINTAINERS: update Ross Zwisler's email address
        tools/testing/nvdimm: Fix support for emulating controller temperature
        tools/testing/nvdimm: Make DSM failure code injection an override
        acpi, nfit: Prefer _DSM over _LSR for namespace label reads
        libnvdimm: Introduce locked DIMM capacity support
      828bf6e9
  3. 25 Aug, 2018 8 commits
    • Linus Torvalds's avatar
      Merge tag 'armsoc-late' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc · b3262720
      Linus Torvalds authored
      Pull ARM SoC late updates from Olof Johansson:
       "A couple of late-merged changes that would be useful to get in this
        merge window:
      
         - Driver support for reset of audio complex on Meson platforms. The
           audio driver went in this merge window, and these changes have been
           in -next for a while (just not in our tree).
      
         - Power management fixes for IOMMU on Rockchip platforms, getting
           closer to kexec working on them, including Chromebooks.
      
         - Another pass updating "arm,psci" -> "psci" for some properties that
           have snuck in since last time it was done"
      
      * tag 'armsoc-late' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
        iommu/rockchip: Move irq request past pm_runtime_enable
        iommu/rockchip: Handle errors returned from PM framework
        arm64: rockchip: Force CONFIG_PM on Rockchip systems
        ARM: rockchip: Force CONFIG_PM on Rockchip systems
        arm64: dts: Fix various entry-method properties to reflect documentation
        reset: imx7: Fix always writing bits as 0
        reset: meson: add meson audio arb driver
        reset: meson: add dt-bindings for meson-axg audio arb
      b3262720
    • Linus Torvalds's avatar
      Merge tag 'kbuild-v4.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild · 1bc27677
      Linus Torvalds authored
      Pull more Kbuild updates from Masahiro Yamada:
      
       - add build_{menu,n,g,x}config targets for compile-testing Kconfig
      
       - fix and improve recursive dependency detection in Kconfig
      
       - fix parallel building of menuconfig/nconfig
      
       - fix syntax error in clang-version.sh
      
       - suppress distracting log from syncconfig
      
       - remove obsolete "rpm" target
      
       - remove VMLINUX_SYMBOL(_STR) macro entirely
      
       - fix microblaze build with CONFIG_DYNAMIC_FTRACE
      
       - move compiler test for dead code/data elimination to Kconfig
      
       - rename well-known LDFLAGS variable to KBUILD_LDFLAGS
      
       - misc fixes and cleanups
      
      * tag 'kbuild-v4.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
        kbuild: rename LDFLAGS to KBUILD_LDFLAGS
        kbuild: pass LDFLAGS to recordmcount.pl
        kbuild: test dead code/data elimination support in Kconfig
        initramfs: move gen_initramfs_list.sh from scripts/ to usr/
        vmlinux.lds.h: remove stale <linux/export.h> include
        export.h: remove VMLINUX_SYMBOL() and VMLINUX_SYMBOL_STR()
        Coccinelle: remove pci_alloc_consistent semantic to detect in zalloc-simple.cocci
        kbuild: make sorting initramfs contents independent of locale
        kbuild: remove "rpm" target, which is alias of "rpm-pkg"
        kbuild: Fix LOADLIBES rename in Documentation/kbuild/makefiles.txt
        kconfig: suppress "configuration written to .config" for syncconfig
        kconfig: fix "Can't open ..." in parallel build
        kbuild: Add a space after `!` to prevent parsing as file pattern
        scripts: modpost: check memory allocation results
        kconfig: improve the recursive dependency report
        kconfig: report recursive dependency involving 'imply'
        kconfig: error out when seeing recursive dependency
        kconfig: add build-only configurator targets
        scripts/dtc: consolidate include path options in Makefile
      1bc27677
    • Linus Torvalds's avatar
      Merge tag 'for-linus-20180825' of git://git.kernel.dk/linux-block · b8dcdab3
      Linus Torvalds authored
      Pull block fixes from Jens Axboe:
       "A few small fixes for this merge window:
      
         - Locking imbalance fix for bcache (Shan Hai)
      
         - A few small fixes for wbt. One is a cleanup/prep, one is a fix for
           an existing issue, and the last two are fixes for changes that went
           into this merge window (me)"
      
      * tag 'for-linus-20180825' of git://git.kernel.dk/linux-block:
        blk-wbt: don't maintain inflight counts if disabled
        blk-wbt: fix has-sleeper queueing check
        blk-wbt: use wq_has_sleeper() for wq active check
        blk-wbt: move disable check into get_limit()
        bcache: release dc->writeback_lock properly in bch_writeback_thread()
      b8dcdab3
    • Linus Torvalds's avatar
      Merge tag 'upstream-4.19-rc1-fix' of git://git.infradead.org/linux-ubifs · db84abf5
      Linus Torvalds authored
      Pull UBIFS fix from Richard Weinberger:
       "Remove an empty file from UBIFS source"
      
      * tag 'upstream-4.19-rc1-fix' of git://git.infradead.org/linux-ubifs:
        ubifs: Remove empty file.h
      db84abf5
    • Linus Torvalds's avatar
      Merge tag '4.19-rc-smb3' of git://git.samba.org/sfrench/cifs-2.6 · 04faac10
      Linus Torvalds authored
      Pull cifs fixes from Steve French:
       "Three small SMB3 fixes, one for stable"
      
      * tag '4.19-rc-smb3' of git://git.samba.org/sfrench/cifs-2.6:
        cifs: update internal module version number for cifs.ko to 2.12
        cifs: check kmalloc before use
        cifs: check if SMB2 PDU size has been padded and suppress the warning
        cifs: create a define for how many iovs we need for an SMB2_open()
      04faac10
    • Linus Torvalds's avatar
      mm/cow: don't bother write protecting already write-protected pages · 1b2de5d0
      Linus Torvalds authored
      This is not normally noticeable, but repeated forks are unnecessarily
      expensive because they repeatedly dirty the parent page tables during
      the page table copy operation.
      
      It's trivial to just avoid write protecting the page table entry if it
      was already not writable.
      
      This patch was inspired by
      
          https://bugzilla.kernel.org/show_bug.cgi?id=200447
      
      which points to an ancient "waste time re-doing fork" issue in the
      presence of lots of signals.
      
      That bug was fixed by Eric Biederman's signal handling series
      culminating in commit c3ad2c3b ("signal: Don't restart fork when
      signals come in"), but the unnecessary work for repeated forks is still
      work just fixing, particularly since the fix is trivial.
      
      Cc: Eric Biederman <ebiederm@xmission.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      1b2de5d0
    • Colin Ian King's avatar
      hpfs: remove unnecessary checks on the value of r when assigning error code · e0fcfe1f
      Colin Ian King authored
      At the point where r is being checked for different values, r is always
      going to be equal to 2 as the previous if statements jump to end or end1
      if r is not 2.  Hence the assignment to err can be simplified to just
      err an assignment without any checks on the value or r.
      
      Detected by CoverityScan, CID#1226737 ("Logically dead code")
      Signed-off-by: default avatarColin Ian King <colin.king@canonical.com>
      Reviewed-by: default avatarMikulas Patocka <mikulas@artax.karlin.mff.cuni.cz>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      e0fcfe1f
    • Jens Axboe's avatar
      libata: maintainership update · 7634ccd2
      Jens Axboe authored
      Tejun Heo wrote:
      >
      > I asked Jens whether he could take care of the libata tree and he
      > thankfully agreed, so, from now on, Jens will be the libata
      > maintainer.
      >
      > Thanks a lot!
      
      Thanks for your work in this area. I still remember the first linux
      storage summit we did in Vancouver 2001, Tejun was invited to talk about
      his libata error handling work. Before that, it was basically a crap
      shoot if we recovered properly or not... A lot of water has flown under
      the bridge since then!
      
      Here's an "official" patch. Linus, can you apply it?
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      7634ccd2
  4. 24 Aug, 2018 5 commits
    • Linus Torvalds's avatar
      Merge branch 'for-4.19' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata · 05193597
      Linus Torvalds authored
      Pull libata updates from Tejun Heo:
       "Nothing too interesting. Mostly ahci and ahci_platform changes, many
        around power management"
      
      * 'for-4.19' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata: (22 commits)
        ata: ahci_platform: enable to get and control reset
        ata: libahci_platform: add reset control support
        ata: add an extra argument to ahci_platform_get_resources()
        ata: sata_rcar: Add r8a77965 support
        ata: sata_rcar: exclude setting of PHY registers in Gen3
        ata: sata_rcar: really mask all interrupts on Gen2 and later
        Revert "ata: ahci_platform: allow disabling of hotplug to save power"
        ata: libahci: Allow reconfigure of DEVSLP register
        ata: libahci: Correct setting of DEVSLP register
        ata: ahci: Enable DEVSLP by default on x86 with SLP_S0
        ata: ahci: Support state with min power but Partial low power state
        Revert "ata: ahci_platform: convert kcalloc to devm_kcalloc"
        ata: sata_rcar: Add rudimentary Runtime PM support
        ata: sata_rcar: Provide a short-hand for &pdev->dev
        ata: Only output sg element mapped number in verbose debug
        ata: Guard ata_scsi_dump_cdb() by ATA_VERBOSE_DEBUG
        ata: ahci_platform: convert kcalloc to devm_kcalloc
        ata: ahci_platform: convert kzallloc to kcalloc
        ata: ahci_platform: correct parameter documentation for ahci_platform_shutdown
        libata: remove ata_sff_data_xfer_noirq()
        ...
      05193597
    • Linus Torvalds's avatar
      Merge branch 'for-4.19' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup · 59676610
      Linus Torvalds authored
      Pull cgroup updates from Tejun Heo:
       "Just one commit from Steven to take out spin lock from trace event
        handlers"
      
      * 'for-4.19' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
        cgroup/tracing: Move taking of spin lock out of trace event handlers
      59676610
    • Linus Torvalds's avatar
      Merge branch 'for-4.19' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq · 9022ada8
      Linus Torvalds authored
      Pull workqueue updates from Tejun Heo:
       "Over the lockdep cross-release churn, workqueue lost some of the
        existing annotations. Johannes Berg restored it and also improved
        them"
      
      * 'for-4.19' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
        workqueue: re-add lockdep dependencies for flushing
        workqueue: skip lockdep wq dependency in cancel_work_sync()
      9022ada8
    • Linus Torvalds's avatar
      Merge tag 'iommu-updates-v4.19' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu · 18b8bfdf
      Linus Torvalds authored
      Pull IOMMU updates from Joerg Roedel:
      
       - PASID table handling updates for the Intel VT-d driver. It implements
         a global PASID space now so that applications usings multiple devices
         will just have one PASID.
      
       - A new config option to make iommu passthroug mode the default.
      
       - New sysfs attribute for iommu groups to export the type of the
         default domain.
      
       - A debugfs interface (for debug only) usable by IOMMU drivers to
         export internals to user-space.
      
       - R-Car Gen3 SoCs support for the ipmmu-vmsa driver
      
       - The ARM-SMMU now aborts transactions from unknown devices and devices
         not attached to any domain.
      
       - Various cleanups and smaller fixes all over the place.
      
      * tag 'iommu-updates-v4.19' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (42 commits)
        iommu/omap: Fix cache flushes on L2 table entries
        iommu: Remove the ->map_sg indirection
        iommu/arm-smmu-v3: Abort all transactions if SMMU is enabled in kdump kernel
        iommu/arm-smmu-v3: Prevent any devices access to memory without registration
        iommu/ipmmu-vmsa: Don't register as BUS IOMMU if machine doesn't have IPMMU-VMSA
        iommu/ipmmu-vmsa: Clarify supported platforms
        iommu/ipmmu-vmsa: Fix allocation in atomic context
        iommu: Add config option to set passthrough as default
        iommu: Add sysfs attribyte for domain type
        iommu/arm-smmu-v3: sync the OVACKFLG to PRIQ consumer register
        iommu/arm-smmu: Error out only if not enough context interrupts
        iommu/io-pgtable-arm-v7s: Abort allocation when table address overflows the PTE
        iommu/io-pgtable-arm: Fix pgtable allocation in selftest
        iommu/vt-d: Remove the obsolete per iommu pasid tables
        iommu/vt-d: Apply per pci device pasid table in SVA
        iommu/vt-d: Allocate and free pasid table
        iommu/vt-d: Per PCI device pasid table interfaces
        iommu/vt-d: Add for_each_device_domain() helper
        iommu/vt-d: Move device_domain_info to header
        iommu/vt-d: Apply global PASID in SVA
        ...
      18b8bfdf
    • Linus Torvalds's avatar
      Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux · d972604f
      Linus Torvalds authored
      Pull thermal management updates from Zhang Rui:
      
       - Add Daniel Lezcano as the reviewer of thermal framework and SoC
         driver changes (Daniel Lezcano).
      
       - Fix a bug in intel_dts_soc_thermal driver, which does not translate
         IO-APIC GSI (Global System Interrupt) into Linux irq number (Hans de
         Goede).
      
       - For device tree bindings, allow cooling devices sharing same trip
         point with same contribution value to share cooling map (Viresh
         Kumar).
      
      * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux:
        dt-bindings: thermal: Allow multiple devices to share cooling map
        MAINTAINERS: Add Daniel Lezcano as designated reviewer for thermal
        Thermal: Intel SoC DTS: Translate IO-APIC GSI number to linux irq number
      d972604f