1. 14 Sep, 2017 19 commits
    • Thomas Gleixner's avatar
      watchdog/core: Clean up header mess · 3b371b59
      Thomas Gleixner authored
      Having the same #ifdef in various places does not make it more
      readable. Collect stuff into one place.
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: default avatarDon Zickus <dzickus@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Chris Metcalf <cmetcalf@mellanox.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Nicholas Piggin <npiggin@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sebastian Siewior <bigeasy@linutronix.de>
      Cc: Ulrich Obergfell <uobergfe@redhat.com>
      Link: http://lkml.kernel.org/r/20170912194147.627096864@linutronix.deSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      3b371b59
    • Thomas Gleixner's avatar
      watchdog/core: Further simplify sysctl handling · e8b62b2d
      Thomas Gleixner authored
      Use a single function to update sysctl changes. This is not a high
      frequency user space interface and it's root only.
      
      Preparatory patch to cleanup the sysctl variable handling.
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: default avatarDon Zickus <dzickus@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Chris Metcalf <cmetcalf@mellanox.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Nicholas Piggin <npiggin@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sebastian Siewior <bigeasy@linutronix.de>
      Cc: Ulrich Obergfell <uobergfe@redhat.com>
      Link: http://lkml.kernel.org/r/20170912194147.549114957@linutronix.deSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      e8b62b2d
    • Thomas Gleixner's avatar
      watchdog/core: Get rid of the thread teardown/setup dance · d57108d4
      Thomas Gleixner authored
      The lockup detector reconfiguration tears down all watchdog threads when
      the watchdog is disabled and sets them up again when its enabled.
      
      That's a pointless exercise. The watchdog threads are not consuming an
      insane amount of resources, so it's enough to set them up at init time and
      keep them in parked position when the watchdog is disabled and unpark them
      when it is reenabled. The smpboot thread infrastructure takes care of
      keeping the force parked threads in place even across cpu hotplug.
      
      Aside of that the code implements the park/unpark facility of smp hotplug
      threads on its own, which is even more pointless. We have functionality in
      the smpboot thread code to do so.
      
      Use the new thread management functions and get rid of the unholy mess.
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: default avatarDon Zickus <dzickus@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Chris Metcalf <cmetcalf@mellanox.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Nicholas Piggin <npiggin@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sebastian Siewior <bigeasy@linutronix.de>
      Cc: Ulrich Obergfell <uobergfe@redhat.com>
      Link: http://lkml.kernel.org/r/20170912194147.470370113@linutronix.deSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      d57108d4
    • Thomas Gleixner's avatar
      watchdog/core: Create new thread handling infrastructure · 2eb2527f
      Thomas Gleixner authored
      The lockup detector reconfiguration tears down all watchdog threads when
      the watchdog is disabled and sets them up again when its enabled.
      
      That's a pointless exercise. The watchdog threads are not consuming an
      insane amount of resources, so it's enough to set them up at init time and
      keep them in parked position when the watchdog is disabled and unpark them
      when it is reenabled. The smpboot thread infrastructure takes care of
      keeping the force parked threads in place even across cpu hotplug.
      
      Another horrible mechanism are the open coded park/unpark loops which are
      used for reconfiguration of the watchdog. The smpboot infrastructure allows
      exactly the same via smpboot_update_cpumask_thread_percpu(), which is cpu
      hotplug safe. Using that instead of the open coded loops allows to get rid
      of the hotplug locking mess in the watchdog code.
      
      Implement a clean infrastructure which allows to replace the open coded
      nonsense.
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: default avatarDon Zickus <dzickus@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Chris Metcalf <cmetcalf@mellanox.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Nicholas Piggin <npiggin@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sebastian Siewior <bigeasy@linutronix.de>
      Cc: Ulrich Obergfell <uobergfe@redhat.com>
      Link: http://lkml.kernel.org/r/20170912194147.377182587@linutronix.deSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      2eb2527f
    • Thomas Gleixner's avatar
      smpboot/threads, watchdog/core: Avoid runtime allocation · 0d85923c
      Thomas Gleixner authored
      smpboot_update_cpumask_threads_percpu() allocates a temporary cpumask at
      runtime. This is suboptimal because the call site needs more code size for
      proper error handling than a statically allocated temporary mask requires
      data size.
      
      Add static temporary cpumask. The function is globaly serialized, so no
      further protection required.
      
      Remove the half baken error handling in the watchdog code and get rid of
      the export as there are no in tree modular users of that function.
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: default avatarDon Zickus <dzickus@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Chris Metcalf <cmetcalf@mellanox.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Nicholas Piggin <npiggin@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sebastian Siewior <bigeasy@linutronix.de>
      Cc: Ulrich Obergfell <uobergfe@redhat.com>
      Link: http://lkml.kernel.org/r/20170912194147.297288838@linutronix.deSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      0d85923c
    • Thomas Gleixner's avatar
      watchdog/core: Split out cpumask write function · 05ba3de7
      Thomas Gleixner authored
      Split the write part of the cpumask proc handler out into a separate helper
      to avoid deep indentation. This also reduces the patch complexity in the
      following cleanups.
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: default avatarDon Zickus <dzickus@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Chris Metcalf <cmetcalf@mellanox.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Nicholas Piggin <npiggin@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sebastian Siewior <bigeasy@linutronix.de>
      Cc: Ulrich Obergfell <uobergfe@redhat.com>
      Link: http://lkml.kernel.org/r/20170912194147.218075991@linutronix.deSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      05ba3de7
    • Thomas Gleixner's avatar
      watchdog/core: Clean up the #ifdef maze · 368a7e2c
      Thomas Gleixner authored
      The #ifdef maze in this file is horrible, group stuff at least a bit so one
      can figure out what belongs to what.
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: default avatarDon Zickus <dzickus@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Chris Metcalf <cmetcalf@mellanox.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Nicholas Piggin <npiggin@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sebastian Siewior <bigeasy@linutronix.de>
      Cc: Ulrich Obergfell <uobergfe@redhat.com>
      Link: http://lkml.kernel.org/r/20170912194147.139629546@linutronix.deSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      368a7e2c
    • Thomas Gleixner's avatar
      watchdog/core: Clean up stub functions · 2b9d7f23
      Thomas Gleixner authored
      Having stub functions which take a full page is not helping the
      readablility of code.
      
      Condense them and move the doubled #ifdef variant into the SYSFS section.
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: default avatarDon Zickus <dzickus@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Chris Metcalf <cmetcalf@mellanox.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Nicholas Piggin <npiggin@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sebastian Siewior <bigeasy@linutronix.de>
      Cc: Ulrich Obergfell <uobergfe@redhat.com>
      Link: http://lkml.kernel.org/r/20170912194147.045545271@linutronix.deSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      2b9d7f23
    • Thomas Gleixner's avatar
      watchdog/core: Remove the park_in_progress obfuscation · 01f0a027
      Thomas Gleixner authored
      Commit:
      
        b94f5118 ("kernel/watchdog: prevent false hardlockup on overloaded system")
      
      tries to fix the following issue:
      
      proc_write()
         set_sample_period()    <--- New sample period becoms visible
      			  <----- Broken starts
         proc_watchdog_update()
           watchdog_enable_all_cpus()		watchdog_hrtimer_fn()
           update_watchdog_all_cpus()		   restart_timer(sample_period)
              watchdog_park_threads()
      
      					thread->park()
      					  disable_nmi()
      			  <----- Broken ends
      
      The reason why this is broken is that the update of the watchdog threshold
      becomes immediately effective and visible for the hrtimer function which
      uses that value to rearm the timer. But the NMI/perf side still uses the
      old value up to the point where it is disabled. If the rate has been
      lowered then the NMI can run fast enough to 'detect' a hard lockup because
      the timer has not fired due to the longer period.
      
      The patch 'fixed' this by adding a variable:
      
      proc_write()
         set_sample_period()
      					<----- Broken starts
         proc_watchdog_update()
           watchdog_enable_all_cpus()		watchdog_hrtimer_fn()
           update_watchdog_all_cpus()		   restart_timer(sample_period)
               watchdog_park_threads()
      	  park_in_progress = 1
      					<----- Broken ends
      				        nmi_watchdog()
      					  if (park_in_progress)
      					     return;
      
      The only effect of this variable was to make the window where the breakage
      can hit small enough that it was not longer observable in testing. From a
      correctness point of view it is a pointless bandaid which merily papers
      over the root cause: the unsychronized update of the variable.
      
      Looking deeper into the related code pathes unearthed similar problems in
      the watchdog_start()/stop() functions.
      
       watchdog_start()
      	perf_nmi_event_start()
      	hrtimer_start()
      
       watchdog_stop()
      	hrtimer_cancel()
      	perf_nmi_event_stop()
      
      In both cases the call order is wrong because if the tasks gets preempted
      or the VM gets scheduled out long enough after the first call, then there is
      a chance that the next NMI will see a stale hrtimer interrupt count and
      trigger a false positive hard lockup splat.
      
      Get rid of park_in_progress so the code can be gradually deobfuscated and
      pruned from several layers of duct tape papering over the root cause,
      which has been either ignored or not understood at all.
      
      Once this is removed the underlying problem will be fixed by rewriting the
      proc interface to do a proper synchronized update.
      
      Address the start/stop() ordering problem as well by reverting the call
      order, so this part is at least correct now.
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: default avatarDon Zickus <dzickus@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Chris Metcalf <cmetcalf@mellanox.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Nicholas Piggin <npiggin@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sebastian Siewior <bigeasy@linutronix.de>
      Cc: Ulrich Obergfell <uobergfe@redhat.com>
      Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1709052038270.2393@nanosSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      01f0a027
    • Thomas Gleixner's avatar
      watchdog/hardlockup/perf: Prevent CPU hotplug deadlock · 941154bd
      Thomas Gleixner authored
      The following deadlock is possible in the watchdog hotplug code:
      
        cpus_write_lock()
          ...
            takedown_cpu()
              smpboot_park_threads()
                smpboot_park_thread()
                  kthread_park()
                    ->park() := watchdog_disable()
                      watchdog_nmi_disable()
                        perf_event_release_kernel();
                          put_event()
                            _free_event()
                              ->destroy() := hw_perf_event_destroy()
                                x86_release_hardware()
                                  release_ds_buffers()
                                    get_online_cpus()
      
      when a per cpu watchdog perf event is destroyed which drops the last
      reference to the PMU hardware. The cleanup code there invokes
      get_online_cpus() which instantly deadlocks because the hotplug percpu
      rwsem is write locked.
      
      To solve this add a deferring mechanism:
      
        cpus_write_lock()
      			   kthread_park()
      			    watchdog_nmi_disable(deferred)
      			      perf_event_disable(event);
      			      move_event_to_deferred(event);
      			   ....
        cpus_write_unlock()
        cleaup_deferred_events()
          perf_event_release_kernel()
      
      This is still properly serialized against concurrent hotplug via the
      cpu_add_remove_lock, which is held by the task which initiated the hotplug
      event.
      
      This is also used to handle event destruction when the watchdog threads are
      parked via other mechanisms than CPU hotplug.
      Analyzed-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Reported-by: default avatarBorislav Petkov <bp@alien8.de>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: default avatarDon Zickus <dzickus@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Chris Metcalf <cmetcalf@mellanox.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Nicholas Piggin <npiggin@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sebastian Siewior <bigeasy@linutronix.de>
      Cc: Ulrich Obergfell <uobergfe@redhat.com>
      Link: http://lkml.kernel.org/r/20170912194146.884469246@linutronix.deSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      941154bd
    • Thomas Gleixner's avatar
      watchdog/hardlockup/perf: Remove broken self disable on failure · 20d853fd
      Thomas Gleixner authored
      The self disabling feature is broken vs. CPU hotplug locking:
      
      CPU 0			   CPU 1
      cpus_write_lock();
       cpu_up(1)
         wait_for_completion()
      			   ....
      			   unpark_watchdog()
      			   ->unpark()
      			     perf_event_create() <- fails
      			       watchdog_enable &= ~NMI_WATCHDOG;
      			   ....
      cpus_write_unlock();
      			   CPU 2
      cpus_write_lock()
       cpu_down(2)
         wait_for_completion()
      			   wakeup(watchdog);
      			     watchdog()
      			     if (!(watchdog_enable & NMI_WATCHDOG))
      				watchdog_nmi_disable()
      				  perf_event_disable()
      				  ....
      				  cpus_read_lock();
      
      			   stop_smpboot_threads()
      			     park_watchdog();
      			       wait_for_completion(watchdog->parked);
      
      Result: End of hotplug and instantaneous full lockup of the machine.
      
      There is a similar problem with disabling the watchdog via the user space
      interface as the sysctl function fiddles with watchdog_enable directly.
      
      It's very debatable whether this is required at all. If the watchdog works
      nicely on N CPUs and it fails to enable on the N + 1 CPU either during
      hotplug or because the user space interface disabled it via sysctl cpumask
      and then some perf user grabbed the counter which is then unavailable for
      the watchdog when the sysctl cpumask gets changed back.
      
      There is no real justification for this.
      
      One of the reasons WHY this is done is the utter stupidity of the init code
      of the perf NMI watchdog. Instead of checking upfront at boot whether PERF
      is available and functional at all, it just does this check at run time
      over and over when user space fiddles with the sysctl. That's broken beyond
      repair along with the idiotic error code dependent warn level printks and
      the even more silly printk rate limiting.
      
      If the init code checks whether perf works at boot time, then this mess can
      be more or less avoided completely. Perf does not come magically into life
      at runtime. Brain usage while coding is overrated.
      
      Remove the cruft and add a temporary safe guard which gets removed later.
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: default avatarDon Zickus <dzickus@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Chris Metcalf <cmetcalf@mellanox.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Nicholas Piggin <npiggin@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sebastian Siewior <bigeasy@linutronix.de>
      Cc: Ulrich Obergfell <uobergfe@redhat.com>
      Link: http://lkml.kernel.org/r/20170912194146.806708429@linutronix.deSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      20d853fd
    • Thomas Gleixner's avatar
      watchdog/core: Mark hardlockup_detector_disable() __init · 7a355820
      Thomas Gleixner authored
      The function is only used by the KVM init code. Mark it __init to prevent
      creative abuse.
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: default avatarDon Zickus <dzickus@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Chris Metcalf <cmetcalf@mellanox.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Nicholas Piggin <npiggin@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sebastian Siewior <bigeasy@linutronix.de>
      Cc: Ulrich Obergfell <uobergfe@redhat.com>
      Link: http://lkml.kernel.org/r/20170912194146.727134632@linutronix.deSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      7a355820
    • Thomas Gleixner's avatar
      watchdog/core: Rename watchdog_proc_mutex · 946d1977
      Thomas Gleixner authored
      Following patches will use the mutex for other purposes as well. Rename it
      as it is not longer a proc specific thing.
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: default avatarDon Zickus <dzickus@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Chris Metcalf <cmetcalf@mellanox.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Nicholas Piggin <npiggin@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sebastian Siewior <bigeasy@linutronix.de>
      Cc: Ulrich Obergfell <uobergfe@redhat.com>
      Link: http://lkml.kernel.org/r/20170912194146.647714850@linutronix.deSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      946d1977
    • Thomas Gleixner's avatar
      watchdog/core: Rework CPU hotplug locking · b7a34981
      Thomas Gleixner authored
      The watchdog proc interface causes extensive recursive locking of the CPU
      hotplug percpu rwsem, which is deadlock prone.
      
      Replace the get/put_online_cpus() pairs with cpu_hotplug_disable()/enable()
      calls for now. Later patches will remove that requirement completely.
      Reported-by: default avatarBorislav Petkov <bp@alien8.de>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: default avatarDon Zickus <dzickus@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Chris Metcalf <cmetcalf@mellanox.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Nicholas Piggin <npiggin@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sebastian Siewior <bigeasy@linutronix.de>
      Cc: Ulrich Obergfell <uobergfe@redhat.com>
      Link: http://lkml.kernel.org/r/20170912194146.568079057@linutronix.deSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      b7a34981
    • Thomas Gleixner's avatar
      watchdog/core: Remove broken suspend/resume interfaces · 5490125d
      Thomas Gleixner authored
      This interface has several issues:
      
       - It's causing recursive locking of the hotplug lock.
      
       - It's complete overkill to teardown all threads and then recreate them
      
      The same can be achieved with the simple hardlockup_detector_perf_stop /
      restart() interfaces. The abuse from the busy looping poweroff() loop of
      PARISC has been solved as well.
      
      Remove the cruft.
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: default avatarDon Zickus <dzickus@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Chris Metcalf <cmetcalf@mellanox.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Nicholas Piggin <npiggin@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sebastian Siewior <bigeasy@linutronix.de>
      Cc: Ulrich Obergfell <uobergfe@redhat.com>
      Link: http://lkml.kernel.org/r/20170912194146.487537732@linutronix.deSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      5490125d
    • Thomas Gleixner's avatar
      parisc, watchdog/core: Use lockup_detector_stop() · 47bb4baf
      Thomas Gleixner authored
      The broken lockup_detector_suspend/resume() interface is going away. Use
      the new lockup_detector_soft_poweroff() interface to stop the watchdog from
      the busy looping power off routine.
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: default avatarDon Zickus <dzickus@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Chris Metcalf <cmetcalf@mellanox.com>
      Cc: Helge Deller <deller@gmx.de>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Nicholas Piggin <npiggin@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sebastian Siewior <bigeasy@linutronix.de>
      Cc: Ulrich Obergfell <uobergfe@redhat.com>
      Cc: linux-parisc@vger.kernel.org
      Link: http://lkml.kernel.org/r/20170912194146.407385557@linutronix.deSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      47bb4baf
    • Thomas Gleixner's avatar
      watchdog/core: Provide interface to stop from poweroff() · 6554fd8c
      Thomas Gleixner authored
      PARISC has a a busy looping power off routine. If the watchdog is enabled
      the watchdog timer will still fire, but the thread is not running, which
      causes the softlockup watchdog to trigger.
      
      Provide a interface which allows to turn the watchdog off.
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: default avatarDon Zickus <dzickus@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Chris Metcalf <cmetcalf@mellanox.com>
      Cc: Helge Deller <deller@gmx.de>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Nicholas Piggin <npiggin@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sebastian Siewior <bigeasy@linutronix.de>
      Cc: Ulrich Obergfell <uobergfe@redhat.com>
      Cc: linux-parisc@vger.kernel.org
      Link: http://lkml.kernel.org/r/20170912194146.327343752@linutronix.deSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      6554fd8c
    • Peter Zijlstra's avatar
      perf/x86/intel, watchdog/core: Sanitize PMU HT bug workaround · 2406e3b1
      Peter Zijlstra authored
      The lockup_detector_suspend/resume() interface is broken in several ways
      especially as it results in recursive locking of the CPU hotplug lock.
      
      Use the new stop/restart interface in the perf NMI watchdog to temporarily
      disable and reenable the already active watchdog events. That's enough to
      handle it.
      Signed-off-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: default avatarDon Zickus <dzickus@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Chris Metcalf <cmetcalf@mellanox.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Nicholas Piggin <npiggin@gmail.com>
      Cc: Sebastian Siewior <bigeasy@linutronix.de>
      Cc: Ulrich Obergfell <uobergfe@redhat.com>
      Link: http://lkml.kernel.org/r/20170912194146.247141871@linutronix.deSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      2406e3b1
    • Peter Zijlstra's avatar
      watchdog/hardlockup: Provide interface to stop/restart perf events · d0b6e0a8
      Peter Zijlstra authored
      Provide an interface to stop and restart perf NMI watchdog events on all
      CPUs. This is only usable during init and especially for handling the perf
      HT bug on Intel machines. It's safe to use it this way as nothing can
      start/stop the NMI watchdog in parallel.
      Signed-off-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: default avatarDon Zickus <dzickus@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Chris Metcalf <cmetcalf@mellanox.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Nicholas Piggin <npiggin@gmail.com>
      Cc: Sebastian Siewior <bigeasy@linutronix.de>
      Cc: Ulrich Obergfell <uobergfe@redhat.com>
      Link: http://lkml.kernel.org/r/20170912194146.167649596@linutronix.deSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      d0b6e0a8
  2. 13 Sep, 2017 21 commits
    • Linus Torvalds's avatar
      Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 46c1e79f
      Linus Torvalds authored
      Pull perf fixes from Ingo Molnar:
       "A handful of tooling fixes"
      
      * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        perf stat: Wait for the correct child
        perf tools: Support running perf binaries with a dash in their name
        perf config: Check not only section->from_system_config but also item's
        perf ui progress: Fix progress update
        perf ui progress: Make sure we always define step value
        perf tools: Open perf.data with O_CLOEXEC flag
        tools lib api: Fix make DEBUG=1 build
        perf tests: Fix compile when libunwind's unwind.h is available
        tools include linux: Guard against redefinition of some macros
      46c1e79f
    • Linus Torvalds's avatar
      Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · ec846ecd
      Linus Torvalds authored
      Pull scheduler fixes from Ingo Molnar:
       "Three CPU hotplug related fixes and a debugging improvement"
      
      * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        sched/debug: Add debugfs knob for "sched_debug"
        sched/core: WARN() when migrating to an offline CPU
        sched/fair: Plug hole between hotplug and active_load_balance()
        sched/fair: Avoid newidle balance for !active CPUs
      ec846ecd
    • Linus Torvalds's avatar
      Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · b5df1b3a
      Linus Torvalds authored
      Pull x86 fixes from Ingo Molnar:
       "The main changes are the PCID fixes from Andy, but there's also two
        hyperv fixes and two paravirt updates"
      
      * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/hyper-v: Remove duplicated HV_X64_EX_PROCESSOR_MASKS_RECOMMENDED definition
        x86/hyper-V: Allocate the IDT entry early in boot
        paravirt: Switch maintainer
        x86/paravirt: Remove no longer used paravirt functions
        x86/mm/64: Initialize CR4.PCIDE early
        x86/hibernate/64: Mask off CR3's PCID bits in the saved CR3
        x86/mm: Get rid of VM_BUG_ON in switch_tlb_irqs_off()
      b5df1b3a
    • Linus Torvalds's avatar
      Merge tag 'openrisc-for-linus' of git://github.com/openrisc/linux · 9888e4d4
      Linus Torvalds authored
      Pull OpenRISC fixlet from Stafford Horne:
       "Fix warning for upcoming work to remove linux/vmalloc.h from
        asm-generic/io.h"
      
      * tag 'openrisc-for-linus' of git://github.com/openrisc/linux:
        openrisc: add forward declaration for struct vm_area_struct
      9888e4d4
    • Linus Torvalds's avatar
      Merge tag 'modules-for-v4.14' of git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux · 4791bccc
      Linus Torvalds authored
      Pull modules updates from Jessica Yu:
       "Summary of modules changes for the 4.14 merge window:
      
         - minor code cleanups and fixes
      
         - modpost: avoid building modules that have names that exceed the
           size of the name field in struct module"
      
      * tag 'modules-for-v4.14' of git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux:
        module: Remove const attribute from alias for MODULE_DEVICE_TABLE
        module: fix ddebug_remove_module()
        modpost: abort if module name is too long
      4791bccc
    • Linus Torvalds's avatar
      Fix up MAINTAINERS file sorting · 3882a734
      Linus Torvalds authored
      Another merge window, another MAINTAINERS file disaster.
      
      People have serious problems with the alphabet and sorting, and poor
      Jérôme Glisse and Radim Krčmář get their names mangled by locale issues,
      turning them into some mangled mess (probably others do too, but those
      two stood out when sorting things again).
      
      And we now have two copies of the same 'AS3645A LED FLASH CONTROLLER
      DRIVER' in the tree and in the MAINTAINERS file, but that's a separate
      issue - the duplication is real, and I left them as two entries for the
      same name.
      
      This does not try to sort the actual section pattern entries, although I
      may end up doing that later.
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      3882a734
    • Linus Torvalds's avatar
      Merge tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux · f60a2abf
      Linus Torvalds authored
      Pull clk updates from Stephen Boyd:
       "The diff is dominated by the Allwinner A10/A20 SoCs getting converted
        to the sunxi-ng framework. Otherwise, the heavy hitters are various
        drivers for SoCs like AT91, Amlogic, Renesas, and Rockchip. There are
        some other new clk drivers in here too but overall this is just a
        bunch of clk drivers for various different pieces of hardware and a
        collection of non-critical fixes for clk drivers.
      
        New Drivers:
         - Allwinner R40 SoCs
         - Renesas R-Car Gen3 USB 2.0 clock selector PHY
         - Atmel AT91 audio PLL
         - Uniphier PXs3 SoCs
         - ARC HSDK Board PLLs
         - AXS10X Board PLLs
         - STMicroelectronics STM32H743 SoCs
      
        Removed Drivers:
         - Non-compiling mb86s7x support
      
        Updates:
         - Allwinner A10/A20 SoCs converted to sunxi-ng framework
         - Allwinner H3 CPU clk fixes
         - Renesas R-Car D3 SoC
         - Renesas V2H and M3-W modules
         - Samsung Exynos5420/5422/5800 audio fixes
         - Rockchip fractional clk approximation fixes
         - Rockchip rk3126 SoC support within the rk3128 driver
         - Amlogic gxbb CEC32 and sd_emmc clks
         - Amlogic meson8b reset controller support
         - IDT VersaClock 5P49V5925/5P49V6901 support
         - Qualcomm MSM8996 SMMU clks
         - Various 'const' applications for struct clk_ops
         - si5351 PLL reset bugfix
         - Uniphier audio on LD11/LD20 and ethernet support on LD11/LD20/Pro4/PXs2
         - Assorted Tegra clk driver fixes"
      
      * tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux: (120 commits)
        clk: si5351: fix PLL reset
        ASoC: atmel-classd: remove aclk clock
        ASoC: atmel-classd: remove aclk clock from DT binding
        clk: at91: clk-generated: make gclk determine audio_pll rate
        clk: at91: clk-generated: create function to find best_diff
        clk: at91: add audio pll clock drivers
        dt-bindings: clk: at91: add audio plls to the compatible list
        clk: at91: clk-generated: remove useless divisor loop
        clk: mb86s7x: Drop non-building driver
        clk: ti: check for null return in strrchr to avoid null dereferencing
        clk: Don't write error code into divider register
        clk: uniphier: add video input subsystem clock
        clk: uniphier: add audio system clock
        clk: stm32h7: Add stm32h743 clock driver
        clk: gate: expose clk_gate_ops::is_enabled
        clk: nxp: clk-lpc32xx: rename clk_gate_is_enabled()
        clk: uniphier: add PXs3 clock data
        clk: hi6220: change watchdog clock source
        clk: Kconfig: Name RK805 in Kconfig for COMMON_CLK_RK808
        clk: cs2000: Add cs2000_set_saved_rate
        ...
      f60a2abf
    • Linus Torvalds's avatar
      Merge tag 'rtc-4.14' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux · 561a8eb3
      Linus Torvalds authored
      Pull RTC updates from Alexandre Belloni:
       "Subsystem:
         - remove .open() and .release() RTC ops
         - constify i2c_device_id
      
        New driver:
         - Realtek RTD1295
         - Android emulator (goldfish) RTC
      
        Drivers:
         - ds1307: Beginning of a huge cleanup
         - s35390a: handle invalid RTC time
         - sun6i: external oscillator gate support"
      
      * tag 'rtc-4.14' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux: (40 commits)
        rtc: ds1307: use octal permissions
        rtc: ds1307: fix braces
        rtc: ds1307: fix alignments and blank lines
        rtc: ds1307: use BIT
        rtc: ds1307: use u32
        rtc: ds1307: use sizeof
        rtc: ds1307: remove regs member
        rtc: Add Realtek RTD1295
        dt-bindings: rtc: Add Realtek RTD1295
        rtc: sun6i: Add support for the external oscillator gate
        rtc: goldfish: Add RTC driver for Android emulator
        dt-bindings: Add device tree binding for Goldfish RTC driver
        rtc: ds1307: add basic support for ds1341 chip
        rtc: ds1307: remove member nvram_offset from struct ds1307
        rtc: ds1307: factor out offset to struct chip_desc
        rtc: ds1307: factor out rtc_ops to struct chip_desc
        rtc: ds1307: factor out irq_handler to struct chip_desc
        rtc: ds1307: improve irq setup
        rtc: ds1307: constify struct chip_desc variables
        rtc: ds1307: improve trickle charger initialization
        ...
      561a8eb3
    • Linus Torvalds's avatar
      Merge tag 'sound-fix-4.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound · 2818d0d7
      Linus Torvalds authored
      Pull sound fixes from Takashi Iwai:
       "Most of the commits are trivial cleanup patches, while one commit is a
        significant fix for the race at ALSA sequencer that was spotted by
        syzkaller"
      
      * tag 'sound-fix-4.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
        ALSA: seq: Cancel pending autoload work at unbinding device
        ALSA: firewire: Use common error handling code in snd_motu_stream_start_duplex()
        ALSA: asihpi: Kill BUG_ON() usages
        ALSA: core: Use %pS printk format for direct addresses
        ALSA: ymfpci: Use common error handling code in snd_ymfpci_create()
        ALSA: ymfpci: Use common error handling code in snd_card_ymfpci_probe()
        ALSA: 6fire: Use common error handling code in usb6fire_chip_probe()
        ALSA: usx2y: Use common error handling code in submit_urbs()
        ALSA: us122l: Use common error handling code in us122l_create_card()
        ALSA: hdspm: Use common error handling code in snd_hdspm_probe()
        ALSA: rme9652: Use common code in hdsp_get_iobox_version()
        ALSA: maestro3: Use common error handling code in two functions
      2818d0d7
    • Linus Torvalds's avatar
      Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi · cc4238bd
      Linus Torvalds authored
      Pull SCSI fixes from James Bottomley:
       "A tiny update: one patch corrects a Kconfig problem with the shift of
        the SAS SMP code to BSG and the other removes a vestige of user space
        target mode"
      
      * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
        scsi: scsi_transport_sas: select BLK_DEV_BSGLIB
        scsi: Remove Scsi_Host.uspace_req_q
      cc4238bd
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.dk/linux-block · 80a0d644
      Linus Torvalds authored
      Pull block fixes from Jens Axboe:
       "Small collection of fixes that would be nice to have in -rc1. This
        contains:
      
         - NVMe pull request form Christoph, mostly with fixes for nvme-pci,
           host memory buffer in particular.
      
         - Error handling fixup for cgwb_create(), in case allocation of 'wb'
           fails. From Christophe Jaillet.
      
         - Ensure that trace_block_getrq() gets the 'dev' in an appropriate
           fashion, to avoid a potential NULL deref. From Greg Thelen.
      
         - Regression fix for dm-mq with blk-mq, fixing a problem with
           stacking IO schedulers. From me.
      
         - string.h fixup, fixing an issue with memcpy_and_pad(). This
           original change came in through an NVMe dependency, which is why
           I'm including it here. From Martin Wilck.
      
         - Fix potential int overflow in __blkdev_sectors_to_bio_pages(), from
           Mikulas.
      
         - MBR enable fix for sed-opal, from Scott"
      
      * 'for-linus' of git://git.kernel.dk/linux-block:
        block: directly insert blk-mq request from blk_insert_cloned_request()
        mm/backing-dev.c: fix an error handling path in 'cgwb_create()'
        string.h: un-fortify memcpy_and_pad
        nvme-pci: implement the HMB entry number and size limitations
        nvme-pci: propagate (some) errors from host memory buffer setup
        nvme-pci: use appropriate initial chunk size for HMB allocation
        nvme-pci: fix host memory buffer allocation fallback
        nvme: fix lightnvm check
        block: fix integer overflow in __blkdev_sectors_to_bio_pages()
        block: sed-opal: Set MBRDone on S3 resume path if TPER is MBREnabled
        block: tolerate tracing of NULL bio
      80a0d644
    • Linus Torvalds's avatar
      Merge tag 'docs-4.14' of git://git.lwn.net/linux · 20e52ee5
      Linus Torvalds authored
      Pull documentation fixes from Jonathan Corbet:
       "A cleanup from Mauro that needed to wait for the media pull, plus a
        handful of other fixes that wandered in"
      
      * tag 'docs-4.14' of git://git.lwn.net/linux:
        kokr/memory-barriers.txt: Apply atomic_t.txt change
        kokr/doc: Update memory-barriers.txt for read-to-write dependencies
        docs-rst: don't require adjustbox anymore
        docs-rst: conf.py: only setup notice box colors if Sphinx < 1.6
        docs-rst: conf.py: remove lscape from LaTeX preamble
      20e52ee5
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse · e7989f97
      Linus Torvalds authored
      Pull fuse updates from Miklos Szeredi:
       "This fixes a regression (spotted by the Sandstorm.io folks) in the pid
        namespace handling introduced in 4.12.
      
        There's also a fix for honoring sync/dsync flags for pwritev2()"
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
        fuse: getattr cleanup
        fuse: honor iocb sync flags on write
        fuse: allow server to run in different pid_ns
      e7989f97
    • Linus Torvalds's avatar
      Merge branch 'overlayfs-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs · c353f88f
      Linus Torvalds authored
      Pull overlayfs updates from Miklos Szeredi:
       "This fixes d_ino correctness in readdir, which brings overlayfs on par
        with normal filesystems regarding inode number semantics, as long as
        all layers are on the same filesystem.
      
        There are also some bug fixes, one in particular (random ioctl's
        shouldn't be able to modify lower layers) that touches some vfs code,
        but of course no-op for non-overlay fs"
      
      * 'overlayfs-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs:
        ovl: fix false positive ESTALE on lookup
        ovl: don't allow writing ioctl on lower layer
        ovl: fix relatime for directories
        vfs: add flags to d_real()
        ovl: cleanup d_real for negative
        ovl: constant d_ino for non-merge dirs
        ovl: constant d_ino across copy up
        ovl: fix readdir error value
        ovl: check snprintf return
      c353f88f
    • Vitaly Kuznetsov's avatar
      x86/hyper-v: Remove duplicated HV_X64_EX_PROCESSOR_MASKS_RECOMMENDED definition · 1278f58c
      Vitaly Kuznetsov authored
      Commits:
      
        7dcf90e9 ("PCI: hv: Use vPCI protocol version 1.2")
        628f54cc ("x86/hyper-v: Support extended CPU ranges for TLB flush hypercalls")
      
      added the same definition and they came in through different trees.
      Fix the duplication.
      Signed-off-by: default avatarVitaly Kuznetsov <vkuznets@redhat.com>
      Cc: Haiyang Zhang <haiyangz@microsoft.com>
      Cc: K. Y. Srinivasan <kys@microsoft.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephen Hemminger <sthemmin@microsoft.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: devel@linuxdriverproject.org
      Link: http://lkml.kernel.org/r/20170911150620.3998-1-vkuznets@redhat.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      1278f58c
    • K. Y. Srinivasan's avatar
      x86/hyper-V: Allocate the IDT entry early in boot · 213ff44a
      K. Y. Srinivasan authored
      Allocate the hypervisor callback IDT entry early in the boot sequence.
      
      The previous code would allocate the entry as part of registering the handler
      when the vmbus driver loaded, and this caused a problem for the IDT cleanup
      that Thomas is working on for v4.15.
      Signed-off-by: default avatarK. Y. Srinivasan <kys@microsoft.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: apw@canonical.com
      Cc: devel@linuxdriverproject.org
      Cc: gregkh@linuxfoundation.org
      Cc: jasowang@redhat.com
      Cc: olaf@aepfle.de
      Link: http://lkml.kernel.org/r/20170908231557.2419-1-kys@exchange.microsoft.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      213ff44a
    • Juergen Gross's avatar
      paravirt: Switch maintainer · 30c1bbff
      Juergen Gross authored
      Jeremy Fitzhardinge is stepping down as a paravirt maintainer. I'll
      replace him.
      
      While at it, update the file list to the actual pattern.
      Signed-off-by: default avatarJuergen Gross <jgross@suse.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: akataria@vmware.com
      Cc: chrisw@sous-sol.org
      Cc: jeremy@goop.org
      Cc: rusty@rustcorp.com.au
      Cc: virtualization@lists.linux-foundation.org
      Link: http://lkml.kernel.org/r/20170905143407.9227-1-jgross@suse.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      30c1bbff
    • Juergen Gross's avatar
      x86/paravirt: Remove no longer used paravirt functions · 87930019
      Juergen Gross authored
      With removal of lguest some of the paravirt functions are no longer
      needed:
      
      	->read_cr4()
      	->store_idt()
      	->set_pmd_at()
      	->set_pud_at()
      	->pte_update()
      
      Remove them.
      Signed-off-by: default avatarJuergen Gross <jgross@suse.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: akataria@vmware.com
      Cc: boris.ostrovsky@oracle.com
      Cc: chrisw@sous-sol.org
      Cc: jeremy@goop.org
      Cc: rusty@rustcorp.com.au
      Cc: virtualization@lists.linux-foundation.org
      Cc: xen-devel@lists.xenproject.org
      Link: http://lkml.kernel.org/r/20170904102527.25409-1-jgross@suse.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      87930019
    • Andy Lutomirski's avatar
      x86/mm/64: Initialize CR4.PCIDE early · c7ad5ad2
      Andy Lutomirski authored
      cpu_init() is weird: it's called rather late (after early
      identification and after most MMU state is initialized) on the boot
      CPU but is called extremely early (before identification) on secondary
      CPUs.  It's called just late enough on the boot CPU that its CR4 value
      isn't propagated to mmu_cr4_features.
      
      Even if we put CR4.PCIDE into mmu_cr4_features, we'd hit two
      problems.  First, we'd crash in the trampoline code.  That's
      fixable, and I tried that.  It turns out that mmu_cr4_features is
      totally ignored by secondary_start_64(), though, so even with the
      trampoline code fixed, it wouldn't help.
      
      This means that we don't currently have CR4.PCIDE reliably initialized
      before we start playing with cpu_tlbstate.  This is very fragile and
      tends to cause boot failures if I make even small changes to the TLB
      handling code.
      
      Make it more robust: initialize CR4.PCIDE earlier on the boot CPU
      and propagate it to secondary CPUs in start_secondary().
      
      ( Yes, this is ugly.  I think we should have improved mmu_cr4_features
        to actually control CR4 during secondary bootup, but that would be
        fairly intrusive at this stage. )
      Signed-off-by: default avatarAndy Lutomirski <luto@kernel.org>
      Reported-by: default avatarSai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>
      Tested-by: default avatarSai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>
      Cc: Borislav Petkov <bpetkov@suse.de>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-kernel@vger.kernel.org
      Fixes: 660da7c9 ("x86/mm: Enable CR4.PCIDE on supported systems")
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      c7ad5ad2
    • Andy Lutomirski's avatar
      x86/hibernate/64: Mask off CR3's PCID bits in the saved CR3 · f34902c5
      Andy Lutomirski authored
      Jiri reported a resume-from-hibernation failure triggered by PCID.
      The root cause appears to be rather odd.  The hibernation asm
      restores a CR3 value that comes from the image header.  If the image
      kernel has PCID on, it's entirely reasonable for this CR3 value to
      have one of the low 12 bits set.  The restore code restores it with
      CR4.PCIDE=0, which means that those low 12 bits are accepted by the
      CPU but are either ignored or interpreted as a caching mode.  This
      is odd, but still works.  We blow up later when the image kernel
      restores CR4, though, since changing CR4.PCIDE with CR3[11:0] != 0
      is illegal.  Boom!
      
      FWIW, it's entirely unclear to me what's supposed to happen if a PAE
      kernel restores a non-PAE image or vice versa.  Ditto for LA57.
      Reported-by: default avatarJiri Kosina <jikos@kernel.org>
      Tested-by: default avatarJiri Kosina <jkosina@suse.cz>
      Signed-off-by: default avatarAndy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bpetkov@suse.de>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Fixes: 660da7c9 ("x86/mm: Enable CR4.PCIDE on supported systems")
      Link: http://lkml.kernel.org/r/18ca57090651a6341e97083883f9e814c4f14684.1504847163.git.luto@kernel.orgSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      f34902c5
    • Andy Lutomirski's avatar
      x86/mm: Get rid of VM_BUG_ON in switch_tlb_irqs_off() · a376e7f9
      Andy Lutomirski authored
      If we hit the VM_BUG_ON(), we're detecting a genuinely bad situation,
      but we're very unlikely to get a useful call trace.
      
      Make it a warning instead.
      Signed-off-by: default avatarAndy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bpetkov@suse.de>
      Cc: Jiri Kosina <jikos@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/3b4e06bbb382ca54a93218407c93925ff5871546.1504847163.git.luto@kernel.orgSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      a376e7f9