1. 17 Jun, 2019 7 commits
    • Daniel Bristot de Oliveira's avatar
      x86/jump_label: Batch jump label updates · ba54f0c3
      Daniel Bristot de Oliveira authored
      Currently, the jump label of a static key is transformed via the arch
      specific function:
      
          void arch_jump_label_transform(struct jump_entry *entry,
                                         enum jump_label_type type)
      
      The new approach (batch mode) uses two arch functions, the first has the
      same arguments of the arch_jump_label_transform(), and is the function:
      
          bool arch_jump_label_transform_queue(struct jump_entry *entry,
                                               enum jump_label_type type)
      
      Rather than transforming the code, it adds the jump_entry in a queue of
      entries to be updated. This functions returns true in the case of a
      successful enqueue of an entry. If it returns false, the caller must to
      apply the queue and then try to queue again, for instance, because the
      queue is full.
      
      This function expects the caller to sort the entries by the address before
      enqueueuing then. This is already done by the arch independent code, though.
      
      After queuing all jump_entries, the function:
      
          void arch_jump_label_transform_apply(void)
      
      Applies the changes in the queue.
      Signed-off-by: default avatarDaniel Bristot de Oliveira <bristot@redhat.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Chris von Recklinghausen <crecklin@redhat.com>
      Cc: Clark Williams <williams@redhat.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jason Baron <jbaron@akamai.com>
      Cc: Jiri Kosina <jkosina@suse.cz>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Marcelo Tosatti <mtosatti@redhat.com>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Scott Wood <swood@redhat.com>
      Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: https://lkml.kernel.org/r/57b4caa654bad7e3b066301c9a9ae233dea065b5.1560325897.git.bristot@redhat.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      ba54f0c3
    • Daniel Bristot de Oliveira's avatar
      jump_label: Batch updates if arch supports it · c2ba8a15
      Daniel Bristot de Oliveira authored
      If the architecture supports the batching of jump label updates, use it!
      
      An easy way to see the benefits of this patch is switching the
      schedstats on and off. For instance:
      
      -------------------------- %< ----------------------------
        #!/bin/sh
        while [ true ]; do
            sysctl -w kernel.sched_schedstats=1
            sleep 2
            sysctl -w kernel.sched_schedstats=0
            sleep 2
        done
      -------------------------- >% ----------------------------
      
      while watching the IPI count:
      
      -------------------------- %< ----------------------------
        # watch -n1 "cat /proc/interrupts | grep Function"
      -------------------------- >% ----------------------------
      
      With the current mode, it is possible to see +- 168 IPIs each 2 seconds,
      while with this patch the number of IPIs goes to 3 each 2 seconds.
      
      Regarding the performance impact of this patch set, I made two measurements:
      
          The time to update a key (the task that is causing the change)
          The time to run the int3 handler (the side effect on a thread that
                                            hits the code being changed)
      
      The schedstats static key was chosen as the key to being switched on and off.
      The reason being is that it is used in more than 56 places, in a hot path. The
      change in the schedstats static key will be done with the following command:
      
      while [ true ]; do
          sysctl -w kernel.sched_schedstats=1
          usleep 500000
          sysctl -w kernel.sched_schedstats=0
          usleep 500000
      done
      
      In this way, they key will be updated twice per second. To force the hit of the
      int3 handler, the system will also run a kernel compilation with two jobs per
      CPU. The test machine is a two nodes/24 CPUs box with an Intel Xeon processor
      @2.27GHz.
      
      Regarding the update part, on average, the regular kernel takes 57 ms to update
      the schedstats key, while the kernel with the batch updates takes just 1.4 ms
      on average. Although it seems to be too good to be true, it makes sense: the
      schedstats key is used in 56 places, so it was expected that it would take
      around 56 times to update the keys with the current implementation, as the
      IPIs are the most expensive part of the update.
      
      Regarding the int3 handler, the non-batch handler takes 45 ns on average, while
      the batch version takes around 180 ns. At first glance, it seems to be a high
      value. But it is not, considering that it is doing 56 updates, rather than one!
      It is taking four times more, only. This gain is possible because the patch
      uses a binary search in the vector: log2(56)=5.8. So, it was expected to have
      an overhead within four times.
      
      (voice of tv propaganda) But, that is not all! As the int3 handler keeps on for
      a shorter period (because the update part is on for a shorter time), the number
      of hits in the int3 handler decreased by 10%.
      
      The question then is: Is it worth paying the price of "135 ns" more in the int3
      handler?
      
      Considering that, in this test case, we are saving the handling of 53 IPIs,
      that takes more than these 135 ns, it seems to be a meager price to be paid.
      Moreover, the test case was forcing the hit of the int3, in practice, it
      does not take that often. While the IPI takes place on all CPUs, hitting
      the int3 handler or not!
      
      For instance, in an isolated CPU with a process running in user-space
      (nohz_full use-case), the chances of hitting the int3 handler is barely zero,
      while there is no way to avoid the IPIs. By bounding the IPIs, we are improving
      a lot this scenario.
      Signed-off-by: default avatarDaniel Bristot de Oliveira <bristot@redhat.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Chris von Recklinghausen <crecklin@redhat.com>
      Cc: Clark Williams <williams@redhat.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jason Baron <jbaron@akamai.com>
      Cc: Jiri Kosina <jkosina@suse.cz>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Marcelo Tosatti <mtosatti@redhat.com>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Scott Wood <swood@redhat.com>
      Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: https://lkml.kernel.org/r/acc891dbc2dbc9fd616dd680529a2337b1d1274c.1560325897.git.bristot@redhat.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      c2ba8a15
    • Daniel Bristot de Oliveira's avatar
      x86/alternative: Batch of patch operations · c0213b0a
      Daniel Bristot de Oliveira authored
      Currently, the patch of an address is done in three steps:
      
      -- Pseudo-code #1 - Current implementation ---
      
              1) add an int3 trap to the address that will be patched
                  sync cores (send IPI to all other CPUs)
              2) update all but the first byte of the patched range
                  sync cores (send IPI to all other CPUs)
              3) replace the first byte (int3) by the first byte of replacing opcode
                  sync cores (send IPI to all other CPUs)
      
      -- Pseudo-code #1 ---
      
      When a static key has more than one entry, these steps are called once for
      each entry. The number of IPIs then is linear with regard to the number 'n' of
      entries of a key: O(n*3), which is O(n).
      
      This algorithm works fine for the update of a single key. But we think
      it is possible to optimize the case in which a static key has more than
      one entry. For instance, the sched_schedstats jump label has 56 entries
      in my (updated) fedora kernel, resulting in 168 IPIs for each CPU in
      which the thread that is enabling the key is _not_ running.
      
      With this patch, rather than receiving a single patch to be processed, a vector
      of patches is passed, enabling the rewrite of the pseudo-code #1 in this
      way:
      
      -- Pseudo-code #2 - This patch  ---
      1)  for each patch in the vector:
              add an int3 trap to the address that will be patched
      
          sync cores (send IPI to all other CPUs)
      
      2)  for each patch in the vector:
              update all but the first byte of the patched range
      
          sync cores (send IPI to all other CPUs)
      
      3)  for each patch in the vector:
              replace the first byte (int3) by the first byte of replacing opcode
      
          sync cores (send IPI to all other CPUs)
      -- Pseudo-code #2 - This patch  ---
      
      Doing the update in this way, the number of IPI becomes O(3) with regard
      to the number of keys, which is O(1).
      
      The batch mode is done with the function text_poke_bp_batch(), that receives
      two arguments: a vector of "struct text_to_poke", and the number of entries
      in the vector.
      
      The vector must be sorted by the addr field of the text_to_poke structure,
      enabling the binary search of a handler in the poke_int3_handler function
      (a fast path).
      Signed-off-by: default avatarDaniel Bristot de Oliveira <bristot@redhat.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Reviewed-by: default avatarMasami Hiramatsu <mhiramat@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Chris von Recklinghausen <crecklin@redhat.com>
      Cc: Clark Williams <williams@redhat.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jason Baron <jbaron@akamai.com>
      Cc: Jiri Kosina <jkosina@suse.cz>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Marcelo Tosatti <mtosatti@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Scott Wood <swood@redhat.com>
      Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: https://lkml.kernel.org/r/ca506ed52584c80f64de23f6f55ca288e5d079de.1560325897.git.bristot@redhat.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      c0213b0a
    • Daniel Bristot de Oliveira's avatar
      jump_label: Sort entries of the same key by the code · 0f133021
      Daniel Bristot de Oliveira authored
      In the batching mode, all the entries of a given key are updated at once.
      During the update of a key, a hit in the int3 handler will check if the
      hitting code address belongs to one of these keys.
      
      To optimize the search of a given code in the vector of entries being
      updated, a binary search is used. The binary search relies on the order
      of the entries of a key by its code. Hence the keys need to be sorted
      by the code too, so sort the entries of a given key by the code.
      Signed-off-by: default avatarDaniel Bristot de Oliveira <bristot@redhat.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Chris von Recklinghausen <crecklin@redhat.com>
      Cc: Clark Williams <williams@redhat.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jason Baron <jbaron@akamai.com>
      Cc: Jiri Kosina <jkosina@suse.cz>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Marcelo Tosatti <mtosatti@redhat.com>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Scott Wood <swood@redhat.com>
      Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: https://lkml.kernel.org/r/f57ae83e0592418ba269866bb7ade570fc8632e0.1560325897.git.bristot@redhat.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      0f133021
    • Daniel Bristot de Oliveira's avatar
      x86/jump_label: Add a __jump_label_set_jump_code() helper · 4cc6620b
      Daniel Bristot de Oliveira authored
      Move the definition of the code to be written from
      __jump_label_transform() to a specialized function. No functional
      change.
      Signed-off-by: default avatarDaniel Bristot de Oliveira <bristot@redhat.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Chris von Recklinghausen <crecklin@redhat.com>
      Cc: Clark Williams <williams@redhat.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jason Baron <jbaron@akamai.com>
      Cc: Jiri Kosina <jkosina@suse.cz>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Marcelo Tosatti <mtosatti@redhat.com>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Scott Wood <swood@redhat.com>
      Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: https://lkml.kernel.org/r/d2f52a0010ecd399cf9b02a65bcf5836571b9e52.1560325897.git.bristot@redhat.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      4cc6620b
    • Daniel Bristot de Oliveira's avatar
      jump_label: Add a jump_label_can_update() helper · e1aacb3f
      Daniel Bristot de Oliveira authored
      Move the check if a jump_entry is valid to a function. No functional
      change.
      Signed-off-by: default avatarDaniel Bristot de Oliveira <bristot@redhat.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Chris von Recklinghausen <crecklin@redhat.com>
      Cc: Clark Williams <williams@redhat.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jason Baron <jbaron@akamai.com>
      Cc: Jiri Kosina <jkosina@suse.cz>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Marcelo Tosatti <mtosatti@redhat.com>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Scott Wood <swood@redhat.com>
      Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: https://lkml.kernel.org/r/56b69bd3f8e644ed64f2dbde7c088030b8cbe76b.1560325897.git.bristot@redhat.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      e1aacb3f
    • Ingo Molnar's avatar
      410df0c5
  2. 16 Jun, 2019 4 commits
    • Linus Torvalds's avatar
      Linux 5.2-rc5 · 9e0babf2
      Linus Torvalds authored
      9e0babf2
    • Linus Torvalds's avatar
      Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 963172d9
      Linus Torvalds authored
      Pull x86 fixes from Thomas Gleixner:
       "The accumulated fixes from this and last week:
      
         - Fix vmalloc TLB flush and map range calculations which lead to
           stale TLBs, spurious faults and other hard to diagnose issues.
      
         - Use fault_in_pages_writable() for prefaulting the user stack in the
           FPU code as it's less fragile than the current solution
      
         - Use the PF_KTHREAD flag when checking for a kernel thread instead
           of current->mm as the latter can give the wrong answer due to
           use_mm()
      
         - Compute the vmemmap size correctly for KASLR and 5-Level paging.
           Otherwise this can end up with a way too small vmemmap area.
      
         - Make KASAN and 5-level paging work again by making sure that all
           invalid bits are masked out when computing the P4D offset. This
           worked before but got broken recently when the LDT remap area was
           moved.
      
         - Prevent a NULL pointer dereference in the resource control code
           which can be triggered with certain mount options when the
           requested resource is not available.
      
         - Enforce ordering of microcode loading vs. perf initialization on
           secondary CPUs. Otherwise perf tries to access a non-existing MSR
           as the boot CPU marked it as available.
      
         - Don't stop the resource control group walk early otherwise the
           control bitmaps are not updated correctly and become inconsistent.
      
         - Unbreak kgdb by returning 0 on success from
           kgdb_arch_set_breakpoint() instead of an error code.
      
         - Add more Icelake CPU model defines so depending changes can be
           queued in other trees"
      
      * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/microcode, cpuhotplug: Add a microcode loader CPU hotplug callback
        x86/kasan: Fix boot with 5-level paging and KASAN
        x86/fpu: Don't use current->mm to check for a kthread
        x86/kgdb: Return 0 from kgdb_arch_set_breakpoint()
        x86/resctrl: Prevent NULL pointer dereference when local MBM is disabled
        x86/resctrl: Don't stop walking closids when a locksetup group is found
        x86/fpu: Update kernel's FPU state before using for the fsave header
        x86/mm/KASLR: Compute the size of the vmemmap section properly
        x86/fpu: Use fault_in_pages_writeable() for pre-faulting
        x86/CPU: Add more Icelake model numbers
        mm/vmalloc: Avoid rare case of flushing TLB with weird arguments
        mm/vmalloc: Fix calculation of direct map addr range
      963172d9
    • Linus Torvalds's avatar
      Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · efba92d5
      Linus Torvalds authored
      Pull timer fixes from Thomas Gleixner:
       "A set of small fixes:
      
         - Repair the ktime_get_coarse() functions so they actually deliver
           what they are supposed to: tick granular time stamps. The current
           code missed to add the accumulated nanoseconds part of the
           timekeeper so the resulting granularity was 1 second.
      
         - Prevent the tracer from infinitely recursing into time getter
           functions in the arm architectured timer by marking these functions
           notrace
      
         - Fix a trivial compiler warning caused by wrong qualifier ordering"
      
      * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        timekeeping: Repair ktime_get_coarse*() granularity
        clocksource/drivers/arm_arch_timer: Don't trace count reader functions
        clocksource/drivers/timer-ti-dm: Change to new style declaration
      efba92d5
    • Linus Torvalds's avatar
      Merge branch 'ras-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · f763cf8e
      Linus Torvalds authored
      Pull RAS fixes from Thomas Gleixner:
       "Two small fixes for RAS:
      
         - Use a proper search algorithm to find the correct element in the
           CEC array. The replacement was a better choice than fixing the
           crash causes by the original search function with horrible duct
           tape.
      
         - Move the timer based decay function into thread context so it can
           actually acquire the mutex which protects the CEC array to prevent
           corruption"
      
      * 'ras-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        RAS/CEC: Convert the timer callback to a workqueue
        RAS/CEC: Fix binary search function
      f763cf8e
  3. 15 Jun, 2019 12 commits
    • Linus Torvalds's avatar
      Merge tag 'platform-drivers-x86-v5.2-3' of git://git.infradead.org/linux-platform-drivers-x86 · e01e060f
      Linus Torvalds authored
      Pull x86 platform driver fixes from Andy Shevchenko:
      
       - fix a couple of Mellanox driver enumeration issues
      
       - fix ASUS laptop regression with backlight
      
       - fix Dell computers that got a wrong mode (tablet versus laptop) after
         resume
      
      * tag 'platform-drivers-x86-v5.2-3' of git://git.infradead.org/linux-platform-drivers-x86:
        platform/mellanox: mlxreg-hotplug: Add devm_free_irq call to remove flow
        platform/x86: mlx-platform: Fix parent device in i2c-mux-reg device registration
        platform/x86: intel-vbtn: Report switch events when event wakes device
        platform/x86: asus-wmi: Only Tell EC the OS will handle display hotkeys from asus_nb_wmi
      e01e060f
    • Linus Torvalds's avatar
      Merge tag 'usb-5.2-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb · ff39074b
      Linus Torvalds authored
      Pull USB fixes from Greg KH:
       "Here are some small USB driver fixes for 5.2-rc5
      
        Nothing major, just some small gadget fixes, usb-serial new device
        ids, a few new quirks, and some small fixes for some regressions that
        have been found after the big 5.2-rc1 merge.
      
        All of these have been in linux-next for a while with no reported
        issues"
      
      * tag 'usb-5.2-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
        usb: typec: Make sure an alt mode exist before getting its partner
        usb: gadget: udc: lpc32xx: fix return value check in lpc32xx_udc_probe()
        usb: gadget: dwc2: fix zlp handling
        usb: dwc2: Set actual frame number for completed ISOC transfer for none DDMA
        usb: gadget: udc: lpc32xx: allocate descriptor with GFP_ATOMIC
        usb: gadget: fusb300_udc: Fix memory leak of fusb300->ep[i]
        usb: phy: mxs: Disable external charger detect in mxs_phy_hw_init()
        usb: dwc2: Fix DMA cache alignment issues
        usb: dwc2: host: Fix wMaxPacketSize handling (fix webcam regression)
        USB: Fix chipmunk-like voice when using Logitech C270 for recording audio.
        USB: usb-storage: Add new ID to ums-realtek
        usb: typec: ucsi: ccg: fix memory leak in do_flash
        USB: serial: option: add Telit 0x1260 and 0x1261 compositions
        USB: serial: pl2303: add Allied Telesis VT-Kit3
        USB: serial: option: add support for Simcom SIM7500/SIM7600 RNDIS mode
      ff39074b
    • Linus Torvalds's avatar
      Merge tag 'powerpc-5.2-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux · fa1827d7
      Linus Torvalds authored
      Pull powerpc fixes from Michael Ellerman:
       "One fix for a regression introduced by our 32-bit KASAN support, which
        broke booting on machines with "bootx" early debugging enabled.
      
        A fix for a bug which broke kexec on 32-bit, introduced by changes to
        the 32-bit STRICT_KERNEL_RWX support in v5.1.
      
        Finally two fixes going to stable for our THP split/collapse handling,
        discovered by Nick. The first fixes random crashes and/or corruption
        in guests under sufficient load.
      
        Thanks to: Nicholas Piggin, Christophe Leroy, Aaro Koskinen, Mathieu
        Malaterre"
      
      * tag 'powerpc-5.2-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
        powerpc/32s: fix booting with CONFIG_PPC_EARLY_DEBUG_BOOTX
        powerpc/64s: __find_linux_pte() synchronization vs pmdp_invalidate()
        powerpc/64s: Fix THP PMD collapse serialisation
        powerpc: Fix kexec failure on book3s/32
      fa1827d7
    • Linus Torvalds's avatar
      Merge tag 'trace-v5.2-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace · 6a71398c
      Linus Torvalds authored
      Pull tracing fixes from Steven Rostedt:
      
       - Out of range read of stack trace output
      
       - Fix for NULL pointer dereference in trace_uprobe_create()
      
       - Fix to a livepatching / ftrace permission race in the module code
      
       - Fix for NULL pointer dereference in free_ftrace_func_mapper()
      
       - A couple of build warning clean ups
      
      * tag 'trace-v5.2-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
        ftrace: Fix NULL pointer dereference in free_ftrace_func_mapper()
        module: Fix livepatch/ftrace module text permissions race
        tracing/uprobe: Fix obsolete comment on trace_uprobe_create()
        tracing/uprobe: Fix NULL pointer dereference in trace_uprobe_create()
        tracing: Make two symbols static
        tracing: avoid build warning with HAVE_NOP_MCOUNT
        tracing: Fix out-of-range read in trace_stack_print()
      6a71398c
    • Borislav Petkov's avatar
      x86/microcode, cpuhotplug: Add a microcode loader CPU hotplug callback · 78f4e932
      Borislav Petkov authored
      Adric Blake reported the following warning during suspend-resume:
      
        Enabling non-boot CPUs ...
        x86: Booting SMP configuration:
        smpboot: Booting Node 0 Processor 1 APIC 0x2
        unchecked MSR access error: WRMSR to 0x10f (tried to write 0x0000000000000000) \
         at rIP: 0xffffffff8d267924 (native_write_msr+0x4/0x20)
        Call Trace:
         intel_set_tfa
         intel_pmu_cpu_starting
         ? x86_pmu_dead_cpu
         x86_pmu_starting_cpu
         cpuhp_invoke_callback
         ? _raw_spin_lock_irqsave
         notify_cpu_starting
         start_secondary
         secondary_startup_64
        microcode: sig=0x806ea, pf=0x80, revision=0x96
        microcode: updated to revision 0xb4, date = 2019-04-01
        CPU1 is up
      
      The MSR in question is MSR_TFA_RTM_FORCE_ABORT and that MSR is emulated
      by microcode. The log above shows that the microcode loader callback
      happens after the PMU restoration, leading to the conjecture that
      because the microcode hasn't been updated yet, that MSR is not present
      yet, leading to the #GP.
      
      Add a microcode loader-specific hotplug vector which comes before
      the PERF vectors and thus executes earlier and makes sure the MSR is
      present.
      
      Fixes: 400816f6 ("perf/x86/intel: Implement support for TSX Force Abort")
      Reported-by: default avatarAdric Blake <promarbler14@gmail.com>
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      Reviewed-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: <stable@vger.kernel.org>
      Cc: x86@kernel.org
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=203637
      78f4e932
    • Linus Torvalds's avatar
      Merge branch 'for-5.2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup · 0011572c
      Linus Torvalds authored
      Pull cgroup fixes from Tejun Heo:
       "This has an unusually high density of tricky fixes:
      
         - task_get_css() could deadlock when it races against a dying cgroup.
      
         - cgroup.procs didn't list thread group leaders with live threads.
      
           This could mislead readers to think that a cgroup is empty when
           it's not. Fixed by making PROCS iterator include dead tasks. I made
           a couple mistakes making this change and this pull request contains
           a couple follow-up patches.
      
         - When cpusets run out of online cpus, it updates cpusmasks of member
           tasks in bizarre ways. Joel improved the behavior significantly"
      
      * 'for-5.2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
        cpuset: restore sanity to cpuset_cpus_allowed_fallback()
        cgroup: Fix css_task_iter_advance_css_set() cset skip condition
        cgroup: css_task_iter_skip()'d iterators must be advanced before accessed
        cgroup: Include dying leaders with live threads in PROCS iterations
        cgroup: Implement css_task_iter_skip()
        cgroup: Call cgroup_release() before __exit_signal()
        docs cgroups: add another example size for hugetlb
        cgroup: Use css_tryget() instead of css_tryget_online() in task_get_css()
      0011572c
    • Linus Torvalds's avatar
      Merge tag 'drm-fixes-2019-06-14' of git://anongit.freedesktop.org/drm/drm · 6aa7a22b
      Linus Torvalds authored
      Pull drm fixes from Daniel Vetter:
       "Nothing unsettling here, also not aware of anything serious still
        pending.
      
        The edid override regression fix took a bit longer since this seems to
        be an area with an overabundance of bad options. But the fix we have
        now seems like a good path forward.
      
        Next week it should be back to Dave.
      
        Summary:
      
         - fix regression on amdgpu on SI
      
         - fix edid override regression
      
         - driver fixes: amdgpu, i915, mediatek, meson, panfrost
      
         - fix writecombine for vmap in gem-shmem helper (used by panfrost)
      
         - add more panel quirks"
      
      * tag 'drm-fixes-2019-06-14' of git://anongit.freedesktop.org/drm/drm: (25 commits)
        drm/amdgpu: return 0 by default in amdgpu_pm_load_smu_firmware
        drm/amdgpu: Fix bounds checking in amdgpu_ras_is_supported()
        drm: add fallback override/firmware EDID modes workaround
        drm/edid: abstract override/firmware EDID retrieval
        drm/i915/perf: fix whitelist on Gen10+
        drm/i915/sdvo: Implement proper HDMI audio support for SDVO
        drm/i915: Fix per-pixel alpha with CCS
        drm/i915/dmc: protect against reading random memory
        drm/i915/dsi: Use a fuzzy check for burst mode clock check
        drm/amdgpu/{uvd,vcn}: fetch ring's read_ptr after alloc
        drm/panfrost: Require the simple_ondemand governor
        drm/panfrost: make devfreq optional again
        drm/gem_shmem: Use a writecombine mapping for ->vaddr
        drm: panel-orientation-quirks: Add quirk for GPD MicroPC
        drm: panel-orientation-quirks: Add quirk for GPD pocket2
        drm/meson: fix G12A primary plane disabling
        drm/meson: fix primary plane disabling
        drm/meson: fix G12A HDMI PLL settings for 4K60 1000/1001 variations
        drm/mediatek: call mtk_dsi_stop() after mtk_drm_crtc_atomic_disable()
        drm/mediatek: clear num_pipes when unbind driver
        ...
      6aa7a22b
    • Linus Torvalds's avatar
      Merge tag 'gfs2-v5.2.fixes2' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2 · 40665244
      Linus Torvalds authored
      Pull gfs2 fix from Andreas Gruenbacher:
       "Fix rounding error in gfs2_iomap_page_prepare"
      
      * tag 'gfs2-v5.2.fixes2' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2:
        gfs2: Fix rounding error in gfs2_iomap_page_prepare
      40665244
    • Linus Torvalds's avatar
      Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi · 1ed1fa5f
      Linus Torvalds authored
      Pull SCSI fix from James Bottomley:
       "A single bug fix for hpsa.
      
        The user visible consequences aren't clear, but the ioaccel2 raid
        acceleration may misfire on the malformed request assuming the payload
        is big enough to require chaining (more than 31 sg entries)"
      
      * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
        scsi: hpsa: correct ioaccel2 chaining
      1ed1fa5f
    • Linus Torvalds's avatar
      Merge tag 'for-linus-20190614' of git://git.kernel.dk/linux-block · 7b103151
      Linus Torvalds authored
      Pull block fixes from Jens Axboe:
      
       - Remove references to old schedulers for the scheduler switching and
         blkio controller documentation (Andreas)
      
       - Kill duplicate check for report zone for null_blk (Chaitanya)
      
       - Two bcache fixes (Coly)
      
       - Ensure that mq-deadline is selected if zoned block device is enabled,
         as we need that to support them (Damien)
      
       - Fix io_uring memory leak (Eric)
      
       - ps3vram fallout from LBDAF removal (Geert)
      
       - Redundant blk-mq debugfs debugfs_create return check cleanup (Greg)
      
       - Extend NOPLM quirk for ST1000LM024 drives (Hans)
      
       - Remove error path warning that can now trigger after the queue
         removal/addition fixes (Ming)
      
      * tag 'for-linus-20190614' of git://git.kernel.dk/linux-block:
        block/ps3vram: Use %llu to format sector_t after LBDAF removal
        libata: Extend quirks for the ST1000LM024 drives with NOLPM quirk
        bcache: only set BCACHE_DEV_WB_RUNNING when cached device attached
        bcache: fix stack corruption by PRECEDING_KEY()
        blk-mq: remove WARN_ON(!q->elevator) from blk_mq_sched_free_requests
        blkio-controller.txt: Remove references to CFQ
        block/switching-sched.txt: Update to blk-mq schedulers
        null_blk: remove duplicate check for report zone
        blk-mq: no need to check return value of debugfs_create functions
        io_uring: fix memory leak of UNIX domain socket inode
        block: force select mq-deadline for zoned block devices
      7b103151
    • Linus Torvalds's avatar
      Merge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux · 5dcedf46
      Linus Torvalds authored
      Pull i2c fixes from Wolfram Sang:
       "I2C has two simple but wanted driver fixes for you"
      
      * 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
        i2c: pca-platform: Fix GPIO lookup code
        i2c: acorn: fix i2c warning
      5dcedf46
    • Casey Schaufler's avatar
      Smack: Restore the smackfsdef mount option and add missing prefixes · 6e7739fc
      Casey Schaufler authored
      The 5.1 mount system rework changed the smackfsdef mount option to
      smackfsdefault.  This fixes the regression by making smackfsdef treated
      the same way as smackfsdefault.
      
      Also fix the smack_param_specs[] to have "smack" prefixes on all the
      names.  This isn't visible to a user unless they either:
      
       (a) Try to mount a filesystem that's converted to the internal mount API
           and that implements the ->parse_monolithic() context operation - and
           only then if they call security_fs_context_parse_param() rather than
           security_sb_eat_lsm_opts().
      
           There are no examples of this upstream yet, but nfs will probably want
           to do this for nfs2 or nfs3.
      
       (b) Use fsconfig() to configure the filesystem - in which case
           security_fs_context_parse_param() will be called.
      
      This issue is that smack_sb_eat_lsm_opts() checks for the "smack" prefix
      on the options, but smack_fs_context_parse_param() does not.
      
      Fixes: c3300aaf ("smack: get rid of match_token()")
      Fixes: 2febd254 ("smack: Implement filesystem context security hooks")
      Cc: stable@vger.kernel.org
      Reported-by: default avatarJose Bollo <jose.bollo@iot.bzh>
      Signed-off-by: default avatarCasey Schaufler <casey@schaufler-ca.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Tested-by: default avatarCasey Schaufler <casey@schaufler-ca.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      6e7739fc
  4. 14 Jun, 2019 17 commits
    • Wei Li's avatar
      ftrace: Fix NULL pointer dereference in free_ftrace_func_mapper() · 04e03d9a
      Wei Li authored
      The mapper may be NULL when called from register_ftrace_function_probe()
      with probe->data == NULL.
      
      This issue can be reproduced as follow (it may be covered by compiler
      optimization sometime):
      
      / # cat /sys/kernel/debug/tracing/set_ftrace_filter
      #### all functions enabled ####
      / # echo foo_bar:dump > /sys/kernel/debug/tracing/set_ftrace_filter
      [  206.949100] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
      [  206.952402] Mem abort info:
      [  206.952819]   ESR = 0x96000006
      [  206.955326]   Exception class = DABT (current EL), IL = 32 bits
      [  206.955844]   SET = 0, FnV = 0
      [  206.956272]   EA = 0, S1PTW = 0
      [  206.956652] Data abort info:
      [  206.957320]   ISV = 0, ISS = 0x00000006
      [  206.959271]   CM = 0, WnR = 0
      [  206.959938] user pgtable: 4k pages, 48-bit VAs, pgdp=0000000419f3a000
      [  206.960483] [0000000000000000] pgd=0000000411a87003, pud=0000000411a83003, pmd=0000000000000000
      [  206.964953] Internal error: Oops: 96000006 [#1] SMP
      [  206.971122] Dumping ftrace buffer:
      [  206.973677]    (ftrace buffer empty)
      [  206.975258] Modules linked in:
      [  206.976631] Process sh (pid: 281, stack limit = 0x(____ptrval____))
      [  206.978449] CPU: 10 PID: 281 Comm: sh Not tainted 5.2.0-rc1+ #17
      [  206.978955] Hardware name: linux,dummy-virt (DT)
      [  206.979883] pstate: 60000005 (nZCv daif -PAN -UAO)
      [  206.980499] pc : free_ftrace_func_mapper+0x2c/0x118
      [  206.980874] lr : ftrace_count_free+0x68/0x80
      [  206.982539] sp : ffff0000182f3ab0
      [  206.983102] x29: ffff0000182f3ab0 x28: ffff8003d0ec1700
      [  206.983632] x27: ffff000013054b40 x26: 0000000000000001
      [  206.984000] x25: ffff00001385f000 x24: 0000000000000000
      [  206.984394] x23: ffff000013453000 x22: ffff000013054000
      [  206.984775] x21: 0000000000000000 x20: ffff00001385fe28
      [  206.986575] x19: ffff000013872c30 x18: 0000000000000000
      [  206.987111] x17: 0000000000000000 x16: 0000000000000000
      [  206.987491] x15: ffffffffffffffb0 x14: 0000000000000000
      [  206.987850] x13: 000000000017430e x12: 0000000000000580
      [  206.988251] x11: 0000000000000000 x10: cccccccccccccccc
      [  206.988740] x9 : 0000000000000000 x8 : ffff000013917550
      [  206.990198] x7 : ffff000012fac2e8 x6 : ffff000012fac000
      [  206.991008] x5 : ffff0000103da588 x4 : 0000000000000001
      [  206.991395] x3 : 0000000000000001 x2 : ffff000013872a28
      [  206.991771] x1 : 0000000000000000 x0 : 0000000000000000
      [  206.992557] Call trace:
      [  206.993101]  free_ftrace_func_mapper+0x2c/0x118
      [  206.994827]  ftrace_count_free+0x68/0x80
      [  206.995238]  release_probe+0xfc/0x1d0
      [  206.995555]  register_ftrace_function_probe+0x4a8/0x868
      [  206.995923]  ftrace_trace_probe_callback.isra.4+0xb8/0x180
      [  206.996330]  ftrace_dump_callback+0x50/0x70
      [  206.996663]  ftrace_regex_write.isra.29+0x290/0x3a8
      [  206.997157]  ftrace_filter_write+0x44/0x60
      [  206.998971]  __vfs_write+0x64/0xf0
      [  206.999285]  vfs_write+0x14c/0x2f0
      [  206.999591]  ksys_write+0xbc/0x1b0
      [  206.999888]  __arm64_sys_write+0x3c/0x58
      [  207.000246]  el0_svc_common.constprop.0+0x408/0x5f0
      [  207.000607]  el0_svc_handler+0x144/0x1c8
      [  207.000916]  el0_svc+0x8/0xc
      [  207.003699] Code: aa0003f8 a9025bf5 aa0103f5 f946ea80 (f9400303)
      [  207.008388] ---[ end trace 7b6d11b5f542bdf1 ]---
      [  207.010126] Kernel panic - not syncing: Fatal exception
      [  207.011322] SMP: stopping secondary CPUs
      [  207.013956] Dumping ftrace buffer:
      [  207.014595]    (ftrace buffer empty)
      [  207.015632] Kernel Offset: disabled
      [  207.017187] CPU features: 0x002,20006008
      [  207.017985] Memory Limit: none
      [  207.019825] ---[ end Kernel panic - not syncing: Fatal exception ]---
      
      Link: http://lkml.kernel.org/r/20190606031754.10798-1-liwei391@huawei.comSigned-off-by: default avatarWei Li <liwei391@huawei.com>
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      04e03d9a
    • Josh Poimboeuf's avatar
      module: Fix livepatch/ftrace module text permissions race · 9f255b63
      Josh Poimboeuf authored
      It's possible for livepatch and ftrace to be toggling a module's text
      permissions at the same time, resulting in the following panic:
      
        BUG: unable to handle page fault for address: ffffffffc005b1d9
        #PF: supervisor write access in kernel mode
        #PF: error_code(0x0003) - permissions violation
        PGD 3ea0c067 P4D 3ea0c067 PUD 3ea0e067 PMD 3cc13067 PTE 3b8a1061
        Oops: 0003 [#1] PREEMPT SMP PTI
        CPU: 1 PID: 453 Comm: insmod Tainted: G           O  K   5.2.0-rc1-a188339c #1
        Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-20181126_142135-anatol 04/01/2014
        RIP: 0010:apply_relocate_add+0xbe/0x14c
        Code: fa 0b 74 21 48 83 fa 18 74 38 48 83 fa 0a 75 40 eb 08 48 83 38 00 74 33 eb 53 83 38 00 75 4e 89 08 89 c8 eb 0a 83 38 00 75 43 <89> 08 48 63 c1 48 39 c8 74 2e eb 48 83 38 00 75 32 48 29 c1 89 08
        RSP: 0018:ffffb223c00dbb10 EFLAGS: 00010246
        RAX: ffffffffc005b1d9 RBX: 0000000000000000 RCX: ffffffff8b200060
        RDX: 000000000000000b RSI: 0000004b0000000b RDI: ffff96bdfcd33000
        RBP: ffffb223c00dbb38 R08: ffffffffc005d040 R09: ffffffffc005c1f0
        R10: ffff96bdfcd33c40 R11: ffff96bdfcd33b80 R12: 0000000000000018
        R13: ffffffffc005c1f0 R14: ffffffffc005e708 R15: ffffffff8b2fbc74
        FS:  00007f5f447beba8(0000) GS:ffff96bdff900000(0000) knlGS:0000000000000000
        CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
        CR2: ffffffffc005b1d9 CR3: 000000003cedc002 CR4: 0000000000360ea0
        DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
        DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
        Call Trace:
         klp_init_object_loaded+0x10f/0x219
         ? preempt_latency_start+0x21/0x57
         klp_enable_patch+0x662/0x809
         ? virt_to_head_page+0x3a/0x3c
         ? kfree+0x8c/0x126
         patch_init+0x2ed/0x1000 [livepatch_test02]
         ? 0xffffffffc0060000
         do_one_initcall+0x9f/0x1c5
         ? kmem_cache_alloc_trace+0xc4/0xd4
         ? do_init_module+0x27/0x210
         do_init_module+0x5f/0x210
         load_module+0x1c41/0x2290
         ? fsnotify_path+0x3b/0x42
         ? strstarts+0x2b/0x2b
         ? kernel_read+0x58/0x65
         __do_sys_finit_module+0x9f/0xc3
         ? __do_sys_finit_module+0x9f/0xc3
         __x64_sys_finit_module+0x1a/0x1c
         do_syscall_64+0x52/0x61
         entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      The above panic occurs when loading two modules at the same time with
      ftrace enabled, where at least one of the modules is a livepatch module:
      
      CPU0					CPU1
      klp_enable_patch()
        klp_init_object_loaded()
          module_disable_ro()
          					ftrace_module_enable()
      					  ftrace_arch_code_modify_post_process()
      				    	    set_all_modules_text_ro()
            klp_write_object_relocations()
              apply_relocate_add()
      	  *patches read-only code* - BOOM
      
      A similar race exists when toggling ftrace while loading a livepatch
      module.
      
      Fix it by ensuring that the livepatch and ftrace code patching
      operations -- and their respective permissions changes -- are protected
      by the text_mutex.
      
      Link: http://lkml.kernel.org/r/ab43d56ab909469ac5d2520c5d944ad6d4abd476.1560474114.git.jpoimboe@redhat.comReported-by: default avatarJohannes Erdfelt <johannes@erdfelt.com>
      Fixes: 444d13ff ("modules: add ro_after_init support")
      Acked-by: default avatarJessica Yu <jeyu@kernel.org>
      Reviewed-by: default avatarPetr Mladek <pmladek@suse.com>
      Reviewed-by: default avatarMiroslav Benes <mbenes@suse.cz>
      Signed-off-by: default avatarJosh Poimboeuf <jpoimboe@redhat.com>
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      9f255b63
    • Eiichi Tsukata's avatar
      tracing/uprobe: Fix obsolete comment on trace_uprobe_create() · a4158345
      Eiichi Tsukata authored
      Commit 0597c49c ("tracing/uprobes: Use dyn_event framework for
      uprobe events") cleaned up the usage of trace_uprobe_create(), and the
      function has been no longer used for removing uprobe/uretprobe.
      
      Link: http://lkml.kernel.org/r/20190614074026.8045-2-devel@etsukata.comReviewed-by: default avatarSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      Signed-off-by: default avatarEiichi Tsukata <devel@etsukata.com>
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      a4158345
    • Eiichi Tsukata's avatar
      tracing/uprobe: Fix NULL pointer dereference in trace_uprobe_create() · f01098c7
      Eiichi Tsukata authored
      Just like the case of commit 8b05a3a7 ("tracing/kprobes: Fix NULL
      pointer dereference in trace_kprobe_create()"), writing an incorrectly
      formatted string to uprobe_events can trigger NULL pointer dereference.
      
      Reporeducer:
      
        # echo r > /sys/kernel/debug/tracing/uprobe_events
      
      dmesg:
      
        BUG: kernel NULL pointer dereference, address: 0000000000000000
        #PF: supervisor read access in kernel mode
        #PF: error_code(0x0000) - not-present page
        PGD 8000000079d12067 P4D 8000000079d12067 PUD 7b7ab067 PMD 0
        Oops: 0000 [#1] PREEMPT SMP PTI
        CPU: 0 PID: 1903 Comm: bash Not tainted 5.2.0-rc3+ #15
        Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-2.fc30 04/01/2014
        RIP: 0010:strchr+0x0/0x30
        Code: c0 eb 0d 84 c9 74 18 48 83 c0 01 48 39 d0 74 0f 0f b6 0c 07 3a 0c 06 74 ea 19 c0 83 c8 01 c3 31 c0 c3 0f 1f 84 00 00 00 00 00 <0f> b6 07 89 f2 40 38 f0 75 0e eb 13 0f b6 47 01 48 83 c
        RSP: 0018:ffffb55fc0403d10 EFLAGS: 00010293
      
        RAX: ffff993ffb793400 RBX: 0000000000000000 RCX: ffffffffa4852625
        RDX: 0000000000000000 RSI: 000000000000002f RDI: 0000000000000000
        RBP: ffffb55fc0403dd0 R08: ffff993ffb793400 R09: 0000000000000000
        R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
        R13: ffff993ff9cc1668 R14: 0000000000000001 R15: 0000000000000000
        FS:  00007f30c5147700(0000) GS:ffff993ffda00000(0000) knlGS:0000000000000000
        CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
        CR2: 0000000000000000 CR3: 000000007b628000 CR4: 00000000000006f0
        Call Trace:
         trace_uprobe_create+0xe6/0xb10
         ? __kmalloc_track_caller+0xe6/0x1c0
         ? __kmalloc+0xf0/0x1d0
         ? trace_uprobe_create+0xb10/0xb10
         create_or_delete_trace_uprobe+0x35/0x90
         ? trace_uprobe_create+0xb10/0xb10
         trace_run_command+0x9c/0xb0
         trace_parse_run_command+0xf9/0x1eb
         ? probes_open+0x80/0x80
         __vfs_write+0x43/0x90
         vfs_write+0x14a/0x2a0
         ksys_write+0xa2/0x170
         do_syscall_64+0x7f/0x200
         entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      Link: http://lkml.kernel.org/r/20190614074026.8045-1-devel@etsukata.com
      
      Cc: stable@vger.kernel.org
      Fixes: 0597c49c ("tracing/uprobes: Use dyn_event framework for uprobe events")
      Reviewed-by: default avatarSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      Signed-off-by: default avatarEiichi Tsukata <devel@etsukata.com>
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      f01098c7
    • YueHaibing's avatar
      tracing: Make two symbols static · ff585c5b
      YueHaibing authored
      Fix sparse warnings:
      
      kernel/trace/trace.c:6927:24: warning:
       symbol 'get_tracing_log_err' was not declared. Should it be static?
      kernel/trace/trace.c:8196:15: warning:
       symbol 'trace_instance_dir' was not declared. Should it be static?
      
      Link: http://lkml.kernel.org/r/20190614153210.24424-1-yuehaibing@huawei.comAcked-by: default avatarTom Zanussi <tom.zanussi@linux.intel.com>
      Reported-by: default avatarHulk Robot <hulkci@huawei.com>
      Signed-off-by: default avatarYueHaibing <yuehaibing@huawei.com>
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      ff585c5b
    • Vasily Gorbik's avatar
      tracing: avoid build warning with HAVE_NOP_MCOUNT · cbdaeaf0
      Vasily Gorbik authored
      Selecting HAVE_NOP_MCOUNT enables -mnop-mcount (if gcc supports it)
      and sets CC_USING_NOP_MCOUNT. Reuse __is_defined (which is suitable for
      testing CC_USING_* defines) to avoid conditional compilation and fix
      the following gcc 9 warning on s390:
      
      kernel/trace/ftrace.c:2514:1: warning: ‘ftrace_code_disable’ defined
      but not used [-Wunused-function]
      
      Link: http://lkml.kernel.org/r/patch.git-1a82d13f33ac.your-ad-here.call-01559732716-ext-6629@work.hours
      
      Fixes: 2f4df001 ("tracing: Add -mcount-nop option support")
      Signed-off-by: default avatarVasily Gorbik <gor@linux.ibm.com>
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      cbdaeaf0
    • Eiichi Tsukata's avatar
      tracing: Fix out-of-range read in trace_stack_print() · becf33f6
      Eiichi Tsukata authored
      Puts range check before dereferencing the pointer.
      
      Reproducer:
      
        # echo stacktrace > trace_options
        # echo 1 > events/enable
        # cat trace > /dev/null
      
      KASAN report:
      
        ==================================================================
        BUG: KASAN: use-after-free in trace_stack_print+0x26b/0x2c0
        Read of size 8 at addr ffff888069d20000 by task cat/1953
      
        CPU: 0 PID: 1953 Comm: cat Not tainted 5.2.0-rc3+ #5
        Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-2.fc30 04/01/2014
        Call Trace:
         dump_stack+0x8a/0xce
         print_address_description+0x60/0x224
         ? trace_stack_print+0x26b/0x2c0
         ? trace_stack_print+0x26b/0x2c0
         __kasan_report.cold+0x1a/0x3e
         ? trace_stack_print+0x26b/0x2c0
         kasan_report+0xe/0x20
         trace_stack_print+0x26b/0x2c0
         print_trace_line+0x6ea/0x14d0
         ? tracing_buffers_read+0x700/0x700
         ? trace_find_next_entry_inc+0x158/0x1d0
         s_show+0xea/0x310
         seq_read+0xaa7/0x10e0
         ? seq_escape+0x230/0x230
         __vfs_read+0x7c/0x100
         vfs_read+0x16c/0x3a0
         ksys_read+0x121/0x240
         ? kernel_write+0x110/0x110
         ? perf_trace_sys_enter+0x8a0/0x8a0
         ? syscall_slow_exit_work+0xa9/0x410
         do_syscall_64+0xb7/0x390
         ? prepare_exit_to_usermode+0x165/0x200
         entry_SYSCALL_64_after_hwframe+0x44/0xa9
        RIP: 0033:0x7f867681f910
        Code: b6 fe ff ff 48 8d 3d 0f be 08 00 48 83 ec 08 e8 06 db 01 00 66 0f 1f 44 00 00 83 3d f9 2d 2c 00 00 75 10 b8 00 00 00 00 04
        RSP: 002b:00007ffdabf23488 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
        RAX: ffffffffffffffda RBX: 0000000000020000 RCX: 00007f867681f910
        RDX: 0000000000020000 RSI: 00007f8676cde000 RDI: 0000000000000003
        RBP: 00007f8676cde000 R08: ffffffffffffffff R09: 0000000000000000
        R10: 0000000000000871 R11: 0000000000000246 R12: 00007f8676cde000
        R13: 0000000000000003 R14: 0000000000020000 R15: 0000000000000ec0
      
        Allocated by task 1214:
         save_stack+0x1b/0x80
         __kasan_kmalloc.constprop.0+0xc2/0xd0
         kmem_cache_alloc+0xaf/0x1a0
         getname_flags+0xd2/0x5b0
         do_sys_open+0x277/0x5a0
         do_syscall_64+0xb7/0x390
         entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
        Freed by task 1214:
         save_stack+0x1b/0x80
         __kasan_slab_free+0x12c/0x170
         kmem_cache_free+0x8a/0x1c0
         putname+0xe1/0x120
         do_sys_open+0x2c5/0x5a0
         do_syscall_64+0xb7/0x390
         entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
        The buggy address belongs to the object at ffff888069d20000
         which belongs to the cache names_cache of size 4096
        The buggy address is located 0 bytes inside of
         4096-byte region [ffff888069d20000, ffff888069d21000)
        The buggy address belongs to the page:
        page:ffffea0001a74800 refcount:1 mapcount:0 mapping:ffff88806ccd1380 index:0x0 compound_mapcount: 0
        flags: 0x100000000010200(slab|head)
        raw: 0100000000010200 dead000000000100 dead000000000200 ffff88806ccd1380
        raw: 0000000000000000 0000000000070007 00000001ffffffff 0000000000000000
        page dumped because: kasan: bad access detected
      
        Memory state around the buggy address:
         ffff888069d1ff00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
         ffff888069d1ff80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
        >ffff888069d20000: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                           ^
         ffff888069d20080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
         ffff888069d20100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
        ==================================================================
      
      Link: http://lkml.kernel.org/r/20190610040016.5598-1-devel@etsukata.com
      
      Fixes: 4285f2fc ("tracing: Remove the ULONG_MAX stack trace hackery")
      Signed-off-by: default avatarEiichi Tsukata <devel@etsukata.com>
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      becf33f6
    • Andreas Gruenbacher's avatar
      gfs2: Fix rounding error in gfs2_iomap_page_prepare · 2741b672
      Andreas Gruenbacher authored
      The pos and len arguments to the iomap page_prepare callback are not
      block aligned, so we need to take that into account when computing the
      number of blocks.
      
      Fixes: d0a22a4b ("gfs2: Fix iomap write page reclaim deadlock")
      Signed-off-by: default avatarAndreas Gruenbacher <agruenba@redhat.com>
      2741b672
    • Linus Torvalds's avatar
      Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux · 72a20cee
      Linus Torvalds authored
      Pull arm64 fixes from Will Deacon:
       "Here are some arm64 fixes for -rc5.
      
        The only non-trivial change (in terms of the diffstat) is fixing our
        SVE ptrace API for big-endian machines, but the majority of this is
        actually the addition of much-needed comments and updates to the
        documentation to try to avoid this mess biting us again in future.
      
        There are still a couple of small things on the horizon, but nothing
        major at this point.
      
        Summary:
      
         - Fix broken SVE ptrace API when running in a big-endian configuration
      
         - Fix performance regression due to off-by-one in TLBI range checking
      
         - Fix build regression when using Clang"
      
      * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
        arm64/sve: Fix missing SVE/FPSIMD endianness conversions
        arm64: tlbflush: Ensure start/end of address range are aligned to stride
        arm64: Don't unconditionally add -Wno-psabi to KBUILD_CFLAGS
      72a20cee
    • Linus Torvalds's avatar
      Merge branch 'akpm' (patches from Andrew) · fd6b99fa
      Linus Torvalds authored
      Merge misc fixes from Andrew Morton:
       "16 fixes"
      
      * emailed patches from Andrew Morton <akpm@linux-foundation.org>:
        mm/devm_memremap_pages: fix final page put race
        PCI/P2PDMA: track pgmap references per resource, not globally
        lib/genalloc: introduce chunk owners
        PCI/P2PDMA: fix the gen_pool_add_virt() failure path
        mm/devm_memremap_pages: introduce devm_memunmap_pages
        drivers/base/devres: introduce devm_release_action()
        mm/vmscan.c: fix trying to reclaim unevictable LRU page
        coredump: fix race condition between collapse_huge_page() and core dumping
        mm/mlock.c: change count_mm_mlocked_page_nr return type
        mm: mmu_gather: remove __tlb_reset_range() for force flush
        fs/ocfs2: fix race in ocfs2_dentry_attach_lock()
        mm/vmscan.c: fix recent_rotated history
        mm/mlock.c: mlockall error for flag MCL_ONFAULT
        scripts/decode_stacktrace.sh: prefix addr2line with $CROSS_COMPILE
        mm/list_lru.c: fix memory leak in __memcg_init_list_lru_node
        mm: memcontrol: don't batch updates of local VM stats and events
      fd6b99fa
    • Daniel Vetter's avatar
      Merge branch 'drm-fixes-5.2' of git://people.freedesktop.org/~agd5f/linux into drm-fixes · e14c5873
      Daniel Vetter authored
      Fixes for 5.2:
      - Extend previous vce fix for resume to uvd and vcn
      - Fix bounds checking in ras debugfs interface
      - Fix a regression on SI using amdgpu
      Signed-off-by: default avatarDaniel Vetter <daniel.vetter@ffwll.ch>
      From: Alex Deucher <alexdeucher@gmail.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190613021856.3307-1-alexander.deucher@amd.com
      e14c5873
    • Linus Torvalds's avatar
      Merge tag 'iommu-fixes-v5.2-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu · c78ad1be
      Linus Torvalds authored
      Pull iommu fixes from Joerg Roedel:
      
       - three fixes for Intel VT-d to fix a potential dead-lock, a formatting
         fix and a bit setting fix
      
       - one fix for the ARM-SMMU to make it work on some platforms with
         sub-optimal SMMU emulation
      
      * tag 'iommu-fixes-v5.2-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
        iommu/arm-smmu: Avoid constant zero in TLBI writes
        iommu/vt-d: Set the right field for Page Walk Snoop
        iommu/vt-d: Fix lock inversion between iommu->lock and device_domain_lock
        iommu: Add missing new line for dma type
      c78ad1be
    • Linus Torvalds's avatar
      Merge tag 'gpio-v5.2-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio · 7617c9a0
      Linus Torvalds authored
      Pull GPIO fix from Linus Walleij:
       "A single fix for the PCA953x driver affecting some fringe variants of
        the chip"
      
      * tag 'gpio-v5.2-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio:
        gpio: pca953x: hack to fix 24 bit gpio expanders
      7617c9a0
    • Linus Torvalds's avatar
      Merge tag 'sound-5.2-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound · bcb46a0e
      Linus Torvalds authored
      Pull sound fixes from Takashi Iwai:
       "It might feel like deja vu to receive a bulk of changes at rc5, and it
        happens again; we've got a collection of fixes for ASoC. Most of fixes
        are targeted for the newly merged SOF (Sound Open Firmware) stuff and
        the relevant fixes for Intel platforms.
      
        Other than that, there are a few regression fixes for the recent ASoC
        core changes and HD-audio quirk, as well as a couple of FireWire fixes
        and for other ASoC codecs"
      
      * tag 'sound-5.2-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (54 commits)
        Revert "ALSA: hda/realtek - Improve the headset mic for Acer Aspire laptops"
        ALSA: ice1712: Check correct return value to snd_i2c_sendbytes (EWS/DMX 6Fire)
        ALSA: oxfw: allow PCM capture for Stanton SCS.1m
        ALSA: firewire-motu: fix destruction of data for isochronous resources
        ASoC: Intel: sst: fix kmalloc call with wrong flags
        ASoC: core: Fix deadlock in snd_soc_instantiate_card()
        SoC: rt274: Fix internal jack assignment in set_jack callback
        ALSA: hdac: fix memory release for SST and SOF drivers
        ASoC: SOF: Intel: hda: use the defined ppcap functions
        ASoC: core: move DAI pre-links initiation to snd_soc_instantiate_card
        ASoC: Intel: cht_bsw_rt5672: fix kernel oops with platform_name override
        ASoC: Intel: cht_bsw_nau8824: fix kernel oops with platform_name override
        ASoC: Intel: bytcht_es8316: fix kernel oops with platform_name override
        ASoC: Intel: cht_bsw_max98090: fix kernel oops with platform_name override
        ASoC: sun4i-i2s: Add offset to RX channel select
        ASoC: sun4i-i2s: Fix sun8i tx channel offset mask
        ASoC: max98090: remove 24-bit format support if RJ is 0
        ASoC: da7219: Fix build error without CONFIG_I2C
        ASoC: SOF: Intel: hda: Fix COMPILE_TEST build error
        ASoC: SOF: fix DSP oops definitions in FW ABI
        ...
      bcb46a0e
    • Andrey Ryabinin's avatar
      x86/kasan: Fix boot with 5-level paging and KASAN · f3176ec9
      Andrey Ryabinin authored
      Since commit d52888aa ("x86/mm: Move LDT remap out of KASLR region on
      5-level paging") kernel doesn't boot with KASAN on 5-level paging machines.
      The bug is actually in early_p4d_offset() and introduced by commit
      12a8cc7f ("x86/kasan: Use the same shadow offset for 4- and 5-level paging")
      
      early_p4d_offset() tries to convert pgd_val(*pgd) value to a physical
      address. This doesn't make sense because pgd_val() already contains the
      physical address.
      
      It did work prior to commit d52888aa because the result of
      "__pa_nodebug(pgd_val(*pgd)) & PTE_PFN_MASK" was the same as "pgd_val(*pgd)
      & PTE_PFN_MASK". __pa_nodebug() just set some high bits which were masked
      out by applying PTE_PFN_MASK.
      
      After the change of the PAGE_OFFSET offset in commit d52888aa
      __pa_nodebug(pgd_val(*pgd)) started to return a value with more high bits
      set and PTE_PFN_MASK wasn't enough to mask out all of them. So it returns a
      wrong not even canonical address and crashes on the attempt to dereference
      it.
      
      Switch back to pgd_val() & PTE_PFN_MASK to cure the issue.
      
      Fixes: 12a8cc7f ("x86/kasan: Use the same shadow offset for 4- and 5-level paging")
      Reported-by: default avatarKirill A. Shutemov <kirill@shutemov.name>
      Signed-off-by: default avatarAndrey Ryabinin <aryabinin@virtuozzo.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: kasan-dev@googlegroups.com
      Cc: stable@vger.kernel.org
      Cc: <stable@vger.kernel.org>
      Link: https://lkml.kernel.org/r/20190614143149.2227-1-aryabinin@virtuozzo.com
      f3176ec9
    • Thomas Gleixner's avatar
      timekeeping: Repair ktime_get_coarse*() granularity · e3ff9c36
      Thomas Gleixner authored
      Jason reported that the coarse ktime based time getters advance only once
      per second and not once per tick as advertised.
      
      The code reads only the monotonic base time, which advances once per
      second. The nanoseconds are accumulated on every tick in xtime_nsec up to
      a second and the regular time getters take this nanoseconds offset into
      account, but the ktime_get_coarse*() implementation fails to do so.
      
      Add the accumulated xtime_nsec value to the monotonic base time to get the
      proper per tick advancing coarse tinme.
      
      Fixes: b9ff604c ("timekeeping: Add ktime_get_coarse_with_offset")
      Reported-by: default avatarJason A. Donenfeld <Jason@zx2c4.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Tested-by: default avatarJason A. Donenfeld <Jason@zx2c4.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Clemens Ladisch <clemens@ladisch.de>
      Cc: Sultan Alsawaf <sultan@kerneltoast.com>
      Cc: Waiman Long <longman@redhat.com>
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/alpine.DEB.2.21.1906132136280.1791@nanos.tec.linutronix.de
      e3ff9c36
    • Daniel Vetter's avatar
      Merge tag 'drm-misc-fixes-2019-06-13' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes · 744ed8cb
      Daniel Vetter authored
      Sean writes:
      
      meson: A few G12A fixes across the driver (Neil)
      quirks: A couple quirks for GPD devices (Hans)
      gem_shmem: Use writecombine when vmapping non-dmabuf BOs (Boris)
      panfrost: A couple tweaks to requiring devfreq (Neil & Ezequiel)
      edid: Ensure we return the override mode when ddc probe fails (Jani)
      
      Cc: Hans de Goede <hdegoede@redhat.com>
      Cc: Neil Armstrong <narmstrong@baylibre.com>
      Cc: Boris Brezillon <boris.brezillon@collabora.com>
      Cc: Ezequiel Garcia <ezequiel@collabora.com>
      Cc: Jani Nikula <jani.nikula@intel.com>
      Signed-off-by: default avatarDaniel Vetter <daniel.vetter@ffwll.ch>
      From: Sean Paul <sean@poorly.run>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190613143946.GA24233@art_vandelay
      744ed8cb