1. 15 Mar, 2013 26 commits
    • Steven Rostedt (Red Hat)'s avatar
      tracing: Only clear trace buffer on module unload if event was traced · 575380da
      Steven Rostedt (Red Hat) authored
      Currently, when a module with events is unloaded, the trace buffer is
      cleared. This is just a safety net in case the module might have some
      strange callback when its event is outputted. But there's no reason
      to reset the buffer if the module didn't have any of its events traced.
      
      Add a flag to the event "call" structure called WAS_ENABLED and gets set
      when the event is ever enabled, and this flag never gets cleared. When a
      module gets unloaded, if any of its events have this flag set, then the
      trace buffer will get cleared.
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      575380da
    • Steven Rostedt (Red Hat)'s avatar
      tracing: Add comment for trace event flag IGNORE_ENABLE · 2a30c11f
      Steven Rostedt (Red Hat) authored
      All the trace event flags have comments but the IGNORE_ENABLE flag
      which is set for ftrace internal events that should not be enabled
      via the debugfs "enable" file. That is, if the top level enable file
      is set, it will enable all events. It use to just check the ftrace
      event call descriptor "reg" field and skip those whithout it, but now
      some ftrace internal events have a reg field but still need to be
      skipped. The flag was created to ignore those events.
      
      Now document it.
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      2a30c11f
    • Steven Rostedt (Red Hat)'s avatar
      ring-buffer: Init waitqueue for blocked readers · f1dc6725
      Steven Rostedt (Red Hat) authored
      The move of blocked readers to the ring buffer left out the
      init of the wait queue that is used. Tests missed this due to running
      stress tests against the buffers, which didn't allow for any
      readers to end up waiting. Running a simple read and wait triggered
      a bug.
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      f1dc6725
    • Li Zefan's avatar
      tracing: Fix some section mismatch warnings · 523c8113
      Li Zefan authored
      As we've added __init annotation to field-defining functions, we should
      add __refdata annotation to event_call variables, which reference those
      functions.
      
      Link: http://lkml.kernel.org/r/51343C1F.2050502@huawei.comReported-by: default avatarFengguang Wu <fengguang.wu@intel.com>
      Signed-off-by: default avatarLi Zefan <lizefan@huawei.com>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      523c8113
    • Steven Rostedt (Red Hat)'s avatar
      tracing: Fix trace events build without modules · 315326c1
      Steven Rostedt (Red Hat) authored
      The new multi-buffers added a descriptor that kept track of module
      events, and the directories they use, with struct ftace_module_file_ops.
      This is used to add a ref count to keep modules from unloading while
      their files are being accessed.
      
      As the descriptor is only needed when CONFIG_MODULES is enabled, it
      is only declared when the config is enabled. But that struct is
      dereferenced in a few areas outside the #ifdef CONFIG_MODULES.
      
      By adding some helper routines and moving code around a little,
      events can be compiled again without modules.
      Reported-by: default avatarFengguang Wu <fengguang.wu@intel.com>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      315326c1
    • Steven Rostedt (Red Hat)'s avatar
      tracing: Add __per_cpu annotation to trace array percpu data pointer · 34ef61b1
      Steven Rostedt (Red Hat) authored
      With the conversion of the data array to per cpu, sparse now complains
      about the use of per_cpu_ptr() on the variable. But The variable is
      allocated with alloc_percpu() and is fine to use. But since the structure
      that contains the data variable does not annotate it as such, sparse
      gives out a lot of false warnings.
      Reported-by: default avatarFengguang Wu <fengguang.wu@intel.com>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      34ef61b1
    • Li Zefan's avatar
      tracing/syscalls: Annotate field-defining functions with __init · b8aae39f
      Li Zefan authored
      These two functions are called during kernel boot only.
      
      Link: http://lkml.kernel.org/r/51258796.7020704@huawei.comSigned-off-by: default avatarLi Zefan <lizefan@huawei.com>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      b8aae39f
    • Li Zefan's avatar
      tracing: Annotate event field-defining functions with __init · 7e4f44b1
      Li Zefan authored
      Those functions are called either during kernel boot or module init.
      
      Before:
      
      $ dmesg | grep 'Freeing unused kernel memory'
      Freeing unused kernel memory: 1208k freed
      Freeing unused kernel memory: 1360k freed
      Freeing unused kernel memory: 1960k freed
      
      After:
      
      $ dmesg | grep 'Freeing unused kernel memory'
      Freeing unused kernel memory: 1236k freed
      Freeing unused kernel memory: 1388k freed
      Freeing unused kernel memory: 1960k freed
      
      Link: http://lkml.kernel.org/r/5125877D.5000201@huawei.comSigned-off-by: default avatarLi Zefan <lizefan@huawei.com>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      7e4f44b1
    • Li Zefan's avatar
      tracing: Add a helper function for event print functions · f71130de
      Li Zefan authored
      Move duplicate code in event print functions to a helper function.
      
      This shrinks the size of the kernel by ~13K.
      
         text    data     bss     dec     hex filename
      6596137 1743966 10138672        18478775        119f6b7 vmlinux.o.old
      6583002 1743849 10138672        18465523        119c2f3 vmlinux.o.new
      
      Link: http://lkml.kernel.org/r/51258746.2060304@huawei.comSigned-off-by: default avatarLi Zefan <lizefan@huawei.com>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      f71130de
    • Steven Rostedt (Red Hat)'s avatar
      tracing/ring-buffer: Move poll wake ups into ring buffer code · 15693458
      Steven Rostedt (Red Hat) authored
      Move the logic to wake up on ring buffer data into the ring buffer
      code itself. This simplifies the tracing code a lot and also has the
      added benefit that waiters on one of the instance buffers can be woken
      only when data is added to that instance instead of data added to
      any instance.
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      15693458
    • Steven Rostedt's avatar
      tracing: Fix read blocking on trace_pipe_raw · b627344f
      Steven Rostedt authored
      If the ring buffer is empty, a read to trace_pipe_raw wont block.
      The tracing code has the infrastructure to wake up waiting readers,
      but the trace_pipe_raw doesn't take advantage of that.
      
      When a read is done to trace_pipe_raw without the O_NONBLOCK flag
      set, have the read block until there's data in the requested buffer.
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      b627344f
    • Steven Rostedt's avatar
      tracing: Fix polling on trace_pipe_raw · cc60cdc9
      Steven Rostedt authored
      The trace_pipe_raw never implemented polling and this was casing
      issues for several utilities. This is now implemented.
      
      Blocked reads still are on the TODO list.
      Reported-by: default avatarMauro Carvalho Chehab <mchehab@redhat.com>
      Tested-by: default avatarMauro Carvalho Chehab <mchehab@redhat.com>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      cc60cdc9
    • Steven Rostedt (Red Hat)'s avatar
      tracing: Do not block on splice if either file or splice NONBLOCK flag is set · 189e5784
      Steven Rostedt (Red Hat) authored
      Currently only the splice NONBLOCK flag is checked to determine if
      the splice read should block or not. But the file descriptor NONBLOCK
      flag also needs to be checked.
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      189e5784
    • Steven Rostedt's avatar
      tracing: Use direct field, type and system names · 92edca07
      Steven Rostedt authored
      The names used to display the field and type in the event format
      files are copied, as well as the system name that is displayed.
      
      All these names are created by constant values passed in.
      If one of theses values were to be removed by a module, the module
      would also be required to remove any event it created.
      
      By using the strings directly, we can save over 100K of memory.
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      92edca07
    • Steven Rostedt's avatar
      tracing: Use kmem_cache_alloc instead of kmalloc in trace_events.c · d1a29143
      Steven Rostedt authored
      The event structures used by the trace events are mostly persistent,
      but they are also allocated by kmalloc, which is not the best at
      allocating space for what is used. By converting these kmallocs
      into kmem_cache_allocs, we can save over 50K of space that is
      permanently allocated.
      
      After boot we have:
      
       slab name          active allocated size
       ---------          ------ --------- ----
      ftrace_event_file    979   1005     56   67    1
      ftrace_event_field   2301   2310     48   77    1
      
      The ftrace_event_file has at boot up 979 active objects out of
      1005 allocated in the slabs. Each object is 56 bytes. In a normal
      kmalloc, that would allocate 64 bytes for each object.
      
       1005 - 979  = 26 objects not used
       26 * 56 = 1456 bytes wasted
      
      But if we used kmalloc:
      
       64 - 56 = 8 bytes unused per allocation
       8 * 979 = 7832 bytes wasted
      
       7832 - 1456 = 6376 bytes in savings
      
      Doing the same for ftrace_event_field where there's 2301 objects
      allocated in a slab that can hold 2310 with 48 bytes each we have:
      
       2310 - 2301 = 9 objects not used
       9 * 48 = 432 bytes wasted
      
      A kmalloc would also use 64 bytes per object:
      
       64 - 48 = 16 bytes unused per allocation
       16 * 2301 = 36816 bytes wasted!
      
       36816 - 432 = 36384 bytes in savings
      
      This change gives us a total of 42760 bytes in savings. At least
      on my machine, but as there's a lot of these persistent objects
      for all configurations that use trace points, this is a net win.
      
      Thanks to Ezequiel Garcia for his trace_analyze presentation which
      pointed out the wasted space in my code.
      
      Cc: Ezequiel Garcia <elezegarcia@gmail.com>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      d1a29143
    • Steven Rostedt's avatar
      tracing: Get trace_events kernel command line working again · 77248221
      Steven Rostedt authored
      With the new descriptors used to allow multiple buffers in the
      tracing directory added, the kernel command line parameter
      trace_events=... no longer works. This is because the top level
      (global) trace array now has a list of descriptors associated
      with the events and the files in the debugfs directory. But in
      early bootup, when the command line is processed and the events
      enabled, the trace array list of events has not been set up yet.
      
      Without the list of events in the trace array, the setting of
      events to record will fail because it would not match any events.
      
      The solution is to set up the top level array in two stages.
      The first is to just add the ftrace file descriptors that just point
      to the events. This will allow events to be enabled and start tracing.
      The second stage is called after the filesystem is set up, and this
      stage will create the debugfs event files and directories associated
      with the trace array events.
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      77248221
    • Steven Rostedt's avatar
      tracing: Add rmdir to remove multibuffer instances · 0c8916c3
      Steven Rostedt authored
      Add a method to the hijacked dentry descriptor of the
      "instances" directory to allow for rmdir to remove an
      instance of a multibuffer.
      
      Example:
      
        cd /debug/tracing/instances
        mkdir hello
        ls
      hello/
        rmdir hello
        ls
      
      Like the mkdir method, the i_mutex is dropped for the instances
      directory. The instances directory is created at boot up and can
      not be renamed or removed. The trace_types_lock mutex is used to
      synchronize adding and removing of instances.
      
      I've run several stress tests with different threads trying to
      create and delete directories of the same name, and it has stood
      up fine.
      
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      0c8916c3
    • Steven Rostedt's avatar
      tracing: Add interface to allow multiple trace buffers · 277ba044
      Steven Rostedt authored
      Add the interface ("instances" directory) to add multiple buffers
      to ftrace. To create a new instance, simply do a mkdir in the
      instances directory:
      
      This will create a directory with the following:
      
       # cd instances
       # mkdir foo
       # ls foo
      buffer_size_kb        free_buffer  trace_clock    trace_pipe
      buffer_total_size_kb  set_event    trace_marker   tracing_enabled
      events/               trace        trace_options  tracing_on
      
      Currently only events are able to be set, and there isn't a way
      to delete a buffer when one is created (yet).
      
      Note, the i_mutex lock is dropped from the parent "instances"
      directory during the mkdir operation. As the "instances" directory
      can not be renamed or deleted (created on boot), I do not see
      any harm in dropping the lock. The creation of the sub directories
      is protected by trace_types_lock mutex, which only lets one
      instance get into the code path at a time. If two tasks try to
      create or delete directories of the same name, only one will occur
      and the other will fail with -EEXIST.
      
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      277ba044
    • Steven Rostedt's avatar
      tracing: Make syscall events suitable for multiple buffers · 12ab74ee
      Steven Rostedt authored
      Currently the syscall events record into the global buffer. But if
      multiple buffers are in place, then we need to have syscall events
      record in the proper buffers.
      
      By adding descriptors to pass to the syscall event functions, the
      syscall events can now record into the buffers that have been assigned
      to them (one event may be applied to mulitple buffers).
      
      This will allow tracing high volume syscalls along with seldom occurring
      syscalls without losing the seldom syscall events.
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      12ab74ee
    • Steven Rostedt's avatar
      tracing: Replace the static global per_cpu arrays with allocated per_cpu · a7603ff4
      Steven Rostedt authored
      The global and max-tr currently use static per_cpu arrays for the CPU data
      descriptors. But in order to get new allocated trace_arrays, they need to
      be allocated per_cpu arrays. Instead of using the static arrays, switch
      the global and max-tr to use allocated data.
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      a7603ff4
    • Steven Rostedt's avatar
      tracing: Pass the ftrace_file to the buffer lock reserve code · ccb469a1
      Steven Rostedt authored
      Pass the struct ftrace_event_file *ftrace_file to the
      trace_event_buffer_lock_reserve() (new function that replaces the
      trace_current_buffer_lock_reserver()).
      
      The ftrace_file holds a pointer to the trace_array that is in use.
      In the case of multiple buffers with different trace_arrays, this
      allows different events to be recorded into different buffers.
      
      Also fixed some of the stale comments in include/trace/ftrace.h
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      ccb469a1
    • Steven Rostedt's avatar
      tracing: Encapsulate global_trace and remove dependencies on global vars · 2b6080f2
      Steven Rostedt authored
      The global_trace variable in kernel/trace/trace.c has been kept 'static' and
      local to that file so that it would not be used too much outside of that
      file. This has paid off, even though there were lots of changes to make
      the trace_array structure more generic (not depending on global_trace).
      
      Removal of a lot of direct usages of global_trace is needed to be able to
      create more trace_arrays such that we can add multiple buffers.
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      2b6080f2
    • Steven Rostedt's avatar
      tracing: Use RING_BUFFER_ALL_CPUS for TRACE_PIPE_ALL_CPU · ae3b5093
      Steven Rostedt authored
      Both RING_BUFFER_ALL_CPUS and TRACE_PIPE_ALL_CPU are defined as
      -1 and used to say that all the ring buffers are to be modified
      or read (instead of just a single cpu, which would be >= 0).
      
      There's no reason to keep TRACE_PIPE_ALL_CPU as it is also started
      to be used for more than what it was created for, and now that
      the ring buffer code added a generic RING_BUFFER_ALL_CPUS define,
      we can clean up the trace code to use that instead and remove
      the TRACE_PIPE_ALL_CPU macro.
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      ae3b5093
    • Steven Rostedt's avatar
      tracing: Separate out trace events from global variables · ae63b31e
      Steven Rostedt authored
      The trace events for ftrace are all defined via global variables.
      The arrays of events and event systems are linked to a global list.
      This prevents multiple users of the event system (what to enable and
      what not to).
      
      By adding descriptors to represent the event/file relation, as well
      as to which trace_array descriptor they are associated with, allows
      for more than one set of events to be defined. Once the trace events
      files have a link between the trace event and the trace_array they
      are associated with, we can create multiple trace_arrays that can
      record separate events in separate buffers.
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      ae63b31e
    • Steven Rostedt (Red Hat)'s avatar
      tracing: Prevent buffer overwrite disabled for latency tracers · 613f04a0
      Steven Rostedt (Red Hat) authored
      The latency tracers require the buffers to be in overwrite mode,
      otherwise they get screwed up. Force the buffers to stay in overwrite
      mode when latency tracers are enabled.
      
      Added a flag_changed() method to the tracer structure to allow
      the tracers to see what flags are being changed, and also be able
      to prevent the change from happing.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      613f04a0
    • Steven Rostedt (Red Hat)'s avatar
      tracing: Keep overwrite in sync between regular and snapshot buffers · 80902822
      Steven Rostedt (Red Hat) authored
      Changing the overwrite mode for the ring buffer via the trace
      option only sets the normal buffer. But the snapshot buffer could
      swap with it, and then the snapshot would be in non overwrite mode
      and the normal buffer would be in overwrite mode, even though the
      option flag states otherwise.
      
      Keep the two buffers overwrite modes in sync.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      80902822
  2. 14 Mar, 2013 1 commit
  3. 13 Mar, 2013 1 commit
    • Steven Rostedt (Red Hat)'s avatar
      tracing: Fix free of probe entry by calling call_rcu_sched() · 740466bc
      Steven Rostedt (Red Hat) authored
      Because function tracing is very invasive, and can even trace
      calls to rcu_read_lock(), RCU access in function tracing is done
      with preempt_disable_notrace(). This requires a synchronize_sched()
      for updates and not a synchronize_rcu().
      
      Function probes (traceon, traceoff, etc) must be freed after
      a synchronize_sched() after its entry has been removed from the
      hash. But call_rcu() is used. Fix this by using call_rcu_sched().
      
      Also fix the usage to use hlist_del_rcu() instead of hlist_del().
      
      Cc: stable@vger.kernel.org
      Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      740466bc
  4. 12 Mar, 2013 1 commit
    • Steven Rostedt (Red Hat)'s avatar
      tracing: Fix race in snapshot swapping · 2721e72d
      Steven Rostedt (Red Hat) authored
      Although the swap is wrapped with a spin_lock, the assignment
      of the temp buffer used to swap is not within that lock.
      It needs to be moved into that lock, otherwise two swaps
      happening on two different CPUs, can end up using the wrong
      temp buffer to assign in the swap.
      
      Luckily, all current callers of the swap function appear to have
      their own locks. But in case something is added that allows two
      different callers to call the swap, then there's a chance that
      this race can trigger and corrupt the buffers.
      
      New code is coming soon that will allow for this race to trigger.
      
      I've Cc'd stable, so this bug will not show up if someone backports
      one of the changes that can trigger this bug.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      2721e72d
  5. 08 Mar, 2013 1 commit
  6. 07 Mar, 2013 2 commits
    • Steven Rostedt (Red Hat)'s avatar
      tracing: Do not return EINVAL in snapshot when not allocated · c9960e48
      Steven Rostedt (Red Hat) authored
      To use the tracing snapshot feature, writing a '1' into the snapshot
      file causes the snapshot buffer to be allocated if it has not already
      been allocated and dose a 'swap' with the main buffer, so that the
      snapshot now contains what was in the main buffer, and the main buffer
      now writes to what was the snapshot buffer.
      
      To free the snapshot buffer, a '0' is written into the snapshot file.
      
      To clear the snapshot buffer, any number but a '0' or '1' is written
      into the snapshot file. But if the file is not allocated it returns
      -EINVAL error code. This is rather pointless. It is better just to
      do nothing and return success.
      Acked-by: default avatarHiraku Toyooka <hiraku.toyooka.gu@hitachi.com>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      c9960e48
    • Steven Rostedt (Red Hat)'s avatar
      tracing: Add help of snapshot feature when snapshot is empty · d8741e2e
      Steven Rostedt (Red Hat) authored
      When cat'ing the snapshot file, instead of showing an empty trace
      header like the trace file does, show how to use the snapshot
      feature.
      
      Also, this is a good place to show if the snapshot has been allocated
      or not. Users may want to "pre allocate" the snapshot to have a fast
      "swap" of the current buffer. Otherwise, a swap would be slow and might
      fail as it would need to allocate the snapshot buffer, and that might
      fail under tight memory constraints.
      
      Here's what it looked like before:
      
       # tracer: nop
       #
       # entries-in-buffer/entries-written: 0/0   #P:4
       #
       #                              _-----=> irqs-off
       #                             / _----=> need-resched
       #                            | / _---=> hardirq/softirq
       #                            || / _--=> preempt-depth
       #                            ||| /     delay
       #           TASK-PID   CPU#  ||||    TIMESTAMP  FUNCTION
       #              | |       |   ||||       |         |
      
      Here's what it looks like now:
      
       # tracer: nop
       #
       #
       # * Snapshot is freed *
       #
       # Snapshot commands:
       # echo 0 > snapshot : Clears and frees snapshot buffer
       # echo 1 > snapshot : Allocates snapshot buffer, if not already allocated.
       #                      Takes a snapshot of the main buffer.
       # echo 2 > snapshot : Clears snapshot buffer (but does not allocate)
       #                      (Doesn't have to be '2' works with any number that
       #                       is not a '0' or '1')
      Acked-by: default avatarHiraku Toyooka <hiraku.toyooka.gu@hitachi.com>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      d8741e2e
  7. 28 Feb, 2013 2 commits
  8. 20 Feb, 2013 6 commits
    • Ingo Molnar's avatar
      Merge branch 'tip/perf/core' of... · ff1fb5f6
      Ingo Molnar authored
      Merge branch 'tip/perf/core' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace into perf/urgent
      
      Pull two fixes from Steven Rostedt.
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      ff1fb5f6
    • Stephane Eranian's avatar
      perf/x86: Add Intel IvyBridge event scheduling constraints · 69943182
      Stephane Eranian authored
      Intel IvyBridge processor has different constraints compared
      to SandyBridge. Therefore it needs its own contraint table.
      This patch adds the constraint table.
      
      Without this patch, the events listed in the patch may not be
      scheduled correctly and bogus counts may be collected.
      Signed-off-by: default avatarStephane Eranian <eranian@google.com>
      Cc: peterz@infradead.org
      Cc: ak@linux.intel.com
      Cc: acme@redhat.com
      Cc: jolsa@redhat.com
      Cc: namhyung.kim@lge.com
      Link: http://lkml.kernel.org/r/1361355312-3323-1-git-send-email-eranian@google.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      69943182
    • Linus Torvalds's avatar
      Merge branch 'for-3.9-async' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq · ece8e0b2
      Linus Torvalds authored
      Pull async changes from Tejun Heo:
       "These are followups for the earlier deadlock issue involving async
        ending up waiting for itself through block requesting module[1].  The
        following changes are made by these commits.
      
         - Instead of requesting default elevator on each request_queue init,
           block now requests it once early during boot.
      
         - Kmod triggers warning if invoked from an async worker.
      
         - Async synchronization implementation has been reimplemented.  It's
           a lot simpler now."
      
      * 'for-3.9-async' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
        async: initialise list heads to fix crash
        async: replace list of active domains with global list of pending items
        async: keep pending tasks on async_domain and remove async_pending
        async: use ULLONG_MAX for infinity cookie value
        async: bring sanity to the use of words domain and running
        async, kmod: warn on synchronous request_module() from async workers
        block: don't request module during elevator init
        init, block: try to load default elevator module early during boot
      ece8e0b2
    • Linus Torvalds's avatar
      Merge branch 'for-3.9' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq · 67cb104b
      Linus Torvalds authored
      Pull workqueue changes from Tejun Heo:
       "A lot of reorganization is going on mostly to prepare for worker pools
        with custom attributes so that workqueue can replace custom pool
        implementations in places including writeback and btrfs and make CPU
        assignment in crypto more flexible.
      
        workqueue evolved from purely per-cpu design and implementation, so
        there are a lot of assumptions regarding being bound to CPUs and even
        unbound workqueues are implemented as an extension of the model -
        workqueues running on the special unbound CPU.  Bulk of changes this
        round are about promoting worker_pools as the top level abstraction
        replacing global_cwq (global cpu workqueue).  At this point, I'm
        fairly confident about getting custom worker pools working pretty soon
        and ready for the next merge window.
      
        Lai's patches are replacing the convoluted mb() dancing workqueue has
        been doing with much simpler mechanism which only depends on
        assignment atomicity of long.  For details, please read the commit
        message of 0b3dae68 ("workqueue: simplify is-work-item-queued-here
        test").  While the change ends up adding one pointer to struct
        delayed_work, the inflation in percentage is less than five percent
        and it decouples delayed_work logic a lot more cleaner from usual work
        handling, removes the unusual memory barrier dancing, and allows for
        further simplification, so I think the trade-off is acceptable.
      
        There will be two more workqueue related pull requests and there are
        some shared commits among them.  I'll write further pull requests
        assuming this pull request is pulled first."
      
      * 'for-3.9' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq: (37 commits)
        workqueue: un-GPL function delayed_work_timer_fn()
        workqueue: rename cpu_workqueue to pool_workqueue
        workqueue: reimplement is_chained_work() using current_wq_worker()
        workqueue: fix is_chained_work() regression
        workqueue: pick cwq instead of pool in __queue_work()
        workqueue: make get_work_pool_id() cheaper
        workqueue: move nr_running into worker_pool
        workqueue: cosmetic update in try_to_grab_pending()
        workqueue: simplify is-work-item-queued-here test
        workqueue: make work->data point to pool after try_to_grab_pending()
        workqueue: add delayed_work->wq to simplify reentrancy handling
        workqueue: make work_busy() test WORK_STRUCT_PENDING first
        workqueue: replace WORK_CPU_NONE/LAST with WORK_CPU_END
        workqueue: post global_cwq removal cleanups
        workqueue: rename nr_running variables
        workqueue: remove global_cwq
        workqueue: remove worker_pool->gcwq
        workqueue: replace for_each_worker_pool() with for_each_std_worker_pool()
        workqueue: make freezing/thawing per-pool
        workqueue: make hotplug processing per-pool
        ...
      67cb104b
    • Linus Torvalds's avatar
      Merge branch 'for-3.9-cleanups' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq · 1eaec821
      Linus Torvalds authored
      Pull workqueue [delayed_]work_pending() cleanups from Tejun Heo:
       "This is part of on-going cleanups to remove / minimize usages of
        workqueue interfaces which are deprecated and/or misleading.
      
        This round drops a number of usages of [delayed_]work_pending(), which
        are dangerous as they lack any form of synchronization and thus often
        lead to buggy / unnecessary code.  There are a couple legitimate use
        cases in kernel.  Hopefully, they can be converted and
        [delayed_]work_pending() can be removed completely.  Even if not,
        removing most of misuses should make it more difficult to find
        examples of misuses and thus slow down growth of them.
      
        These changes are independent from other workqueue changes."
      
      * 'for-3.9-cleanups' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
        wimax/i2400m: fix i2400m->wake_tx_skb handling
        kprobes: fix wait_for_kprobe_optimizer()
        ipw2x00: simplify scan_event handling
        video/exynos: don't use [delayed_]work_pending()
        tty/max3100: don't use [delayed_]work_pending()
        x86/mce: don't use [delayed_]work_pending()
        rfkill: don't use [delayed_]work_pending()
        wl1251: don't use [delayed_]work_pending()
        thinkpad_acpi: don't use [delayed_]work_pending()
        mwifiex: don't use [delayed_]work_pending()
        sja1000: don't use [delayed_]work_pending()
      1eaec821
    • Linus Torvalds's avatar
      Merge branch 'x86-uv-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 1a13c0b1
      Linus Torvalds authored
      Pull x86 UV3 support update from Ingo Molnar:
       "Support for the SGI Ultraviolet System 3 (UV3) platform - the upcoming
        third major iteration and upscaling of the SGI UV supercomputing
        platform."
      
      * 'x86-uv-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86, uv, uv3: Trim MMR register definitions after code changes for SGI UV3
        x86, uv, uv3: Check current gru hub support for SGI UV3
        x86, uv, uv3: Update Time Support for SGI UV3
        x86, uv, uv3: Update x2apic Support for SGI UV3
        x86, uv, uv3: Update Hub Info for SGI UV3
        x86, uv, uv3: Update ACPI Check to include SGI UV3
        x86, uv, uv3: Update MMR register definitions for SGI Ultraviolet System 3 (UV3)
      1a13c0b1