• Steven Rostedt (Red Hat)'s avatar
    tracing: Optimize trace_printk() with one arg to use trace_puts() · 9d3c752c
    Steven Rostedt (Red Hat) authored
    Although trace_printk() is extremely fast, especially when it uses
    trace_bprintk() (writes args straight to buffer instead of inserting
    into string), it still has the overhead of calling one of the printf
    sprintf() functions, that need to scan the fmt string to determine
    what, if any args it has.
    
    This is a waste of precious CPU cycles if the printk format has no
    args but a single constant string. It is better to use trace_puts()
    which does not have the overhead of the fmt scanning.
    
    But wouldn't it be nice if the developer didn't have to think about
    such things, and the compile would just do it for them?
    
      trace_printk("this string has no args\n");
      [...]
      trace_printk("this sting does %p %d\n", foo, bar);
    
    As tracing is critical to have the least amount of overhead,
    especially when dealing with race conditions, and you want to
    eliminate any "Heisenbugs", you want the trace_printk() to use the
    fastest possible means of tracing.
    
    Currently the macro magic determines if it will use trace_bprintk()
    or if the fmt is a dynamic string (a variable), it will fall
    back to the slow trace_printk() method that does a full snprintf()
    before copying it into the buffer, where as trace_bprintk() only
    copys the pointer to the fmt and the args into the buffer.
    
    Well, now there's a way to spend some more Hogwarts cash and come
    up with new fancy macro magic.
    
      #define trace_printk(fmt, ...)			\
      do {							\
    	char _______STR[] = __stringify((__VA_ARGS__));	\
    	if (sizeof(_______STR) > 3)			\
    		do_trace_printk(fmt, ##__VA_ARGS__);	\
    	else						\
    		trace_puts(fmt);			\
      } while (0)
    
    The above needs a bit of explaining (both here and in the comments).
    
    By stringifying the __VA_ARGS__, we can, at compile time, determine
    the number of args that are being passed to trace_printk(). The extra
    parenthesis are required, otherwise the compiler complains about
    too many parameters for __stringify if there is more than one arg.
    
    When there are no args, the __stringify((__VA_ARGS__)) converts into
    "()\0", a string of 3 characters. Anything else, will be a string
    containing more than 3 characters. Now we assign that string to a
    dynamic char array, and then take the sizeof() of that array.
    If it is greater than 3 characters, we know trace_printk() has args
    and we need to do the full "do_trace_printk()" on them, otherwise
    it was only passed a single arg and we can optimize to use trace_puts().
    
    Cc: Thomas Gleixner <tglx@linutronix.de>
    Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
    Cc: Frederic Weisbecker <fweisbec@gmail.com>
    Signed-off-by: default avatarSteven "The King of Nasty Macros!" Rostedt <rostedt@goodmis.org>
    9d3c752c
kernel.h 25.5 KB