Fix argdist, trace, tplist to use the libbcc USDT support (#698)

* Allow argdist to enable USDT probes without a pid The current code would only pass the pid to the USDT class, thereby not allowing USDT probes to be enabled from the binary path only. If the probe doesn't have a semaphore, it can actually be enabled for all processes in a uniform fashion -- which is now supported. * Reintroduce USDT support into tplist To print USDT probe information, tplist needs an API to return the probe data, including the number of arguments and locations for each probe. This commit introduces this API, called bcc_usdt_foreach, and invokes it from the revised tplist implementation. Although the result is not 100% identical to the original tplist, which could also print the probe argument information, this is not strictly required for users of the argdist and trace tools, which is why it was omitted for now. * Fix trace.py tracepoint support Somehow, the import of the Perf class was omitted from tracepoint.py, which would cause failures when trace enables kernel tracepoints. * trace: Native bcc USDT support trace now works again by using the new bcc USDT support instead of the home-grown Python USDT parser. This required an additional change in the BPF Python API to allow multiple USDT context objects to be passed to the constructor in order to support multiple USDT probes in a single invocation of trace. Otherwise, the USDT-related code in trace was greatly simplified, and uses the `bpf_usdt_readarg` macros to obtain probe argument values. One minor inconvenience that was introduced in the bcc USDT API is that USDT probes with multiple locations that reside in a shared object *must* have a pid specified to enable, even if they don't have an associated semaphore. The reason is that the bcc USDT code figures out which location invoked the probe by inspecting `ctx->ip`, which, for shared objects, can only be determined when the specific process context is available to figure out where the shared object was loaded. This limitation did not previously exist, because instead of looking at `ctx->ip`, the Python USDT reader generated separate code for each probe location with an incrementing identifier. It's not a very big deal because it only means that some probes can't be enabled without specifying a process id, which is almost always desired anyway for USDT probes. argdist has not yet been retrofitted with support for multiple USDT probes, and needs to be updated in a separate commit. * argdist: Support multiple USDT probes argdist now supports multiple USDT probes, as it did before the transition to the native bcc USDT support. This requires aggregating the USDT objects from each probe and passing them together to the BPF constructor when the probes are initialized and attached. Also add a more descriptive exception message to the USDT class when it fails to enable a probe.

Fix argdist, trace, tplist to use the libbcc USDT support (#698)
* Allow argdist to enable USDT probes without a pid The current code would only pass the pid to the USDT class, thereby not allowing USDT probes to be enabled from the binary path only. If the probe doesn't have a semaphore, it can actually be enabled for all processes in a uniform fashion -- which is now supported. * Reintroduce USDT support into tplist To print USDT probe information, tplist needs an API to return the probe data, including the number of arguments and locations for each probe. This commit introduces this API, called bcc_usdt_foreach, and invokes it from the revised tplist implementation. Although the result is not 100% identical to the original tplist, which could also print the probe argument information, this is not strictly required for users of the argdist and trace tools, which is why it was omitted for now. * Fix trace.py tracepoint support Somehow, the import of the Perf class was omitted from tracepoint.py, which would cause failures when trace enables kernel tracepoints. * trace: Native bcc USDT support trace now works again by using the new bcc USDT support instead of the home-grown Python USDT parser. This required an additional change in the BPF Python API to allow multiple USDT context objects to be passed to the constructor in order to support multiple USDT probes in a single invocation of trace. Otherwise, the USDT-related code in trace was greatly simplified, and uses the `bpf_usdt_readarg` macros to obtain probe argument values. One minor inconvenience that was introduced in the bcc USDT API is that USDT probes with multiple locations that reside in a shared object *must* have a pid specified to enable, even if they don't have an associated semaphore. The reason is that the bcc USDT code figures out which location invoked the probe by inspecting `ctx->ip`, which, for shared objects, can only be determined when the specific process context is available to figure out where the shared object was loaded. This limitation did not previously exist, because instead of looking at `ctx->ip`, the Python USDT reader generated separate code for each probe location with an incrementing identifier. It's not a very big deal because it only means that some probes can't be enabled without specifying a process id, which is almost always desired anyway for USDT probes. argdist has not yet been retrofitted with support for multiple USDT probes, and needs to be updated in a separate commit. * argdist: Support multiple USDT probes argdist now supports multiple USDT probes, as it did before the transition to the native bcc USDT support. This requires aggregating the USDT objects from each probe and passing them together to the BPF constructor when the probes are initialized and attached. Also add a more descriptive exception message to the USDT class when it fails to enable a probe.
69e361ac · Sasha Goldshtein · 4ast · 66441862 · 69e361ac · 69e361ac
Commit 69e361ac authored Sep 27, 2016 by Sasha Goldshtein Committed by 4ast Sep 27, 2016
11 changed files
--- a/src/cc/bcc_usdt.h
+++ b/src/cc/bcc_usdt.h
@@ -26,6 +26,18 @@ void *bcc_usdt_new_frompid(int pid);
 void *bcc_usdt_new_frompath(const char *path);
 void bcc_usdt_close(void *usdt);

+struct bcc_usdt {
+    const char *provider;
+    const char *name;
+    const char *bin_path;
+    uint64_t semaphore;
+    int num_locations;
+    int num_arguments;
+};
+
+typedef void (*bcc_usdt_cb)(struct bcc_usdt *);
+void bcc_usdt_foreach(void *usdt, bcc_usdt_cb callback);
+
 int bcc_usdt_enable_probe(void *, const char *, const char *);
 const char *bcc_usdt_genargs(void *);


--- a/src/cc/usdt.cc
+++ b/src/cc/usdt.cc
@@ -24,6 +24,7 @@
 #include "bcc_proc.h"
 #include "usdt.h"
 #include "vendor/tinyformat.hpp"
+#include "bcc_usdt.h"

 namespace USDT {

@@ -255,6 +256,19 @@ bool Context::enable_probe(const std::string &probe_name,
  return p && p->enable(fn_name);
 }

+void Context::each(each_cb callback) {
+  for (const auto &probe : probes_) {
+    struct bcc_usdt info = {0};
+    info.provider = probe->provider().c_str();
+    info.bin_path = probe->bin_path().c_str();
+    info.name = probe->name().c_str();
+    info.semaphore = probe->semaphore();
+    info.num_locations = probe->num_locations();
+    info.num_arguments = probe->num_arguments();
+    callback(&info);
+  }
+}
+
 void Context::each_uprobe(each_uprobe_cb callback) {
  for (auto &p : probes_) {
    if (!p->enabled())
@@ -288,7 +302,6 @@ Context::~Context() {
 }

 extern "C" {
-#include "bcc_usdt.h"

 void *bcc_usdt_new_frompid(int pid) {
  USDT::Context *ctx = new USDT::Context(pid);
@@ -331,6 +344,11 @@ const char *bcc_usdt_genargs(void *usdt) {
  return storage_.c_str();
 }

+void bcc_usdt_foreach(void *usdt, bcc_usdt_cb callback) {
+  USDT::Context *ctx = static_cast<USDT::Context *>(usdt);
+  ctx->each(callback);
+}
+
 void bcc_usdt_foreach_uprobe(void *usdt, bcc_usdt_uprobe_cb callback) {
  USDT::Context *ctx = static_cast<USDT::Context *>(usdt);
  ctx->each_uprobe(callback);

--- a/src/cc/usdt.h
+++ b/src/cc/usdt.h
@@ -23,6 +23,8 @@
 #include "syms.h"
 #include "vendor/optional.hpp"

+struct bcc_usdt;
+
 namespace USDT {

 using std::experimental::optional;
@@ -148,6 +150,7 @@ public:

  size_t num_locations() const { return locations_.size(); }
  size_t num_arguments() const { return locations_.front().arguments_.size(); }
+  uint64_t semaphore()   const { return semaphore_; }

  uint64_t address(size_t n = 0) const { return locations_[n].address_; }
  bool usdt_getarg(std::ostream &stream);
@@ -194,6 +197,9 @@ public:
  bool enable_probe(const std::string &probe_name, const std::string &fn_name);
  bool generate_usdt_args(std::ostream &stream);

+  typedef void (*each_cb)(struct bcc_usdt *);
+  void each(each_cb callback);
+
  typedef void (*each_uprobe_cb)(const char *, const char *, uint64_t, int);
  void each_uprobe(each_uprobe_cb callback);
 };

--- a/src/python/bcc/__init__.py
+++ b/src/python/bcc/__init__.py
@@ -149,7 +149,7 @@ class BPF(object):
        return None

    def __init__(self, src_file="", hdr_file="", text=None, cb=None, debug=0,
-            cflags=[], usdt=None):
+            cflags=[], usdt_contexts=[]):
        """Create a a new BPF module with the given source code.

        Note:
@@ -179,7 +179,15 @@ class BPF(object):
        self.tables = {}
        cflags_array = (ct.c_char_p * len(cflags))()
        for i, s in enumerate(cflags): cflags_array[i] = s.encode("ascii")
-        if usdt and text: text = usdt.get_text() + text
+        if text:
+            for usdt_context in usdt_contexts:
+                usdt_text = usdt_context.get_text()
+                if usdt_text is None:
+                    raise Exception("can't generate USDT probe arguments; " +
+                                    "possible cause is missing pid when a " +
+                                    "probe in a shared object has multiple " +
+                                    "locations")
+                text = usdt_context.get_text() + text

        if text:
            self.module = lib.bpf_module_create_c_from_string(text.encode("ascii"),
@@ -197,7 +205,8 @@ class BPF(object):
        if not self.module:
            raise Exception("Failed to compile BPF module %s" % src_file)

-        if usdt: usdt.attach_uprobes(self)
+        for usdt_context in usdt_contexts:
+            usdt_context.attach_uprobes(self)

        # If any "kprobe__" or "tracepoint__" prefixed functions were defined,
        # they will be loaded and attached here.

--- a/src/python/bcc/libbcc.py
+++ b/src/python/bcc/libbcc.py
@@ -157,7 +157,23 @@ lib.bcc_usdt_enable_probe.argtypes = [ct.c_void_p, ct.c_char_p, ct.c_char_p]
 lib.bcc_usdt_genargs.restype = ct.c_char_p
 lib.bcc_usdt_genargs.argtypes = [ct.c_void_p]

-_USDT_CB = ct.CFUNCTYPE(None, ct.c_char_p, ct.c_char_p, ct.c_ulonglong, ct.c_int)
+class bcc_usdt(ct.Structure):
+    _fields_ = [
+            ('provider', ct.c_char_p),
+            ('name', ct.c_char_p),
+            ('bin_path', ct.c_char_p),
+            ('semaphore', ct.c_ulonglong),
+            ('num_locations', ct.c_int),
+            ('num_arguments', ct.c_int),
+        ]
+
+_USDT_CB = ct.CFUNCTYPE(None, ct.POINTER(bcc_usdt))
+
+lib.bcc_usdt_foreach.restype = None
+lib.bcc_usdt_foreach.argtypes = [ct.c_void_p, _USDT_CB]
+
+_USDT_PROBE_CB = ct.CFUNCTYPE(None, ct.c_char_p, ct.c_char_p,
+                              ct.c_ulonglong, ct.c_int)

 lib.bcc_usdt_foreach_uprobe.restype = None
-lib.bcc_usdt_foreach_uprobe.argtypes = [ct.c_void_p, _USDT_CB]
+lib.bcc_usdt_foreach_uprobe.argtypes = [ct.c_void_p, _USDT_PROBE_CB]
--- a/src/python/bcc/tracepoint.py
+++ b/src/python/bcc/tracepoint.py
@@ -16,6 +16,7 @@ import ctypes as ct
 import multiprocessing
 import os
 import re
+from .perf import Perf

 class Tracepoint(object):
        enabled_tracepoints = []

--- a/src/python/bcc/usdt.py
+++ b/src/python/bcc/usdt.py
@@ -12,34 +12,67 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.

-from .libbcc import lib, _USDT_CB
+from .libbcc import lib, _USDT_CB, _USDT_PROBE_CB
+
+class USDTProbe(object):
+    def __init__(self, usdt):
+        self.provider = usdt.provider
+        self.name = usdt.name
+        self.bin_path = usdt.bin_path
+        self.semaphore = usdt.semaphore
+        self.num_locations = usdt.num_locations
+        self.num_arguments = usdt.num_arguments
+
+    def __str__(self):
+        return "%s %s:%s [sema 0x%x]\n  %d location(s)\n  %d argument(s)" % \
+               (self.bin_path, self.provider, self.name, self.semaphore,
+                self.num_locations, self.num_arguments)
+
+    def short_name(self):
+        return "%s:%s" % (self.provider, self.name)

 class USDT(object):
    def __init__(self, pid=None, path=None):
-        if pid:
+        if pid and pid != -1:
            self.pid = pid
            self.context = lib.bcc_usdt_new_frompid(pid)
            if self.context == None:
-                raise Exception("USDT failed to instrument PID %d" % pid) 
+                raise Exception("USDT failed to instrument PID %d" % pid)
        elif path:
            self.path = path
            self.context = lib.bcc_usdt_new_frompath(path)
            if self.context == None:
-                raise Exception("USDT failed to instrument path %s" % path) 
+                raise Exception("USDT failed to instrument path %s" % path)
+        else:
+            raise Exception("either a pid or a binary path must be specified")

    def enable_probe(self, probe, fn_name):
        if lib.bcc_usdt_enable_probe(self.context, probe, fn_name) != 0:
-            raise Exception("failed to enable probe '%s'" % probe)
+            raise Exception(("failed to enable probe '%s'; a possible cause " +
+                            "can be that the probe requires a pid to enable") %
+                            probe)

    def get_text(self):
        return lib.bcc_usdt_genargs(self.context)

+    def enumerate_probes(self):
+        probes = []
+        def _add_probe(probe):
+            probes.append(USDTProbe(probe.contents))
+
+        lib.bcc_usdt_foreach(self.context, _USDT_CB(_add_probe))
+        return probes
+
+    # This is called by the BPF module's __init__ when it realizes that there
+    # is a USDT context and probes need to be attached.
    def attach_uprobes(self, bpf):
        probes = []
        def _add_probe(binpath, fn_name, addr, pid):
            probes.append((binpath, fn_name, addr, pid))

-        lib.bcc_usdt_foreach_uprobe(self.context, _USDT_CB(_add_probe))
+        lib.bcc_usdt_foreach_uprobe(self.context, _USDT_PROBE_CB(_add_probe))

        for (binpath, fn_name, addr, pid) in probes:
-            bpf.attach_uprobe(name=binpath, fn_name=fn_name, addr=addr, pid=pid)
+            bpf.attach_uprobe(name=binpath, fn_name=fn_name,
+                              addr=addr, pid=pid)
+
--- a/tools/argdist.py
+++ b/tools/argdist.py
@@ -175,8 +175,9 @@ u64 __time = bpf_ktime_get_ns();
                        self._bail("no exprs specified")
                self.exprs = exprs.split(',')

-        def __init__(self, bpf, type, specifier):
-                self.pid = bpf.args.pid
+        def __init__(self, tool, type, specifier):
+                self.usdt_ctx = None
+                self.pid = tool.args.pid
                self.raw_spec = specifier
                self._validate_specifier()

@@ -200,8 +201,7 @@ u64 __time = bpf_ktime_get_ns();
                        self.library = parts[1]
                        self.probe_func_name = "%s_probe%d" % \
                                (self.function, Probe.next_probe_index)
-                        bpf.enable_usdt_probe(self.function,
-                                        fn_name=self.probe_func_name)
+                        self._enable_usdt_probe()
                else:
                        self.library = parts[1]
                self.is_user = len(self.library) > 0
@@ -242,8 +242,10 @@ u64 __time = bpf_ktime_get_ns();
                        (self.function, Probe.next_probe_index)
                Probe.next_probe_index += 1

-        def close(self):
-                pass
+        def _enable_usdt_probe(self):
+                self.usdt_ctx = USDT(path=self.library, pid=self.pid)
+                self.usdt_ctx.enable_probe(
+                        self.function, self.probe_func_name)

        def _substitute_exprs(self):
                def repl(expr):
@@ -262,12 +264,17 @@ u64 __time = bpf_ktime_get_ns();
                else:
                        return "%s v%d;\n" % (self.expr_types[i], i)

+        def _generate_usdt_arg_assignment(self, i):
+                expr = self.exprs[i]
+                if self.probe_type == "u" and expr[0:3] == "arg":
+                        return ("        u64 %s = 0;\n" +
+                                "        bpf_usdt_readarg(%s, ctx, &%s);\n") % \
+                                (expr, expr[3], expr)
+                else:
+                        return ""
+
        def _generate_field_assignment(self, i):
-                text = ""
-                if self.probe_type == "u" and self.exprs[i][0:3] == "arg":
-                    text = ("        u64 %s;\n" + 
-                           "        bpf_usdt_readarg(%s, ctx, &%s);\n") % \
-                           (self.exprs[i], self.exprs[i][3], self.exprs[i])
+                text = self._generate_usdt_arg_assignment(i)
                if self._is_string(self.expr_types[i]):
                        return (text + "        bpf_probe_read(&__key.v%d.s," +
                                " sizeof(__key.v%d.s), (void *)%s);\n") % \
@@ -291,8 +298,9 @@ u64 __time = bpf_ktime_get_ns();

        def _generate_key_assignment(self):
                if self.type == "hist":
-                        return "%s __key = %s;\n" % \
-                                (self.expr_types[0], self.exprs[0])
+                        return self._generate_usdt_arg_assignment(0) + \
+                               ("%s __key = %s;\n" % \
+                                (self.expr_types[0], self.exprs[0]))
                else:
                        text = "struct %s_key_t __key = {};\n" % \
                                self.probe_hash_name
@@ -590,11 +598,6 @@ argdist -p 2780 -z 120 \\
                        print("at least one specifier is required")
                        exit()

-        def enable_usdt_probe(self, probe_name, fn_name):
-                if not self.usdt_ctx:
-                        self.usdt_ctx = USDT(pid=self.args.pid)
-                self.usdt_ctx.enable_probe(probe_name, fn_name)
-
        def _generate_program(self):
                bpf_source = """
 struct __string_t { char s[%d]; };
@@ -610,9 +613,13 @@ struct __string_t { char s[%d]; };
                for probe in self.probes:
                        bpf_source += probe.generate_text()
                if self.args.verbose:
-                        if self.usdt_ctx: print(self.usdt_ctx.get_text())
+                        for text in [probe.usdt_ctx.get_text() \
+                                     for probe in self.probes if probe.usdt_ctx]:
+                            print(text)
                        print(bpf_source)
-                self.bpf = BPF(text=bpf_source, usdt=self.usdt_ctx)
+                usdt_contexts = [probe.usdt_ctx
+                                 for probe in self.probes if probe.usdt_ctx]
+                self.bpf = BPF(text=bpf_source, usdt_contexts=usdt_contexts)

        def _attach(self):
                Tracepoint.attach(self.bpf)
@@ -637,12 +644,6 @@ struct __string_t { char s[%d]; };
                           count_so_far >= self.args.count:
                                exit()

-        def _close_probes(self):
-                for probe in self.probes:
-                        probe.close()
-                        if self.args.verbose:
-                                print("closed probe: " + str(probe))
-
        def run(self):
                try:
                        self._create_probes()
@@ -654,7 +655,6 @@ struct __string_t { char s[%d]; };
                                traceback.print_exc()
                        elif sys.exc_info()[0] is not SystemExit:
                                print(sys.exc_info()[1])
-                self._close_probes()

 if __name__ == "__main__":
        Tool().run()
--- a/tools/tplist.py
+++ b/tools/tplist.py
@@ -13,7 +13,7 @@ import os
 import re
 import sys

-from bcc import USDTReader
+from bcc import USDT

 trace_root = "/sys/kernel/debug/tracing"
 event_root = os.path.join(trace_root, "events")
@@ -21,7 +21,7 @@ event_root = os.path.join(trace_root, "events")
 parser = argparse.ArgumentParser(description=
                "Display kernel tracepoints or USDT probes and their formats.",
                formatter_class=argparse.RawDescriptionHelpFormatter)
-parser.add_argument("-p", "--pid", type=int, default=-1, help=
+parser.add_argument("-p", "--pid", type=int, default=None, help=
                "List USDT probes in the specified process")
 parser.add_argument("-l", "--lib", default="", help=
                "List USDT probes in the specified library or executable")
@@ -65,23 +65,23 @@ def print_tracepoints():
                                print_tpoint(category, event)

 def print_usdt(pid, lib):
-        reader = USDTReader(bin_path=lib, pid=pid)
+        reader = USDT(path=lib, pid=pid)
        probes_seen = []
-        for probe in reader.probes:
-                probe_name = "%s:%s" % (probe.provider, probe.name)
+        for probe in reader.enumerate_probes():
+                probe_name = probe.short_name()
                if not args.filter or fnmatch.fnmatch(probe_name, args.filter):
                        if probe_name in probes_seen:
                                continue
                        probes_seen.append(probe_name)
                        if args.variables:
-                                print(probe.display_verbose())
+                                print(probe)
                        else:
                                print("%s %s:%s" % (probe.bin_path,
-                                        probe.provider, probe.name))
+                                                    probe.provider, probe.name))

 if __name__ == "__main__":
        try:
-                if args.pid != -1 or args.lib != "":
+                if args.pid or args.lib != "":
                        print_usdt(args.pid, args.lib)
                else:
                        print_tracepoints()

--- a/tools/tplist_example.txt
+++ b/tools/tplist_example.txt
@@ -17,25 +17,18 @@ $ tplist -l basic_usdt
 /home/vagrant/basic_usdt basic_usdt:loop_iter
 /home/vagrant/basic_usdt basic_usdt:end_main

-The loop_iter probe sounds interesting. What are the locations of that
-probe, and which variables are available?
+The loop_iter probe sounds interesting. How many arguments are available?

 $ tplist '*loop_iter' -l basic_usdt -v
 /home/vagrant/basic_usdt basic_usdt:loop_iter [sema 0x601036]
-  location 0x400550 raw args: -4@$42 8@%rax
-    4   signed bytes @ constant 42
-    8 unsigned bytes @ register %rax
-  location 0x40056f raw args: 8@-8(%rbp) 8@%rax
-    8 unsigned bytes @ -8(%rbp)
-    8 unsigned bytes @ register %rax
+  2 location(s)
+  2 argument(s)

 This output indicates that the loop_iter probe is used in two locations
-in the basic_usdt executable. The first location passes a constant value,
-42, to the probe. The second location passes a variable value located at
-an offset from the %rbp register. Don't worry -- you don't have to trace
-the register values yourself. The argdist and trace tools understand the
-probe format and can print out the arguments automatically -- you can
-refer to them as arg1, arg2, and so on.
+in the basic_usdt executable, and that it has two arguments. Fortunately,
+the argdist and trace tools understand the probe format and can print out
+the arguments automatically -- you can refer to them as arg1, arg2, and
+so on.

 Try to explore with some common libraries on your system and see if they
 contain UDST probes. Here are two examples you might find interesting:

--- a/tools/trace.py
+++ b/tools/trace.py
@@ -59,6 +59,7 @@ class Probe(object):
                cls.pid = args.pid or -1

        def __init__(self, probe, string_size):
+                self.usdt = None
                self.raw_probe = probe
                self.string_size = string_size
                Probe.probe_count += 1
@@ -145,30 +146,15 @@ class Probe(object):
                        # We will discover the USDT provider by matching on
                        # the USDT name in the specified library
                        self._find_usdt_probe()
-                        self._enable_usdt_probe()
                else:
                        self.library = parts[1]
                        self.function = parts[2]

-        def _enable_usdt_probe(self):
-                if self.usdt.need_enable():
-                        if Probe.pid == -1:
-                                self._bail("probe needs pid to enable")
-                        self.usdt.enable(Probe.pid)
-
-        def _disable_usdt_probe(self):
-                if self.probe_type == "u" and self.usdt.need_enable():
-                        self.usdt.disable(Probe.pid)
-
-        def close(self):
-                self._disable_usdt_probe()
-
        def _find_usdt_probe(self):
-                reader = USDTReader(bin_path=self.library)
-                for probe in reader.probes:
+                self.usdt = USDT(path=self.library, pid=Probe.pid)
+                for probe in self.usdt.enumerate_probes():
                        if probe.name == self.usdt_name:
-                                self.usdt = probe
-                                return
+                                return # Found it, will enable later
                self._bail("unrecognized USDT probe %s" % self.usdt_name)

        def _parse_filter(self, filt):
@@ -219,7 +205,8 @@ class Probe(object):
        def _replace_args(self, expr):
                for alias, replacement in Probe.aliases.items():
                        # For USDT probes, we replace argN values with the
-                        # actual arguments for that probe.
+                        # actual arguments for that probe obtained using special
+                        # bpf_readarg_N macros emitted at BPF construction.
                        if alias.startswith("arg") and self.probe_type == "u":
                                continue
                        expr = expr.replace(alias, replacement)
@@ -294,15 +281,21 @@ BPF_PERF_OUTPUT(%s);

        def _generate_field_assign(self, idx):
                field_type = self.types[idx]
-                expr = self.values[idx]
+                expr = self.values[idx].strip()
+                text = ""
+                if self.probe_type == "u" and expr[0:3] == "arg":
+                        text = ("        u64 %s;\n" +
+                                "        bpf_usdt_readarg(%s, ctx, &%s);\n") % \
+                                (expr, expr[3], expr)
+
                if field_type == "s":
-                        return """
+                        return text + """
        if (%s != 0) {
                bpf_probe_read(&__data.v%d, sizeof(__data.v%d), (void *)%s);
        }
 """                     % (expr, idx, idx, expr)
                if field_type in Probe.fmt_types:
-                        return "        __data.v%d = (%s)%s;\n" % \
+                        return text + "        __data.v%d = (%s)%s;\n" % \
                                        (idx, Probe.c_type[field_type], expr)
                self._bail("unrecognized field type %s" % field_type)

@@ -324,23 +317,17 @@ BPF_PERF_OUTPUT(%s);
                        pid_filter = ""

                prefix = ""
-                qualifier = ""
                signature = "struct pt_regs *ctx"
                if self.probe_type == "t":
                        data_decl += self.tp.generate_struct()
                        prefix = self.tp.generate_get_struct()
-                elif self.probe_type == "u":
-                        signature += ", int __loc_id"
-                        prefix = self.usdt.generate_usdt_cases(
-                                pid=Probe.pid if Probe.pid != -1 else None)
-                        qualifier = "static inline"

                data_fields = ""
                for i, expr in enumerate(self.values):
                        data_fields += self._generate_field_assign(i)

                text = """
-%s int %s(%s)
+int %s(%s)
 {
        %s
        %s
@@ -355,15 +342,10 @@ BPF_PERF_OUTPUT(%s);
        return 0;
 }
 """
-                text = text % (qualifier, self.probe_name, signature,
+                text = text % (self.probe_name, signature,
                               pid_filter, prefix, self.filter,
                               self.struct_name, data_fields, self.events_name)

-                if self.probe_type == "u":
-                        self.usdt_thunk_names = []
-                        text += self.usdt.generate_usdt_thunks(
-                                        self.probe_name, self.usdt_thunk_names)
-
                return data_decl + "\n" + text

        @classmethod
@@ -421,11 +403,7 @@ BPF_PERF_OUTPUT(%s);
                        self._bail("unable to find library %s" % self.library)

                if self.probe_type == "u":
-                        for i, location in enumerate(self.usdt.locations):
-                                bpf.attach_uprobe(name=libpath,
-                                        addr=location.address,
-                                        fn_name=self.usdt_thunk_names[i],
-                                        pid=Probe.pid)
+                        pass # Was already enabled by the BPF constructor
                elif self.probe_type == "r":
                        bpf.attach_uretprobe(name=libpath,
                                             sym=self.function,
@@ -511,7 +489,16 @@ trace 'u:pthread:pthread_create (arg4 != 0)'
                        print(self.program)

        def _attach_probes(self):
-                self.bpf = BPF(text=self.program)
+                usdt_contexts = []
+                for probe in self.probes:
+                    if probe.usdt:
+                        # USDT probes must be enabled before the BPF object
+                        # is initialized, because that's where the actual
+                        # uprobe is being attached.
+                        probe.usdt.enable_probe(
+                                probe.usdt_name, probe.probe_name)
+                        usdt_contexts.append(probe.usdt)
+                self.bpf = BPF(text=self.program, usdt_contexts=usdt_contexts)
                Tracepoint.attach(self.bpf)
                for probe in self.probes:
                        if self.args.verbose:
@@ -530,12 +517,6 @@ trace 'u:pthread:pthread_create (arg4 != 0)'
                while True:
                        self.bpf.kprobe_poll()

-        def _close_probes(self):
-                for probe in self.probes:
-                        probe.close()
-                        if self.args.verbose:
-                                print("closed probe: " + str(probe))
-
        def run(self):
                try:
                        self._create_probes()
@@ -547,7 +528,6 @@ trace 'u:pthread:pthread_create (arg4 != 0)'
                                traceback.print_exc()
                        elif sys.exc_info()[0] is not SystemExit:
                                print(sys.exc_info()[1])
-                self._close_probes()

 if __name__ == "__main__":
       Tool().run()