1. 17 Aug, 2017 3 commits
    • Pavel Safronov's avatar
      Fixes for debian and ubuntu docker build · eb819caf
      Pavel Safronov authored
      * Fixed build for debian and ubuntu
      * Bumped debian and ubuntu versions (fix some build-dependency issues)
      * Make debian and ubuntu Dockerfiles use the same build script
      * Build-dependencies now installing automatically via pbuilder
      eb819caf
    • 4ast's avatar
      Merge pull request #1294 from iovisor/yhs_dev · 9de830ae
      4ast authored
      avoid large map memory allocation in userspace
      9de830ae
    • Yonghong Song's avatar
      avoid large map memory allocation in userspace · 067219b2
      Yonghong Song authored
      In bcc, internal BPF_F_TABLE defines a structure to
      contain all the table information for later easy
      extraction. A global structure will be defined
      with this type. Note that this structure will be
      allocated by LLVM during compilation.
      
      In the table structure, one of field is:
         _leaf_type data[_max_entries]
      
      If the _leaf_type and _max_entries are big,
      significant memory will be consumed. A big
      _leaf_type size example is for BPF_STACK_TRACE map
      with 127*8=1016 bytes. If max_entries is bigger
      as well, significant amount of memory will be
      consumed by LLVM.
      
      This patch replaces
        _leaf_type data[_max_entries]
      to
        unsigned ing max_entries
      
      The detail of a test example can be found in issue #1291.
      For the example in #1291, without this patch, for a
      BPF_STACK_TRACE map with 1M entries, the RSS is roughly
      3GB (roughly 3KB per entry). With this patch, it is 5.8MB.
      Signed-off-by: default avatarYonghong Song <yhs@fb.com>
      067219b2
  2. 16 Aug, 2017 5 commits
  3. 15 Aug, 2017 1 commit
  4. 11 Aug, 2017 10 commits
  5. 01 Aug, 2017 2 commits
  6. 31 Jul, 2017 2 commits
  7. 28 Jul, 2017 1 commit
  8. 27 Jul, 2017 2 commits
    • Brenden Blanco's avatar
      Merge pull request #1263 from iovisor/yhs_dev · 2cc96a8c
      Brenden Blanco authored
      permit multiple pids attaching to the same probe
      2cc96a8c
    • Yonghong Song's avatar
      permit multiple pids attaching to the same probe · 0ba15075
      Yonghong Song authored
      Currently, if more than one pid-associated USDT attaching to
      the same probe, usdt readarg code will be generated twice and
      the compiler will complain.
      
      This patch solves issue by preventing code duplication if
      a previous context with the same mnt point and exec binary
      has generated the code for the same probe. The event name is
      also changed to have pid embedded so different pid-associated
      uprobe event will have different names.
      
      This patch introduces an internal uprobe event name
      discrepency. It is a good idea to have event name
      generation in libbpf so that both C++ API and Python API
      will have consistent name conventions. This will be
      addressed in a subsequent commit as it is largely
      a different issue.
      Signed-off-by: default avatarYonghong Song <yhs@fb.com>
      0ba15075
  9. 19 Jul, 2017 1 commit
  10. 18 Jul, 2017 4 commits
    • Yonghong Song's avatar
      generate proper usdt code to prevent llvm meddling with ctx->#fields · 8206f547
      Yonghong Song authored
      Qin reported a test case where llvm still messes up with ctx->#fields.
      For code like below:
        switch(ctx->ip) {
          case 0x7fdf2ede9820ULL: *((int64_t *)dest) = *(volatile int64_t *)&ctx->r12; return 0;
          case 0x7fdf2edecd9cULL: *((int64_t *)dest) = *(volatile int64_t *)&ctx->bx; return 0;
        }
      The compiler still generates:
          # r1 is the pointer to the ctx
          r1 += 24
          goto LBB0_4
        LBB0_3:
          r1 += 40
        LBB0_4:
          r3 = *(u64 *)(r1 + 0)
      The verifier will reject the above code since the last load is not "ctx + field_offset"
      format.
      
      The responsible llvm optimization pass is CFGSimplifyPass. Its main implementation
      in llvm/lib/Transforms/Utils/SimplifyCFG.cpp. The main routine to do the optimization
      is SinkThenElseCodeToEnd. The routine canSinkInstructions is used to determine whether
      an insn is a candidate for sinking.
      
      Unfortunately, volatile load/store is not a condition to prevent the optimization.
      But inline assembly is a condition which can prevent further optimization.
      
      In this patch, instead of using volatile to annotate ctx->#field access, we do
      normal ctx->#field access but put a compiler inline assembly memory barrier
         __asm__ __volatile__(\"\": : :\"memory\");
      after the field access.
      
      Tested with usdt unit test case, usdt_samples example, a couple of usdt unit tests
      developed in the past.
      Signed-off-by: default avatarYonghong Song <yhs@fb.com>
      8206f547
    • Igor Mazur's avatar
      MySQL tracing without USDT (#1239) · 5f7035e4
      Igor Mazur authored
      Support tracing MySQL queries even when MySQL is built
      without USDT support, by using uprobes on internal functions
      responsible for command (query) dispatching.
      5f7035e4
    • 4ast's avatar
      Merge pull request #1259 from iovisor/yhs_dev · 87abe2a3
      4ast authored
      Fix a clang memory leak
      87abe2a3
    • Yonghong Song's avatar
      Fix a clang memory leak · 6ed2229d
      Yonghong Song authored
      In clang frontend actions, several compiler invocations are called
      for rewriter and transforming source code to IR. During the invocation
      to transform source code to IR, CodeGenOpts.DisableFree is used
      to control whether the top target machine structure should be
      freed or not for a particular clang invocation,
      and its default value is TRUE.
      
      See clang:lib/CodeGen/BackendUtil.cpp:
        ~EmitAssemblyHelper() {
          if (CodeGenOpts.DisableFree)
            BuryPointer(std::move(TM));
        }
      
      So by default, the memory held by TM will not freed, even if
      BPF module itself is freed. This is even more problematic
      when continuous building/loading/unloading happens for long
      live service.
      
      This patch explicitly sets CodeGenOpts.DisableFree to FALSE
      so memory can be properly freed. I did a simple experiment
      to compile/load/unload an empty BPF program and the saving
      is roughly 0.5MB.
      Signed-off-by: default avatarYonghong Song <yhs@fb.com>
      6ed2229d
  11. 17 Jul, 2017 2 commits
  12. 14 Jul, 2017 2 commits
  13. 11 Jul, 2017 3 commits
    • Romain's avatar
      cc: Add open_perf_event to the C/C++ API (#1232) · 4180333c
      Romain authored
      4180333c
    • Rinat Ibragimov's avatar
      memleak: expand allocator coverage (#1214) · 2c1799c9
      Rinat Ibragimov authored
      * memleak: handle libc allocation functions other than malloc
      
      * memleak: use tracepoints to track kernel allocations
      
      * memleak: add combined-only mode
      
      With large number of outstanding allocations, amount of data passed from
      kernel becomes large, which slows everything down.
      
      This patch calculates allocation statistics inside kernel, allowing user-
      space part to pull combined statistics data only, thus significantly
      reducing amount of passed data.
      
      * memleak: increase hashtable capacities
      
      There are a lot of allocations happen in kernel. Default values are not
      enough to keep up.
      
      * test: add a test for the memleak tool
      2c1799c9
    • bveldhoen's avatar
      Add USDT sample (#1229) · b4691fba
      bveldhoen authored
      This sample contains:
          - A library with an operation that uses usdt probes.
          - A console application that calls the operation.
          - Scripts to trace the latency of the operation.
          - Corresponding cmake files.
      b4691fba
  14. 07 Jul, 2017 1 commit
  15. 06 Jul, 2017 1 commit