src/cc/usdt.h · b58a0d7eb0c9b15c091483fdcdd619b692aa0c3f · Kirill Smelkov / bcc

generate proper usdt code to prevent llvm meddling with ctx->#fields · 8206f547

Yonghong Song authored Jul 18, 2017



Qin reported a test case where llvm still messes up with ctx->#fields.
For code like below:
  switch(ctx->ip) {
    case 0x7fdf2ede9820ULL: *((int64_t *)dest) = *(volatile int64_t *)&ctx->r12; return 0;
    case 0x7fdf2edecd9cULL: *((int64_t *)dest) = *(volatile int64_t *)&ctx->bx; return 0;
  }
The compiler still generates:
    # r1 is the pointer to the ctx
    r1 += 24
    goto LBB0_4
  LBB0_3:
    r1 += 40
  LBB0_4:
    r3 = *(u64 *)(r1 + 0)
The verifier will reject the above code since the last load is not "ctx + field_offset"
format.

The responsible llvm optimization pass is CFGSimplifyPass. Its main implementation
in llvm/lib/Transforms/Utils/SimplifyCFG.cpp. The main routine to do the optimization
is SinkThenElseCodeToEnd. The routine canSinkInstructions is used to determine whether
an insn is a candidate for sinking.

Unfortunately, volatile load/store is not a condition to prevent the optimization.
But inline assembly is a condition which can prevent further optimization.

In this patch, instead of using volatile to annotate ctx->#field access, we do
normal ctx->#field access but put a compiler inline assembly memory barrier
   __asm__ __volatile__(\"\": : :\"memory\");
after the field access.

Tested with usdt unit test case, usdt_samples example, a couple of usdt unit tests
developed in the past.
Signed-off-by: Yonghong Song <yhs@fb.com>

8206f547

usdt.h 6.56 KB

Replace usdt.h