• Yonghong Song's avatar
    generate proper usdt code to prevent llvm meddling with ctx->#fields · 8206f547
    Yonghong Song authored
    Qin reported a test case where llvm still messes up with ctx->#fields.
    For code like below:
      switch(ctx->ip) {
        case 0x7fdf2ede9820ULL: *((int64_t *)dest) = *(volatile int64_t *)&ctx->r12; return 0;
        case 0x7fdf2edecd9cULL: *((int64_t *)dest) = *(volatile int64_t *)&ctx->bx; return 0;
      }
    The compiler still generates:
        # r1 is the pointer to the ctx
        r1 += 24
        goto LBB0_4
      LBB0_3:
        r1 += 40
      LBB0_4:
        r3 = *(u64 *)(r1 + 0)
    The verifier will reject the above code since the last load is not "ctx + field_offset"
    format.
    
    The responsible llvm optimization pass is CFGSimplifyPass. Its main implementation
    in llvm/lib/Transforms/Utils/SimplifyCFG.cpp. The main routine to do the optimization
    is SinkThenElseCodeToEnd. The routine canSinkInstructions is used to determine whether
    an insn is a candidate for sinking.
    
    Unfortunately, volatile load/store is not a condition to prevent the optimization.
    But inline assembly is a condition which can prevent further optimization.
    
    In this patch, instead of using volatile to annotate ctx->#field access, we do
    normal ctx->#field access but put a compiler inline assembly memory barrier
       __asm__ __volatile__(\"\": : :\"memory\");
    after the field access.
    
    Tested with usdt unit test case, usdt_samples example, a couple of usdt unit tests
    developed in the past.
    Signed-off-by: default avatarYonghong Song <yhs@fb.com>
    8206f547
usdt.h 6.56 KB