• Yonghong Song's avatar
    selftests/bpf: Fix selftest test_global_funcs/global_func1 failure with latest clang · f1f5553d
    Yonghong Song authored
    The selftest test_global_funcs/global_func1 failed with the latest clang17.
    The reason is due to upstream ArgumentPromotionPass ([1]),
    which may manipulate static function parameters and cause inlining
    although the funciton is marked as noinline.
    
    The original code:
      static __attribute__ ((noinline))
      int f0(int var, struct __sk_buff *skb)
      {
            return skb->len;
      }
    
      __attribute__ ((noinline))
      int f1(struct __sk_buff *skb)
      {
    	...
            return f0(0, skb) + skb->len;
      }
    
      ...
    
      SEC("tc")
      __failure __msg("combined stack size of 4 calls is 544")
      int global_func1(struct __sk_buff *skb)
      {
            return f0(1, skb) + f1(skb) + f2(2, skb) + f3(3, skb, 4);
      }
    
    After ArgumentPromotionPass, the code is translated to
      static __attribute__ ((noinline))
      int f0(int var, int skb_len)
      {
            return skb_len;
      }
    
      __attribute__ ((noinline))
      int f1(struct __sk_buff *skb)
      {
    	...
            return f0(0, skb->len) + skb->len;
      }
    
      ...
    
      SEC("tc")
      __failure __msg("combined stack size of 4 calls is 544")
      int global_func1(struct __sk_buff *skb)
      {
            return f0(1, skb->len) + f1(skb) + f2(2, skb) + f3(3, skb, 4);
      }
    
    And later llvm InstCombine phase recognized that f0()
    simplify returns the value of the second argument and removed f0()
    completely and the final code looks like:
      __attribute__ ((noinline))
      int f1(struct __sk_buff *skb)
      {
    	...
            return skb->len + skb->len;
      }
    
      ...
    
      SEC("tc")
      __failure __msg("combined stack size of 4 calls is 544")
      int global_func1(struct __sk_buff *skb)
      {
            return skb->len + f1(skb) + f2(2, skb) + f3(3, skb, 4);
      }
    
    If f0() is not inlined, the verification will fail with stack size
    544 for a particular callchain. With f0() inlined, the maximum
    stack size is 512 which is in the limit.
    
    Let us add a `asm volatile ("")` in f0() to prevent ArgumentPromotionPass
    from hoisting the code to its caller, and this fixed the test failure.
    
      [1] https://reviews.llvm.org/D148269Signed-off-by: default avatarYonghong Song <yhs@fb.com>
    Signed-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
    Link: https://lore.kernel.org/bpf/20230425174744.1758515-1-yhs@fb.com
    f1f5553d
test_global_func1.c 984 Bytes