• Daniel Borkmann's avatar
    kbuild: disable clang's default use of -fmerge-all-constants · cbb5420a
    Daniel Borkmann authored
    commit 87e0d4f0 upstream.
    
    Prasad reported that he has seen crashes in BPF subsystem with netd
    on Android with arm64 in the form of (note, the taint is unrelated):
    
      [ 4134.721483] Unable to handle kernel paging request at virtual address 800000001
      [ 4134.820925] Mem abort info:
      [ 4134.901283]   Exception class = DABT (current EL), IL = 32 bits
      [ 4135.016736]   SET = 0, FnV = 0
      [ 4135.119820]   EA = 0, S1PTW = 0
      [ 4135.201431] Data abort info:
      [ 4135.301388]   ISV = 0, ISS = 0x00000021
      [ 4135.359599]   CM = 0, WnR = 0
      [ 4135.470873] user pgtable: 4k pages, 39-bit VAs, pgd = ffffffe39b946000
      [ 4135.499757] [0000000800000001] *pgd=0000000000000000, *pud=0000000000000000
      [ 4135.660725] Internal error: Oops: 96000021 [#1] PREEMPT SMP
      [ 4135.674610] Modules linked in:
      [ 4135.682883] CPU: 5 PID: 1260 Comm: netd Tainted: G S      W       4.14.19+ #1
      [ 4135.716188] task: ffffffe39f4aa380 task.stack: ffffff801d4e0000
      [ 4135.731599] PC is at bpf_prog_add+0x20/0x68
      [ 4135.741746] LR is at bpf_prog_inc+0x20/0x2c
      [ 4135.751788] pc : [<ffffff94ab7ad584>] lr : [<ffffff94ab7ad638>] pstate: 60400145
      [ 4135.769062] sp : ffffff801d4e3ce0
      [...]
      [ 4136.258315] Process netd (pid: 1260, stack limit = 0xffffff801d4e0000)
      [ 4136.273746] Call trace:
      [...]
      [ 4136.442494] 3ca0: ffffff94ab7ad584 0000000060400145 ffffffe3a01bf8f8 0000000000000006
      [ 4136.460936] 3cc0: 0000008000000000 ffffff94ab844204 ffffff801d4e3cf0 ffffff94ab7ad584
      [ 4136.479241] [<ffffff94ab7ad584>] bpf_prog_add+0x20/0x68
      [ 4136.491767] [<ffffff94ab7ad638>] bpf_prog_inc+0x20/0x2c
      [ 4136.504536] [<ffffff94ab7b5d08>] bpf_obj_get_user+0x204/0x22c
      [ 4136.518746] [<ffffff94ab7ade68>] SyS_bpf+0x5a8/0x1a88
    
    Android's netd was basically pinning the uid cookie BPF map in BPF
    fs (/sys/fs/bpf/traffic_cookie_uid_map) and later on retrieving it
    again resulting in above panic. Issue is that the map was wrongly
    identified as a prog! Above kernel was compiled with clang 4.0,
    and it turns out that clang decided to merge the bpf_prog_iops and
    bpf_map_iops into a single memory location, such that the two i_ops
    could then not be distinguished anymore.
    
    Reason for this miscompilation is that clang has the more aggressive
    -fmerge-all-constants enabled by default. In fact, clang source code
    has a comment about it in lib/AST/ExprConstant.cpp on why it is okay
    to do so:
    
      Pointers with different bases cannot represent the same object.
      (Note that clang defaults to -fmerge-all-constants, which can
      lead to inconsistent results for comparisons involving the address
      of a constant; this generally doesn't matter in practice.)
    
    The issue never appeared with gcc however, since gcc does not enable
    -fmerge-all-constants by default and even *explicitly* states in
    it's option description that using this flag results in non-conforming
    behavior, quote from man gcc:
    
      Languages like C or C++ require each variable, including multiple
      instances of the same variable in recursive calls, to have distinct
      locations, so using this option results in non-conforming behavior.
    
    There are also various clang bug reports open on that matter [1],
    where clang developers acknowledge the non-conforming behavior,
    and refer to disabling it with -fno-merge-all-constants. But even
    if this gets fixed in clang today, there are already users out there
    that triggered this. Thus, fix this issue by explicitly adding
    -fno-merge-all-constants to the kernel's Makefile to generically
    disable this optimization, since potentially other places in the
    kernel could subtly break as well.
    
    Note, there is also a flag called -fmerge-constants (not supported
    by clang), which is more conservative and only applies to strings
    and it's enabled in gcc's -O/-O2/-O3/-Os optimization levels. In
    gcc's code, the two flags -fmerge-{all-,}constants share the same
    variable internally, so when disabling it via -fno-merge-all-constants,
    then we really don't merge any const data (e.g. strings), and text
    size increases with gcc (14,927,214 -> 14,942,646 for vmlinux.o).
    
      $ gcc -fverbose-asm -O2 foo.c -S -o foo.S
        -> foo.S lists -fmerge-constants under options enabled
      $ gcc -fverbose-asm -O2 -fno-merge-all-constants foo.c -S -o foo.S
        -> foo.S doesn't list -fmerge-constants under options enabled
      $ gcc -fverbose-asm -O2 -fno-merge-all-constants -fmerge-constants foo.c -S -o foo.S
        -> foo.S lists -fmerge-constants under options enabled
    
    Thus, as a workaround we need to set both -fno-merge-all-constants
    *and* -fmerge-constants in the Makefile in order for text size to
    stay as is.
    
      [1] https://bugs.llvm.org/show_bug.cgi?id=18538Reported-by: default avatarPrasad Sodagudi <psodagud@codeaurora.org>
    Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
    Cc: Linus Torvalds <torvalds@linux-foundation.org>
    Cc: Chenbo Feng <fengc@google.com>
    Cc: Richard Smith <richard-llvm@metafoo.co.uk>
    Cc: Chandler Carruth <chandlerc@gmail.com>
    Cc: linux-kernel@vger.kernel.org
    Tested-by: default avatarPrasad Sodagudi <psodagud@codeaurora.org>
    Acked-by: default avatarAlexei Starovoitov <ast@kernel.org>
    Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
    Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
    cbb5420a
Makefile 54.9 KB