• Alan Maguire's avatar
    libbpf: Add btf__distill_base() creating split BTF with distilled base BTF · 58e185a0
    Alan Maguire authored
    To support more robust split BTF, adding supplemental context for the
    base BTF type ids that split BTF refers to is required.  Without such
    references, a simple shuffling of base BTF type ids (without any other
    significant change) invalidates the split BTF.  Here the attempt is made
    to store additional context to make split BTF more robust.
    
    This context comes in the form of distilled base BTF providing minimal
    information (name and - in some cases - size) for base INTs, FLOATs,
    STRUCTs, UNIONs, ENUMs and ENUM64s along with modified split BTF that
    points at that base and contains any additional types needed (such as
    TYPEDEF, PTR and anonymous STRUCT/UNION declarations).  This
    information constitutes the minimal BTF representation needed to
    disambiguate or remove split BTF references to base BTF.  The rules
    are as follows:
    
    - INT, FLOAT, FWD are recorded in full.
    - if a named base BTF STRUCT or UNION is referred to from split BTF, it
      will be encoded as a zero-member sized STRUCT/UNION (preserving
      size for later relocation checks).  Only base BTF STRUCT/UNIONs
      that are either embedded in split BTF STRUCT/UNIONs or that have
      multiple STRUCT/UNION instances of the same name will _need_ size
      checks at relocation time, but as it is possible a different set of
      types will be duplicates in the later to-be-resolved base BTF,
      we preserve size information for all named STRUCT/UNIONs.
    - if an ENUM[64] is named, a ENUM forward representation (an ENUM
      with no values) of the same size is used.
    - in all other cases, the type is added to the new split BTF.
    
    Avoiding struct/union/enum/enum64 expansion is important to keep the
    distilled base BTF representation to a minimum size.
    
    When successful, new representations of the distilled base BTF and new
    split BTF that refers to it are returned.  Both need to be freed by the
    caller.
    
    So to take a simple example, with split BTF with a type referring
    to "struct sk_buff", we will generate distilled base BTF with a
    0-member STRUCT sk_buff of the appropriate size, and the split BTF
    will refer to it instead.
    
    Tools like pahole can utilize such split BTF to populate the .BTF
    section (split BTF) and an additional .BTF.base section.  Then
    when the split BTF is loaded, the distilled base BTF can be used
    to relocate split BTF to reference the current (and possibly changed)
    base BTF.
    
    So for example if "struct sk_buff" was id 502 when the split BTF was
    originally generated,  we can use the distilled base BTF to see that
    id 502 refers to a "struct sk_buff" and replace instances of id 502
    with the current (relocated) base BTF sk_buff type id.
    
    Distilled base BTF is small; when building a kernel with all modules
    using distilled base BTF as a test, overall module size grew by only
    5.3Mb total across ~2700 modules.
    Signed-off-by: default avatarAlan Maguire <alan.maguire@oracle.com>
    Signed-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
    Acked-by: default avatarEduard Zingerman <eddyz87@gmail.com>
    Link: https://lore.kernel.org/bpf/20240613095014.357981-2-alan.maguire@oracle.com
    58e185a0
btf.h 19.4 KB