• Daniel Borkmann's avatar
    bpf: Fix up bpf_skb_adjust_room helper's skb csum setting · 836e66c2
    Daniel Borkmann authored
    Lorenz recently reported:
    
      In our TC classifier cls_redirect [0], we use the following sequence of
      helper calls to decapsulate a GUE (basically IP + UDP + custom header)
      encapsulated packet:
    
        bpf_skb_adjust_room(skb, -encap_len, BPF_ADJ_ROOM_MAC, BPF_F_ADJ_ROOM_FIXED_GSO)
        bpf_redirect(skb->ifindex, BPF_F_INGRESS)
    
      It seems like some checksums of the inner headers are not validated in
      this case. For example, a TCP SYN packet with invalid TCP checksum is
      still accepted by the network stack and elicits a SYN ACK. [...]
    
      That is, we receive the following packet from the driver:
    
        | ETH | IP | UDP | GUE | IP | TCP |
        skb->ip_summed == CHECKSUM_UNNECESSARY
    
      ip_summed is CHECKSUM_UNNECESSARY because our NICs do rx checksum offloading.
      On this packet we run skb_adjust_room_mac(-encap_len), and get the following:
    
        | ETH | IP | TCP |
        skb->ip_summed == CHECKSUM_UNNECESSARY
    
      Note that ip_summed is still CHECKSUM_UNNECESSARY. After bpf_redirect()'ing
      into the ingress, we end up in tcp_v4_rcv(). There, skb_checksum_init() is
      turned into a no-op due to CHECKSUM_UNNECESSARY.
    
    The bpf_skb_adjust_room() helper is not aware of protocol specifics. Internally,
    it handles the CHECKSUM_COMPLETE case via skb_postpull_rcsum(), but that does
    not cover CHECKSUM_UNNECESSARY. In this case skb->csum_level of the original
    skb prior to bpf_skb_adjust_room() call was 0, that is, covering UDP. Right now
    there is no way to adjust the skb->csum_level. NICs that have checksum offload
    disabled (CHECKSUM_NONE) or that support CHECKSUM_COMPLETE are not affected.
    
    Use a safe default for CHECKSUM_UNNECESSARY by resetting to CHECKSUM_NONE and
    add a flag to the helper called BPF_F_ADJ_ROOM_NO_CSUM_RESET that allows users
    from opting out. Opting out is useful for the case where we don't remove/add
    full protocol headers, or for the case where a user wants to adjust the csum
    level manually e.g. through bpf_csum_level() helper that is added in subsequent
    patch.
    
    The bpf_skb_proto_{4_to_6,6_to_4}() for NAT64/46 translation from the BPF
    bpf_skb_change_proto() helper uses bpf_skb_net_hdr_{push,pop}() pair internally
    as well but doesn't change layers, only transitions between v4 to v6 and vice
    versa, therefore no adoption is required there.
    
      [0] https://lore.kernel.org/bpf/20200424185556.7358-1-lmb@cloudflare.com/
    
    Fixes: 2be7e212 ("bpf: add bpf_skb_adjust_room helper")
    Reported-by: default avatarLorenz Bauer <lmb@cloudflare.com>
    Reported-by: default avatarAlan Maguire <alan.maguire@oracle.com>
    Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
    Signed-off-by: default avatarLorenz Bauer <lmb@cloudflare.com>
    Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
    Reviewed-by: default avatarAlan Maguire <alan.maguire@oracle.com>
    Link: https://lore.kernel.org/bpf/CACAyw9-uU_52esMd1JjuA80fRPHJv5vsSg8GnfW3t_qDU4aVKQ@mail.gmail.com/
    Link: https://lore.kernel.org/bpf/11a90472e7cce83e76ddbfce81fdfce7bfc68808.1591108731.git.daniel@iogearbox.net
    836e66c2
skbuff.h 130 KB