• Vijay Pandurangan's avatar
    veth: don’t modify ip_summed; doing so treats packets with bad checksums as good. · ce8c839b
    Vijay Pandurangan authored
    Packets that arrive from real hardware devices have ip_summed ==
    CHECKSUM_UNNECESSARY if the hardware verified the checksums, or
    CHECKSUM_NONE if the packet is bad or it was unable to verify it. The
    current version of veth will replace CHECKSUM_NONE with
    CHECKSUM_UNNECESSARY, which causes corrupt packets routed from hardware to
    a veth device to be delivered to the application. This caused applications
    at Twitter to receive corrupt data when network hardware was corrupting
    packets.
    
    We believe this was added as an optimization to skip computing and
    verifying checksums for communication between containers. However, locally
    generated packets have ip_summed == CHECKSUM_PARTIAL, so the code as
    written does nothing for them. As far as we can tell, after removing this
    code, these packets are transmitted from one stack to another unmodified
    (tcpdump shows invalid checksums on both sides, as expected), and they are
    delivered correctly to applications. We didn’t test every possible network
    configuration, but we tried a few common ones such as bridging containers,
    using NAT between the host and a container, and routing from hardware
    devices to containers. We have effectively deployed this in production at
    Twitter (by disabling RX checksum offloading on veth devices).
    
    This code dates back to the first version of the driver, commit
    <e314dbdc> ("[NET]: Virtual ethernet device driver"), so I
    suspect this bug occurred mostly because the driver API has evolved
    significantly since then. Commit <0b796750> ("net/veth: Fix
    packet checksumming") (in December 2010) fixed this for packets that get
    created locally and sent to hardware devices, by not changing
    CHECKSUM_PARTIAL. However, the same issue still occurs for packets coming
    in from hardware devices.
    Co-authored-by: default avatarEvan Jones <ej@evanjones.ca>
    Signed-off-by: default avatarEvan Jones <ej@evanjones.ca>
    Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com>
    Cc: Phil Sutter <phil@nwl.cc>
    Cc: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
    Cc: netdev@vger.kernel.org
    Cc: linux-kernel@vger.kernel.org
    Signed-off-by: default avatarVijay Pandurangan <vijayp@vijayp.ca>
    Acked-by: default avatarCong Wang <cwang@twopensource.com>
    Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    ce8c839b
veth.c 11.7 KB