1. 17 Dec, 2015 4 commits
  2. 16 Dec, 2015 30 commits
  3. 15 Dec, 2015 6 commits
    • David S. Miller's avatar
      Merge branch 'end-of-ip-csum' · 93d085d2
      David S. Miller authored
      Tom Herbert says:
      
      ====================
      net: The beginning of the end for NETIF_F_IP_CSUM and NETIF_F_IPV6_CSUM
      
      Background:
      
      This patch set starts to address one front in the battle against
      protocol ossification. Protocol ossification describes the state
      that we have arrived at in the evolution of the Internet where we are
      materially limited to only using a very narrow range of protocols
      and protocol features. For instance, only TCP and UDP is sufficiently
      supported on the Internet so that deploying alternative protocols,
      such as SCTP and DCCP, are non-starters. Similarly, IP options and IPv6
      extension headers are typically not considered feasible for wide
      deployment, so we have loss the extensibility of IP protocols.
      
      Protocol ossification is not only a problem on the Internet, but in
      the data center as well. A root cause of this seems to be narrow,
      protocol specific optimizations implemented in switches (for doing
      EMCP) and in NICs (NIC offloads). These tend to be performance
      optimization around TCP and UDP packets, and these have become
      requirements to implement performant network solutions at scale.
      
      Attempts to deal with protocol ossification in data center have yielded
      ad hoc, sub-optimal solutions. A main driver of foo-over-UDP (e.g.
      GRE/UDP, MPLS/UDP) is to leverage the existing EMCP and RSS support for
      UDP by setting the source port as an entropy value. This has seen some
      success, but the cost of additional overhead and layering limits its
      usefulness.  An even more extreme solution is STT where non-TCP packets
      are spoofed as TCP to leverage NIC offloads.
      
      This patch set endeavours to address protocol ossification caused by
      techniques used in transmit checksum offload for NICs. Future work
      will address protocol ossification in the other primary NIC offloads--
      namely receive checksum offload, LSO, LRO, and RSS.
      
      NETIF_F_IP_CSUM and NETIF_F_IPV6_CSUM:
      
      NETIF_F_IP_CSUM and NETIF_F_IPV6_CSUM exemplify the problem of protocol
      ossification. These features are relics from a simpler time in the
      Internet, before encapsulation, before GRE and  IPIP. Many hardware
      vendors only saw the need to provide checksum offload for simple UDP and
      TCP packets over IPv4 (IPv6 support is an afterthought also). In today's
      Internet and data centers, checksum offload is well established as a
      valuable feature, but we can no longer afford to be contsrained to
      use a handful of protocols and features that are supported at the
      discretion of NIC vendors. Generic and protocol agnostic methods are
      needed.
      
      The actual interface that the stack uses with drivers for checksum
      offload is CHECKSUM_PARTIAL. This is a generic and protocol agnostic
      interface. A driver for a device that supports this generic
      interface advertises NETIF_F_HW_CSUM.
      
      Goals of this patch set:
      
      We propose that drivers advertise NETIF_F_HW_CSUM instead of protocol
      specific values of NETIF_F_IP_CSUM and NETIF_F_IPV6_CSUM.  If the
      driver's device is constrained (for instance it can only offlaod simple
      IPv4 and IPv6 packets) then these constraints can be checked in the
      transmit path and skb_checksum_help would be called for packets that the
      driver is unable to offload. In order to facilitate this, we add some
      helper functions that takes a specification argument indicating the
      type of packets a device is able to offload. If a packet does not match
      the specification, the helper function calls skb_checksum_help.
      
      Benefits of this approach are:
        - Simplify the stack and clarify the interface for checksum offload
        - Encourage NIC vendors to implement the generic. protocol agnostic
          checksum offload methods in hardware
        - Encourage feature parity in NIC offloads for IPv4 and IPv6
      
      Many drivers advertise NETIF_F_IP_CSUM and NETIF_F_IPV6_CSUM and it
      probably isn't feasible to convert them all in a given time frame
      (although if we could this would be a great simplification to the
      stack). A reasonable direction may be to declare that new drivers must
      use NETIF_F_HW_CSUM as NETIF_F_IP_CSUM and NETIF_F_IPV6_CSUM are
      considered deprecated.
      
      There is a class of drivers that should now be converted to advertise
      NETIF_F_HW_CSUM, namely those that support offload of ecapsulated
      checksums. These drivers have to date been using skb->encapsulation
      to infer that checksum offload is being performed for an encapsulated
      checksum. This is strictly not correct. skb->encapsulation
      indicates that the inner headers are valid in the skbuff, whereas
      the stack indicates checksum offload arguments exclusively in csum_start
      and csum_offset. At some point we may want to set the inner headers for
      an skbuff but offload the outer transport checksum, so this needs to be
      fixed.
      
      In this patch set:
      
        - Rename some of constants involved in checksum offload to be more
          reflective of their function
        - Eliminate NETIF_F_GEN_CSUM and NETIF_F_V[46]_CSUM entirely as
          unnecessary convolutions
        - Fix conditions in tcp_sendpage and tcp_sendmsg to take IP protocol
          into account when determining if checksum offload can be done
        - Add driver helper functions for determining if a checksum can
          be offloaded to a device. If not, the helper function can call
          skb_checksum_help
        - Document the checksum offload interface between the stack and
          drivers with detail and specifics
      
      Testing:
      
      Have been testing ixgbe and mlx4. No noticeable regressions seen yet.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      93d085d2
    • Tom Herbert's avatar
      net: Elaborate on checksum offload interface description · 7a6ae71b
      Tom Herbert authored
      Add specifics and details the description of the interface between
      the stack and drivers for doing checksum offload. This description
      is meant to be as specific and complete as possible.
      Signed-off-by: default avatarTom Herbert <tom@herbertland.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7a6ae71b
    • Tom Herbert's avatar
      net: Add driver helper functions to determine checksum offloadability · 6ae23ad3
      Tom Herbert authored
      Add skb_csum_offload_chk driver helper function to determine if a
      device with limited checksum offload capabilities is able to offload the
      checksum for a given packet.
      
      This patch includes:
        - The skb_csum_offload_chk function. Returns true if checksum is
          offloadable, else false. Optionally, in the case that the checksum
          is not offloable, the function can call skb_checksum_help to resolve
          the checksum. skb_csum_offload_chk also returns whether the checksum
          refers to an encapsulated checksum.
        - Definition of skb_csum_offl_spec structure that caller uses to
          indicate rules about what it can offload (e.g. IPv4/v6, TCP/UDP only,
          whether encapsulated checksums can be offloaded, whether checksum with
          IPv6 extension headers can be offloaded).
        - Ancilary functions called skb_csum_offload_chk_help,
          skb_csum_off_chk_help_cmn, skb_csum_off_chk_help_cmn_v4_only.
      Signed-off-by: default avatarTom Herbert <tom@herbertland.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6ae23ad3
    • Tom Herbert's avatar
      tcp: Fix conditions to determine checksum offload · 9a49850d
      Tom Herbert authored
      In tcp_send_sendpage and tcp_sendmsg we check the route capabilities to
      determine if checksum offload can be performed. This check currently
      does not take the IP protocol into account for devices that advertise
      only one of NETIF_F_IPV6_CSUM or NETIF_F_IP_CSUM. This patch adds a
      function to check capabilities for checksum offload with a socket
      called sk_check_csum_caps. This function checks for specific IPv4 or
      IPv6 offload support based on the family of the socket.
      Signed-off-by: default avatarTom Herbert <tom@herbertland.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9a49850d
    • Tom Herbert's avatar
      net: Eliminate NETIF_F_GEN_CSUM and NETIF_F_V[46]_CSUM · c8cd0989
      Tom Herbert authored
      These netif flags are unnecessary convolutions. It is more
      straightforward to just use NETIF_F_HW_CSUM, NETIF_F_IP_CSUM,
      and NETIF_F_IPV6_CSUM directly.
      
      This patch also:
          - Cleans up can_checksum_protocol
          - Simplifies netdev_intersect_features
      Signed-off-by: default avatarTom Herbert <tom@herbertland.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c8cd0989
    • Tom Herbert's avatar
      net: Rename NETIF_F_ALL_CSUM to NETIF_F_CSUM_MASK · a188222b
      Tom Herbert authored
      The name NETIF_F_ALL_CSUM is a misnomer. This does not correspond to the
      set of features for offloading all checksums. This is a mask of the
      checksum offload related features bits. It is incorrect to set both
      NETIF_F_HW_CSUM and NETIF_F_IP_CSUM or NETIF_F_IPV6 at the same time for
      features of a device.
      
      This patch:
        - Changes instances of NETIF_F_ALL_CSUM to NETIF_F_CSUM_MASK (where
          NETIF_F_ALL_CSUM is being used as a mask).
        - Changes bonding, sfc/efx, ipvlan, macvlan, vlan, and team drivers to
          use NEITF_F_HW_CSUM in features list instead of NETIF_F_ALL_CSUM.
      Signed-off-by: default avatarTom Herbert <tom@herbertland.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a188222b