1. 22 Feb, 2016 16 commits
    • David S. Miller's avatar
      Merge tag 'linux-can-next-for-4.6-20160220' of... · 86310cc4
      David S. Miller authored
      Merge tag 'linux-can-next-for-4.6-20160220' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next
      
      Marc Kleine-Budde says:
      
      ====================
      pull-request: can-next 2016-02-20
      
      this is a pull request of 9 patch for net-next/master.
      
      The first 3 patches are from Damien Riegel, they add support for
      Technologic Systems IP core to tje sja100 driver. The next patches 6 by
      Marek Vasut (including one my me) first clean sort the CAN driver's
      Kconfig and Makefiles and then add support for the IFI CANFD IP core.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      86310cc4
    • David S. Miller's avatar
      Merge branch 'bpf-helper-improvements' · 9c572dc4
      David S. Miller authored
      Daniel Borkmann says:
      
      ====================
      BPF updates
      
      This set contains various updates for eBPF, i.e. the addition of a
      generic csum helper function and other misc bits that mostly improve
      existing helpers and ease programming with eBPF on cls_bpf. For more
      details, please see individual patches.
      
      Set is rebased on top of http://patchwork.ozlabs.org/patch/584465/.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9c572dc4
    • Daniel Borkmann's avatar
      bpf: don't emit mov A,A on return · 6205b9cf
      Daniel Borkmann authored
      While debugging with bpf_jit_disasm I noticed emissions of 'mov %eax,%eax',
      and found that this comes from BPF_RET | BPF_A translations from classic
      BPF. Emitting this is unnecessary as BPF_REG_A is mapped into BPF_REG_0
      already, therefore only emit a mov when immediates are used as return value.
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6205b9cf
    • Daniel Borkmann's avatar
      bpf: fix csum update in bpf_l4_csum_replace helper for udp · 2f72959a
      Daniel Borkmann authored
      When using this helper for updating UDP checksums, we need to extend
      this in order to write CSUM_MANGLED_0 for csum computations that result
      into 0 as sum. Reason we need this is because packets with a checksum
      could otherwise become incorrectly marked as a packet without a checksum.
      Likewise, if the user indicates BPF_F_MARK_MANGLED_0, then we should
      not turn packets without a checksum into ones with a checksum.
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2f72959a
    • Daniel Borkmann's avatar
      bpf: try harder on clones when writing into skb · 3697649f
      Daniel Borkmann authored
      When we're dealing with clones and the area is not writeable, try
      harder and get a copy via pskb_expand_head(). Replace also other
      occurences in tc actions with the new skb_try_make_writable().
      Reported-by: default avatarAshhad Sheikh <ashhadsheikh394@gmail.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3697649f
    • Daniel Borkmann's avatar
      bpf: remove artificial bpf_skb_{load, store}_bytes buffer limitation · 21cafc1d
      Daniel Borkmann authored
      We currently limit bpf_skb_store_bytes() and bpf_skb_load_bytes()
      helpers to only store or load a maximum buffer of 16 bytes. Thus,
      loading, rewriting and storing headers require several bpf_skb_load_bytes()
      and bpf_skb_store_bytes() calls.
      
      Also here we can use a per-cpu scratch buffer instead in order to not
      pressure stack space any further. I do suspect that this limit was mainly
      set in place for this particular reason. So, ease program development
      by removing this limitation and make the scratchpad generic, so it can
      be reused.
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      21cafc1d
    • Daniel Borkmann's avatar
      bpf: add generic bpf_csum_diff helper · 7d672345
      Daniel Borkmann authored
      For L4 checksums, we currently have bpf_l4_csum_replace() helper. It's
      currently limited to handle 2 and 4 byte changes in a header and feeds the
      from/to into inet_proto_csum_replace{2,4}() helpers of the kernel. When
      working with IPv6, for example, this makes it rather cumbersome to deal
      with, similarly when editing larger parts of a header.
      
      Instead, extend the API in a more generic way: For bpf_l4_csum_replace(),
      add a case for header field mask of 0 to change the checksum at a given
      offset through inet_proto_csum_replace_by_diff(), and provide a helper
      bpf_csum_diff() that can generically calculate a from/to diff for arbitrary
      amounts of data.
      
      This can be used in multiple ways: for the bpf_l4_csum_replace() only
      part, this even provides us with the option to insert precalculated diffs
      from user space f.e. from a map, or from bpf_csum_diff() during runtime.
      
      bpf_csum_diff() has a optional from/to stack buffer input, so we can
      calculate a diff by using a scratchbuffer for scenarios where we're
      inserting (from is NULL), removing (to is NULL) or diffing (from/to buffers
      don't need to be of equal size) data. Also, bpf_csum_diff() allows to
      feed a previous csum into csum_partial(), so the function can also be
      cascaded.
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7d672345
    • Daniel Borkmann's avatar
      bpf: add new arg_type that allows for 0 sized stack buffer · 8e2fe1d9
      Daniel Borkmann authored
      Currently, when we pass a buffer from the eBPF stack into a helper
      function, the function proto indicates argument types as ARG_PTR_TO_STACK
      and ARG_CONST_STACK_SIZE pair. If R<X> contains the former, then R<X+1>
      must be of the latter type. Then, verifier checks whether the buffer
      points into eBPF stack, is initialized, etc. The verifier also guarantees
      that the constant value passed in R<X+1> is greater than 0, so helper
      functions don't need to test for it and can always assume a non-NULL
      initialized buffer as well as non-0 buffer size.
      
      This patch adds a new argument types ARG_CONST_STACK_SIZE_OR_ZERO that
      allows to also pass NULL as R<X> and 0 as R<X+1> into the helper function.
      Such helper functions, of course, need to be able to handle these cases
      internally then. Verifier guarantees that either R<X> == NULL && R<X+1> == 0
      or R<X> != NULL && R<X+1> != 0 (like the case of ARG_CONST_STACK_SIZE), any
      other combinations are not possible to load.
      
      I went through various options of extending the verifier, and introducing
      the type ARG_CONST_STACK_SIZE_OR_ZERO seems to have most minimal changes
      needed to the verifier.
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8e2fe1d9
    • David S. Miller's avatar
      Merge branch 'geneve-vxlan-outer-checksum' · 8b393f83
      David S. Miller authored
      Alexander Duyck says:
      
      ====================
      GENEVE/VXLAN: Enable outer Tx checksum by default
      
      This patch series makes it so that we enable the outer Tx checksum for IPv4
      tunnels by default.  This makes the behavior consistent with how we were
      handling this for IPv6.  In addition I have updated the internal flags for
      these tunnels so that we use a ZERO_CSUM_TX flag for IPv4 which should
      match up will with the ZERO_CSUM6_TX flag which was already in use for
      IPv6.
      
      For most network devices this should be a net gain in terms of performance
      as having the outer header checksum present allows for devices to report
      CHECKSUM_UNNECESSARY which we can then convert to CHECKSUM_COMPLETE in order
      to determine if the inner header checksum is valid.
      
      Below is some data I collected with ixgbe with an X540 that demonstrates
      this.  I located two PFs connected back to back in two different name
      spaces and then setup a pair of tunnels on each, one with checksum enabled
      and one without.
      
      Recv   Send    Send                          Utilization
      Socket Socket  Message  Elapsed              Send
      Size   Size    Size     Time     Throughput  local
      bytes  bytes   bytes    secs.    10^6bits/s  % S
      
      noudpcsum:
       87380  16384  16384    30.00      8898.67   12.80
      udpcsum:
       87380  16384  16384    30.00      9088.47   5.69
      
      The one spot where this may cause a performance regression is if the
      environment contains devices that can parse the inner headers and a device
      supports NETIF_F_GSO_UDP_TUNNEL but not NETIF_F_GSO_UDP_TUNNEL_CSUM.  In
      the case of such a device we have to fall back to using GSO to segment the
      tunnel instead of TSO and as a result we may take a performance hit as seen
      below with i40e.
      
      Recv   Send    Send                          Utilization
      Socket Socket  Message  Elapsed              Send
      Size   Size    Size     Time     Throughput  local
      bytes  bytes   bytes    secs.    10^6bits/s  % S
      
      noudpcsum:
       87380  16384  16384    30.00      9085.21   3.32
      udpcsum:
       87380  16384  16384    30.00      9089.23   5.54
      
      In addition it will be necessary to update iproute2 so that we don't
      provide the checksum attribute unless specified.  This way on older kernels
      which don't have local checksum offload we will default to disabling the
      outer checksum, and on newer kernels that have LCO we can default to
      enabling it.
      
      I also haven't investigated the effect this will have on OVS.  However I
      suspect the impact should be minimal as the worst case scenario should be
      that Tx checksumming will become enabled by default which should be
      consistent with the existing behavior for IPv6.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8b393f83
    • Alexander Duyck's avatar
      VXLAN: Support outer IPv4 Tx checksums by default · 6ceb31ca
      Alexander Duyck authored
      This change makes it so that if UDP CSUM is not specified we will default
      to enabling it.  The main motivation behind this is the fact that with the
      use of outer checksum we can greatly improve the performance for VXLAN
      tunnels on devices that don't know how to parse tunnel headers.
      Signed-off-by: default avatarAlexander Duyck <aduyck@mirantis.com>
      Acked-by: default avatarTom Herbert <tom@herbertland.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6ceb31ca
    • Alexander Duyck's avatar
      GENEVE: Support outer IPv4 Tx checksums by default · 14f1f724
      Alexander Duyck authored
      This change makes it so that if UDP CSUM is not specified we will default
      to enabling it.  The main motivation behind this is the fact that with the
      use of outer checksum we can greatly improve the performance for GENEVE
      tunnels on hardware that doesn't know how to parse them.
      Signed-off-by: default avatarAlexander Duyck <aduyck@mirantis.com>
      Acked-by: default avatarTom Herbert <tom@herbertland.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      14f1f724
    • David S. Miller's avatar
      Merge branch 'lwt-autoload' · 417b7ca4
      David S. Miller authored
      Robert Shearman says:
      
      ====================
      lwtunnel: autoload of lwt modules
      
      Changes since v1:
       - remove "LWTUNNEL_ENCAP_" prefix for the string form of the encaps
         used when requesting the module to reduce duplication, and don't
         bother returning strings for lwt modules using netdevices, both
         suggested by Jiri.
       - update commit message of first patch to clarify security
         implications, in response to Eric's comments.
      
      The lwt implementations using net devices can autoload using the
      existing mechanism using IFLA_INFO_KIND. However, there's no mechanism
      that lwt modules not using net devices can use.
      
      Therefore, these patches add the ability to autoload modules
      registering lwt operations for lwt implementations not using a net
      device so that users don't have to manually load the modules.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      417b7ca4
    • Robert Shearman's avatar
      ila: autoload module · 84a8cbe4
      Robert Shearman authored
      Avoid users having to manually load the module by adding a module
      alias allowing it to be autoloaded by the lwt infra.
      Signed-off-by: default avatarRobert Shearman <rshearma@brocade.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      84a8cbe4
    • Robert Shearman's avatar
      mpls: autoload lwt module · b2b04edc
      Robert Shearman authored
      Avoid users having to manually load the module by adding a module
      alias allowing it to be autoloaded by the lwt infra.
      Signed-off-by: default avatarRobert Shearman <rshearma@brocade.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b2b04edc
    • Robert Shearman's avatar
      lwtunnel: autoload of lwt modules · 745041e2
      Robert Shearman authored
      The lwt implementations using net devices can autoload using the
      existing mechanism using IFLA_INFO_KIND. However, there's no mechanism
      that lwt modules not using net devices can use.
      
      Therefore, add the ability to autoload modules registering lwt
      operations for lwt implementations not using a net device so that
      users don't have to manually load the modules.
      
      Only users with the CAP_NET_ADMIN capability can cause modules to be
      loaded, which is ensured by rtnetlink_rcv_msg rejecting non-RTM_GETxxx
      messages for users without this capability, and by
      lwtunnel_build_state not being called in response to RTM_GETxxx
      messages.
      Signed-off-by: default avatarRobert Shearman <rshearma@brocade.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      745041e2
    • Zhang Shengju's avatar
      vlan: turn on unicast filtering on vlan device · e817af27
      Zhang Shengju authored
      Currently vlan device inherits unicast filtering flag from underlying
      device. If underlying device doesn't support unicast filter, this will
      put vlan device into promiscuous mode when it's stacked.
      
      Tun on IFF_UNICAST_FLT on the vlan device in any case so that it does
      not go into promiscuous mode needlessly. If underlying device does not
      support unicast filtering, that device will enter promiscuous mode.
      Signed-off-by: default avatarZhang Shengju <zhangshengju@cmss.chinamobile.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e817af27
  2. 20 Feb, 2016 24 commits