1. 21 Mar, 2015 16 commits
  2. 20 Mar, 2015 24 commits
    • David S. Miller's avatar
      Merge branch 'ebpf-next' · f6877fcf
      David S. Miller authored
      Daniel Borkmann says:
      
      ====================
      This set adds native eBPF support also to act_bpf and thus covers tc
      with eBPF in the classifier *and* action part.
      
      A link to iproute2 preview has been provided in patch 2 and the code
      will be pushed out after Stephen has processed the classifier part
      and helper bits for tc.
      
      This set depends on ced585c8 ("act_bpf: allow non-default TC_ACT
      opcodes as BPF exec outcome"), so a net into net-next merge would be
      required first. Hope that's fine by you, Dave. ;)
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f6877fcf
    • Daniel Borkmann's avatar
      act_bpf: add initial eBPF support for actions · a8cb5f55
      Daniel Borkmann authored
      This work extends the "classic" BPF programmable tc action by extending
      its scope also to native eBPF code!
      
      Together with commit e2e9b654 ("cls_bpf: add initial eBPF support
      for programmable classifiers") this adds the facility to implement fully
      flexible classifier and actions for tc that can be implemented in a C
      subset in user space, "safely" loaded into the kernel, and being run in
      native speed when JITed.
      
      Also, since eBPF maps can be shared between eBPF programs, it offers the
      possibility that cls_bpf and act_bpf can share data 1) between themselves
      and 2) between user space applications. That means that, f.e. customized
      runtime statistics can be collected in user space, but also more importantly
      classifier and action behaviour could be altered based on map input from
      the user space application.
      
      For the remaining details on the workflow and integration, see the cls_bpf
      commit e2e9b654. Preliminary iproute2 part can be found under [1].
      
        [1] http://git.breakpoint.cc/cgit/dborkman/iproute2.git/log/?h=ebpf-actSigned-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Cc: Jamal Hadi Salim <jhs@mojatatu.com>
      Cc: Jiri Pirko <jiri@resnulli.us>
      Acked-by: default avatarJiri Pirko <jiri@resnulli.us>
      Acked-by: default avatarAlexei Starovoitov <ast@plumgrid.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a8cb5f55
    • Daniel Borkmann's avatar
      ebpf: add sched_act_type and map it to sk_filter's verifier ops · 94caee8c
      Daniel Borkmann authored
      In order to prepare eBPF support for tc action, we need to add
      sched_act_type, so that the eBPF verifier is aware of what helper
      function act_bpf may use, that it can load skb data and read out
      currently available skb fields.
      
      This is bascially analogous to 96be4325 ("ebpf: add sched_cls_type
      and map it to sk_filter's verifier ops").
      
      BPF_PROG_TYPE_SCHED_CLS and BPF_PROG_TYPE_SCHED_ACT need to be
      separate since both will have a different set of functionality in
      future (classifier vs action), thus we won't run into ABI troubles
      when the point in time comes to diverge functionality from the
      classifier.
      
      The future plan for act_bpf would be that it will be able to write
      into skb->data and alter selected fields mirrored in struct __sk_buff.
      
      For an initial support, it's sufficient to map it to sk_filter_ops.
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Cc: Jiri Pirko <jiri@resnulli.us>
      Reviewed-by: default avatarJiri Pirko <jiri@resnulli.us>
      Acked-by: default avatarAlexei Starovoitov <ast@plumgrid.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      94caee8c
    • David S. Miller's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 0fa74a4b
      David S. Miller authored
      Conflicts:
      	drivers/net/ethernet/emulex/benet/be_main.c
      	net/core/sysctl_net_core.c
      	net/ipv4/inet_diag.c
      
      The be_main.c conflict resolution was really tricky.  The conflict
      hunks generated by GIT were very unhelpful, to say the least.  It
      split functions in half and moved them around, when the real actual
      conflict only existed solely inside of one function, that being
      be_map_pci_bars().
      
      So instead, to resolve this, I checked out be_main.c from the top
      of net-next, then I applied the be_main.c changes from 'net' since
      the last time I merged.  And this worked beautifully.
      
      The inet_diag.c and sysctl_net_core.c conflicts were simple
      overlapping changes, and were easily to resolve.
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0fa74a4b
    • Herbert Xu's avatar
      rhashtable: Fix undeclared EEXIST build error on ia64 · 6626af69
      Herbert Xu authored
      We need to include linux/errno.h in rhashtable.h since it doesn't
      always get included otherwise.
      Reported-by: default avatarkbuild test robot <fengguang.wu@intel.com>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6626af69
    • Al Viro's avatar
      net: validate the range we feed to iov_iter_init() in sys_sendto/sys_recvfrom · 4de930ef
      Al Viro authored
      Cc: stable@vger.kernel.org # v3.19
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4de930ef
    • David S. Miller's avatar
      Merge branch 'amd-xgbe-next' · b4c11cb4
      David S. Miller authored
      Tom Lendacky says:
      
      ====================
      amd-xgbe: AMD XGBE driver updates 2015-03-19
      
      The following series of patches includes functional updates and changes
      to the driver.
      
      - Use the phydev->advertising field instead of the phydev->supported
        field when configuring for auto-negotiation, etc.
      - Use the phy_driver flags field for setting the transceiver type
        instead of hardcoding it in the ethtool support.
      - Provide an auto-negotiation timeout check
      - Clarify the Tx/Rx queue information messages
      - Use the new DMA memory barrier operations
      - Set the device DMA mask based on what the hardware reports
      - Remove the software implementation of Tx coalescing
      - Fix the reporting of the Rx coalescing value
      - Use napi_alloc_skb when allocating an SKB in softirq
      
      This patch series is based on net-next.
      
      Changes from v2:
      - Use jiffies instead of timespec for the auto-negotiation timeout check
      - Remove the Rx path SKB allocation re-work patch since we should only
        inline the headers and the current code guards better against any
        hardware bugs
      
      Changes from v1:
      - Default to 32-bit DMA width (minimum supported) if hardware returns
        an unexpected DMA width value
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b4c11cb4
    • Lendacky, Thomas's avatar
      amd-xgbe: Use napi_alloc_skb when allocating skb in softirq · 385565a1
      Lendacky, Thomas authored
      Use the napi_alloc_skb function to allocate an skb when running within
      the softirq context to avoid calls to local_irq_save/restore.
      Signed-off-by: default avatarTom Lendacky <thomas.lendacky@amd.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      385565a1
    • Lendacky, Thomas's avatar
      amd-xgbe: Fix Rx coalescing reporting · 4a57ebcc
      Lendacky, Thomas authored
      The Rx coalescing value is internally converted from usecs to a value
      that the hardware can use. When reporting the Rx coalescing value, this
      internal value is converted back to usecs. During the conversion from
      and back to usecs some rounding occurs. So, for example, when setting an
      Rx usec of 30, it will be reported as 29. Fix this reporting issue by
      keeping the original usec value and using that during reporting.
      Signed-off-by: default avatarTom Lendacky <thomas.lendacky@amd.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4a57ebcc
    • Lendacky, Thomas's avatar
      amd-xgbe: Remove Tx coalescing · c635eaac
      Lendacky, Thomas authored
      The Tx coalescing support in the driver was a software implementation
      for something lacking in the hardware. Using hrtimers, the idea was to
      trigger a timer interrupt after having queued a packet for transmit.
      Unfortunately, as the timer value was lowered, the timer expired before
      the hardware actually did the transmit and so it was racey and resulted
      in unnecessary interrupts.
      
      Remove the Tx coalescing support and hrtimer and replace with a Tx timer
      that is used as a reclaim timer in case of inactivity.
      Signed-off-by: default avatarTom Lendacky <thomas.lendacky@amd.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c635eaac
    • Lendacky, Thomas's avatar
      amd-xgbe: Set DMA mask based on hardware register value · 386d325d
      Lendacky, Thomas authored
      The hardware supplies a value that indicates the DMA range that it
      is capable of using. Use this value rather than hard-coding it in
      the driver.
      Signed-off-by: default avatarTom Lendacky <thomas.lendacky@amd.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      386d325d
    • Lendacky, Thomas's avatar
      amd-xgbe: Use the new DMA memory barriers where appropriate · ceb8f6be
      Lendacky, Thomas authored
      Use the new lighter weight memory barriers when working with the device
      descriptors.
      Signed-off-by: default avatarTom Lendacky <thomas.lendacky@amd.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ceb8f6be
    • Lendacky, Thomas's avatar
      amd-xgbe: Clarify output message about queues · 600c8811
      Lendacky, Thomas authored
      Clarify that the queues referred to in a message when the device is
      brought up are hardware queues and not necessarily related to the
      Linux network queues.
      Signed-off-by: default avatarTom Lendacky <thomas.lendacky@amd.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      600c8811
    • Lendacky, Thomas's avatar
      amd-xgbe-phy: Provide support for auto-negotiation timeout · 9ae5eecd
      Lendacky, Thomas authored
      Currently, there is no interrupt code that indicates auto-negotiation
      has timed out. If the auto-negotiation has timed out then the start of
      a new auto-negotiation will begin again with a new base page being
      received. The state machine could be in a state that is not expecting
      this interrupt code which results in an error during auto-negotiation.
      
      Update the code to timestamp when the auto-negotiation starts.  Should
      another page received interrupt code occur before auto-negotiation has
      completed but after the auto-negotiation timeout, then reset the state
      machine to allow the auto-negotiation to continue.
      Signed-off-by: default avatarTom Lendacky <thomas.lendacky@amd.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9ae5eecd
    • Lendacky, Thomas's avatar
      amd-xgbe-phy: Use the phy_driver flags field · 65f57cb1
      Lendacky, Thomas authored
      Remove the setting of the transceiver type when retrieving the device
      settings using ethtool and instead set the transceiver type in the
      phy_driver structure flags field. Change the transceiver type to be
      internal, also.
      Signed-off-by: default avatarTom Lendacky <thomas.lendacky@amd.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      65f57cb1
    • Lendacky, Thomas's avatar
      amd-xgbe-phy: Use phydev advertising field vs supported · d9663c8c
      Lendacky, Thomas authored
      With ethtool being able to control what is advertised, the advertising
      field is what should be used for priming the auto-negotiation registers
      and for various other checks, instead of the supported field.
      
      Also, move the initial setting of the supported and advertising fields
      into the probe function so that they are not reset each time the device
      is brought up, thus allowing the user to set as desired before bringing
      the device up.
      Signed-off-by: default avatarTom Lendacky <thomas.lendacky@amd.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d9663c8c
    • Catalin Marinas's avatar
      net: compat: Update get_compat_msghdr() to match copy_msghdr_from_user() behaviour · 91edd096
      Catalin Marinas authored
      Commit db31c55a (net: clamp ->msg_namelen instead of returning an
      error) introduced the clamping of msg_namelen when the unsigned value
      was larger than sizeof(struct sockaddr_storage). This caused a
      msg_namelen of -1 to be valid. The native code was subsequently fixed by
      commit dbb490b9 (net: socket: error on a negative msg_namelen).
      
      In addition, the native code sets msg_namelen to 0 when msg_name is
      NULL. This was done in commit (6a2a2b3a net:socket: set msg_namelen
      to 0 if msg_name is passed as NULL in msghdr struct from userland) and
      subsequently updated by 08adb7da (fold verify_iovec() into
      copy_msghdr_from_user()).
      
      This patch brings the get_compat_msghdr() in line with
      copy_msghdr_from_user().
      
      Fixes: db31c55a (net: clamp ->msg_namelen instead of returning an error)
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Dan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      91edd096
    • David S. Miller's avatar
      Merge branch 'rhashtable-inlined-interface' · ebd6af09
      David S. Miller authored
      Herbert Xu says:
      
      ====================
      rhashtable: Introduce inlined interface
      
      This series of patches introduces the inlined rhashtable interface.
      
      The idea is to make all the function pointers visible to the compiler
      by providing the rhashtable_params structure explicitly to each
      inline rhashtable function.  For example, instead of doing
      
      	obj = rhashtable_lookup(ht, key);
      
      you would now do
      
      	obj = rhashtable_lookup_fast(ht, key, params);
      
      Where params is the same data that you would give to rhashtable_init.
      In particular, within rhashtable.c itself we would simply supply
      ht->p.
      
      So to convert users over, you simply have to make params globally
      accessible, e.g., by placing it in a static const variable, which
      can then be used at each inlined call site, as well as by the
      rhashtable_init call.
      
      The only ticky bit is that some users (i.e., netfilter) has a
      dynamic key length.  This is dealt with by using params.key_len
      in the inline functions when it is non-zero, and otherwise falling
      back on ht->p.key_len.
      
      Note that I've only tested this on one compiler, gcc 4.7.2.  So
      please test this with your compilers as well and make sure that
      the code is actually inlined without indirect function calls.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ebd6af09
    • Herbert Xu's avatar
      rhashtable: Rip out obsolete out-of-line interface · dc0ee268
      Herbert Xu authored
      Now that all rhashtable users have been converted over to the
      inline interface, this patch removes the unused out-of-line
      interface.
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      dc0ee268
    • Herbert Xu's avatar
      tipc: Use inlined rhashtable interface · 6cca7289
      Herbert Xu authored
      This patch converts tipc to the inlined rhashtable interface.
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6cca7289
    • Herbert Xu's avatar
      test_rhashtable: Use inlined rhashtable interface · b182aa6e
      Herbert Xu authored
      This patch converts test_rhashtable to the inlined rhashtable
      interface.
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b182aa6e
    • Herbert Xu's avatar
      netfilter: Convert nft_hash to inlined rhashtable · fa377321
      Herbert Xu authored
      This patch converts nft_hash to the inlined rhashtable interface.
      
      This patch also replaces the call to rhashtable_lookup_compare with
      a straight rhashtable_lookup_fast because it's simply doing a memcmp
      (in fact nft_hash_lookup already uses memcmp instead of nft_data_cmp).
      
      Furthermore, the compare function is only meant to compare, it is not
      supposed to have side-effects.  The current side-effect code can
      simply be moved into the nft_hash_get.
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      fa377321
    • Herbert Xu's avatar
      netlink: Move namespace into hash key · c428ecd1
      Herbert Xu authored
      Currently the name space is a de facto key because it has to match
      before we find an object in the hash table.  However, it isn't in
      the hash value so all objects from different name spaces with the
      same port ID hash to the same bucket.
      
      This is bad as the number of name spaces is unbounded.
      
      This patch fixes this by using the namespace when doing the hash.
      
      Because the namespace field doesn't lie next to the portid field
      in the netlink socket, this patch switches over to the rhashtable
      interface without a fixed key.
      
      This patch also uses the new inlined rhashtable interface where
      possible.
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c428ecd1
    • Herbert Xu's avatar
      rhashtable: Allow hash/comparison functions to be inlined · 02fd97c3
      Herbert Xu authored
      This patch deals with the complaint that we make indirect function
      calls on the fast paths unnecessarily in rhashtable.  We resolve
      it by moving the fast paths into inline functions that take struct
      rhashtable_param (which obviously must be the same set of parameters
      supplied to rhashtable_init) as an argument.
      
      The only remaining indirect call is to obj_hashfn (or key_hashfn it
      obj_hashfn is unset) on the rehash as well as the insert-during-
      rehash slow path.
      
      This patch also extends the support of vairable-length keys to
      include those where the key is fixed but scattered in the object.
      For example, in netlink we want to key off the namespace and the
      portid but they're not next to each other.
      
      This patch does this by directly using the object hash function
      as the indicator of whether the key is accessible or not.  It
      also adds a new function obj_cmpfn to compare a key against an
      object.  This means that the caller no longer needs to supply
      explicit compare functions.
      
      All this is done in a backwards compatible manner so no existing
      users are affected until they convert to the new interface.
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      02fd97c3