An error occurred fetching the project authors.
  1. 22 Nov, 2011 1 commit
    • Neil Horman's avatar
      net: add network priority cgroup infrastructure (v4) · 5bc1421e
      Neil Horman authored
      This patch adds in the infrastructure code to create the network priority
      cgroup.  The cgroup, in addition to the standard processes file creates two
      control files:
      
      1) prioidx - This is a read-only file that exports the index of this cgroup.
      This is a value that is both arbitrary and unique to a cgroup in this subsystem,
      and is used to index the per-device priority map
      
      2) priomap - This is a writeable file.  On read it reports a table of 2-tuples
      <name:priority> where name is the name of a network interface and priority is
      indicates the priority assigned to frames egresessing on the named interface and
      originating from a pid in this cgroup
      
      This cgroup allows for skb priority to be set prior to a root qdisc getting
      selected. This is benenficial for DCB enabled systems, in that it allows for any
      application to use dcb configured priorities so without application modification
      Signed-off-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: default avatarJohn Fastabend <john.r.fastabend@intel.com>
      CC: Robert Love <robert.w.love@intel.com>
      CC: "David S. Miller" <davem@davemloft.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5bc1421e
  2. 17 Nov, 2011 1 commit
    • Eric Dumazet's avatar
      net: use jump_label to shortcut RPS if not setup · adc9300e
      Eric Dumazet authored
      Most machines dont use RPS/RFS, and pay a fair amount of instructions in
      netif_receive_skb() / netif_rx() / get_rps_cpu() just to discover
      RPS/RFS is not setup.
      
      Add a jump_label named rps_needed.
      
      If no device rps_map or global rps_sock_flow_table is setup,
      netif_receive_skb() / netif_rx() do a single instruction instead of many
      ones, including conditional jumps.
      
      jmp +0    (if CONFIG_JUMP_LABEL=y)
      Signed-off-by: default avatarEric Dumazet <eric.dumazet@gmail.com>
      CC: Tom Herbert <therbert@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      adc9300e
  3. 16 Nov, 2011 4 commits
  4. 30 Oct, 2011 1 commit
    • Eric Dumazet's avatar
      vlan: allow nested vlan_do_receive() · 6a32e4f9
      Eric Dumazet authored
      commit 2425717b (net: allow vlan traffic to be received under bond)
      broke ARP processing on vlan on top of bonding.
      
             +-------+
      eth0 --| bond0 |---bond0.103
      eth1 --|       |
             +-------+
      
      52870.115435: skb_gro_reset_offset <-napi_gro_receive
      52870.115435: dev_gro_receive <-napi_gro_receive
      52870.115435: napi_skb_finish <-napi_gro_receive
      52870.115435: netif_receive_skb <-napi_skb_finish
      52870.115435: get_rps_cpu <-netif_receive_skb
      52870.115435: __netif_receive_skb <-netif_receive_skb
      52870.115436: vlan_do_receive <-__netif_receive_skb
      52870.115436: bond_handle_frame <-__netif_receive_skb
      52870.115436: vlan_do_receive <-__netif_receive_skb
      52870.115436: arp_rcv <-__netif_receive_skb
      52870.115436: kfree_skb <-arp_rcv
      
      Packet is dropped in arp_rcv() because its pkt_type was set to
      PACKET_OTHERHOST in the first vlan_do_receive() call, since no eth0.103
      exists.
      
      We really need to change pkt_type only if no more rx_handler is about to
      be called for the packet.
      Signed-off-by: default avatarEric Dumazet <eric.dumazet@gmail.com>
      Reviewed-by: default avatarJiri Pirko <jpirko@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6a32e4f9
  5. 24 Oct, 2011 1 commit
  6. 21 Oct, 2011 1 commit
  7. 19 Oct, 2011 4 commits
    • Richard Cochran's avatar
      net: validate HWTSTAMP ioctl parameters · 4dc360c5
      Richard Cochran authored
      This patch adds a sanity check on the values provided by user space for
      the hardware time stamping configuration. If the values lie outside of
      the absolute limits, then the ioctl request will be denied.
      Signed-off-by: default avatarRichard Cochran <richard.cochran@omicron.at>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4dc360c5
    • Eric W. Biederman's avatar
      net: Move rcu_barrier from rollback_registered_many to netdev_run_todo. · 850a545b
      Eric W. Biederman authored
      This patch moves the rcu_barrier from rollback_registered_many
      (inside the rtnl_lock) into netdev_run_todo (just outside the rtnl_lock).
      This allows us to gain the full benefit of sychronize_net calling
      synchronize_rcu_expedited when the rtnl_lock is held.
      
      The rcu_barrier in rollback_registered_many was originally a synchronize_net
      but was promoted to be a rcu_barrier() when it was found that people were
      unnecessarily hitting the 250ms wait in netdev_wait_allrefs().  Changing
      the rcu_barrier back to a synchronize_net is therefore safe.
      
      Since we only care about waiting for the rcu callbacks before we get
      to netdev_wait_allrefs() it is also safe to move the wait into
      netdev_run_todo.
      
      This was tested by creating and destroying 1000 tap devices and observing
      /proc/lock_stat.  /proc/lock_stat reports this change reduces the hold
      times of the rtnl_lock by a factor of 10.  There was no observable
      difference in the amount of time it takes to destroy a network device.
      Signed-off-by: default avatarEric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      850a545b
    • Eric Dumazet's avatar
      net: add skb frag size accessors · 9e903e08
      Eric Dumazet authored
      To ease skb->truesize sanitization, its better to be able to localize
      all references to skb frags size.
      
      Define accessors : skb_frag_size() to fetch frag size, and
      skb_frag_size_{set|add|sub}() to manipulate it.
      Signed-off-by: default avatarEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9e903e08
    • John Fastabend's avatar
      net: allow vlan traffic to be received under bond · 2425717b
      John Fastabend authored
      The following configuration used to work as I expected. At least
      we could use the fcoe interfaces to do MPIO and the bond0 iface
      to do load balancing or failover.
      
             ---eth2.228-fcoe
             |
      eth2 -----|
                |
                |---- bond0
                |
      eth3 -----|
             |
             ---eth3.228-fcoe
      
      This worked because of a change we added to allow inactive slaves
      to rx 'exact' matches. This functionality was kept intact with the
      rx_handler mechanism. However now the vlan interface attached to the
      active slave never receives traffic because the bonding rx_handler
      updates the skb->dev and goto's another_round. Previously, the
      vlan_do_receive() logic was called before the bonding rx_handler.
      
      Now by the time vlan_do_receive calls vlan_find_dev() the
      skb->dev is set to bond0 and it is clear no vlan is attached
      to this iface. The vlan lookup fails.
      
      This patch moves the VLAN check above the rx_handler. A VLAN
      tagged frame is now routed to the eth2.228-fcoe iface in the
      above schematic. Untagged frames continue to the bond0 as
      normal. This case also remains intact,
      
      eth2 --> bond0 --> vlan.228
      
      Here the skb is VLAN tagged but the vlan lookup fails on eth2
      causing the bonding rx_handler to be called. On the second
      pass the vlan lookup is on the bond0 iface and completes as
      expected.
      
      Putting a VLAN.228 on both the bond0 and eth2 device will
      result in eth2.228 receiving the skb. I don't think this is
      completely unexpected and was the result prior to the rx_handler
      result.
      
      Note, the same setup is also used for other storage traffic that
      MPIO is used with eg. iSCSI and similar setups can be contrived
      without storage protocols.
      Signed-off-by: default avatarJohn Fastabend <john.r.fastabend@intel.com>
      Acked-by: default avatarJesse Gross <jesse@nicira.com>
      Reviewed-by: default avatarJiri Pirko <jpirko@redhat.com>
      Tested-by: default avatarHans Schillstrom <hams.schillstrom@ericsson.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2425717b
  8. 03 Oct, 2011 1 commit
  9. 28 Sep, 2011 1 commit
  10. 15 Sep, 2011 2 commits
    • Jiri Pirko's avatar
      net: consolidate and fix ethtool_ops->get_settings calling · 4bc71cb9
      Jiri Pirko authored
      This patch does several things:
      - introduces __ethtool_get_settings which is called from ethtool code and
        from drivers as well. Put ASSERT_RTNL there.
      - dev_ethtool_get_settings() is replaced by __ethtool_get_settings()
      - changes calling in drivers so rtnl locking is respected. In
        iboe_get_rate was previously ->get_settings() called unlocked. This
        fixes it. Also prb_calc_retire_blk_tmo() in af_packet.c had the same
        problem. Also fixed by calling __dev_get_by_index() instead of
        dev_get_by_index() and holding rtnl_lock for both calls.
      - introduces rtnl_lock in bnx2fc_vport_create() and fcoe_vport_create()
        so bnx2fc_if_create() and fcoe_if_create() are called locked as they
        are from other places.
      - use __ethtool_get_settings() in bonding code
      Signed-off-by: default avatarJiri Pirko <jpirko@redhat.com>
      
      v2->v3:
      	-removed dev_ethtool_get_settings()
      	-added ASSERT_RTNL into __ethtool_get_settings()
      	-prb_calc_retire_blk_tmo - use __dev_get_by_index() and lock
      	 around it and __ethtool_get_settings() call
      v1->v2:
              add missing export_symbol
      Reviewed-by: Ben Hutchings <bhutchings@solarflare.com> [except FCoE bits]
      Acked-by: default avatarRalf Baechle <ralf@linux-mips.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4bc71cb9
    • Michael S. Tsirkin's avatar
      net: copy userspace buffers on device forwarding · 48c83012
      Michael S. Tsirkin authored
      dev_forward_skb loops an skb back into host networking
      stack which might hang on the memory indefinitely.
      In particular, this can happen in macvtap in bridged mode.
      Copy the userspace fragments to avoid blocking the
      sender in that case.
      
      As this patch makes skb_copy_ubufs extern now,
      I also added some documentation and made it clear
      the SKBTX_DEV_ZEROCOPY flag automatically instead
      of doing it in all callers. This can be made into a separate
      patch if people feel it's worth it.
      Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      48c83012
  11. 25 Aug, 2011 1 commit
  12. 24 Aug, 2011 1 commit
  13. 23 Aug, 2011 1 commit
  14. 22 Aug, 2011 1 commit
  15. 19 Aug, 2011 2 commits
  16. 18 Aug, 2011 6 commits
  17. 12 Aug, 2011 1 commit
  18. 02 Aug, 2011 1 commit
  19. 25 Jul, 2011 1 commit
  20. 14 Jul, 2011 2 commits
  21. 06 Jul, 2011 1 commit
  22. 01 Jul, 2011 1 commit
    • Thomas Graf's avatar
      rtnl: provide link dump consistency info · 4e985ada
      Thomas Graf authored
      This patch adds a change sequence counter to each net namespace
      which is bumped whenever a netdevice is added or removed from
      the list. If such a change occurred while a link dump took place,
      the dump will have the NLM_F_DUMP_INTR flag set in the first
      message which has been interrupted and in all subsequent messages
      of the same dump.
      
      Note that links may still be modified or renamed while a dump is
      taking place but we can guarantee for userspace to receive a
      complete list of links and not miss any.
      
      Testing:
      I have added 500 VLAN netdevices to make sure the dump is split
      over multiple messages. Then while continuously dumping links in
      one process I also continuously deleted and re-added a dummy
      netdevice in another process. Multiple dumps per seconds have
      had the NLM_F_DUMP_INTR flag set.
      
      I guess we can wait for Johannes patch to hit net-next via the
      wireless tree.  I just wanted to give this some testing right away.
      Signed-off-by: default avatarThomas Graf <tgraf@infradead.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4e985ada
  23. 22 Jun, 2011 1 commit
  24. 11 Jun, 2011 1 commit
    • Jiri Pirko's avatar
      vlan: Fix the ingress VLAN_FLAG_REORDER_HDR check · 0b5c9db1
      Jiri Pirko authored
      Testing of VLAN_FLAG_REORDER_HDR does not belong in vlan_untag
      but rather in vlan_do_receive.  Otherwise the vlan header
      will not be properly put on the packet in the case of
      vlan header accelleration.
      
      As we remove the check from vlan_check_reorder_header
      rename it vlan_reorder_header to keep the naming clean.
      
      Fix up the skb->pkt_type early so we don't look at the packet
      after adding the vlan tag, which guarantees we don't goof
      and look at the wrong field.
      
      Use a simple if statement instead of a complicated switch
      statement to decided that we need to increment rx_stats
      for a multicast packet.
      
      Hopefully at somepoint we will just declare the case where
      VLAN_FLAG_REORDER_HDR is cleared as unsupported and remove
      the code.  Until then this keeps it working correctly.
      Signed-off-by: default avatarEric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: default avatarJiri Pirko <jpirko@redhat.com>
      Acked-by: default avatarChangli Gao <xiaosuo@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0b5c9db1
  25. 08 Jun, 2011 1 commit
    • Alexander Duyck's avatar
      v2 ethtool: remove support for ETHTOOL_GRXNTUPLE · bff55273
      Alexander Duyck authored
      This change is meant to remove all support for displaying an ntuple as
      strings via ETHTOOL_GRXNTUPLE.  The reason for this change is due to the
      fact that multiple issues have been found including:
       - Multiple buffer overruns for strings being displayed.
       - Incorrect filters displayed, cleared filters with ring of -2 are displayed
       - Setting get_rx_ntuple displays no rules if defined.
       - Endianess wrong on displayed values.
       - Hard limit of 1024 filters makes display functionality extremely limited
      
      The only driver that had supported this interface was ixgbe.  Since it no
      longer uses the interface and due to the issues mentioned above I am
      submitting this patch to remove it.
      
      v2:
      Updated based on comments from Ben Hutchings
       - Left ETH_SS_NTUPLE_FILTERS in code but commented on it being deprecated
       - Removed ethtool_rx_ntuple_list and ethtool_rx_ntuple_flow_spec_container
       - Left ETHTOOL_GRXNTUPLE but commented it as deprecated
      
      Also cleaned up set_rx_ntuple since there is no flow spec container to
      maintain we can drop all the code for the alloc and free of it and just
      return ops->set_rx_ntuple().
      Signed-off-by: default avatarAlexander Duyck <alexander.h.duyck@intel.com>
      Acked-by: default avatarBen Hutchings <bhutchings@solarflare.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bff55273
  26. 07 Jun, 2011 1 commit
    • Heiko Carstens's avatar
      net: cpu offline cause napi stall · 264524d5
      Heiko Carstens authored
      Frank Blaschka reported :
      <quote>
        During heavy network load we turn off/on cpus.
        Sometimes this causes a stall on the network device.
        Digging into the dump I found out following:
      
        napi is scheduled but does not run. From the I/O buffers
        and the napi state I see napi/rx_softirq processing has stopped
        because the budget was reached. napi stays in the
        softnet_data poll_list and the rx_softirq was raised again.
      
        I assume at this time the cpu offline comes in,
        the rx softirq is raised/moved to another cpu but napi stays in the
        poll_list of the softnet_data of the now offline cpu.
      
        Reviewing dev_cpu_callback (net/core/dev.c) I did not find the
        poll_list is transfered to the new cpu.
      </quote>
      
      This patch is a straightforward implementation of Frank suggestion :
      
      Transfert poll_list and trigger NET_RX_SOFTIRQ on new cpu.
      Reported-by: default avatarFrank Blaschka <blaschka@linux.vnet.ibm.com>
      Signed-off-by: default avatarHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: default avatarEric Dumazet <eric.dumazet@gmail.com>
      Tested-by: default avatarEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      264524d5