1. 24 Aug, 2010 23 commits
    • Yevgeny Petrilin's avatar
      mlx4_en: Fixed incorrect unmapping on RX flow. · 69351a29
      Yevgeny Petrilin authored
      When allocating new fragments to replace the ones that would be passed to the stack,
      The fragments that should be replaced, are the ones that were already used.
      Signed-off-by: default avatarYevgeny Petrilin <yevgenyp@mellanox.co.il>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      69351a29
    • Stephen Hemminger's avatar
      tc: add meta match on receive hash · c2e3143e
      Stephen Hemminger authored
      Trivial extension to existing meta data match rules to allow
      matching on skb receive hash value.
      Signed-off-by: default avatarStephen Hemminger <shemminger@vyatta.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c2e3143e
    • Joe Perches's avatar
      include/linux/if_ether.h: Remove unused #define MAC_FMT · 5a46790c
      Joe Perches authored
      Last use was removed, so remove the #define.
      Signed-off-by: default avatarJoe Perches <joe@perches.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5a46790c
    • Eric Dumazet's avatar
      net: ip_append_data() optim · ec550d24
      Eric Dumazet authored
      Compiler is not smart enough to avoid a conditional branch.
      Signed-off-by: default avatarEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ec550d24
    • Eric Dumazet's avatar
      bcm63xx_enet: use netdev stats · c32d83c0
      Eric Dumazet authored
      Use integrated net_device stats instead of a private one
      
      Get rid of bcm_enet_get_stats()
      Signed-off-by: default avatarEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c32d83c0
    • Eric Dumazet's avatar
      ethoc: get rid of ethoc_stats() · 6abc2376
      Eric Dumazet authored
      drivers can avoid implementing ndo_get_stats method if using netdevice
      stats structure.
      Signed-off-by: default avatarEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6abc2376
    • Eric Dumazet's avatar
      be2net: get rid of be_get_stats() · b4ddf4b3
      Eric Dumazet authored
      drivers can avoid implementing ndo_get_stats method if using netdevice
      stats structure.
      Signed-off-by: default avatarEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b4ddf4b3
    • Simon Horman's avatar
      net: increase the size of priv_flags and add IFF_OVS_DATAPATH · 1726442e
      Simon Horman authored
      IFF_OVS_DATAPATH is a place-holder for the Open vSwitch datapath
      which I am preparing to submit for merging.
      
      As all 16 bits of priv_flags are already assigned flags, also increase
      the size of priv_flags to 32 bits.
      
      Unfortunately, by my calculations this increases the size of
      struct net_device by 4 bytes on 32bit architectures and
      8 bytes on 64 bit architectures. I couldn't see an obvious
      way to avoid that.
      
      Cc: Jesse Gross <jesse@nicira.com>
      Signed-off-by: default avatarSimon Horman <horms@verge.net.au>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1726442e
    • stephen hemminger's avatar
      ethtool: allow non-netadmin to query settings · 0fdc100b
      stephen hemminger authored
      The SNMP daemon uses ethtool to determine the speed of
      network interfaces. This fails on Debian (and probably elsewhere)
      because for security SNMP daemon runs as non-root user (snmp).
      
      Note: A similar patch was rejected previously because of a concern about
      the possibility that on some hardware querying the ethtool settings
      requires access to the PHY and could slow the machine down.  But the
      security risk of requiring SNMP daemon (and related services)
      to run as root far out weighs the risk of denial-of-service.
      Signed-off-by: default avatarStephen Hemminger <shemminger@vyatta.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0fdc100b
    • Eric Dumazet's avatar
      net: copy_rtnl_link_stats64() simplification · afdcba37
      Eric Dumazet authored
      No need to use a temporary struct rtnl_link_stats64 variable,
      just copy the source to skb buffer.
      Signed-off-by: default avatarEric Dumazet <eric.dumazet@gmail.com>
      Reviewed-by: default avatarBen Hutchings <bhutchings@solarflare.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      afdcba37
    • Changli Gao's avatar
      0eec32ff
    • David S. Miller's avatar
      pkt_sched: Make act_csum depend upon INET. · 7abac686
      David S. Miller authored
      It uses ip_send_check() and stuff like that.
      Reported-by: default avatarRandy Dunlap <randy.dunlap@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7abac686
    • Dimitris Michailidis's avatar
      cxgb4: update PCI ids · ccea790e
      Dimitris Michailidis authored
      Signed-off-by: default avatarDimitris Michailidis <dm@chelsio.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ccea790e
    • Dimitris Michailidis's avatar
    • Dimitris Michailidis's avatar
      cxgb4: support eeprom read/write on functions other than 0 · 1478b3ee
      Dimitris Michailidis authored
      Extend the address translation for eeprom read/write (code used by
      ethtool -[eE]) to functions other than 0.
      Signed-off-by: default avatarDimitris Michailidis <dm@chelsio.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1478b3ee
    • Dimitris Michailidis's avatar
      cxgb4: handle Rx/Tx queue ranges not starting at 0 · e46dab4d
      Dimitris Michailidis authored
      Currently the driver assumes that queue IDs start at 0 but that's true
      only for function 0.  To support operation on other functions get the
      start of the queue ranges from FW and offset accordingly.
      Signed-off-by: default avatarDimitris Michailidis <dm@chelsio.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e46dab4d
    • David S. Miller's avatar
      bna: Delete get_flags and set_flags ethtool methods. · f04b4dd2
      David S. Miller authored
      This driver doesn't support LRO, NTUPLE, or the RXHASH
      features.  So it should not set these ethtool operations.
      
      This also fixes the warning:
      
      drivers/net/bna/bnad_ethtool.c:1272: warning: initialization from incompatible pointer type
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f04b4dd2
    • Rasesh Mody's avatar
      bna: Brocade 10Gb Ethernet device driver · 8b230ed8
      Rasesh Mody authored
      This is patch 1/6 which contains linux driver source for
      Brocade's BR1010/BR1020 10Gb CEE capable ethernet adapter.
      Signed-off-by: default avatarDebashis Dutt <ddutt@brocade.com>
      Signed-off-by: default avatarRasesh Mody <rmody@brocade.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8b230ed8
    • Gerrit Renker's avatar
      dccp ccid-2: Replace broken RTT estimator with better algorithm · 231cc2aa
      Gerrit Renker authored
      The current CCID-2 RTT estimator code is in parts broken and lags behind the
      suggestions in RFC2988 of using scaled variants for SRTT/RTTVAR.
      
      That code is replaced by the present patch, which reuses the Linux TCP RTT
      estimator code.
      
      Further details:
      ----------------
       1. The minimum RTO of previously one second has been replaced with TCP's, since
          RFC4341, sec. 5 says that the minimum of 1 sec. (suggested in RFC2988, 2.4)
          is not necessary. Instead, the TCP_RTO_MIN is used, which agrees with DCCP's
          concept of a default RTT (RFC 4340, 3.4).
       2. The maximum RTO has been set to DCCP_RTO_MAX (64 sec), which agrees with
          RFC2988, (2.5).
       3. De-inlined the function ccid2_new_ack().
       4. Added a FIXME: the RTT is sampled several times per Ack Vector, which will
          give the wrong estimate. It should be replaced with one sample per Ack.
          However, at the moment this can not be resolved easily, since
          - it depends on TX history code (which also needs some work),
          - the cleanest solution is not to use the `sent' time at all (saves 4 bytes
            per entry) and use DCCP timestamps / elapsed time to estimated the RTT,
            which however is non-trivial to get right (but needs to be done).
      
      Reasons for reusing the Linux TCP estimator algorithm:
      ------------------------------------------------------
      Some time was spent to find a better alternative, using basic RFC2988 as a first
      step. Further analysis and experimentation showed that the Linux TCP RTO
      estimator is superior to a basic RFC2988 implementation. A summary is on
      http://www.erg.abdn.ac.uk/users/gerrit/dccp/notes/ccid2/rto_estimator/
      
      In addition, this estimator fared well in a recent empirical evaluation:
      
          Rewaskar, Sushant, Jasleen Kaur and F. Donelson Smith.
          A Performance Study of Loss Detection/Recovery in Real-world TCP
          Implementations. Proceedings of 15th IEEE International
          Conference on Network Protocols (ICNP-07), 2007.
      
      Thus there is significant benefit in reusing the existing TCP code.
      Signed-off-by: default avatarGerrit Renker <gerrit@erg.abdn.ac.uk>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      231cc2aa
    • Gerrit Renker's avatar
      dccp ccid-2: Simplify dec_pipe and rearming of RTO timer · c38c92a8
      Gerrit Renker authored
      This removes the dec_pipe function and improves the way the RTO timer is rearmed
      when a new acknowledgment comes in.
      
      Details and justification for removal:
      --------------------------------------
       1) The BUG_ON in dec_pipe is never triggered: pipe is only decremented for TX
          history entries between tail and head, for which it had previously been
          incremented in tx_packet_sent; and it is not decremented twice for the same
          entry, since it is
          - either decremented when a corresponding Ack Vector cell in state 0 or 1
            was received (and then ccid2s_acked==1),
          - or it is decremented when ccid2s_acked==0, as part of the loss detection
            in tx_packet_recv (and hence it can not have been decremented earlier).
      
       2) Restarting the RTO timer happens for every single entry in each Ack Vector
          parsed by tx_packet_recv (according to RFC 4340, 11.4 this can happen up to
          16192 times per Ack Vector).
      
       3) The RTO timer should not be restarted when all outstanding data has been
          acknowledged. This is currently done similar to (2), in dec_pipe, when
          pipe has reached 0.
      
      The patch onsolidates the code which rearms the RTO timer, combining the
      segments from new_ack and dec_pipe. As a result, the code becomes clearer
      (compare with tcp_rearm_rto()).
      Signed-off-by: default avatarGerrit Renker <gerrit@erg.abdn.ac.uk>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c38c92a8
    • Gerrit Renker's avatar
      dccp ccid-2: Remove redundant sanity tests · 30564e35
      Gerrit Renker authored
      This removes the ccid2_hc_tx_check_sanity function: it is redundant.
      
      Details:
      
      The tx_check_sanity function performs three tests:
       1) it checks that the circular TX list is sorted
          - in ascending order of sequence number (ccid2s_seq)
          - and time (ccid2s_sent),
          - in the direction from `tail' (hctx_seqt) to `head' (hctx_seqh);
       2) it ensures that the entire list has the length seqbufc * CCID2_SEQBUF_LEN;
       3) it ensures that pipe equals the number of packets that were not
          marked `acked' (ccid2s_acked) between `tail' and `head'.
      
      The following argues that each of these tests is redundant, this can be verified
      by going through the code.
      
      (1) is not necessary, since both time and GSS increase from one packet to the
      next, so that subsequent insertions in tx_packet_sent (which advance the `head'
      pointer) will be in ascending order of time and sequence number.
      
      In (2), the length of the list is always equal to seqbufc times CCID2_SEQBUF_LEN
      (set to 1024) unless allocation caused an earlier failure, because:
       * at initialisation (tx_init), there is one chunk of size 1024 and seqbufc=1;
       * subsequent calls to tx_alloc_seq take place whenever head->next == tail in
         tx_packet_sent; then a new chunk of size 1024 is inserted between head and
         tail, and seqbufc is incremented by one.
      
      To show that (3) is redundant requires looking at two cases.
      
      The `pipe' variable of the TX socket is incremented only in tx_packet_sent, and
      decremented in tx_packet_recv.  When head == tail (TX history empty) then pipe
      should be 0, which is the case directly after initialisation and after a
      retransmission timeout has occurred (ccid2_hc_tx_rto_expire).
      
      The first case involves parsing Ack Vectors for packets recorded in the live
      portion of the buffer, between tail and head. For each packet marked by the
      receiver as received (state 0) or ECN-marked (state 1), pipe is decremented by
      one, so for all such packets the BUG_ON in tx_check_sanity will not trigger.
      
      The second case is the loss detection in the second half of tx_packet_recv,
      below the comment "Check for NUMDUPACK".
      
      The first while-loop here ensures that the sequence number of `seqp' is either
      above or equal to `high_ack', or otherwise equal to the highest sequence number
      sent so far (of the entry head->prev, as head points to the next unsent entry).
      The next while-loop ("while (1)") counts the number of acked packets starting
      from that position of seqp, going backwards in the direction from head->prev to
      tail. If NUMDUPACK=3 such packets were counted within this loop, `seqp' points
      to the last acknowledged packet of these, and the "if (done == NUMDUPACK)" block
      is entered next.
      The while-loop contained within that block in turn traverses the list backwards,
      from head to tail; the position of `seqp' is saved in the variable `last_acked'.
      For each packet not marked as `acked', a congestion event is triggered within
      the loop, and pipe is decremented. The loop terminates when `seqp' has reached
      `tail', whereupon tail is set to the position previously stored in `last_acked'.
      Thus, between `last_acked' and the previous position of `tail',
       - pipe has been decremented earlier if the packet was marked as state 0 or 1;
       - pipe was decremented if the packet was not marked as acked.
      That is, pipe has been decremented by the number of packets between `last_acked'
      and the previous position of `tail'. As a consequence, pipe now again reflects
      the number of packets which have not (yet) been acked between the new position
      of tail (at `last_acked') and head->prev, or 0 if head==tail. The result is that
      the BUG_ON condition in check_sanity will also not be triggered, hence the test
      (3) is also redundant.
      Signed-off-by: default avatarGerrit Renker <gerrit@erg.abdn.ac.uk>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      30564e35
    • Gerrit Renker's avatar
      dccp ccid-3: No more CCID control blocks in LISTEN state · 51c22bb5
      Gerrit Renker authored
      The CCIDs are activated as last of the features, at the end of the handshake,
      were the LISTEN state of the master socket is inherited into the server
      state of the child socket. Thus, the only states visible to CCIDs now are
      OPEN/PARTOPEN, and the closing states.
      
      This allows to remove tests which were previously necessary to protect
      against referencing a socket in the listening state (in CCID-3), but which
      now have become redundant.
      
      As a further byproduct of enabling the CCIDs only after the connection has been
      fully established, several typecast-initialisations of ccid3_hc_{rx,tx}_sock
      can now be eliminated:
       * the CCID is loaded, so it is not necessary to test if it is NULL,
       * if it is possible to load a CCID and leave the private area NULL, then this
          is a bug, which should crash loudly - and earlier,
       * the test for state==OPEN || state==PARTOPEN now reduces only to the closing
         phase (e.g. when the node has received an unexpected Reset).
      Signed-off-by: default avatarGerrit Renker <gerrit@erg.abdn.ac.uk>
      Acked-by: default avatarIan McDonald <ian.mcdonald@jandi.co.nz>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      51c22bb5
    • Gerrit Renker's avatar
      ccid: ccid-2/3 code cosmetics · 67b67e36
      Gerrit Renker authored
      This patch collects cosmetics-only changes to separate these from
      code changes:
       * update with regard to CodingStyle and whitespace changes,
       * documentation:
         - adding/revising comments,
         - remove CCID-3 RX socket documentation which is either
           duplicate or refers to fields that no longer exist,
       * expand embedded tfrc_tx_info struct inline for consistency,
         removing indirections via #define.
      Signed-off-by: default avatarGerrit Renker <gerrit@erg.abdn.ac.uk>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      67b67e36
  2. 23 Aug, 2010 13 commits
  3. 22 Aug, 2010 4 commits