1. 04 Sep, 2008 26 commits
    • Gerrit Renker's avatar
      dccp: Insert feature-negotiation options into skb · 0ef118a0
      Gerrit Renker authored
      This patch replaces the earlier insertion routine from options.c, so that
      code specific to feature negotiation can remain in feat.c. This is possible
      by calling a function already existing in options.c.
      Signed-off-by: default avatarGerrit Renker <gerrit@erg.abdn.ac.uk>
      Acked-by: default avatarIan McDonald <ian.mcdonald@jandi.co.nz>
      0ef118a0
    • Gerrit Renker's avatar
      dccp: Header option insertion routine for feature-negotiation · cf9ddf73
      Gerrit Renker authored
      The patch extends existing code:
       * Confirm options divide into the confirmed value plus an optional preference
         list for SP values. Previously only the preference list was echoed for SP
         values, now the confirmed value is added as per RFC 4340, 6.1;
       * length and sanity checks are added to avoid illegal memory (or NULL) access.
      Signed-off-by: default avatarGerrit Renker <gerrit@erg.abdn.ac.uk>
      Acked-by: default avatarIan McDonald <ian.mcdonald@jandi.co.nz>
      cf9ddf73
    • Gerrit Renker's avatar
      dccp: Support for Mandatory options · d0440ee6
      Gerrit Renker authored
      Support for Mandatory options is provided by this patch, which will
      be used by subsequent feature-negotiation patches.
      Signed-off-by: default avatarGerrit Renker <gerrit@erg.abdn.ac.uk>
      Acked-by: default avatarIan McDonald <ian.mcdonald@jandi.co.nz>
      Acked-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      d0440ee6
    • Gerrit Renker's avatar
      dccp: Increase the scope of variable-length htonl/ntohl functions · b9aaac1c
      Gerrit Renker authored
      This extends the scope of two available functions, encode|decode_value_var,
      to work up to 6 (8) bytes, to match maximum requirements in the RFC.
      
      These functions are going to be used both by general option processing and 
      feature negotiation code, hence declarations have been put into feat.h.
      Signed-off-by: default avatarGerrit Renker <gerrit@erg.abdn.ac.uk>
      Acked-by: default avatarIan McDonald <ian.mcdonald@jandi.co.nz>
      Acked-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      b9aaac1c
    • Gerrit Renker's avatar
      dccp: API to query the current TX/RX CCID · c8041e26
      Gerrit Renker authored
      This provides function to query the current TX/RX CCID dynamically, without
      reliance on the minisock value, using dynamic information available in the
      currently loaded CCID module.
      
      This query function is then used to 
       (a) provide the getsockopt part for getting/setting CCIDs via sockopts;
       (b) replace the current test for "which CCID is in use" in probe.c.
      Signed-off-by: default avatarGerrit Renker <gerrit@erg.abdn.ac.uk>
      Acked-by: default avatarIan McDonald <ian.mcdonald@jandi.co.nz>
      c8041e26
    • Gerrit Renker's avatar
      dccp: Set per-connection CCIDs via socket options · fade756f
      Gerrit Renker authored
      With this patch, TX/RX CCIDs can now be changed on a per-connection basis, which
      overrides the defaults set by the global sysctl variables for TX/RX CCIDs.
      
      To make full use of this facility, the remaining patches of this patch set are
      needed, which track dependencies and activate negotiated feature values.
      
      Note on the maximum number of CCIDs that can be registered:
      -----------------------------------------------------------
      The maximum number of CCIDs that can be registered on the socket is constrained
      by the space in a Confirm/Change feature negotiation option. 
      
      The space in these in turn depends on the size of header options as defined
      in RFC 4340, 5.8. Since this is a recurring constant, it has been moved from
      ackvec.h into linux/dccp.h, clarifying its purpose.
      
      Relative to this size, the maximum number of CCID identifiers that can be 
      present in a Confirm option (which always consumes 1 byte more than a Change
      option, cf. 6.1) is 2 bytes less than the maximum TLV size: one for the
      CCID-feature-type and one for the selected value.
      Signed-off-by: default avatarGerrit Renker <gerrit@erg.abdn.ac.uk>
      fade756f
    • Gerrit Renker's avatar
      dccp: Tidy up setsockopt calls · 73bbe095
      Gerrit Renker authored
      This splits the setsockopt calls into two groups, depending on whether an
      integer argument (val) is required and whether routines being called do
      their own locking.
      
      Some options (such as setting the CCID) use u8 rather than int, so that for
      these the test with regard to integer-sizeof can not be used.
      
      The second switch-case statement now only has those statements which need
      locking and which make use of `val'.
      Signed-off-by: default avatarGerrit Renker <gerrit@erg.abdn.ac.uk>
      Acked-by: default avatarIan McDonald <ian.mcdonald@jandi.co.nz>
      Acked-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Reviewed-by: default avatarEugene Teo <eugeneteo@kernel.sg>
      73bbe095
    • Gerrit Renker's avatar
      dccp: Deprecate Ack Ratio sysctl · 17c30b40
      Gerrit Renker authored
      This patch deprecates the Ack Ratio sysctl, since
       * Ack Ratio is entirely ignored by CCID-3 and CCID-4,
       * Ack Ratio currently doesn't work in CCID-2 (i.e. is always set to 1);
       * even if it would work in CCID-2, there is no point for a user to change it:
         - Ack Ratio is constrained by cwnd (RFC 4341, 6.1.2),
         - if Ack Ratio > cwnd, the system resorts to spurious RTO timeouts 
           (since waiting for Acks which will never arrive in this window),
         - cwnd is not a user-configurable value.	
      
      The only reasonable place for Ack Ratio is to print it for debugging. It is
      planned to do this later on, as part of e.g. dccp_probe.
      
      With this patch Ack Ratio is now under full control of feature negotiation:
       * Ack Ratio is resolved as a dependency of the selected CCID;
       * if the chosen CCID supports it (i.e. CCID == CCID-2), Ack Ratio is set to
         the default of 2, following RFC 4340, 11.3 - "New connections start with Ack
         Ratio 2 for both endpoints";
       * what happens then is part of another patch set, since it concerns the 
         dynamic update of Ack Ratio while the connection is in full flight.
      
      Thanks to Tomasz Grobelny for discussion leading up to this patch.
      Signed-off-by: default avatarGerrit Renker <gerrit@erg.abdn.ac.uk>
      Acked-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      17c30b40
    • Gerrit Renker's avatar
      dccp: Feature negotiation for minimum-checksum-coverage · 20f41eee
      Gerrit Renker authored
      This provides feature negotiation for server minimum checksum coverage
      which so far has been missing.
      
      Since sender/receiver coverage values range only from 0...15, their
      type has also been reduced in size from u16 to u4.
      
      Feature-negotiation options are now generated for both sender and receiver
      coverage, i.e. when the peer has `forgotten' to enable partial coverage
      then feature negotiation will automatically enable (negotiate) the partial
      coverage value for this connection.
      Signed-off-by: default avatarGerrit Renker <gerrit@erg.abdn.ac.uk>
      Acked-by: default avatarIan McDonald <ian.mcdonald@jandi.co.nz>
      20f41eee
    • Gerrit Renker's avatar
      dccp: Deprecate old setsockopt framework · 668144f7
      Gerrit Renker authored
      The previous setsockopt interface, which passed socket options via struct 
      dccp_so_feat, is complicated/difficult to use. Continuing to support it leads to
      ugly code since the old approach did not distinguish between NN and SP values.
      
      This patch removes the old setsockopt interface and replaces it with two new
      functions to register NN/SP values for feature negotiation. These are 
      essentially wrappers around the internal __feat_register functions, with 
      checking added to avoid
       * wrong usage (type);
       * changing values while the connection is in progress.
      Signed-off-by: default avatarGerrit Renker <gerrit@erg.abdn.ac.uk>
      668144f7
    • Gerrit Renker's avatar
      dccp: Mechanism to resolve CCID dependencies · d4c8741c
      Gerrit Renker authored
      This adds a hook to resolve features whose value depends on the choice of
      CCID. It is done at the server since it can only be done after the CCID
      values have been negotiated; i.e. the client will add its CCID preference
      list on the Change options sent in the Request, which will be reconciled
      with the local preference list of the server.
      
      The concept is documented on 
      http://www.erg.abdn.ac.uk/users/gerrit/dccp/notes/feature_negotiation/\
      				implementation_notes.html#ccid_dependencies
      Signed-off-by: default avatarGerrit Renker <gerrit@erg.abdn.ac.uk>
      Acked-by: default avatarIan McDonald <ian.mcdonald@jandi.co.nz>
      d4c8741c
    • Gerrit Renker's avatar
      dccp: Resolve dependencies of features on choice of CCID · 093e1f46
      Gerrit Renker authored
      This provides a missing link in the code chain, as several features implicitly
      depend and/or rely on the choice of CCID. Most notably, this is the Send Ack Vector
      feature, but also Ack Ratio and Send Loss Event Rate (also taken care of).
      
      For Send Ack Vector, the situation is as follows:
       * since CCID2 mandates the use of Ack Vectors, there is no point in allowing 
         endpoints which use CCID2 to disable Ack Vector features such a connection;
      
       * a peer with a TX CCID of CCID2 will always expect Ack Vectors, and a peer
         with a RX CCID of CCID2 must always send Ack Vectors (RFC 4341, sec. 4);
      
       * for all other CCIDs, the use of (Send) Ack Vector is optional and thus
         negotiable. However, this implies that the code negotiating the use of Ack
         Vectors also supports it (i.e. is able to supply and to either parse or
         ignore received Ack Vectors). Since this is not the case (CCID-3 has no Ack
         Vector support), the use of Ack Vectors is here disabled, with a comment
         in the source code.
      
      An analogous consideration arises for the Send Loss Event Rate feature,
      since the CCID-3 implementation does not support the loss interval options
      of RFC 4342. To make such use explicit, corresponding feature-negotiation
      options are inserted which signal the use of the loss event rate option,
      as it is used by the CCID3 code.
      
      Lastly, the values of the Ack Ratio feature are matched to the choice of CCID.
      
      The patch implements this as a function which is called after the user has
      made all other registrations for changing default values of features.
      
      The table is variable-length, the reserved (and hence for feature-negotiation
      invalid, confirmed by considering section 19.4 of RFC 4340) feature number `0'
      is used to mark the end of the table.
      Signed-off-by: default avatarGerrit Renker <gerrit@erg.abdn.ac.uk>
      Acked-by: default avatarIan McDonald <ian.mcdonald@jandi.co.nz>
      093e1f46
    • Gerrit Renker's avatar
      dccp: Query supported CCIDs · 71bb4959
      Gerrit Renker authored
      This provides a data structure to record which CCIDs are locally supported
      and three accessor functions:
       - a test function for internal use which is used to validate CCID requests
         made by the user;
       - a copy function so that the list can be used for feature-negotiation;   
       - documented getsockopt() support so that the user can query capabilities.
      
      The data structure is a table which is filled in at compile-time with the
      list of available CCIDs (which in turn depends on the Kconfig choices).
      
      Using the copy function for cloning the list of supported CCIDs is useful for
      feature negotiation, since the negotiation is now with the full list of available
      CCIDs (e.g. {2, 3}) instead of the default value {2}. This means negotiation 
      will not fail if the peer requests to use CCID3 instead of CCID2. 
      Signed-off-by: default avatarGerrit Renker <gerrit@erg.abdn.ac.uk>
      Acked-by: default avatarIan McDonald <ian.mcdonald@jandi.co.nz>
      71bb4959
    • Gerrit Renker's avatar
      dccp: Registration routines for changing feature values · 86349c8d
      Gerrit Renker authored
      Two registration routines, for SP and NN features, are provided by this patch,
      replacing a previous routine which was used for both feature types.
      
      These are internal-only routines and therefore start with `__feat_register'.
      
      It further exports the known limits of Sequence Window and Ack Ratio as symbolic
      constants.
      Signed-off-by: default avatarGerrit Renker <gerrit@erg.abdn.ac.uk>
      Acked-by: default avatarIan McDonald <ian.mcdonald@jandi.co.nz>
      86349c8d
    • Gerrit Renker's avatar
      dccp: Limit feature negotiation to connection setup phase · 5591d286
      Gerrit Renker authored
      This patch starts the new implementation of feature negotiation:
       1. Although it is theoretically possible to perform feature negotiation at any
          time (and RFC 4340 supports this), in practice this is prohibitively complex,
          as it requires to put traffic on hold for each new negotiation.
       2. As a byproduct of restricting feature negotiation to connection setup, the
          feature-negotiation retransmit timer is no longer required. This part is now
          mapped onto the protocol-level retransmission.
          Details indicating why timers are no longer needed can be found on
          http://www.erg.abdn.ac.uk/users/gerrit/dccp/notes/feature_negotiation/\
      	                                      implementation_notes.html
      
      This patch disables anytime negotiation, subsequent patches work out full
      feature negotiation support for connection setup.
      Signed-off-by: default avatarGerrit Renker <gerrit@erg.abdn.ac.uk>
      5591d286
    • Gerrit Renker's avatar
      dccp: Cleanup routines for feature negotiation · 70208383
      Gerrit Renker authored
      This inserts the required de-allocation routines for memory allocated by 
      feature negotiation in the socket destructors, replacing dccp_feat_clean()
      in one instance.
      Signed-off-by: default avatarGerrit Renker <gerrit@erg.abdn.ac.uk>
      Acked-by: default avatarIan McDonald <ian.mcdonald@jandi.co.nz>
      70208383
    • Gerrit Renker's avatar
      dccp: Per-socket initialisation of feature negotiation · 828755ce
      Gerrit Renker authored
      This provides feature-negotiation initialisation for both DCCP sockets and
      DCCP request_sockets, to support feature negotiation during connection setup.
      
      It also resolves a FIXME regarding the congestion control initialisation.
      
      Thanks to Wei Yongjun for help with the IPv6 side of this patch.
      Signed-off-by: default avatarGerrit Renker <gerrit@erg.abdn.ac.uk>
      Acked-by: default avatarIan McDonald <ian.mcdonald@jandi.co.nz>
      828755ce
    • Gerrit Renker's avatar
      dccp: List management for new feature negotiation · 3001fc05
      Gerrit Renker authored
      This adds list fields and list management functions for the new feature
      negotiation implementation. The new code is kept in parallel to the old
      code, until removed at the end of the patch set.
      
      Thanks to Arnaldo for suggestions to improve the code.
      Signed-off-by: default avatarGerrit Renker <gerrit@erg.abdn.ac.uk>
      Acked-by: default avatarIan McDonald <ian.mcdonald@jandi.co.nz>
      3001fc05
    • Gerrit Renker's avatar
      dccp: Implement lookup table for feature-negotiation information · b4eec206
      Gerrit Renker authored
      A lookup table for feature-negotiation information, extracted from RFC 4340/42,
      is provided by this patch. All currently known features can be found in this 
      table, along with their feature location, their default value, and type.
      Signed-off-by: default avatarGerrit Renker <gerrit@erg.abdn.ac.uk>
      Acked-by: default avatarIan McDonald <ian.mcdonald@jandi.co.nz>
      b4eec206
    • Gerrit Renker's avatar
      dccp: Basic data structure for feature negotiation · 5c7c9451
      Gerrit Renker authored
      This patch prepares for the new and extended feature-negotiation routines.
      
      The following feature-negotiation data structures are provided:
      	* a container for the various (SP or NN) values,
      	* symbolic state names to track feature states,
      	* an entry struct which holds all current information together,
      	* elementary functions to fill in and process these structures.
      
      Entry structs are arranged as FIFO for the following reason: RFC 4340 specifies
      that if multiple options of the same type are present, they are processed in the
      order of their appearance in the packet; which means that this order needs to be
      preserved in the local data structure (the later insertion code also respects
      this order).
      
      The struct list_head has been chosen for the following reasons: the most 
      frequent operations are
       * add new entry at tail (when receiving Change or setting socket options);
       * delete entry (when Confirm has been received);
       * deep copy of entire list (cloning from listening socket onto request socket).
      
      The NN value has been set to 64 bit, which is a currently sufficient upper limit
      (Sequence Window feature has 48 bit).
      Signed-off-by: default avatarGerrit Renker <gerrit@erg.abdn.ac.uk>
      Acked-by: default avatarIan McDonald <ian.mcdonald@jandi.co.nz>
      5c7c9451
    • Gerrit Renker's avatar
      dccp ccid-3: Replace lazy BUG_ON with condition · 959fd992
      Gerrit Renker authored
      The BUG_ON(w_tot == 0) only holds if there is no more than 1 loss interval in
      the loss history. If there is only a single loss interval, the calc_i_mean()
      routine need in fact not be called (RFC 3448, 6.3.1). 
      Signed-off-by: default avatarGerrit Renker <gerrit@erg.abdn.ac.uk>
      959fd992
    • Gerrit Renker's avatar
      dccp: Toggle debug output without module unloading · 43264991
      Gerrit Renker authored
      This sets the sysfs permissions so that root can toggle the `debug'
      parameter available for nearly every DCCP module. This is useful 
      since there are various module inter-dependencies. The debug flag
      can now be toggled at runtime using
      
        echo 1 > /sys/module/dccp/parameters/dccp_debug
        echo 1 > /sys/module/dccp_ccid2/parameters/ccid2_debug
        echo 1 > /sys/module/dccp_ccid3/parameters/ccid3_debug
        echo 1 > /sys/module/dccp_tfrc_lib/parameters/tfrc_debug
      
      The last is not very useful yet, since no code at the moment calls
      the tfrc_debug() macro.
      Signed-off-by: default avatarGerrit Renker <gerrit@erg.abdn.ac.uk>
      43264991
    • Gerrit Renker's avatar
      dccp: Empty the write queue when disconnecting · 48816322
      Gerrit Renker authored
      dccp_disconnect() can be called due to several reasons:
      
       1. when the connection setup failed (inet_stream_connect());
       2. when shutting down (inet_shutdown(), inet_csk_listen_stop());
       3. when aborting the connection (dccp_close() with 0 linger time).
      
      In case (1) the write queue is empty. This patch empties the write queue,
      if in case (2) or (3) it was not yet empty.
      
      This avoids triggering the write-queue BUG_TRAP in sk_stream_kill_queues()
      later on.
      
      It also seems natural to do: when breaking an association, to delete all
      packets that were originally intended for the soon-disconnected end (compare
      with call to tcp_write_queue_purge in tcp_disconnect()).
      Signed-off-by: default avatarGerrit Renker <gerrit@erg.abdn.ac.uk>
      48816322
    • Gerrit Renker's avatar
      dccp: Fill in the Data fields for "Option Error" Resets · eac7726b
      Gerrit Renker authored
      This updates the use of the `out_invalid_option' label, which produces a 
      Reset (code 5, "Option Error"), to fill in the  Data1...Data3 fields as
      specified in RFC 4340, 5.6.
      Signed-off-by: default avatarGerrit Renker <gerrit@erg.abdn.ac.uk>
      eac7726b
    • Gerrit Renker's avatar
      dccp: Silently ignore options with nonsensical lengths · faf61c33
      Gerrit Renker authored
      This updates the option-parsing code with regard to RFC 4340, 5.8:
       "[..] options with nonsensical lengths (length byte less than two or more
        than the remaining space in the options portion of the header) MUST be
        ignored, and any option space following an option with nonsensical length
        MUST likewise be ignored."
      
      Hence in the following cases erratic options will be ignored:
       1. The type byte of a multi-byte option is the last byte of the header
          options (i.e. effective option length of 1).
       2. The value of the length byte is less than the minimum 2. This has been 
          changed from previously 3: although no multi-byte option with a length
          less than 3 yet exists (cf. table 3 in 5.8), a length of 2 is valid.
          (The switch-statement in dccp_parse has further per-option length checks.)
       3. The option length exceeds the length of the remaining option space.
      Signed-off-by: default avatarGerrit Renker <gerrit@erg.abdn.ac.uk>
      faf61c33
    • Wei Yongjun's avatar
      dccp: Always generate a Reset in response to option errors · ba1a6c7b
      Wei Yongjun authored
      RFC4340 states that if a packet is received with an option error (such as a
      Mandatory Option as the last byte of the option list), the endpoint should
      repond with a Reset.
      
      In the LISTEN and RESPOND states, the endpoint correctly reponds with Reset,
      while in the REQUEST/OPEN states, packets with option errors are just ignored.
      
      The packet sequence is as follows:
      
      Case 1:
      
        Endpoint A                           Endpoint B
        (CLOSED)                             (CLOSED)
      
                     <----------------       REQUEST
      
        RESPONSE     ----------------->      (*1)
        (with invalid option)
                     <----------------       RESET
                                             (with Reset Code 5, "Option Error")
      
        (*1) currently just ignored, no Reset is sent
      
      Case 2:
      
        Endpoint A                           Endpoint B
        (OPEN)                               (OPEN)
      
        DATA-ACK     ----------------->      (*2)
        (with invalid option)
                     <----------------       RESET
                                             (with Reset Code 5, "Option Error")
      
        (*2) currently just ignored, no Reset is sent
      
      This patch fixes the problem, by generating a Reset instead of silently
      ignoring option errors.
      Signed-off-by: default avatarWei Yongjun <yjwei@cn.fujitsu.com>
      Acked-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Acked-by: default avatarGerrit Renker <gerrit@erg.abdn.ac.uk>
      ba1a6c7b
  2. 03 Sep, 2008 14 commits