• Jon Paul Maloy's avatar
    tipc: guarantee peer bearer id exchange after reboot · 634696b1
    Jon Paul Maloy authored
    When a link endpoint is going down locally, e.g., because its interface
    is being stopped, it will spontaneously send out a RESET message to
    its peer, informing it about this fact. This saves the peer from
    detecting the failure via probing, and hence gives both speedier and
    less resource consuming failure detection on the peer side.
    
    According to the link FSM, a receiver of a RESET message, ignoring the
    reason for it, must now consider the sender ready to come back up, and
    starts periodically sending out ACTIVATE messages to the peer in order
    to re-establish the link. Also, according to the FSM, the receiver of
    an ACTIVATE message can now go directly to state ESTABLISHED and start
    sending regular traffic packets. This is a well-proven and robust FSM.
    
    However, in the case of a reboot, there is a small possibilty that link
    endpoint on the rebooted node may have been re-created with a new bearer
    identity between the moment it sent its (pre-boot) RESET and the moment
    it receives the ACTIVATE from the peer. The new bearer identity cannot
    be known by the peer according to this scenario, since traffic headers
    don't convey such information. This is a problem, because both endpoints
    need to know the correct value of the peer's bearer id at any moment in
    time in order to be able to produce correct link events for their users.
    
    The only way to guarantee this is to enforce a full setup message
    exchange (RESET + ACTIVATE) even after the reboot, since those messages
    carry the bearer idientity in their header.
    
    In this commit we do this by introducing and setting a "stopping" bit in
    the header of the spontaneously generated RESET messages, informing the
    peer that the sender will not be immediately ready to re-establish the
    link. A receiver seeing this bit must act as if this were a locally
    detected connectivity failure, and hence has to go through a full two-
    way setup message exchange before any link can be re-established.
    
    Although never reported, this problem seems to have always been around.
    
    This protocol addition is fully backwards compatible.
    Acked-by: default avatarYing Xue <ying.xue@windriver.com>
    Signed-off-by: default avatarJon Maloy <jon.maloy@ericsson.com>
    Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    634696b1
msg.h 20 KB