• Gang He's avatar
    dlm: make sctp_connect_to_sock() return in specified time · f706d830
    Gang He authored
    When the user setup a two-ring cluster, DLM kernel module
    will automatically selects to use SCTP protocol to communicate
    between each node. There will be about 5 minute hang in DLM
    kernel module, in case one ring is broken before switching to
    another ring, this will potentially affect the dependent upper
    applications, e.g. ocfs2, gfs2, clvm and clustered-MD, etc.
    Unfortunately, if the user setup a two-ring cluster, we can not
    specify DLM communication protocol with TCP explicitly, since
    DLM kernel module only supports SCTP protocol for multiple
    ring cluster.
    Base on my investigation, the time is spent in sock->ops->connect()
    function before returns ETIMEDOUT(-110) error, since O_NONBLOCK
    argument in connect() function does not work here, then we should
    make sock->ops->connect() function return in specified time via
    setting socket SO_SNDTIMEO atrribute.
    Signed-off-by: default avatarGang He <ghe@suse.com>
    Signed-off-by: default avatarDavid Teigland <teigland@redhat.com>
    f706d830
lowcomms.c 43.6 KB