• unknown's avatar
    WL#2347 - Load independent heartbeats · f4b56000
    unknown authored
    Reset missed heartbeat count on receipt of signal from node.
    
    This fixes a bug where that under high network load, the heartbeat packets could be
    delayed, causing the appearance of node failure (due to lost heartbeats).
    
    
    ndb/include/kernel/NodeInfo.hpp:
      Add m_heartbeat_cnt to track missed heartbeats
    ndb/include/transporter/TransporterCallback.hpp:
      add prototype for transporter_recv_from()
      
      Called on receipt from a node.
    ndb/src/common/transporter/TransporterRegistry.cpp:
      Add calls to transporter_receive_from when data is received (before unpack)
    ndb/src/kernel/blocks/qmgr/Qmgr.hpp:
      remove NodeRec::alarmCount. missed heartbeat count now kept in NodeInfo
    ndb/src/kernel/blocks/qmgr/QmgrMain.cpp:
      Use NodeInfo::m_heartbeat_cnt for missed heartbeat count
    ndb/src/kernel/vm/TransporterCallback.cpp:
      add transporter_recv_from(), which is called on receipt of signals.
      It resets missed heartbeat count for that node.
    ndb/src/ndbapi/ClusterMgr.cpp:
      Use NodeInfo::m_heartbeat_cnt for missed heartbeat count
    ndb/src/ndbapi/ClusterMgr.hpp:
      Use NodeInfo::m_heartbeat_cnt instead of ClusterMgr::Node::hbSent for missed
      heartbeat count.
      
      We now use the same storage for API and Kernel heartbeats.
      
      Add ClusterMgr::hb_received(nodeId) to reset hbSent (as if we received a heartbeat,
      but callable from elsewhere - e.g. when signal received)
    ndb/src/ndbapi/TransporterFacade.cpp:
      Implement transporter_recv_from for ndbapi - which resets hbSent
    ndb/src/ndbapi/TransporterFacade.hpp:
      Add hb_received(nodeId)
    f4b56000
ClusterMgr.cpp 21.3 KB