• Roland Dreier's avatar
    IB: Return "maybe missed event" hint from ib_req_notify_cq() · ed23a727
    Roland Dreier authored
    The semantics defined by the InfiniBand specification say that
    completion events are only generated when a completions is added to a
    completion queue (CQ) after completion notification is requested.  In
    other words, this means that the following race is possible:
    
    	while (CQ is not empty)
    		ib_poll_cq(CQ);
    	// new completion is added after while loop is exited
    	ib_req_notify_cq(CQ);
    	// no event is generated for the existing completion
    
    To close this race, the IB spec recommends doing another poll of the
    CQ after requesting notification.
    
    However, it is not always possible to arrange code this way (for
    example, we have found that NAPI for IPoIB cannot poll after
    requesting notification).  Also, some hardware (eg Mellanox HCAs)
    actually will generate an event for completions added before the call
    to ib_req_notify_cq() -- which is allowed by the spec, since there's
    no way for any upper-layer consumer to know exactly when a completion
    was really added -- so the extra poll of the CQ is just a waste.
    
    Motivated by this, we add a new flag "IB_CQ_REPORT_MISSED_EVENTS" for
    ib_req_notify_cq() so that it can return a hint about whether the a
    completion may have been added before the request for notification.
    The return value of ib_req_notify_cq() is extended so:
    
    	 < 0	means an error occurred while requesting notification
    	== 0	means notification was requested successfully, and if
    		IB_CQ_REPORT_MISSED_EVENTS was passed in, then no
    		events were missed and it is safe to wait for another
    		event.
    	 > 0	is only returned if IB_CQ_REPORT_MISSED_EVENTS was
    		passed in.  It means that the consumer must poll the
    		CQ again to make sure it is empty to avoid the race
    		described above.
    
    We add a flag to enable this behavior rather than turning it on
    unconditionally, because checking for missed events may incur
    significant overhead for some low-level drivers, and consumers that
    don't care about the results of this test shouldn't be forced to pay
    for the test.
    Signed-off-by: default avatarRoland Dreier <rolandd@cisco.com>
    ed23a727
ib_verbs.h 53.2 KB