SlapPopen: Fix select-based timeout reads
Bug
Usage of slapos node instance
shows that when a subprocess launched by SlapPopen deadlocks, the parent process also deadlocks (waiting in a blocking read) instead of timeouting, despite the select.poll
-based approach intended to ensure that reads only occur when they would not block.
Quite possibly this is because the current implementation does not discriminate on poll events before reading, and the poll()
call could happen to be woken up by an event that doesn't guarantee that the following read will be non-blocking (i.e. essentially an event other than "there is data to read" or "there is an error, the next read will fail (but not block!)").
Fix
These changes switch to using the higher-level selectors API instead of the low-level select.poll. Hopefully this should ensure that reads to stdout or stderr will only occur when guaranteed to be non-blocking. Another advantage is this implementation is more portable.
I was unable to devise a test that forced a deadlock with the previous implementation, so the theory about not discriminating on events is not proven, and it is not proven that this change would fix it. But at the very least this change makes the code more portable.
Alternative
Switching to a Popen.communicate(timeout=...)
based approach should guarantee that timeouts are respected, but stdout would no longer be logged back in real time.