Detect if a test leaks processes and terminate them
For every TestCase nxdtest spawns test process to run with stdout/stderr redirected to pipes that nxdtest reads. Nxdtest, in turn, tees those pipes to its stdout/stderr until the pipes become EOF. If the test process, in turn, spawns other processes, those other processes will inherit opened pipes, and so the pipes won't become EOF untill all spawned test processes (main test process + other processes that it spawns) exit. Thus, if there will be any process, that the main test process spawned, but did not terminated upon its own exit, nxdtest will get stuck waiting for pipes to become EOF which won't happen at all if a spawned test subprocess persists not to terminate.
I hit this problem for real on a Wendelin.core 2 test - there the main test processes was segfaulting and so did not instructed other spawned processes (ZEO, WCFS, ...) to terminate. As the result the whole test was becoming stuck instead of being promptly reported as failed:
runTestSuite: Makefile:175: recipe for target 'test.wcfs' failed
runTestSuite: make: *** [test.wcfs] Segmentation fault
runTestSuite: wcfs: 2021/08/09 17:32:09 zlink [::1]:52052 - [::1]:23386: recvPkt: EOF
runTestSuite: E0809 17:32:09.376800 38082 wcfs.go:2574] zwatch zeo://localhost:23386: zlink [::1]:52052 - [::1]:23386: recvPkt: EOF
runTestSuite: E0809 17:32:09.377431 38082 wcfs.go:2575] zwatcher failed -> switching filesystem to EIO mode (TODO)
<LONG WAIT>
runTestSuite: PROCESS TOO LONG OR DEAD, GOING TO BE TERMINATED
-> Fix it.
/cc @jerome