• Kirill Smelkov's avatar
    Detect if a test leaks processes and terminate them · 0ad45a9c
    Kirill Smelkov authored
    For every TestCase nxdtest spawns test process to run with stdout/stderr
    redirected to pipes that nxdtest reads. Nxdtest, in turn, tees those
    pipes to its stdout/stderr until the pipes become EOF. If the test
    process, in turn, spawns other processes, those other processes will
    inherit opened pipes, and so the pipes won't become EOF untill _all_
    spawned test processes (main test process + other processes that it
    spawns) exit. Thus, if there will be any process, that the main test
    process spawned, but did not terminated upon its own exit, nxdtest will
    get stuck waiting for pipes to become EOF which won't happen at all if a
    spawned test subprocess persists not to terminate.
    
    I hit this problem for real on a Wendelin.core 2 test - there the main
    test processes was segfaulting and so did not instructed other spawned
    processes (ZEO, WCFS, ...) to terminate. As the result the whole test
    was becoming stuck instead of being promptly reported as failed:
    
        runTestSuite: Makefile:175: recipe for target 'test.wcfs' failed
        runTestSuite: make: *** [test.wcfs] Segmentation fault
        runTestSuite: wcfs: 2021/08/09 17:32:09 zlink [::1]:52052 - [::1]:23386: recvPkt: EOF
        runTestSuite: E0809 17:32:09.376800   38082 wcfs.go:2574] zwatch zeo://localhost:23386: zlink [::1]:52052 - [::1]:23386: recvPkt: EOF
        runTestSuite: E0809 17:32:09.377431   38082 wcfs.go:2575] zwatcher failed -> switching filesystem to EIO mode (TODO)
        <LONG WAIT>
        runTestSuite: PROCESS TOO LONG OR DEAD, GOING TO BE TERMINATED
    
    -> Fix it.
    
    /reviewed-by @jerome
    /reviewed-on nexedi/nxdtest!9
    0ad45a9c
setup.py 938 Bytes