Propagate cancellation to spawned test jobs
A user might cancel test result in ERP5 UI if e.g. some misbehaviour is detected and a new revision is ready to be tested. This works by test_result.start() returning None - indicating that there is no more test_result_lines to exercise. Master also indicates this cancellation via test_result.isAlive() returning False, but until now we were not using that information and were always waiting for completion of current test job that is already spawned. This works well in practice if individual tests are not long, but e.g. for SlapOS.SoftwareReleases.IntegrationTest-* it is not good, because there an individual test might takes _hours_ to execute. -> Fix it by first setting global context to where we'll propagate cancellation from test_result.isAlive, and by using that context as the base for all other activities. This should terminate spawned test process if test_result is canceled. The interval to check is picked up as 5 minutes not to overload master. @jerome says that We now have 341 active test nodes, but sometimes we are using more, we did in the past to stress test some new machines. For the developer, if we reduce the waiting time from a few hours to 1 minutes or 5 minutes seems more or less equivalent. For 350 testnodes and each nxdtest checking its test_result status via isAlive query to master every 5 minutes, it results in ~ 1 isAlive request/second to master on average. Had to change time to golang.time to use time.after(). Due to that time() and sleep() are changed to time.now() and time.sleep() correspondingly. /helped-and-reviewed-by @jerome /reviewed-on nexedi/nxdtest!14
Showing
Please register or sign in to comment