tests/slapgrid: make test more deterministic
test_one_failing_daemon_in_service_will_bang_with_watchdog often fail with:
====================================================================== FAIL: test_one_failing_daemon_in_service_will_bang_with_watchdog (slapos.tests.slapgrid.TestSlapgridCPWithMasterWatchdog) ---------------------------------------------------------------------- Traceback (most recent call last): File "/srv/slapgrid/slappart9/srv/testnode/bpy/inst/test0-0/parts/slapos.core/slapos/tests/slapgrid.py", line 907, in test_one_failing_daemon_in_service_will_bang_with_watchdog 'etc', 'software_release', 'worked', '.slapos-retention-lock-delay']) File "/srv/slapgrid/slappart9/srv/testnode/bpy/soft/5082e1741ad09c0910ec59bf9feae300/eggs/six-1.11.0-py2.7.egg/six.py", line 673, in assertCountEqual return getattr(self, _assertCountEqual)(*args, **kwargs) AssertionError: Element counts were not equal: First has 1, Second has 0: 'crashed' First has 1, Second has 0: 'launched'
This test uses a service that will create files
when running, and just after telling supervisor to start the service test
inspects the directory content.
If service had time to start, then files are created, otherwise they are not.
Change the service to wait for a delay before creating the files to reduce the change of race condition here.
The changes were merged into master. The source branch has been removed.
Maybe a bit more explanation about this test, because it was not easy.
This creates a partition using this script instead of buildout:
#!/bin/sh mkdir -p etc/service && echo "#!/bin/sh" > etc/service/daemon && echo "touch launched if [ -f ./crashed ]; then while true; do echo Working; sleep 0.1; done else touch ./crashed; echo Failing; sleep 1; exit 111; fi" >> etc/service/daemon && chmod 755 etc/service/daemon && touch worked
so when this buildout runs, it will:
etc/service/daemonwith content (reformatted for readability)
#!/bin/sh sleep 1 # (the fix was to add a delay here) touch launched if [ -f ./crashed ]; then while true; do echo Working; sleep 0.1; done else touch ./crashed; echo Failing; sleep 1; exit 111; fi
- make it executable
- touch a file
workedwhich is a "proof" that buildout ran.
When daemon runs, it will:
- create a file
launchedwhich is a "proof" that daemon was started
- if a file
crashedexists, print "Working" in an infinite loop
- otherwise create this
crashedfile and exit with error code.
Seems fine to merge I think.
Yes it's very unlikely that this would have side effects, it's "test only" and this test is already failing.
mergedToggle commit list