WIP: Fix/rapid cdn promise relax
Possible blocker: It's real problem that promises for slave instance preparation are failing, as they indicate that partitions needs to be reprocessed until everything is correctly setup. Tests on this branch are failing, simply exposing the real problem.
Attention: Do not simply silence the promises, as it will lead to problems. One have to rethink how to react on the promise state, and when they shall result with problems. Working on silencing tickets on master is NOGO.
Outcome: The promises promise-key-download-url-ready.py
and publish-failsafe-error.py
shall have some grace period, so that on real cluster they do not react too fast. Generally distributing the information about the slave requires a lot of processing on each partition, and with high amount of slaves this can take quite some time (up to 2 hours). The idea is, that such proimse shall be allowed to fail up to 5 times before anomaly would be detected, lowering the amount of tickets generated on live clusters after adding a slave.
Tasks:
- slapos.toolbox!133 (merged)
- slapos.toolbox!134
- configure
check_file_state
with properTestLess
,AnomalyResult
andTestResult
- configure proper grace period (
failure_amount
)- assert that the grace period really works depending of promise configuration, if needed improve promise code