Record counters for request apdex
With this, we'll emit 2 new counters from web processes that can be used to monitor apdex. The `gitlab_sli:rails_request_apdex:total` counter is incremented for every successful (not a 500) that is not to a health endpoint. The `gitlab_sli:rails_request_apdex:success_total` is incremented when the request took less than 1 second. We intend to customize this value per endpoint in the future. Both these counters are labelled with `feature_category` and `endpoint_id` from the context. The metrics would also be initialized on the first scrape. This means that a 0 would be available for every set of labels, avoiding bugs in calculations with these metrics. To get to all of the `feature_category`s and `endpoint_id`s for the initialization, we had to move some code that iterates all endpoints that was only used in tests to the application. We know this would initialize about 2 * 2500 metrics per pod running a web server. So we'd like to roll this out in a controlled fashion, to make sure this doesn't impact our monitoring. Which is why this is feature flagged. This also limits the initialization of these metrics to just web-processes. So they don't get generated for consoles or runner processes. This also includes a developer-api to define SLIs and encourages initializing them with the known label sets.
Showing
lib/gitlab/metrics/sli.rb
0 → 100644
Please register or sign in to comment