lib/gitlab/sidekiq_middleware/server_metrics.rb · b4600f0041a8a1a5bdc9dc12295d19a9d6b26d82 · nexedi / gitlab-ce

Reduce the number of buckets in Sidekiq histograms · 7c912e14

Bob Van Landuyt authored Mar 10, 2022

Because of the wide range of buckets used in for these metrics and the
large number of pods running, the cardinality of these series made it
hard to query the Prometheus instance serving these.

As a result, some of the metrics that are used for service monitoring
and alerting were failing to record in Thanos. By reducing the number
of buckets we're hoping to improve the rule evaluations and prevent
missing series for Sidekiq

This brings the number of series for the
`sidekiq_jobs_completion_seconds` &
`sidekiq_jobs_queue_duration_seconds` down from +8k to about 1.5k
each.

This also reduces the number of buckets used for measuring the total
time a job spends per resource: cpu, db, gitaly or elasticsearch.

Changelog: changed

7c912e14

server_metrics.rb 8.19 KB

Replace server_metrics.rb