• Stan Hu's avatar
    Measure Sidekiq enqueue latency for scheduled jobs · c4c42d29
    Stan Hu authored
    When Sidekiq schedules a job to be run later, it sets the `at` field
    with the desired time and inserts the payload into the `scheduled`
    queue. Sidekiq runs a separate thread that polls this queue via a Redis
    `zrangebyscore` command and inserts jobs that have passed the scheduled
    time. Sidekiq uses a random delay that is based on the number of
    processes running at a given time.
    
    We observed that this enqueuing of scheduled jobs can add an unexpected,
    significant delay and increase CPU utilization on Redis as more Sidekiq
    processes connect. Before we can make any tweaks to this configuration,
    we should first measure how much delay we have.
    
    This commit adds a `enqueue_latency_s` for scheduled jobs to record how
    long this delay was. In the Sidekiq client middleware, we have access to
    the `at` field before it is deleted by Sidekiq, so we save it as the
    `scheduled_at` field. We subtract the difference between `enqueued_at`
    and `scheduled_at` and log `enqueue_latency_s`.
    
    Relates to
    https://gitlab.com/gitlab-com/gl-infra/scalability/-/issues/1179
    
    Changelog: added
    c4c42d29
client_metrics_spec.rb 3 KB