Commit 07a85323 authored by Matthias Kaeppler's avatar Matthias Kaeppler

Add Cop that enforces worker data_consistency

This ensures developers consider scalability and immediacy
trade-offs by either using or not using the DB primary.
parent e7a075f7
This diff is collapsed.
......@@ -484,14 +484,22 @@ under several strategies outlined below.
## Trading immediacy for reduced primary load
Not requiring immediate data consistency allows developers to decide to either:
We require Sidekiq workers to make an explicit decision around whether they need to use the
primary database node for all reads and writes, or whether reads can be served from replicas. This is
enforced by a RuboCop rule, which ensures that the `data_consistency` field is set.
When setting this field, consider the following trade-off:
- Ensure immediately consistent reads, but increase load on the primary database.
- Prefer read replicas to add relief to the primary, but increase the likelihood of stale reads that have to be retried.
By default, any worker has a data consistency requirement of `:always`, so, as before, all
database operations target the primary. To allow for reads to be served from replicas instead, we
added two additional consistency modes: `:sticky` and `:delayed`.
To maintain the same behavior compared to before this field was introduced, set it to `:always`, so
database operations will only target the primary. Reasons for having to do so include workers
that mostly or exclusively perform writes, or workers that read their own writes and who might run
into data consistency issues should a stale record be read back from a replica. **Try to avoid
these scenarios, since `:always` should be considered the exception, not the rule.**
To allow for reads to be served from replicas, we added two additional consistency modes: `:sticky` and `:delayed`.
When you declare either `:sticky` or `:delayed` consistency, workers become eligible for database
load-balancing. In both cases, jobs are enqueued with a short delay.
......@@ -508,7 +516,7 @@ they prefer read replicas and will wait for replicas to catch up:
| **Data Consistency** | **Description** |
|--------------|-----------------------------|
| `:always` | The job is required to use the primary database (default). It should be used for workers that primarily perform writes or that have very strict requirements around reading their writes without suffering any form of delay. |
| `:always` | The job is required to use the primary database (default). It should be used for workers that primarily perform writes or that have strict requirements around data consistency when reading their own writes. |
| `:sticky` | The job prefers replicas, but switches to the primary for writes or when encountering replication lag. It should be used for jobs that require to be executed as fast as possible but can sustain a small initial queuing delay. |
| `:delayed` | The job prefers replicas, but switches to the primary for writes. When encountering replication lag before the job starts, the job is retried once. If the replica is still not up to date on the next retry, it switches to the primary. It should be used for jobs where delaying execution further typically does not matter, such as cache expiration or web hooks execution. |
......
# frozen_string_literal: true
require_relative '../code_reuse_helpers'
module RuboCop
module Cop
# This cop checks for a call to `data_consistency` to exist in Sidekiq workers.
#
# @example
#
# # bad
# class BadWorker
# def perform
# end
# end
#
# # good
# class GoodWorker
# data_consistency :delayed
#
# def perform
# end
# end
#
class WorkerDataConsistency < RuboCop::Cop::Cop
include CodeReuseHelpers
HELP_LINK = 'https://docs.gitlab.com/ee/development/sidekiq_style_guide.html#job-data-consistency-strategies'
MSG = <<~MSG
Should define data_consistency expectation.
It is encouraged for workers to use database replicas as much as possible by declaring
data_consistency to use the :delayed or :sticky modes. Mode :always will result in the
worker always hitting the primary database node for both reads and writes, which limits
scalability.
Some guidelines:
1. If your worker mostly writes or reads its own writes, use mode :always. TRY TO AVOID THIS.
2. If your worker performs mostly reads and can tolerate small delays, use mode :delayed.
3. If your worker performs mostly reads but cannot tolerate any delays, use mode :sticky.
See #{HELP_LINK} for a more detailed explanation of these settings.
MSG
def_node_search :application_worker?, <<~PATTERN
`(send nil? :include (const nil? :ApplicationWorker))
PATTERN
def_node_search :data_consistency_defined?, <<~PATTERN
`(send nil? :data_consistency ...)
PATTERN
def on_class(node)
return unless in_worker?(node) && application_worker?(node)
return if data_consistency_defined?(node)
add_offense(node, location: :expression)
end
end
end
end
# frozen_string_literal: true
require 'fast_spec_helper'
require_relative '../../../rubocop/cop/worker_data_consistency'
RSpec.describe RuboCop::Cop::WorkerDataConsistency do
subject(:cop) { described_class.new }
before do
allow(cop)
.to receive(:in_worker?)
.and_return(true)
end
it 'adds an offense when not defining data_consistency' do
expect_offense(<<~CODE)
class SomeWorker
^^^^^^^^^^^^^^^^ Should define data_consistency expectation.[...]
include ApplicationWorker
queue_namespace :pipeline_hooks
feature_category :continuous_integration
urgency :high
end
CODE
end
it 'adds no offense when defining data_consistency' do
expect_no_offenses(<<~CODE)
class SomeWorker
include ApplicationWorker
queue_namespace :pipeline_hooks
feature_category :continuous_integration
data_consistency :delayed
urgency :high
end
CODE
end
it 'adds no offense when worker is not an ApplicationWorker' do
expect_no_offenses(<<~CODE)
class SomeWorker
queue_namespace :pipeline_hooks
feature_category :continuous_integration
urgency :high
end
CODE
end
end
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment