Commit 6ec84d45 authored by Stan Hu's avatar Stan Hu

Merge branch 'allow-minimum-processes-in-sidekiq-cluster' into 'master'

Add --min-concurrency to sidekiq-cluster

See merge request gitlab-org/gitlab!23408
parents 4b529f64 8e6c0c0e
---
title: Allow setting minimum concurrency for sidekiq-cluster processes
merge_request: 23408
author:
type: added
...@@ -124,9 +124,18 @@ number of threads that equals the number of queues, plus one spare thread. ...@@ -124,9 +124,18 @@ number of threads that equals the number of queues, plus one spare thread.
For example, a process that handles the `process_commit` and `post_receive` For example, a process that handles the `process_commit` and `post_receive`
queues will use three threads in total. queues will use three threads in total.
## Limiting concurrency ## Managing concurrency
To limit the concurrency of the Sidekiq process: When setting the maximum concurrency, keep in mind this normally should
not exceed the number of CPU cores available. The values in the examples
below are arbitrary and not particular recommendations.
Each thread requires a Redis connection, so adding threads may increase Redis
latency and potentially cause client timeouts. See the [Sidekiq documentation
about Redis](https://github.com/mperham/sidekiq/wiki/Using-Redis) for more
details.
### When running a single Sidekiq process (default)
1. Edit `/etc/gitlab/gitlab.rb` and add: 1. Edit `/etc/gitlab/gitlab.rb` and add:
...@@ -140,11 +149,14 @@ To limit the concurrency of the Sidekiq process: ...@@ -140,11 +149,14 @@ To limit the concurrency of the Sidekiq process:
sudo gitlab-ctl reconfigure sudo gitlab-ctl reconfigure
``` ```
To limit the max concurrency of the Sidekiq cluster processes: This will set the concurrency (number of threads) for the Sidekiq process.
### When running Sidekiq cluster
1. Edit `/etc/gitlab/gitlab.rb` and add: 1. Edit `/etc/gitlab/gitlab.rb` and add:
```ruby ```ruby
sidekiq_cluster['min_concurrency'] = 15
sidekiq_cluster['max_concurrency'] = 25 sidekiq_cluster['max_concurrency'] = 25
``` ```
...@@ -154,14 +166,21 @@ To limit the max concurrency of the Sidekiq cluster processes: ...@@ -154,14 +166,21 @@ To limit the max concurrency of the Sidekiq cluster processes:
sudo gitlab-ctl reconfigure sudo gitlab-ctl reconfigure
``` ```
For each queue group, the concurrency factor will be set to `min(number of queues, N)`. `min_concurrency` and `max_concurrency` are independent; one can be set without
Setting the value to 0 will disable the limit. Keep in mind this normally would the other. Setting `min_concurrency` to 0 will disable the limit.
not exceed the number of CPU cores available.
For each queue group, let N be one more than the number of queues. The
concurrency factor will be set to:
1. `N`, if it's between `min_concurrency` and `max_concurrency`.
1. `max_concurrency`, if `N` exceeds this value.
1. `min_concurrency`, if `N` is less than this value.
If `min_concurrency` is equal to `max_concurrency`, then this value will be used
regardless of the number of queues.
Each thread requires a Redis connection, so adding threads may When `min_concurrency` is greater than `max_concurrency`, it is treated as
increase Redis latency and potentially cause client timeouts. See the [Sidekiq being equal to `max_concurrency`.
documentation about Redis](https://github.com/mperham/sidekiq/wiki/Using-Redis)
for more details.
## Modifying the check interval ## Modifying the check interval
......
...@@ -64,20 +64,20 @@ module Gitlab ...@@ -64,20 +64,20 @@ module Gitlab
# directory - The directory of the Rails application. # directory - The directory of the Rails application.
# #
# Returns an Array containing the PIDs of the started processes. # Returns an Array containing the PIDs of the started processes.
def self.start(queues, env: :development, directory: Dir.pwd, max_concurrency: 50, dryrun: false) def self.start(queues, env: :development, directory: Dir.pwd, max_concurrency: 50, min_concurrency: 0, dryrun: false)
queues.map.with_index do |pair, index| queues.map.with_index do |pair, index|
start_sidekiq(pair, env: env, directory: directory, max_concurrency: max_concurrency, worker_id: index, dryrun: dryrun) start_sidekiq(pair, env: env, directory: directory, max_concurrency: max_concurrency, min_concurrency: min_concurrency, worker_id: index, dryrun: dryrun)
end end
end end
# Starts a Sidekiq process that processes _only_ the given queues. # Starts a Sidekiq process that processes _only_ the given queues.
# #
# Returns the PID of the started process. # Returns the PID of the started process.
def self.start_sidekiq(queues, env:, directory:, max_concurrency:, worker_id:, dryrun:) def self.start_sidekiq(queues, env:, directory:, max_concurrency:, min_concurrency:, worker_id:, dryrun:)
counts = count_by_queue(queues) counts = count_by_queue(queues)
cmd = %w[bundle exec sidekiq] cmd = %w[bundle exec sidekiq]
cmd << "-c #{self.concurrency(queues, max_concurrency)}" cmd << "-c #{self.concurrency(queues, min_concurrency, max_concurrency)}"
cmd << "-e#{env}" cmd << "-e#{env}"
cmd << "-gqueues: #{proc_details(counts)}" cmd << "-gqueues: #{proc_details(counts)}"
cmd << "-r#{directory}" cmd << "-r#{directory}"
...@@ -119,12 +119,12 @@ module Gitlab ...@@ -119,12 +119,12 @@ module Gitlab
end.join(', ') end.join(', ')
end end
def self.concurrency(queues, max_concurrency) def self.concurrency(queues, min_concurrency, max_concurrency)
if max_concurrency.positive? concurrency_from_queues = queues.length + 1
[queues.length + 1, max_concurrency].min max = max_concurrency.positive? ? max_concurrency : concurrency_from_queues
else min = [min_concurrency, max].min
queues.length + 1
end concurrency_from_queues.clamp(min, max)
end end
# Waits for the given process to complete using a separate thread. # Waits for the given process to complete using a separate thread.
......
...@@ -17,6 +17,7 @@ module Gitlab ...@@ -17,6 +17,7 @@ module Gitlab
def initialize(log_output = STDERR) def initialize(log_output = STDERR)
# As recommended by https://github.com/mperham/sidekiq/wiki/Advanced-Options#concurrency # As recommended by https://github.com/mperham/sidekiq/wiki/Advanced-Options#concurrency
@max_concurrency = 50 @max_concurrency = 50
@min_concurrency = 0
@environment = ENV['RAILS_ENV'] || 'development' @environment = ENV['RAILS_ENV'] || 'development'
@pid = nil @pid = nil
@interval = 5 @interval = 5
...@@ -54,8 +55,14 @@ module Gitlab ...@@ -54,8 +55,14 @@ module Gitlab
@logger.info("Starting cluster with #{queue_groups.length} processes") @logger.info("Starting cluster with #{queue_groups.length} processes")
@processes = SidekiqCluster.start(queue_groups, env: @environment, directory: @rails_path, @processes = SidekiqCluster.start(
max_concurrency: @max_concurrency, dryrun: @dryrun) queue_groups,
env: @environment,
directory: @rails_path,
max_concurrency: @max_concurrency,
min_concurrency: @min_concurrency,
dryrun: @dryrun
)
return if @dryrun return if @dryrun
...@@ -128,6 +135,10 @@ module Gitlab ...@@ -128,6 +135,10 @@ module Gitlab
@max_concurrency = int.to_i @max_concurrency = int.to_i
end end
opt.on('--min-concurrency INT', 'Minimum threads to use with Sidekiq (default: 0)') do |int|
@min_concurrency = int.to_i
end
opt.on('-e', '--environment ENV', 'The application environment') do |env| opt.on('-e', '--environment ENV', 'The application environment') do |env|
@environment = env @environment = env
end end
......
# frozen_string_literal: true # frozen_string_literal: true
require 'spec_helper' require 'fast_spec_helper'
describe Gitlab::SidekiqCluster::CLI do describe Gitlab::SidekiqCluster::CLI do
let(:cli) { described_class.new('/dev/null') } let(:cli) { described_class.new('/dev/null') }
let(:default_options) do let(:default_options) do
{ env: 'test', directory: Dir.pwd, max_concurrency: 50, dryrun: false } { env: 'test', directory: Dir.pwd, max_concurrency: 50, min_concurrency: 0, dryrun: false }
end
before do
stub_env('RAILS_ENV', 'test')
end end
describe '#run' do describe '#run' do
...@@ -52,6 +56,17 @@ describe Gitlab::SidekiqCluster::CLI do ...@@ -52,6 +56,17 @@ describe Gitlab::SidekiqCluster::CLI do
end end
end end
context 'with --min-concurrency flag' do
it 'starts Sidekiq workers for specified queues with a min concurrency' do
expect(Gitlab::SidekiqConfig::CliMethods).to receive(:worker_queues).and_return(%w(foo bar baz))
expect(Gitlab::SidekiqCluster).to receive(:start)
.with([%w(foo bar baz), %w(solo)], default_options.merge(min_concurrency: 2))
.and_return([])
cli.run(%w(foo,bar,baz solo --min-concurrency 2))
end
end
context 'queue namespace expansion' do context 'queue namespace expansion' do
it 'starts Sidekiq workers for all queues in all_queues.yml with a namespace in argv' do it 'starts Sidekiq workers for all queues in all_queues.yml with a namespace in argv' do
expect(Gitlab::SidekiqConfig::CliMethods).to receive(:worker_queues).and_return(['cronjob:foo', 'cronjob:bar']) expect(Gitlab::SidekiqConfig::CliMethods).to receive(:worker_queues).and_return(['cronjob:foo', 'cronjob:bar'])
......
# frozen_string_literal: true # frozen_string_literal: true
require 'spec_helper' require 'fast_spec_helper'
require 'rspec-parameterized'
describe Gitlab::SidekiqCluster do describe Gitlab::SidekiqCluster do
describe '.trap_signals' do describe '.trap_signals' do
...@@ -59,12 +60,18 @@ describe Gitlab::SidekiqCluster do ...@@ -59,12 +60,18 @@ describe Gitlab::SidekiqCluster do
describe '.start' do describe '.start' do
it 'starts Sidekiq with the given queues, environment and options' do it 'starts Sidekiq with the given queues, environment and options' do
expected_options = { env: :production, directory: 'foo/bar', max_concurrency: 20, dryrun: true } expected_options = {
env: :production,
directory: 'foo/bar',
max_concurrency: 20,
min_concurrency: 10,
dryrun: true
}
expect(described_class).to receive(:start_sidekiq).ordered.with(%w(foo), expected_options.merge(worker_id: 0)) expect(described_class).to receive(:start_sidekiq).ordered.with(%w(foo), expected_options.merge(worker_id: 0))
expect(described_class).to receive(:start_sidekiq).ordered.with(%w(bar baz), expected_options.merge(worker_id: 1)) expect(described_class).to receive(:start_sidekiq).ordered.with(%w(bar baz), expected_options.merge(worker_id: 1))
described_class.start([%w(foo), %w(bar baz)], env: :production, directory: 'foo/bar', max_concurrency: 20, dryrun: true) described_class.start([%w(foo), %w(bar baz)], env: :production, directory: 'foo/bar', max_concurrency: 20, min_concurrency: 10, dryrun: true)
end end
it 'starts Sidekiq with the given queues and sensible default options' do it 'starts Sidekiq with the given queues and sensible default options' do
...@@ -72,6 +79,7 @@ describe Gitlab::SidekiqCluster do ...@@ -72,6 +79,7 @@ describe Gitlab::SidekiqCluster do
env: :development, env: :development,
directory: an_instance_of(String), directory: an_instance_of(String),
max_concurrency: 50, max_concurrency: 50,
min_concurrency: 0,
worker_id: an_instance_of(Integer), worker_id: an_instance_of(Integer),
dryrun: false dryrun: false
} }
...@@ -86,7 +94,7 @@ describe Gitlab::SidekiqCluster do ...@@ -86,7 +94,7 @@ describe Gitlab::SidekiqCluster do
describe '.start_sidekiq' do describe '.start_sidekiq' do
let(:first_worker_id) { 0 } let(:first_worker_id) { 0 }
let(:options) do let(:options) do
{ env: :production, directory: 'foo/bar', max_concurrency: 20, worker_id: first_worker_id, dryrun: false } { env: :production, directory: 'foo/bar', max_concurrency: 20, min_concurrency: 0, worker_id: first_worker_id, dryrun: false }
end end
let(:env) { { "ENABLE_SIDEKIQ_CLUSTER" => "1", "SIDEKIQ_WORKER_ID" => first_worker_id.to_s } } let(:env) { { "ENABLE_SIDEKIQ_CLUSTER" => "1", "SIDEKIQ_WORKER_ID" => first_worker_id.to_s } }
let(:args) { ['bundle', 'exec', 'sidekiq', anything, '-eproduction', *([anything] * 5)] } let(:args) { ['bundle', 'exec', 'sidekiq', anything, '-eproduction', *([anything] * 5)] }
...@@ -119,6 +127,27 @@ describe Gitlab::SidekiqCluster do ...@@ -119,6 +127,27 @@ describe Gitlab::SidekiqCluster do
end end
end end
describe '.concurrency' do
using RSpec::Parameterized::TableSyntax
where(:queue_count, :min, :max, :expected) do
2 | 0 | 0 | 3 # No min or max specified
2 | 0 | 9 | 3 # No min specified, value < max
2 | 1 | 4 | 3 # Value between min and max
2 | 4 | 5 | 4 # Value below range
5 | 2 | 3 | 3 # Value above range
2 | 1 | 1 | 1 # Value above explicit setting (min == max)
0 | 3 | 3 | 3 # Value below explicit setting (min == max)
1 | 4 | 3 | 3 # Min greater than max
end
with_them do
let(:queues) { Array.new(queue_count) }
it { expect(described_class.concurrency(queues, min, max)).to eq(expected) }
end
end
describe '.wait_async' do describe '.wait_async' do
it 'waits for a process in a separate thread' do it 'waits for a process in a separate thread' do
thread = described_class.wait_async(Process.spawn('true')) thread = described_class.wait_async(Process.spawn('true'))
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment