Commit 61b3860e authored by Dylan Griffith's avatar Dylan Griffith

Merge branch '221177-estimate-es-cluster-size' into 'master'

Advanced Search: Estimate Elasticsearch cluster size

See merge request gitlab-org/gitlab!54430
parents c589c593 bbde8d09
...@@ -56,6 +56,12 @@ A few notes on CPU and storage: ...@@ -56,6 +56,12 @@ A few notes on CPU and storage:
to any spinning media for Elasticsearch. In testing, nodes that use SSD storage to any spinning media for Elasticsearch. In testing, nodes that use SSD storage
see boosts in both query and indexing performance. see boosts in both query and indexing performance.
- We've introduced the [`estimate_cluster_size`](#gitlab-advanced-search-rake-tasks)
Rake task to estimate the Advanced Search storage requirements in advance, which
- The [`estimate_cluster_size`](#gitlab-advanced-search-rake-tasks) Rake task estimates the
Advanced Search storage requirements in advance. The Rake task uses total repository size
for the calculation. [Introduced](https://gitlab.com/gitlab-org/gitlab/-/issues/221177) in GitLab 13.10.
Keep in mind, these are **minimum requirements** for Elasticsearch. Keep in mind, these are **minimum requirements** for Elasticsearch.
Heavily-used Elasticsearch clusters will likely require considerably more Heavily-used Elasticsearch clusters will likely require considerably more
resources. resources.
...@@ -421,8 +427,9 @@ The following are some available Rake tasks: ...@@ -421,8 +427,9 @@ The following are some available Rake tasks:
| [`sudo gitlab-rake gitlab:elastic:index_snippets`](https://gitlab.com/gitlab-org/gitlab/blob/master/ee/lib/tasks/gitlab/elastic.rake) | Performs an Elasticsearch import that indexes the snippets data. | | [`sudo gitlab-rake gitlab:elastic:index_snippets`](https://gitlab.com/gitlab-org/gitlab/blob/master/ee/lib/tasks/gitlab/elastic.rake) | Performs an Elasticsearch import that indexes the snippets data. |
| [`sudo gitlab-rake gitlab:elastic:projects_not_indexed`](https://gitlab.com/gitlab-org/gitlab/blob/master/ee/lib/tasks/gitlab/elastic.rake) | Displays which projects are not indexed. | | [`sudo gitlab-rake gitlab:elastic:projects_not_indexed`](https://gitlab.com/gitlab-org/gitlab/blob/master/ee/lib/tasks/gitlab/elastic.rake) | Displays which projects are not indexed. |
| [`sudo gitlab-rake gitlab:elastic:reindex_cluster`](https://gitlab.com/gitlab-org/gitlab/blob/master/ee/lib/tasks/gitlab/elastic.rake) | Schedules a zero-downtime cluster reindexing task. This feature should be used with an index that was created after GitLab 13.0. | | [`sudo gitlab-rake gitlab:elastic:reindex_cluster`](https://gitlab.com/gitlab-org/gitlab/blob/master/ee/lib/tasks/gitlab/elastic.rake) | Schedules a zero-downtime cluster reindexing task. This feature should be used with an index that was created after GitLab 13.0. |
| [`sudo gitlab-rake gitlab:elastic:mark_reindex_failed`](https://gitlab.com/gitlab-org/gitlab/blob/master/ee/lib/tasks/gitlab/elastic.rake)`] | Mark the most recent re-index job as failed. | | [`sudo gitlab-rake gitlab:elastic:mark_reindex_failed`](https://gitlab.com/gitlab-org/gitlab/blob/master/ee/lib/tasks/gitlab/elastic.rake) | Mark the most recent re-index job as failed. |
| [`sudo gitlab-rake gitlab:elastic:list_pending_migrations`](https://gitlab.com/gitlab-org/gitlab/blob/master/ee/lib/tasks/gitlab/elastic.rake)`] | List pending migrations. Pending migrations include those that have not yet started, have started but not finished, and those that are halted. | | [`sudo gitlab-rake gitlab:elastic:list_pending_migrations`](https://gitlab.com/gitlab-org/gitlab/blob/master/ee/lib/tasks/gitlab/elastic.rake) | List pending migrations. Pending migrations include those that have not yet started, have started but not finished, and those that are halted. |
| [`sudo gitlab-rake gitlab:elastic:estimate_cluster_size`](https://gitlab.com/gitlab-org/gitlab/blob/master/ee/lib/tasks/gitlab/elastic.rake) | Get an estimate of cluster size based on the total repository size. |
### Environment variables ### Environment variables
......
---
title: 'Advanced Search: Estimate Elasticsearch cluster size'
merge_request: 54430
author:
type: added
...@@ -163,6 +163,21 @@ namespace :gitlab do ...@@ -163,6 +163,21 @@ namespace :gitlab do
end end
end end
desc "GitLab | Elasticsearch | Estimate Cluster size"
task estimate_cluster_size: :environment do
include ActionView::Helpers::NumberHelper
total_size = Namespace::RootStorageStatistics.sum(:repository_size).to_i
total_size_human = number_to_human_size(total_size, delimiter: ',', precision: 1, significant: false)
estimated_cluster_size = total_size * 0.5
estimated_cluster_size_human = number_to_human_size(estimated_cluster_size, delimiter: ',', precision: 1, significant: false)
puts "This GitLab instance repository size is #{total_size_human}."
puts "By our estimates for such repository size, your cluster size should be at least #{estimated_cluster_size_human}.".color(:green)
puts 'Please note that it is possible to index only selected namespaces/projects by using Elasticsearch indexing restrictions.'
end
def project_id_batches(&blk) def project_id_batches(&blk)
relation = Project.all relation = Project.all
......
...@@ -231,4 +231,18 @@ RSpec.describe 'gitlab:elastic namespace rake tasks', :elastic do ...@@ -231,4 +231,18 @@ RSpec.describe 'gitlab:elastic namespace rake tasks', :elastic do
end end
end end
end end
describe 'estimate_cluster_size' do
subject { run_rake_task('gitlab:elastic:estimate_cluster_size') }
before do
create(:namespace_root_storage_statistics, repository_size: 1.megabyte)
create(:namespace_root_storage_statistics, repository_size: 10.megabyte)
create(:namespace_root_storage_statistics, repository_size: 30.megabyte)
end
it 'outputs estimates' do
expect { subject }.to output(/your cluster size should be at least 20.5 MB/).to_stdout
end
end
end end
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment