Commit f73f0c7b authored by Patrick Steinhardt's avatar Patrick Steinhardt

git_access: Use batched new blobs check

The Git access checks support checking the size of pushed objects. If we
have a quarantine environment (which is always the case when access
checks are executed via `git receive-pack`'s hooks), then we simply
derive the size from the quarantined objects. Otherwise, we use
`get_new_blobs`, which does a `git rev-list $REVISION --not --all`.

Using git-rev-list(1) with `--not --all` can be a very expensive
operation depending on the repository's shape. Especially when it's a
biggish monorepo with lots of references, this query can easily take
tens of seconds. Given that we call the RPC once per change, this thus
roughly scales `O(len(changes) * len(existing_refs))`.

To improve this situation, Gitaly has implemented a new `list_blobs()`
RPC which takes a set of revisions. Like this, we can batch all calls
into a single one and thus avoid some of the overhead if there are
multiple changes at once. The new code which does that is currently
implemented behind a feature flag.

Changelog: performance
parent ab35492c
---
name: git_access_batched_changes_size
introduced_by_url: https://gitlab.com/gitlab-org/gitlab/-/merge_requests/64503
rollout_issue_url: https://gitlab.com/gitlab-org/gitlab/-/issues/334130
milestone: '14.1'
type: development
group: group::gitaly
default_enabled: false
...@@ -498,13 +498,23 @@ module Gitlab ...@@ -498,13 +498,23 @@ module Gitlab
end end
def check_changes_size def check_changes_size
changes_size = 0 changes_size =
if Feature.enabled?(:git_access_batched_changes_size, project, default_enabled: :yaml)
revs = ['--not', '--all', '--not']
revs += changes_list.map { |change| change[:newrev] }
changes_list.each do |change| repository.blobs(revs).sum(&:size)
changes_size += repository.new_blobs(change[:newrev]).sum(&:size) else
changes_size = 0
check_size_against_limit(changes_size) changes_list.each do |change|
end changes_size += repository.new_blobs(change[:newrev]).sum(&:size)
end
changes_size
end
check_size_against_limit(changes_size)
end end
def check_size_against_limit(size) def check_size_against_limit(size)
......
...@@ -384,11 +384,12 @@ RSpec.describe Gitlab::GitAccessSnippet do ...@@ -384,11 +384,12 @@ RSpec.describe Gitlab::GitAccessSnippet do
it_behaves_like 'a push to repository to make it over the limit' it_behaves_like 'a push to repository to make it over the limit'
end end
context 'when GIT_OBJECT_DIRECTORY_RELATIVE env var is not set' do shared_examples_for 'a change with GIT_OBJECT_DIRECTORY_RELATIVE env var unset' do
let(:change_size) { 200 } let(:change_size) { 200 }
before do before do
allow(snippet.repository).to receive(:new_blobs).and_return( stub_feature_flags(git_access_batched_changes_size: batched)
allow(snippet.repository).to receive(expected_call).and_return(
[double(:blob, size: change_size)] [double(:blob, size: change_size)]
) )
end end
...@@ -397,6 +398,20 @@ RSpec.describe Gitlab::GitAccessSnippet do ...@@ -397,6 +398,20 @@ RSpec.describe Gitlab::GitAccessSnippet do
it_behaves_like 'a push to repository below the limit' it_behaves_like 'a push to repository below the limit'
it_behaves_like 'a push to repository to make it over the limit' it_behaves_like 'a push to repository to make it over the limit'
end end
context 'when batched computation is enabled' do
let(:batched) { true }
let(:expected_call) { :blobs }
it_behaves_like 'a change with GIT_OBJECT_DIRECTORY_RELATIVE env var unset'
end
context 'when batched computation is disabled' do
let(:batched) { false }
let(:expected_call) { :new_blobs }
it_behaves_like 'a change with GIT_OBJECT_DIRECTORY_RELATIVE env var unset'
end
end end
describe 'HEAD realignment' do describe 'HEAD realignment' do
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment