Commit af2a926a authored by Mike Kozono's avatar Mike Kozono

Geo: Treat missing files as sync failures

This is a small but significant design change to files/blobs replicated
by the Geo Self-Service Framework.

- Before: When a file is missing on the primary, and a secondary
attempts to sync it, the secondary considers it "synced", since its
state matches the primary.
- After: With this change, the secondary considers the sync "failed",
since it was unable to sync the file and this is an undesirable state.

Other implications:

For blob types which have Geo verification enabled, this change
short-circuits a logical loop in which sync succeeds but verification
fails and then sync becomes failed and then sync gets retried.

This loop affects blobs replicated *and* verified by the Geo
Self-Service Framework, including:

- Package Files
- Terraform State Versions
- Pipeline Artifacts

And soon to include:

- LFS Objects
- Pages Deployments
- Uploads
- CI Job Artifacts

This change is feature flagged behind
`geo_treat_missing_files_as_sync_failed` so we can test in staging. Also
I intend to enable it by default before removing it. Therefore customers
will be able to easily switch back to old behavior for a whole
milestone, in case of any unforeseen problems with this design change.

Changelog: changed
EE: true
parent 4831b598
......@@ -31,7 +31,11 @@ module Geo
download_result = ::Gitlab::Geo::Replication::BlobDownloader.new(replicator: @replicator).execute
mark_as_synced = download_result.success || download_result.primary_missing_file
mark_as_synced = download_result.success
if download_result.primary_missing_file && Feature.disabled?(:geo_treat_missing_files_as_sync_failed, default_enabled: :yaml)
mark_as_synced = true
end
if mark_as_synced
registry.synced!
......
---
name: geo_treat_missing_files_as_sync_failed
introduced_by_url: https://gitlab.com/gitlab-org/gitlab/-/merge_requests/76801
rollout_issue_url: https://gitlab.com/gitlab-org/gitlab/-/issues/348590
milestone: '14.6'
type: development
group: group::geo
default_enabled: false
......@@ -46,18 +46,58 @@ RSpec.describe Geo::BlobDownloadService do
end
context "when the downloader returns failure" do
let(:result) { double(:result, success: false, primary_missing_file: false, bytes_downloaded: 123, reason: "foo", extra_details: nil) }
context "when the file is not missing on the primary" do
let(:result) { double(:result, success: false, primary_missing_file: false, bytes_downloaded: 123, reason: "foo", extra_details: nil) }
it "creates the registry" do
expect do
it "creates the registry" do
expect do
subject.execute
end.to change { registry_class.count }.by(1)
end
it "sets sync state to failed" do
subject.execute
end.to change { registry_class.count }.by(1)
expect(registry_class.last).to be_failed
end
end
it "sets sync state to failed" do
subject.execute
context "when the file is missing on the primary" do
context "when the feature flag geo_treat_missing_files_as_sync_failed is enabled" do
let(:result) { double(:result, success: false, primary_missing_file: true, bytes_downloaded: 123, reason: "foo", extra_details: nil) }
it "creates the registry" do
expect do
subject.execute
end.to change { registry_class.count }.by(1)
end
it "sets sync state to failed" do
subject.execute
expect(registry_class.last).to be_failed
end
end
context "when the feature flag geo_treat_missing_files_as_sync_failed is disabled" do
let(:result) { double(:result, success: false, primary_missing_file: true, bytes_downloaded: 123, reason: "foo", extra_details: nil) }
before do
stub_feature_flags(geo_treat_missing_files_as_sync_failed: false)
end
it "creates the registry" do
expect do
subject.execute
end.to change { registry_class.count }.by(1)
end
it "sets sync state to synced" do
subject.execute
expect(registry_class.last).to be_failed
expect(registry_class.last).to be_synced
end
end
end
end
end
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment