Commit f4eb7f77 authored by Gabriel Mazetto's avatar Gabriel Mazetto

Geo: Document how our Git push proxy works

parent aa672175
......@@ -2,10 +2,48 @@
Geo connects GitLab instances together. One GitLab instance is
designated as a **primary** node and can be run with multiple
**secondary** nodes. Geo orchestrates quite a few components that are
described in more detail below.
**secondary** nodes. Geo orchestrates quite a few components that can be seen on
the diagram below and are described in more detail within this document.
## Database replication
![Geo Architecture Diagram](../administration/geo/replication/img/geo_architecture.png)
## Replication layer
Geo handles replication for different components:
- [Database](#database-replication): includes the entire application, except cache and jobs.
- [Git repositories](#repository-replication): includes both projects and wikis.
- [Uploaded blobs](#uploads-replication): includes anything from images attached on issues
to raw logs and assets from CI.
With the exception of the Database replication, on a *secondary* node, everything is coordinated
by the [Geo Log Cursor](#geo-log-cursor).
### Geo Log Cursor daemon
The [Geo Log Cursor daemon](#geo-log-cursor-daemon) is a separate process running on
each **secondary** node. It monitors the [Geo Event Log](#geo-event-log)
for new events and creates background jobs for each specific event type.
For example when a repository is updated, the Geo **primary** node creates
a Geo event with an associated repository updated event. The Geo Log Cursor daemon
picks the event up and schedules a `Geo::ProjectSyncWorker` job which will
use the `Geo::RepositorySyncService` and `Geo::WikiSyncService` classes
to update the repository and the wiki respectively.
The Geo Log Cursor daemon can operate in High Availability mode automatically.
The daemon will try to acquire a lock from time to time and once acquired, it
will behave as the *active* daemon.
Any additional running daemons on the same node, will be in standby
mode, ready to resume work if the *active* daemon releases its lock.
We use the [`ExclusiveLease`](https://www.rubydoc.info/github/gitlabhq/gitlabhq/Gitlab/ExclusiveLease) lock type with a small TTL, that is renewed at every
pooling cycle. That allows us to implement this global lock with a timeout.
At the end of the pooling cycle, if the daemon can't renew and/or reacquire
the lock, it switches to standby mode.
### Database replication
Geo uses [streaming replication](#streaming-replication) to replicate
the database from the **primary** to the **secondary** nodes. This
......@@ -13,7 +51,7 @@ replication gives the **secondary** nodes access to all the data saved
in the database. So users can log in on the **secondary** and read all
the issues, merge requests, etc. on the **secondary** node.
## Repository replication
### Repository replication
Geo also replicates repositories. Each **secondary** node keeps track of
the state of every repository in the [tracking database](#tracking-database).
......@@ -23,7 +61,7 @@ There are a few ways a repository gets replicated by the:
- [Repository Sync worker](#repository-sync-worker).
- [Geo Log Cursor](#geo-log-cursor).
### Project Registry
#### Project Registry
The `Geo::ProjectRegistry` class defines the model used to track the
state of repository replication. For each project in the main
......@@ -32,15 +70,15 @@ database, one record in the tracking database is kept.
It records the following about repositories:
- The last time they were synced.
- The last time they were synced successfully.
- The last time they were successfully synced.
- If they need to be resynced.
- When retry should be attempted.
- When a retry should be attempted.
- The number of retries.
- If and when the they were verified.
- If and when they were verified.
It also stores these attributes for project wikis in dedicated columns.
### Repository Sync worker
#### Repository Sync worker
The `Geo::RepositorySyncWorker` class runs periodically in the
background and it searches the `Geo::ProjectRegistry` model for
......@@ -59,26 +97,12 @@ times, Geo does a so-called _redownload_. It will do a clean clone
into the `@geo-temporary` directory in the root of the storage. When
it's successful, we replace the main repo with the newly cloned one.
### Geo Log Cursor
The [Geo Log Cursor](#geo-log-cursor) is a separate process running on
each **secondary** node. It monitors the [Geo Event Log](#geo-event-log)
and handles all of the events. When it sees an unhandled event, it
starts a background worker to handle that event, depending on the type
of event.
When a repository receives an update, the Geo **primary** node creates
a Geo event with an associated repository updated event. The cursor
picks that up, and schedules a `Geo::ProjectSyncWorker` job which will
use the `Geo::RepositorySyncService` class and `Geo::WikiSyncService`
class to update the repository and the wiki.
## Uploads replication
### Uploads replication
File uploads are also being replicated to the **secondary** node. To
track the state of syncing, the `Geo::FileRegistry` model is used.
### File Registry
#### File Registry
Similar to the [Project Registry](#project-registry), there is a
`Geo::FileRegistry` model that tracks the synced uploads.
......@@ -86,7 +110,7 @@ Similar to the [Project Registry](#project-registry), there is a
CI Job Artifacts are synced in a similar way as uploads or LFS
objects, but they are tracked by `Geo::JobArtifactRegistry` model.
### File Download Dispatch worker
#### File Download Dispatch worker
Also similar to the [Repository Sync worker](#repository-sync-worker),
there is a `Geo::FileDownloadDispatchWorker` class that is run
......@@ -113,7 +137,7 @@ Authorization: GL-Geo <access_key>:<JWT payload>
```
The **primary** node uses the `access_key` field to look up the
corresponding Geo **secondary** node and decrypts the JWT payload,
corresponding **secondary** node and decrypts the JWT payload,
which contains additional information to identify the file
request. This ensures that the **secondary** node downloads the right
file for the right database ID. For example, for an LFS object, the
......@@ -133,6 +157,28 @@ NOTE: **Note:**
JWT requires synchronized clocks between the machines
involved, otherwise it may fail with an encryption error.
## Git Push to Geo secondary
The Git Push Proxy exists as a functionality built inside the `gitlab-shell` component.
It is active on a **secondary** node only. It allows the user that has cloned a repository
from the secondary node to push to the same URL.
Git `push` requests directed to a **secondary** node will be sent over to the **primary** node,
while `pull` requests will continue to be served by the **secondary** node for maximum efficiency.
HTTPS and SSH requests are handled differently:
- With HTTPS, we will give the user a `HTTP 302 Redirect` pointing to the project on the **primary** node.
The git client is wise enough to understand that status code and process the redirection.
- With SSH, because there is no equivalent way to perform a redirect, we have to proxy the request.
This is done inside [`gitlab-shell`](https://gitlab.com/gitlab-org/gitlab-shell), by first translating the request
to the HTTP protocol, and then proxying it to the **primary** node.
The [`gitlab-shell`](https://gitlab.com/gitlab-org/gitlab-shell) daemon knows when to proxy based on the response
from `/api/v4/allowed`. A special `HTTP 300` status code is returned and we execute a "custom action",
specified in the response body. The response contains additional data that allows the proxied `push` operation
to happen on the **primary** node.
## Using the Tracking Database
Along with the main database that is replicated, a Geo **secondary**
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment