-[Database](#database-replication): includes the entire application, except cache and jobs.
-[Git repositories](#repository-replication): includes both projects and wikis.
-[Uploaded blobs](#uploads-replication): includes anything from images attached on issues
to raw logs and assets from CI.
With the exception of the Database replication, on a *secondary* node, everything is coordinated
by the [Geo Log Cursor](#geo-log-cursor).
### Geo Log Cursor daemon
The [Geo Log Cursor daemon](#geo-log-cursor-daemon) is a separate process running on
each **secondary** node. It monitors the [Geo Event Log](#geo-event-log)
for new events and creates background jobs for each specific event type.
For example when a repository is updated, the Geo **primary** node creates
a Geo event with an associated repository updated event. The Geo Log Cursor daemon
picks the event up and schedules a `Geo::ProjectSyncWorker` job which will
use the `Geo::RepositorySyncService` and `Geo::WikiSyncService` classes
to update the repository and the wiki respectively.
The Geo Log Cursor daemon can operate in High Availability mode automatically.
The daemon will try to acquire a lock from time to time and once acquired, it
will behave as the *active* daemon.
Any additional running daemons on the same node, will be in standby
mode, ready to resume work if the *active* daemon releases its lock.
We use the [`ExclusiveLease`](https://www.rubydoc.info/github/gitlabhq/gitlabhq/Gitlab/ExclusiveLease) lock type with a small TTL, that is renewed at every
pooling cycle. That allows us to implement this global lock with a timeout.
At the end of the pooling cycle, if the daemon can't renew and/or reacquire
the lock, it switches to standby mode.
### Database replication
Geo uses [streaming replication](#streaming-replication) to replicate
Geo uses [streaming replication](#streaming-replication) to replicate
the database from the **primary** to the **secondary** nodes. This
the database from the **primary** to the **secondary** nodes. This
...
@@ -13,7 +51,7 @@ replication gives the **secondary** nodes access to all the data saved
...
@@ -13,7 +51,7 @@ replication gives the **secondary** nodes access to all the data saved
in the database. So users can log in on the **secondary** and read all
in the database. So users can log in on the **secondary** and read all
the issues, merge requests, etc. on the **secondary** node.
the issues, merge requests, etc. on the **secondary** node.
## Repository replication
### Repository replication
Geo also replicates repositories. Each **secondary** node keeps track of
Geo also replicates repositories. Each **secondary** node keeps track of
the state of every repository in the [tracking database](#tracking-database).
the state of every repository in the [tracking database](#tracking-database).
...
@@ -23,7 +61,7 @@ There are a few ways a repository gets replicated by the:
...
@@ -23,7 +61,7 @@ There are a few ways a repository gets replicated by the: