Commit 3d50086e authored by Mek Stittri's avatar Mek Stittri

Merge branch 'weimeng/move-to-availability-page' into 'master'

Rename HA page to Availability

See merge request gitlab-org/gitlab!29186
parents 09ddb099 e2df0142
---
type: reference, concepts
---
# Availability
GitLab offers high availability options for organizations that require
the fault tolerance and redundancy necessary to maintain high-uptime operations.
Please consult our [scaling documentation](../scaling) if you want to resolve
performance bottlenecks you encounter in individual GitLab components without
incurring the additional complexity costs associated with maintaining a
highly-available architecture.
On this page, we present examples of self-managed instances which demonstrate
how GitLab can be scaled out and made highly available. These examples progress
from simple to complex as scaling or highly-available components are added.
For larger setups serving 2,000 or more users, we provide
[reference architectures](../scaling/index.md#reference-architectures) based on GitLab's
experience with GitLab.com and internal scale testing that aim to achieve the
right balance of scalability and availability.
For detailed insight into how GitLab scales and configures GitLab.com, you can
watch [this 1 hour Q&A](https://www.youtube.com/watch?v=uCU8jdYzpac)
with [John Northrup](https://gitlab.com/northrup), and live questions coming
in from some of our customers.
## High availability
### Omnibus installation with automatic database failover
By adding automatic failover for database systems, we can enable higher uptime with an additional layer of complexity.
- For PostgreSQL, we provide repmgr for server cluster management and failover
and a combination of [PgBouncer](../high_availability/pgbouncer.md) and [Consul](../high_availability/consul.md) for
database client cutover.
- For Redis, we use [Redis Sentinel](../high_availability/redis.md) for server failover and client cutover.
You can also optionally run [additional Sidekiq processes on dedicated hardware](../high_availability/sidekiq.md)
and configure individual Sidekiq processes to
[process specific background job queues](../operations/extra_sidekiq_processes.md)
if you need to scale out background job processing.
### GitLab Geo
GitLab Geo allows you to replicate your GitLab instance to other geographical locations as a read-only fully operational instance that can also be promoted in case of disaster.
This configuration is supported in [GitLab Premium and Ultimate](https://about.gitlab.com/pricing/).
References:
- [Geo Documentation](../geo/replication/index.md)
- [GitLab Geo with a highly available configuration](../geo/replication/high_availability.md)
## GitLab components and configuration instructions
The GitLab application depends on the following [components](../../development/architecture.md#component-diagram).
It can also depend on several third party services depending on
your environment setup. Here we'll detail both in the order in which
you would typically configure them along with our recommendations for
their use and configuration.
### Third party services
Here's some details of several third party services a typical environment
will depend on. The services can be provided by numerous applications
or providers and further advice can be given on how best to select.
These should be configured first, before the [GitLab components](#gitlab-components).
| Component | Description | Configuration instructions |
|--------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------|---------------------------------------------------------|
| [Load Balancer(s)](../high_availability/load_balancer.md)[^6] | Handles load balancing for the GitLab nodes where required | [Load balancer HA configuration](../high_availability/load_balancer.md) |
| [Cloud Object Storage service](../high_availability/object_storage.md)[^4] | Recommended store for shared data objects | [Cloud Object Storage configuration](../high_availability/object_storage.md) |
| [NFS](../high_availability/nfs.md)[^5] [^7] | Shared disk storage service. Can be used as an alternative for Gitaly or Object Storage. Required for GitLab Pages | [NFS configuration](../high_availability/nfs.md) |
### GitLab components
Next are all of the components provided directly by GitLab. As mentioned
earlier, they are presented in the typical order you would configure
them.
| Component | Description | Configuration instructions |
|---------------------------------------------------------------------------------------------------------------------|---------------------------------------------------------------------|---------------------------------------------------------------|
| [Consul](../../development/architecture.md#consul)[^3] | Service discovery and health checks/failover | [Consul HA configuration](../high_availability/consul.md) **(PREMIUM ONLY)** |
| [PostgreSQL](../../development/architecture.md#postgresql) | Database | [Database HA configuration](../high_availability/database.md) |
| [PgBouncer](../../development/architecture.md#pgbouncer) | Database Pool Manager | [PgBouncer HA configuration](../high_availability/pgbouncer.md) **(PREMIUM ONLY)** |
| [Redis](../../development/architecture.md#redis)[^3] with Redis Sentinel | Key/Value store for shared data with HA watcher service | [Redis HA configuration](../high_availability/redis.md) |
| [Gitaly](../../development/architecture.md#gitaly)[^2] [^5] [^7] | Recommended high-level storage for Git repository data | [Gitaly HA configuration](../high_availability/gitaly.md) |
| [Sidekiq](../../development/architecture.md#sidekiq) | Asynchronous/Background jobs | [Sidekiq configuration](../high_availability/sidekiq.md) |
| [GitLab application nodes](../../development/architecture.md#unicorn)[^1] | (Unicorn / Puma, Workhorse) - Web-requests (UI, API, Git over HTTP) | [GitLab app HA/scaling configuration](../high_availability/gitlab.md) |
| [Prometheus](../../development/architecture.md#prometheus) and [Grafana](../../development/architecture.md#grafana) | GitLab environment monitoring | [Monitoring node for scaling/HA](../high_availability/monitoring_node.md) |
In some cases, components can be combined on the same nodes to reduce complexity as well.
[^1]: In our architectures we run each GitLab Rails node using the Puma webserver
and have its number of workers set to 90% of available CPUs along with 4 threads.
[^2]: Gitaly node requirements are dependent on customer data, specifically the number of
projects and their sizes. We recommend 2 nodes as an absolute minimum for HA environments
and at least 4 nodes should be used when supporting 50,000 or more users.
We also recommend that each Gitaly node should store no more than 5TB of data
and have the number of [`gitaly-ruby` workers](../gitaly/index.md#gitaly-ruby)
set to 20% of available CPUs. Additional nodes should be considered in conjunction
with a review of expected data size and spread based on the recommendations above.
[^3]: Recommended Redis setup differs depending on the size of the architecture.
For smaller architectures (up to 5,000 users) we suggest one Redis cluster for all
classes and that Redis Sentinel is hosted alongside Consul.
For larger architectures (10,000 users or more) we suggest running a separate
[Redis Cluster](../high_availability/redis.md#running-multiple-redis-clusters) for the Cache class
and another for the Queues and Shared State classes respectively. We also recommend
that you run the Redis Sentinel clusters separately as well for each Redis Cluster.
[^4]: For data objects such as LFS, Uploads, Artifacts, etc. We recommend a [Cloud Object Storage service](../object_storage.md)
over NFS where possible, due to better performance and availability.
[^5]: NFS can be used as an alternative for both repository data (replacing Gitaly) and
object storage but this isn't typically recommended for performance reasons. Note however it is required for
[GitLab Pages](https://gitlab.com/gitlab-org/gitlab-pages/issues/196).
[^6]: Our architectures have been tested and validated with [HAProxy](https://www.haproxy.org/)
as the load balancer. However other reputable load balancers with similar feature sets
should also work instead but be aware these aren't validated.
[^7]: We strongly recommend that any Gitaly and / or NFS nodes are set up with SSD disks over
HDD with a throughput of at least 8,000 IOPS for read operations and 2,000 IOPS for write
as these components have heavy I/O. These IOPS values are recommended only as a starter
as with time they may be adjusted higher or lower depending on the scale of your
environment's workload. If you're running the environment on a Cloud provider
you may need to refer to their documentation on how configure IOPS correctly.
......@@ -113,4 +113,4 @@ Validating file hooks from /plugins directory
[system hooks]: ../system_hooks/system_hooks.md
[webhooks]: ../user/project/integrations/webhooks.md
[highly available]: ./high_availability/README.md
[highly available]: ./availability/index.md
......@@ -129,7 +129,7 @@ To configure the connection to the external read-replica database and enable Log
database to keep track of replication status and automatically recover from
potential replication issues. Omnibus automatically configures a tracking database
when `roles ['geo_secondary_role']` is set. For high availability,
refer to [Geo High Availability](../../high_availability/README.md).
refer to [Geo High Availability](../../availability/index.md).
If you want to run this database external to Omnibus, please follow the instructions below.
The tracking database requires an [FDW](https://www.postgresql.org/docs/9.6/postgres-fdw.html)
......
......@@ -47,12 +47,12 @@ It is possible to use cloud hosted services for PostgreSQL and Redis, but this i
## Prerequisites: Two working GitLab HA clusters
One cluster will serve as the **primary** node. Use the
[GitLab HA documentation](../../high_availability/README.md) to set this up. If
[GitLab HA documentation](../../availability/index.md) to set this up. If
you already have a working GitLab instance that is in-use, it can be used as a
**primary**.
The second cluster will serve as the **secondary** node. Again, use the
[GitLab HA documentation](../../high_availability/README.md) to set this up.
[GitLab HA documentation](../../availability/index.md) to set this up.
It's a good idea to log in and test it, however, note that its data will be
wiped out as part of the process of replicating from the **primary**.
......@@ -371,7 +371,7 @@ more information.
The minimal reference architecture diagram above shows all application services
running together on the same machines. However, for high availability we
[strongly recommend running all services separately](../../high_availability/README.md).
[strongly recommend running all services separately](../../availability/index.md).
For example, a Sidekiq server could be configured similarly to the frontend
application servers above, with some changes to run only the `sidekiq` service:
......
......@@ -2,7 +2,7 @@
> - Introduced in GitLab Enterprise Edition 8.9.
> - Using Geo in combination with
> [High Availability](../../high_availability/README.md)
> [High Availability](../../availability/index.md)
> is considered **Generally Available** (GA) in
> [GitLab Premium](https://about.gitlab.com/pricing/) 10.4.
......
......@@ -4,137 +4,4 @@ type: reference, concepts
# High Availability
GitLab offers high availability options for organizations that require
the fault tolerance and redundancy necessary to maintain high-uptime operations.
Please consult our [scaling documentation](../scaling) if you want to resolve
performance bottlenecks you encounter in individual GitLab components without
incurring the additional complexity costs associated with maintaining a
highly-available architecture.
On this page, we present examples of self-managed instances which demonstrate
how GitLab can be scaled out and made highly available. These examples progress
from simple to complex as scaling or highly-available components are added.
For larger setups serving 2,000 or more users, we provide
[reference architectures](../scaling/index.md#reference-architectures) based on GitLab's
experience with GitLab.com and internal scale testing that aim to achieve the
right balance of scalability and availability.
For detailed insight into how GitLab scales and configures GitLab.com, you can
watch [this 1 hour Q&A](https://www.youtube.com/watch?v=uCU8jdYzpac)
with [John Northrup](https://gitlab.com/northrup), and live questions coming
in from some of our customers.
## Examples
### Omnibus installation with automatic database failover
By adding automatic failover for database systems, we can enable higher uptime with an additional layer of complexity.
- For PostgreSQL, we provide repmgr for server cluster management and failover
and a combination of [PgBouncer](pgbouncer.md) and [Consul](consul.md) for
database client cutover.
- For Redis, we use [Redis Sentinel](redis.md) for server failover and client cutover.
You can also optionally run [additional Sidekiq processes on dedicated hardware](sidekiq.md)
and configure individual Sidekiq processes to
[process specific background job queues](../operations/extra_sidekiq_processes.md)
if you need to scale out background job processing.
### GitLab Geo
GitLab Geo allows you to replicate your GitLab instance to other geographical locations as a read-only fully operational instance that can also be promoted in case of disaster.
This configuration is supported in [GitLab Premium and Ultimate](https://about.gitlab.com/pricing/).
References:
- [Geo Documentation](../geo/replication/index.md)
- [GitLab Geo with a highly available configuration](../geo/replication/high_availability.md)
## GitLab components and configuration instructions
The GitLab application depends on the following [components](../../development/architecture.md#component-diagram).
It can also depend on several third party services depending on
your environment setup. Here we'll detail both in the order in which
you would typically configure them along with our recommendations for
their use and configuration.
### Third party services
Here's some details of several third party services a typical environment
will depend on. The services can be provided by numerous applications
or providers and further advice can be given on how best to select.
These should be configured first, before the [GitLab components](#gitlab-components).
| Component | Description | Configuration instructions |
|--------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------|---------------------------------------------------------|
| [Load Balancer(s)](load_balancer.md)[^6] | Handles load balancing for the GitLab nodes where required | [Load balancer HA configuration](load_balancer.md) |
| [Cloud Object Storage service](object_storage.md)[^4] | Recommended store for shared data objects | [Cloud Object Storage configuration](object_storage.md) |
| [NFS](nfs.md)[^5] [^7] | Shared disk storage service. Can be used as an alternative for Gitaly or Object Storage. Required for GitLab Pages | [NFS configuration](nfs.md) |
### GitLab components
Next are all of the components provided directly by GitLab. As mentioned
earlier, they are presented in the typical order you would configure
them.
| Component | Description | Configuration instructions |
|---------------------------------------------------------------------------------------------------------------------|---------------------------------------------------------------------|---------------------------------------------------------------|
| [Consul](../../development/architecture.md#consul)[^3] | Service discovery and health checks/failover | [Consul HA configuration](consul.md) **(PREMIUM ONLY)** |
| [PostgreSQL](../../development/architecture.md#postgresql) | Database | [Database HA configuration](database.md) |
| [PgBouncer](../../development/architecture.md#pgbouncer) | Database Pool Manager | [PgBouncer HA configuration](pgbouncer.md) **(PREMIUM ONLY)** |
| [Redis](../../development/architecture.md#redis)[^3] with Redis Sentinel | Key/Value store for shared data with HA watcher service | [Redis HA configuration](redis.md) |
| [Gitaly](../../development/architecture.md#gitaly)[^2] [^5] [^7] | Recommended high-level storage for Git repository data | [Gitaly HA configuration](gitaly.md) |
| [Sidekiq](../../development/architecture.md#sidekiq) | Asynchronous/Background jobs | [Sidekiq configuration](sidekiq.md) |
| [GitLab application nodes](../../development/architecture.md#unicorn)[^1] | (Unicorn / Puma, Workhorse) - Web-requests (UI, API, Git over HTTP) | [GitLab app HA/scaling configuration](gitlab.md) |
| [Prometheus](../../development/architecture.md#prometheus) and [Grafana](../../development/architecture.md#grafana) | GitLab environment monitoring | [Monitoring node for scaling/HA](monitoring_node.md) |
In some cases, components can be combined on the same nodes to reduce complexity as well.
[^1]: In our architectures we run each GitLab Rails node using the Puma webserver
and have its number of workers set to 90% of available CPUs along with 4 threads.
[^2]: Gitaly node requirements are dependent on customer data, specifically the number of
projects and their sizes. We recommend 2 nodes as an absolute minimum for HA environments
and at least 4 nodes should be used when supporting 50,000 or more users.
We also recommend that each Gitaly node should store no more than 5TB of data
and have the number of [`gitaly-ruby` workers](../gitaly/index.md#gitaly-ruby)
set to 20% of available CPUs. Additional nodes should be considered in conjunction
with a review of expected data size and spread based on the recommendations above.
[^3]: Recommended Redis setup differs depending on the size of the architecture.
For smaller architectures (up to 5,000 users) we suggest one Redis cluster for all
classes and that Redis Sentinel is hosted alongside Consul.
For larger architectures (10,000 users or more) we suggest running a separate
[Redis Cluster](redis.md#running-multiple-redis-clusters) for the Cache class
and another for the Queues and Shared State classes respectively. We also recommend
that you run the Redis Sentinel clusters separately as well for each Redis Cluster.
[^4]: For data objects such as LFS, Uploads, Artifacts, etc. We recommend a [Cloud Object Storage service](object_storage.md)
over NFS where possible, due to better performance and availability.
[^5]: NFS can be used as an alternative for both repository data (replacing Gitaly) and
object storage but this isn't typically recommended for performance reasons. Note however it is required for
[GitLab Pages](https://gitlab.com/gitlab-org/gitlab-pages/issues/196).
[^6]: Our architectures have been tested and validated with [HAProxy](https://www.haproxy.org/)
as the load balancer. However other reputable load balancers with similar feature sets
should also work instead but be aware these aren't validated.
[^7]: We strongly recommend that any Gitaly and / or NFS nodes are set up with SSD disks over
HDD with a throughput of at least 8,000 IOPS for read operations and 2,000 IOPS for write
as these components have heavy I/O. These IOPS values are recommended only as a starter
as with time they may be adjusted higher or lower depending on the scale of your
environment's workload. If you're running the environment on a Cloud provider
you may need to refer to their documentation on how configure IOPS correctly.
[^8]: The architectures were built and tested with the [Intel Xeon E5 v3 (Haswell)](https://cloud.google.com/compute/docs/cpu-platforms)
CPU platform on GCP. On different hardware you may find that adjustments, either lower
or higher, are required for your CPU or Node counts accordingly. For more information, a
[Sysbench](https://github.com/akopytov/sysbench) benchmark of the CPU can be found
[here](https://gitlab.com/gitlab-org/quality/performance/-/wikis/Reference-Architectures/GCP-CPU-Benchmarks).
[^9]: AWS-equivalent configurations are rough suggestions and may change in the
future. They have not yet been tested and validated.
This content has been moved to the [availability page](../availability/index.md).
......@@ -24,7 +24,7 @@ If you use a cloud-managed service, or provide your own PostgreSQL:
## PostgreSQL in a Scaled and Highly Available Environment
This section is relevant for [Scalable and Highly Available Setups](README.md).
This section is relevant for [Scalable and Highly Available Setups](../scaling/index.md).
### Provide your own PostgreSQL instance **(CORE ONLY)**
......
......@@ -11,7 +11,7 @@ should consider using Gitaly on a separate node.
See the [Gitaly HA Epic](https://gitlab.com/groups/gitlab-org/-/epics/289) to
track plans and progress toward high availability support.
This document is relevant for [Scalable and Highly Available Setups](README.md).
This document is relevant for [Scalable and Highly Available Setups](../scaling/index.md).
## Running Gitaly on its own server
......@@ -19,7 +19,7 @@ See [Running Gitaly on its own server](../gitaly/index.md#running-gitaly-on-its-
in Gitaly documentation.
Continue configuration of other components by going back to the
[Scaling and High Availability](README.md#gitlab-components-and-configuration-instructions) page.
[High Availability](../availability/index.md#gitlab-components-and-configuration-instructions) page.
## Enable Monitoring
......
......@@ -11,7 +11,7 @@ You can configure a Prometheus node to monitor GitLab.
## Standalone Monitoring node using GitLab Omnibus
The GitLab Omnibus package can be used to configure a standalone Monitoring node running [Prometheus](../monitoring/prometheus/index.md) and [Grafana](../monitoring/performance/grafana_configuration.md).
The monitoring node is not highly available. See [Scaling and High Availability](README.md)
The monitoring node is not highly available. See [Scaling and High Availability](../scaling/index.md)
for an overview of GitLab scaling and high availability options.
The steps below are the minimum necessary to configure a Monitoring node running Prometheus and Grafana with
......
......@@ -22,7 +22,7 @@ These will be necessary when configuring the GitLab application servers later.
## Redis in a Scaled and Highly Available Environment
This section is relevant for [Scalable and Highly Available Setups](README.md).
This section is relevant for [Scalable and Highly Available Setups](../scaling/index.md).
### Provide your own Redis instance **(CORE ONLY)**
......@@ -38,7 +38,7 @@ In this configuration Redis is not highly available, and represents a single
point of failure. However, in a scaled environment the objective is to allow
the environment to handle more users or to increase throughput. Redis itself
is generally stable and can handle many requests so it is an acceptable
trade off to have only a single instance. See [Scaling and High Availability](README.md)
trade off to have only a single instance. See [High Availability](../availability/index.md)
for an overview of GitLab scaling and high availability options.
The steps below are the minimum necessary to configure a Redis server with
......@@ -84,7 +84,7 @@ Advanced configuration options are supported and can be added if
needed.
Continue configuration of other components by going back to the
[Scaling and High Availability](README.md#gitlab-components-and-configuration-instructions) page.
[High Availability](../availability/index.md#gitlab-components-and-configuration-instructions) page.
### High Availability with GitLab Omnibus **(PREMIUM ONLY)**
......
......@@ -34,7 +34,7 @@ Learn how to install, configure, update, and maintain your GitLab instance.
- [Install](../install/README.md): Requirements, directory structures, and installation methods.
- [Database load balancing](database_load_balancing.md): Distribute database queries among multiple database servers. **(STARTER ONLY)**
- [Omnibus support for log forwarding](https://docs.gitlab.com/omnibus/settings/logs.html#udp-log-shipping-gitlab-enterprise-edition-only) **(STARTER ONLY)**
- [High Availability](high_availability/README.md): Configure multiple servers for scaling or high availability.
- [High Availability](availability/index.md): Configure multiple servers for scaling or high availability.
- [Installing GitLab HA on Amazon Web Services (AWS)](../install/aws/index.md): Set up GitLab High Availability on Amazon AWS.
- [Geo](geo/replication/index.md): Replicate your GitLab instance to other geographic locations as a read-only fully operational version. **(PREMIUM ONLY)**
- [Disaster Recovery](geo/disaster_recovery/index.md): Quickly fail-over to a different site with minimal effort in a disaster situation. **(PREMIUM ONLY)**
......
......@@ -77,9 +77,8 @@ with the Fog library that GitLab uses. Symptoms include:
### GitLab Pages requires NFS
If you're working to add more GitLab servers for [scaling](scaling/index.md) or
[fault tolerance](high_availability/README.md) and one of your requirements
is [GitLab Pages](../user/project/pages/index.md) this currently requires
If you're working to add more GitLab servers for [scaling or fault tolerance](scaling/index.md)
and one of your requirements is [GitLab Pages](../user/project/pages/index.md) this currently requires
NFS. There is [work in progress](https://gitlab.com/gitlab-org/gitlab-pages/issues/196)
to remove this dependency. In the future, GitLab Pages may use
[object storage](https://gitlab.com/gitlab-org/gitlab/-/issues/208135).
......
......@@ -8,7 +8,7 @@ GitLab supports a number of scaling options to ensure that your self-managed
instance is able to scale out to meet your organization's needs when scaling up
a single-box GitLab installation is no longer practical or feasible.
Please consult our [high availability documentation](../high_availability/README.md)
Please consult our [high availability documentation](../availability/index.md)
if your organization requires fault tolerance and redundancy features, such as
automatic database system failover.
......
......@@ -22,7 +22,7 @@ There are many ways you can install GitLab depending on your platform:
TIP: **If in doubt, choose Omnibus:**
The Omnibus GitLab packages are mature, scalable, support
[high availability](../administration/high_availability/README.md) and are used
[high availability](../administration/availability/index.md) and are used
today on GitLab.com. The Helm charts are recommended for those who are familiar
with Kubernetes.
......@@ -36,7 +36,7 @@ The Omnibus GitLab package uses our official deb/rpm repositories. This is
recommended for most users.
If you need additional flexibility and resilience, we recommend deploying
GitLab as described in our [High Availability documentation](../administration/high_availability/README.md).
GitLab as described in our [High Availability documentation](../administration/availability/index.md).
[**> Install GitLab using the Omnibus GitLab package.**](https://about.gitlab.com/install/)
......
......@@ -719,7 +719,7 @@ Have a read through these other resources and feel free to
[open an issue](https://gitlab.com/gitlab-org/gitlab/issues/new)
to request additional material:
- [GitLab High Availability](../../administration/high_availability/README.md):
- [Scaling GitLab](../../administration/scaling/index.md):
GitLab supports several different types of clustering and high-availability.
- [Geo replication](../../administration/geo/replication/index.md):
Geo is the solution for widely distributed development teams.
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment