Merge branch 'weimeng-master-patch-79445' into 'master'

Remove legacy content from Availability page See merge request gitlab-org/gitlab!29878

Merge branch 'weimeng-master-patch-79445' into 'master'
Remove legacy content from Availability page See merge request gitlab-org/gitlab!29878
d60e8d68 · Achilleas Pipinellis · 9363545a · 01249e20 · d60e8d68 · d60e8d68
Commit d60e8d68 authored Apr 21, 2020 by Achilleas Pipinellis
4 changed files
--- a/doc/administration/availability/index.md
+++ b/doc/administration/availability/index.md
@@ -4,136 +4,10 @@ type: reference, concepts
 # Availability
-GitLab offers high availability options for organizations that require
-the fault tolerance and redundancy necessary to maintain high-uptime operations.
-Please consult our [scaling documentation](../scaling) if you want to resolve
-performance bottlenecks you encounter in individual GitLab components without
-incurring the additional complexity costs associated with maintaining a
-highly-available architecture.
-On this page, we present examples of self-managed instances which demonstrate
-how GitLab can be scaled out and made highly available. These examples progress
-from simple to complex as scaling or highly-available components are added.
-For larger setups serving 2,000 or more users, we provide
-[reference architectures](../scaling/index.md#reference-architectures) based on GitLab's
-experience with GitLab.com and internal scale testing that aim to achieve the
-right balance of scalability and availability.
-For detailed insight into how GitLab scales and configures GitLab.com, you can
-watch [this 1 hour Q&A](https://www.youtube.com/watch?v=uCU8jdYzpac)
-with [John Northrup](https://gitlab.com/northrup), and live questions coming
-in from some of our customers.
 GitLab offers a number of options to manage availability and resiliency. Below are the options to consider with trade-offs.
 | Event | GitLab Feature | Recovery Point Objective (RPO) | Recovery Time Objective (RTO) | Cost |
 | ----- | -------------- | --- | --- | ---- |
 | Availability Zone failure | "GitLab HA" | No loss | No loss | 2x Git storage, multiple nodes balanced across AZ's |
-| Region failure | "GitLab Disaster Recovery" | 5-10 minutes | 30 minutes | 2x primary cost |
+| Region failure | [GitLab Geo Disaster Recovery](../geo/disaster_recovery/index.md) | 5-10 minutes | 30 minutes | 2x primary cost |
 | All failures | Backup/Restore | Last backup | Hours to Days | Cost of storing the backups |
-## High availability
-### Omnibus installation with automatic database failover
-By adding automatic failover for database systems, we can enable higher uptime with an additional layer of complexity.
- For PostgreSQL, we provide repmgr for server cluster management and failover
-  and a combination of [PgBouncer](../high_availability/pgbouncer.md) and [Consul](../high_availability/consul.md) for
-  database client cutover.
- For Redis, we use [Redis Sentinel](../high_availability/redis.md) for server failover and client cutover.
-You can also optionally run [additional Sidekiq processes on dedicated hardware](../high_availability/sidekiq.md)
-and configure individual Sidekiq processes to
-[process specific background job queues](../operations/extra_sidekiq_processes.md)
-if you need to scale out background job processing.
-### GitLab Geo
-GitLab Geo allows you to replicate your GitLab instance to other geographical locations as a read-only fully operational instance that can also be promoted in case of disaster.
-This configuration is supported in [GitLab Premium and Ultimate](https://about.gitlab.com/pricing/).
-References:
- [Geo Documentation](../geo/replication/index.md)
- [GitLab Geo with a highly available configuration](../geo/replication/high_availability.md)
-## GitLab components and configuration instructions
-The GitLab application depends on the following [components](../../development/architecture.md#component-diagram).
-It can also depend on several third party services depending on
-your environment setup. Here we'll detail both in the order in which
-you would typically configure them along with our recommendations for
-their use and configuration.
-### Third party services
-Here's some details of several third party services a typical environment
-will depend on. The services can be provided by numerous applications
-or providers and further advice can be given on how best to select.
-These should be configured first, before the [GitLab components](#gitlab-components).
-| Component                                              | Description                                                                                                         | Configuration instructions                              |
-|--------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------|---------------------------------------------------------|
-| [Load Balancer(s)](../high_availability/load_balancer.md)[^6]               | Handles load balancing for the GitLab nodes where required                                                          | [Load balancer HA configuration](../high_availability/load_balancer.md)      |
-| [Cloud Object Storage service](../high_availability/object_storage.md)[^4]  | Recommended store for shared data objects                                                                           | [Cloud Object Storage configuration](../high_availability/object_storage.md) |
-| [NFS](../high_availability/nfs.md)[^5] [^7]                                 | Shared disk storage service. Can be used as an alternative for Gitaly or Object Storage. Required for GitLab Pages  | [NFS configuration](../high_availability/nfs.md)                             |
-### GitLab components
-Next are all of the components provided directly by GitLab. As mentioned
-earlier, they are presented in the typical order you would configure
-them.
-| Component                                                                                                           | Description                                                         | Configuration instructions                                    |
-|---------------------------------------------------------------------------------------------------------------------|---------------------------------------------------------------------|---------------------------------------------------------------|
-| [Consul](../../development/architecture.md#consul)[^3]                                                              | Service discovery and health checks/failover                        | [Consul HA configuration](../high_availability/consul.md) **(PREMIUM ONLY)**       |
-| [PostgreSQL](../../development/architecture.md#postgresql)                                                          | Database                                                            | [Database HA configuration](../high_availability/database.md)                      |
-| [PgBouncer](../../development/architecture.md#pgbouncer)                                                            | Database Pool Manager                                               | [PgBouncer HA configuration](../high_availability/pgbouncer.md) **(PREMIUM ONLY)** |
-| [Redis](../../development/architecture.md#redis)[^3] with Redis Sentinel                                            | Key/Value store for shared data with HA watcher service             | [Redis HA configuration](../high_availability/redis.md)                            |
-| [Gitaly](../../development/architecture.md#gitaly)[^2] [^5] [^7]                                                    | Recommended high-level storage for Git repository data              | [Gitaly HA configuration](../high_availability/gitaly.md)                          |
-| [Sidekiq](../../development/architecture.md#sidekiq)                                                                | Asynchronous/Background jobs                                        | [Sidekiq configuration](../high_availability/sidekiq.md)                           |
-| [GitLab application nodes](../../development/architecture.md#unicorn)[^1]                                           | (Unicorn / Puma, Workhorse) - Web-requests (UI, API, Git over HTTP) | [GitLab app HA/scaling configuration](../high_availability/gitlab.md)              |
-| [Prometheus](../../development/architecture.md#prometheus) and [Grafana](../../development/architecture.md#grafana) | GitLab environment monitoring                                       | [Monitoring node for scaling/HA](../high_availability/monitoring_node.md)          |
-In some cases, components can be combined on the same nodes to reduce complexity as well.
-[^1]: In our architectures we run each GitLab Rails node using the Puma webserver
-      and have its number of workers set to 90% of available CPUs along with 4 threads.
-[^2]: Gitaly node requirements are dependent on customer data, specifically the number of
-      projects and their sizes. We recommend 2 nodes as an absolute minimum for HA environments
-      and at least 4 nodes should be used when supporting 50,000 or more users.
-      We also recommend that each Gitaly node should store no more than 5TB of data
-      and have the number of [`gitaly-ruby` workers](../gitaly/index.md#gitaly-ruby)
-      set to 20% of available CPUs. Additional nodes should be considered in conjunction
-      with a review of expected data size and spread based on the recommendations above.
-[^3]: Recommended Redis setup differs depending on the size of the architecture.
-      For smaller architectures (up to 5,000 users) we suggest one Redis cluster for all
-      classes and that Redis Sentinel is hosted alongside Consul.
-      For larger architectures (10,000 users or more) we suggest running a separate
-      [Redis Cluster](../high_availability/redis.md#running-multiple-redis-clusters) for the Cache class
-      and another for the Queues and Shared State classes respectively. We also recommend
-      that you run the Redis Sentinel clusters separately as well for each Redis Cluster.
-[^4]: For data objects such as LFS, Uploads, Artifacts, etc. We recommend a [Cloud Object Storage service](../object_storage.md)
-      over NFS where possible, due to better performance and availability.
-[^5]: NFS can be used as an alternative for both repository data (replacing Gitaly) and
-      object storage but this isn't typically recommended for performance reasons. Note however it is required for
-      [GitLab Pages](https://gitlab.com/gitlab-org/gitlab-pages/issues/196).
-[^6]: Our architectures have been tested and validated with [HAProxy](https://www.haproxy.org/)
-      as the load balancer. However other reputable load balancers with similar feature sets
-      should also work instead but be aware these aren't validated.
-[^7]: We strongly recommend that any Gitaly and / or NFS nodes are set up with SSD disks over
-      HDD with a throughput of at least 8,000 IOPS for read operations and 2,000 IOPS for write
-      as these components have heavy I/O. These IOPS values are recommended only as a starter
-      as with time they may be adjusted higher or lower depending on the scale of your
-      environment's workload. If you're running the environment on a Cloud provider
-      you may need to refer to their documentation on how configure IOPS correctly.
--- a/doc/administration/high_availability/gitaly.md
+++ b/doc/administration/high_availability/gitaly.md
@@ -19,7 +19,7 @@ See [Running Gitaly on its own server](../gitaly/index.md#running-gitaly-on-its-
 in Gitaly documentation.
 Continue configuration of other components by going back to the
-[High Availability](../availability/index.md#gitlab-components-and-configuration-instructions) page.
+[Scaling](../scaling/index.md#components-provided-by-omnibus-gitlab) page.
 ## Enable Monitoring

--- a/doc/administration/high_availability/redis.md
+++ b/doc/administration/high_availability/redis.md
@@ -89,7 +89,7 @@ Advanced configuration options are supported and can be added if
 needed.
 Continue configuration of other components by going back to the
-[High Availability](../availability/index.md#gitlab-components-and-configuration-instructions) page.
+[Scaling](../scaling/index.md#components-provided-by-omnibus-gitlab) page.
 ### High Availability with GitLab Omnibus **(PREMIUM ONLY)**

--- a/doc/administration/scaling/index.md
+++ b/doc/administration/scaling/index.md
@@ -5,40 +5,17 @@ type: reference, concepts
 # Scaling
 GitLab supports a number of scaling options to ensure that your self-managed
-instance is able to scale out to meet your organization's needs when scaling up
+instance is able to scale to meet your organization's needs.
-a single-box GitLab installation is no longer practical or feasible.
-Please consult our [high availability documentation](../availability/index.md)
+On this page, we present examples of self-managed instances which demonstrate
-if your organization requires fault tolerance and redundancy features, such as
+how GitLab can be scaled up, scaled out or made highly available. These
-automatic database system failover.
+examples progress from simple to complex as scaling or highly-available
+components are added.
-## GitLab components and scaling instructions
+For detailed insight into how GitLab scales and configures GitLab.com, you can
+watch [this 1 hour Q&A](https://www.youtube.com/watch?v=uCU8jdYzpac)
-Here's a list of components directly provided by Omnibus GitLab or installed as
+with [John Northrup](https://gitlab.com/northrup), and live questions coming
-part of a source installation and their configuration instructions for scaling.
+in from some of our customers.
-| Component | Description | Configuration instructions |
-|-----------|-------------|----------------------------|
-| [PostgreSQL](../../development/architecture.md#postgresql) | Database | [PostgreSQL configuration](https://docs.gitlab.com/omnibus/settings/database.html) |
-| [Redis](../../development/architecture.md#redis)  | Key/value store for fast data lookup and caching | [Redis configuration](../high_availability/redis.md) |
-| [GitLab application services](../../development/architecture.md#unicorn) | Unicorn/Puma, Workhorse, GitLab Shell - serves front-end requests (UI, API, Git over HTTP/SSH) | [GitLab app scaling configuration](../high_availability/gitlab.md) |
-| [PgBouncer](../../development/architecture.md#pgbouncer) | Database connection pooler | [PgBouncer configuration](../high_availability/pgbouncer.md#running-pgbouncer-as-part-of-a-non-ha-gitlab-installation) **(PREMIUM ONLY)** |
-| [Sidekiq](../../development/architecture.md#sidekiq) | Asynchronous/background jobs | [Sidekiq configuration](../high_availability/sidekiq.md) |
-| [Gitaly](../../development/architecture.md#gitaly) | Provides access to Git repositories | [Gitaly configuration](../gitaly/index.md#running-gitaly-on-its-own-server) |
-| [Prometheus](../../development/architecture.md#prometheus) and [Grafana](../../development/architecture.md#grafana) | GitLab environment monitoring | [Monitoring node for scaling](../high_availability/monitoring_node.md) |
-## Third-party services used for scaling
-Here's a list of third-party services you may require as part of scaling GitLab.
-The services can be provided by numerous applications or vendors and further
-advice is given on how best to select the right choice for your organization's
-needs.
-| Component | Description | Configuration instructions |
-|-----------|-------------|----------------------------|
-| Load balancer(s) | Handles load balancing, typically when you have multiple GitLab application services nodes | [Load balancer configuration](../high_availability/load_balancer.md)      |
-| Object storage service | Recommended store for shared data objects | [Cloud Object Storage configuration](../high_availability/object_storage.md) |
-| NFS | Shared disk storage service. Can be used as an alternative for Gitaly or Object Storage. Required for GitLab Pages | [NFS configuration](../high_availability/nfs.md) |
 ## Reference architectures
@@ -76,7 +53,7 @@ how much automation you use, mirroring, and repo/change size. Additionally the
 shown memory values are given directly by [GCP machine types](https://cloud.google.com/compute/docs/machine-types).
 On different cloud vendors a best effort like for like can be used.
-### Under 1,000 users
+### Up to 1,000 users
 From 1 to 1,000 users, a single-node [Omnibus](https://docs.gitlab.com/omnibus/) setup with frequent backups is adequate.
 Please refer to the [installation documentation](../../install/README.md) and [backup/restore documentation](https://docs.gitlab.com/omnibus/settings/backups.html#backup-and-restore-omnibus-gitlab-configuration).
@@ -85,11 +62,15 @@ This solution is appropriate for many teams that have a single server at their d
 You can also optionally configure GitLab to use an [external PostgreSQL service](../external_database.md) or an [external object storage service](../high_availability/object_storage.md) for added performance and reliability at a relatively low complexity cost.
-### 1,000 to 1,999 users
+### Up to 2,000 users
-For 1,000 to 1,999 users, defining a reference architecture for this scale is [being worked on](https://gitlab.com/gitlab-org/quality/performance/-/issues/223).
-### 2,000 users
+NOTE: **Note:** The 2,000-user reference architecture documented below is
+designed to help your organization achieve a highly-available GitLab deployment.
+If you do not have the expertise or need to maintain a highly-available
+environment, you can have a simpler and less costly-to-operate environment by
+deploying two or more GitLab Rails servers, external load balancing, an NFS
+server, a PostgreSQL server and a Redis server. A reference architecture with
+this alternative in mind is [being worked on](https://gitlab.com/gitlab-org/quality/performance/-/issues/223).
 - **Supported users (approximate):** 2,000
 - **Test RPS rates:** API: 40 RPS, Web: 4 RPS, Git: 4 RPS
@@ -110,7 +91,7 @@ For 1,000 to 1,999 users, defining a reference architecture for this scale is [b
 | External load balancing node[^6] | 1 | 2 vCPU, 1.8GB Memory | n1-highcpu-2  | c5.large     |
 | Internal load balancing node[^6] | 1 | 2 vCPU, 1.8GB Memory | n1-highcpu-2  | c5.large     |
-### 5,000 users
+### Up to 5,000 users
 - **Supported users (approximate):** 5,000
 - **Test RPS rates:** API: 100 RPS, Web: 10 RPS, Git: 10 RPS
@@ -131,7 +112,7 @@ For 1,000 to 1,999 users, defining a reference architecture for this scale is [b
 | External load balancing node[^6] | 1 | 2 vCPU, 1.8GB Memory  | n1-highcpu-2  | c5.large     |
 | Internal load balancing node[^6] | 1 | 2 vCPU, 1.8GB Memory  | n1-highcpu-2  | c5.large     |
-### 10,000 users
+### Up to 10,000 users
 - **Supported users (approximate):** 10,000
 - **Test RPS rates:** API: 200 RPS, Web: 20 RPS, Git: 20 RPS
@@ -155,7 +136,7 @@ For 1,000 to 1,999 users, defining a reference architecture for this scale is [b
 | External load balancing node[^6] | 1 | 2 vCPU, 1.8GB Memory  | n1-highcpu-2   | c5.large     |
 | Internal load balancing node[^6] | 1 | 2 vCPU, 1.8GB Memory  | n1-highcpu-2   | c5.large     |
-### 25,000 users
+### Up to 25,000 users
 - **Supported users (approximate):** 25,000
 - **Test RPS rates:** API: 500 RPS, Web: 50 RPS, Git: 50 RPS
@@ -179,7 +160,7 @@ For 1,000 to 1,999 users, defining a reference architecture for this scale is [b
 | External load balancing node[^6] | 1 | 2 vCPU, 1.8GB Memory  | n1-highcpu-2   | c5.large     |
 | Internal load balancing node[^6] | 1 | 4 vCPU, 3.6GB Memory  | n1-highcpu-4   | c5.xlarge    |
-### 50,000 users
+### Up to 50,000 users
 - **Supported users (approximate):** 50,000
 - **Test RPS rates:** API: 1000 RPS, Web: 100 RPS, Git: 100 RPS
@@ -203,6 +184,43 @@ For 1,000 to 1,999 users, defining a reference architecture for this scale is [b
 | External load balancing node[^6] | 1 | 2 vCPU, 1.8GB Memory  | n1-highcpu-2   | c5.large     |
 | Internal load balancing node[^6] | 1 | 8 vCPU, 7.2GB Memory  | n1-highcpu-8   | c5.2xlarge   |
+## Configuring GitLab to scale
+### Components not provided by Omnibus GitLab
+Depending on your system architecture, you may require some components which are
+not provided in Omnibus GitLab. If required, these should be configured before
+setting up components provided by GitLab. Advice on how to select the right
+solution for your organization is provided in the configuration instructions
+listed below.
+| Component | Description | Configuration instructions |
+|-----------|-------------|----------------------------|
+| Load balancer(s)[^6] | Handles load balancing, typically when you have multiple GitLab application services nodes | [Load balancer configuration](../high_availability/load_balancer.md)[^6]      |
+| Object storage service[^4] | Recommended store for shared data objects | [Cloud Object Storage configuration](../object_storage.md) |
+| NFS[^5] [^7] | Shared disk storage service. Can be used as an alternative for Gitaly or Object Storage. Required for GitLab Pages | [NFS configuration](../high_availability/nfs.md) |
+### Components provided by Omnibus GitLab
+The following components are provided by Omnibus GitLab. They are listed in the
+order you'll typically configure them if they are required by your
+[reference architecture](#reference-architectures) of choice.
+| Component | Description | Configuration instructions |
+|-----------|-------------|----------------------------|
+| [Consul](../../development/architecture.md#consul)[^3] | Service discovery and health checks/failover | [Consul HA configuration](../high_availability/consul.md) **(PREMIUM ONLY)** |
+| [PostgreSQL](../../development/architecture.md#postgresql) | Database | [PostgreSQL configuration](https://docs.gitlab.com/omnibus/settings/database.html) |
+| [PgBouncer](../../development/architecture.md#pgbouncer) | Database connection pooler | [PgBouncer configuration](../high_availability/pgbouncer.md#running-pgbouncer-as-part-of-a-non-ha-gitlab-installation) **(PREMIUM ONLY)** |
+| Repmgr | PostgreSQL cluster management and failover | [PostgreSQL and Repmgr configuration](../high_availability/database.md) |
+| [Redis](../../development/architecture.md#redis)[^3]  | Key/value store for fast data lookup and caching | [Redis configuration](../high_availability/redis.md) |
+| Redis Sentinel | High availability for Redis | [Redis Sentinel configuration](../high_availability/redis.md) |
+| [Gitaly](../../development/architecture.md#gitaly)[^2] [^5] [^7]  | Provides access to Git repositories | [Gitaly configuration](../gitaly/index.md#running-gitaly-on-its-own-server) |
+| [Sidekiq](../../development/architecture.md#sidekiq) | Asynchronous/background jobs | [Sidekiq configuration](../high_availability/sidekiq.md) |
+| [GitLab application services](../../development/architecture.md#unicorn)[^1] | Unicorn/Puma, Workhorse, GitLab Shell - serves front-end requests (UI, API, Git over HTTP/SSH) | [GitLab app scaling configuration](../high_availability/gitlab.md) |
+| [Prometheus](../../development/architecture.md#prometheus) and [Grafana](../../development/architecture.md#grafana) | GitLab environment monitoring | [Monitoring node for scaling](../high_availability/monitoring_node.md) |
+## Footnotes
 [^1]: In our architectures we run each GitLab Rails node using the Puma webserver
      and have its number of workers set to 90% of available CPUs along with 4 threads.