Separate Disaster Recovery docs from Geo docs

Move the disaster recovery documentation into the administration docs to separate it from the geo documentation because they are separate features.

Separate Disaster Recovery docs from Geo docs
Move the disaster recovery documentation into the administration docs to separate it from the geo documentation because they are separate features.
027c60bd · James Ramsay · 40ae4b80 · 027c60bd · 027c60bd · 027c60bd
Commit 027c60bd authored Feb 13, 2018 by James Ramsay
10 changed files
--- a/doc/administration/disaster_recovery/bring-primary-back.md
+++ b/doc/administration/disaster_recovery/bring-primary-back.md
+## Bring a demoted primary back online
+
+After a fail-over, it is possible to fail back to the demoted primary to restore your original configuration.
+This process consists of two steps: making old primary a secondary and promoting secondary to a primary.
+
+### Configure the former primary to be a secondary
+
+Since the former primary will be out of sync with the current primary, the first
+step is to bring the former primary up to date. There is one downside though, some uploads and repositories
+that have been deleted during an idle period of a primary node, will not be deleted from the disk but the overall sync will be much faster. As an alternative, you can set up a [GitLab instance from scratch](https://docs.gitlab.com/ee/gitlab-geo/#setup-instructions) to workaround this downside.
+
+1. SSH into the former primary that has fallen behind.
+1. Make sure all the services are up by running the command
+
+    ```bash
+    sudo gitlab-ctl start
+    ```
+
+Note: If you [disabled the primary permanently](index.md#step-2-permanently-disable-the-primary), you need to undo those steps now. For Debian/Ubuntu you just need to run `sudo systemctl enable gitlab-runsvdir`. For CentoOS 6, you need to install GitLab instance from scratch and setup it as a secondary node by following [Setup instructions](../../gitlab-geo/README.md#setup-instructions). In this case you don't need the step below.
+
+1. [Setup database replication](../../gitlab-geo/database.md). In this documentation, primary
+   refers to the current primary, and secondary refers to the former primary.
+
+If you have lost your original primary, follow the
+[setup instructions](../../gitlab-geo/README.md#setup-instructions) to set up a new secondary.
+
+### Promote the secondary to primary
+
+When initial replication is complete and the primary and secondary are closely in sync you can do a [Planned Failover](planned-fail-over.md)
+
+### Restore the secondary node
+
+If your objective is to have two nodes again, you need to bring your secondary node back online as well by repeating the first step ([Configure the former primary to be a secondary](#configure-the-former-primary-to-be-a-secondary)) for the secondary node.
--- a/doc/administration/disaster_recovery/index.md
+++ b/doc/administration/disaster_recovery/index.md
+# Disaster Recovery
+
+> **Note:** Disaster Recovery for multi-secondary configurations is in
+> **Alpha** development. Do not use this as your only Disaster Recovery
+> strategy as you may lose data.
+
+GitLab Geo replicates your database and your Git repositories. We will
+support and replicate more data in the future, that will enable you to
+fail-over with minimal effort, in a disaster situation.
+
+See [Geo current limitations](../../gitlab-geo/README.md#current-limitations)
+for more information.
+
+## Promoting secondary Geo replica in single-secondary configuration
+
+We don't currently provide an automated way to promote a Geo replica and do a
+fail-over, but you can do it manually if you have `root` access to the machine.
+
+This process promotes a secondary Geo replica to a primary. To regain
+geographical redundancy as quickly as possible, you should add a new secondary
+immediately after following these instructions.
+
+### Step 1. Allow replication to finish if possible
+
+If the secondary is still replicating data from the primary, follow
+[the Planned Failover doc](planned-fail-over.md) as closely as possible in
+order to avoid unnecessary data loss.
+
+### Step 2. Permanently disable the primary
+
+**Warning: If a primary goes offline, there may be data saved on the primary
+  that has not been replicated to the secondary. This data should be treated
+  as lost if you proceed.**
+
+If an outage on your primary happens, you should do everything possible to
+avoid a split-brain situation where writes can occur to two different GitLab
+instances, complicating recovery efforts. So to prepare for the fail-over, we
+must disable the primary.
+
+1. SSH into your **primary** to stop and disable GitLab, if possible.
+
+    ```bash
+    sudo gitlab-ctl stop
+    ```
+
+    Prevent GitLab from starting up again if the server unexpectedly reboots:
+
+    ```bash
+    sudo systemctl disable gitlab-runsvdir
+    ```
+
+    On some operating systems such as CentOS 6, an easy way to prevent GitLab
+    from being started if the machine reboots isn't available
+    (see [Omnibus issue #3058](https://gitlab.com/gitlab-org/omnibus-gitlab/issues/3058)).
+    It may be safest to uninstall the GitLab package completely:
+
+    ```bash
+    yum remove gitlab-ee
+    ```
+
+1. If you do not have SSH access to your primary, take the machine offline and
+    prevent it from rebooting by any means at your disposal.
+
+    Since there are many ways you may prefer to accomplish this, we will avoid a
+    single recommendation. You may need to:
+
+    * Reconfigure load balancers
+    * Change DNS records (e.g. point the primary DNS record to the secondary node in order to stop usage of the primary)
+    * Stop virtual servers
+    * Block traffic through a firewall
+    * Revoke object storage permissions from the primary
+    * Physically disconnect a machine
+
+### Step 3. Promoting a secondary Geo replica
+
+1. SSH in to your **secondary** and login as root:
+
+    ```bash
+    sudo -i
+    ```
+
+1. Edit `/etc/gitlab/gitlab.rb` to reflect its new status as primary.
+
+    Remove the following line:
+
+    ```ruby
+    ## REMOVE THIS LINE
+    geo_secondary_role['enable'] = true
+    ```
+
+    A new secondary should not be added at this time. If you want to add a new
+    secondary, do this after you have completed the entire process of promoting
+    the secondary to the primary.
+
+1. Promote the secondary to primary. Execute:
+
+    ```bash
+    gitlab-ctl promote-to-primary-node
+    ```
+
+1. Verify you can connect to the newly promoted primary using the URL used
+   previously for the secondary.
+1. Success! The secondary has now been promoted to primary.
+
+### Step 4. (Optional) Updating the primary domain's DNS record
+
+Updating the DNS records for the primary domain to point to the secondary
+will prevent the need to update all references to the primary domain to the
+secondary domain, like changing Git remotes and API URLs.
+
+1. SSH in to your **secondary** and login as root:
+
+    ```bash
+    sudo -i
+    ```
+
+1. Update the primary domain's DNS record.
+
+    After updating the primary domain's DNS records to point to the secondary,
+    edit `/etc/gitlab/gitlab.rb` on the the secondary to reflect the new URL:
+
+    ```ruby
+    # Change the existing external_url configuration
+    external_url 'https://gitlab.example.com'
+    ```
+
+1. Reconfigure the secondary node for the change to take effect:
+
+    ```bash
+    gitlab-ctl reconfigure
+    ```
+
+1. Execute the command below to update the newly promoted primary node URL:
+
+    ```bash
+    gitlab-rake geo:update_primary_node_url
+    ```
+
+    This command will use the changed `external_url` configuration defined
+    in `/etc/gitlab/gitlab.rb`.
+
+1. Verify you can connect to the newly promoted primary using the primary URL.
+
+    If you updated the DNS records for the primary domain, these changes may
+    not have yet propagated depending on the previous DNS records TTL.
+
+### Step 5. (Optional) Add secondary Geo replicas to a promoted primary
+
+Promoting a secondary to primary using the process above does not enable
+GitLab Geo on the new primary.
+
+To bring a new secondary online, follow the
+[Geo setup instructions](../../gitlab-geo/README.md#setup-instructions).
+
+## Promoting secondary Geo replica in multi-secondary configurations
+
+Disaster Recovery does not yet support systems with multiple
+secondary Geo replicas (e.g. one primary and two or more secondaries). We are
+working on it, see [#4284](https://gitlab.com/gitlab-org/gitlab-ee/issues/4284)
+for details.
+
+## Troubleshooting
+
+### I followed the disaster recovery instructions and now two-factor auth is broken!
+
+The setup instructions for Geo prior to 10.5 failed to replicate the
+`otp_key_base` secret, which is used to encrypt the two-factor authentication
+secrets stored in the database. If it differs between primary and secondary
+nodes, users with two-factor authentication enabled won't be able to log in
+after a fail-over.
+
+If you still have access to the old primary node, you can follow the
+instructions in the
+[Upgrading to GitLab 10.5](../../gitlab-geo/updating_the_geo_nodes.md#upgrading-to-gitlab-105)
+section to resolve the error. Otherwise, the secret is lost and you'll need to
+[reset two-factor authentication for all users](../../security/two_factor_authentication.md#disabling-2fa-for-everyone).
--- a/doc/administration/disaster_recovery/planned-fail-over.md
+++ b/doc/administration/disaster_recovery/planned-fail-over.md
+# Disaster Recovery for Planned Fail-Over
+
+A planned fail-over is similar to a disaster recovery scenario, except you are able
+to notify users of the maintenance window, and allow data to finish replicating to
+secondaries.
+
+Please read this entire document as well as [Disaster Recovery](index.md)
+before proceeding.
+
+### Notify users of scheduled maintenance
+
+1. On the primary, in Admin Area > Messages, add a broadcast message.
+
+    Check Admin Area > Geo Nodes to estimate how long it will take to finish syncing.
+
+    ```
+    We are doing scheduled maintenance at XX:XX UTC, expected to take less than 1 hour.
+    ```
+
+1. On the secondary, you may need to clear the cache for the broadcast message to show up.
+
+### Block primary traffic
+
+1. At the scheduled time, using your cloud provider or your node's firewall, block HTTP and SSH traffic to/from the primary except for your IP and the secondary's IP.
+
+### Allow replication to finish as much as possible
+
+1. On the secondary, navigate to Admin Area > Geo Nodes and wait until all replication progress is 100% on the secondary "Current node".
+
+1. Navigate to Admin Area > Monitoring > Background Jobs > Queues and wait until the "geo" queues drop ideally to 0.
+
+### Promote the secondary
+
+1. Finally, follow [Disaster Recovery](index.md) to promote the secondary to a primary.
--- a/doc/administration/index.md
+++ b/doc/administration/index.md
@@ -20,7 +20,8 @@ Learn how to install, configure, update, and maintain your GitLab instance.
  - **(Starter/Premium)** [Omnibus support for log forwarding](https://docs.gitlab.com/omnibus/settings/logs.html#udp-log-shipping-gitlab-enterprise-edition-only)
 - [High Availability](high_availability/README.md): Configure multiple servers for scaling or high availability.
  - [High Availability on AWS](../university/high-availability/aws/README.md): Set up GitLab HA on Amazon AWS.
- **(Premium)** [GitLab Geo](../gitlab-geo/README.md): Replicate your GitLab instance to other geographical locations as a read-only fully operational version.
+- **(Premium)** [Geo](../gitlab-geo/README.md): Replicate your GitLab instance to other geographical locations as a read-only fully operational version.
+- **(Premium)** [Disaster Recovery](disaster_recovery/index.md): Quickly fail-over to a different site with minimal effort in a disaster situation.
 - **(Premium)** [Pivotal Tile](../install/pivotal/index.md): Deploy GitLab as a pre-configured appliance using Ops Manager (BOSH) for Pivotal Cloud Foundry.

 ### Configuring GitLab

--- a/doc/gitlab-geo/README.md
+++ b/doc/gitlab-geo/README.md
@@ -51,7 +51,9 @@ to reading any data available in the GitLab web interface (see [current limitati
 improving speed for distributed teams
 - Helps reducing the loading time for automated tasks,
 custom integrations and internal workflows
- A Geo secondary can be promoted to become the primary in a [Disaster Recovery](disaster-recovery.md) scenario
+- Quickly fail-over to a Geo secondary in a
+[Disaster Recovery](../administration/disaster_recovery/index.md) scenario
+- Allows [planned fail-over](../administration/disaster_recovery/planned-fail-over.md) to a Geo secondary

 ## Architecture

@@ -191,10 +193,6 @@ Read through the [Geo High Availability documentation](ha.md).
 When you have object storage enabled, please consult the
 [Geo with Object Storage](object_storage.md) documentation.

-## Restore demoted primary geo node
-
-Read how to [Bring a demoted primary back](bring-primary-back.md)
-
 ## Replicating the Container Registry

 Read how to [replicate the Container Registry](docker_registry.md).

--- a/doc/gitlab-geo/bring-primary-back.md
+++ b/doc/gitlab-geo/bring-primary-back.md
-## Bring a demoted primary back online
-
-After a failover, it is possible to fail back to the demoted primary to restore your original configuration.
-This process consists of two steps: making old primary a secondary and promoting secondary to a primary.
-
-### Configure the former primary to be a secondary
-
-Since the former primary will be out of sync with the current primary, the first
-step is to bring the former primary up to date. There is one downside though, some uploads and repositories
-that have been deleted during an idle period of a primary node, will not be deleted from the disk but the overall sync will be much faster. As an alternative, you can set up a [GitLab instance from scratch](https://docs.gitlab.com/ee/gitlab-geo/#setup-instructions) to workaround this downside.
-
-1. SSH into the former primary that has fallen behind.
-1. Make sure all the services are up by running the command
-
-    ```bash
-    sudo gitlab-ctl start
-    ```
-
-Note: If you [disabled primary permanently](https://docs.gitlab.com/ee/gitlab-geo/disaster-recovery.html#step-2-permanently-disable-the-primary), you need to undo those steps now. For Debian/Ubuntu you just need to run `sudo systemctl enable gitlab-runsvdir`. For CentoOS 6, you need to install GitLab instance from scratch and setup it as a secondary node by following [Setup instructions](https://docs.gitlab.com/ee/gitlab-geo/#setup-instructions). In this case you don't need the step below.
-1. [Setup the database replication](database.md). In this documentation, primary
-   refers to the current primary, and secondary refers to the former primary.
-
-If you have lost your original primary, follow the
-[setup instructions](README.md#setup-instructions) to set up a new secondary.
-
-### Promote the secondary to primary
-
-When initial replication is complete and the primary and secondary are closely in sync you can do a [Planned Failover](planned-failover.md)
-
-### Restore the secondary node
-
-If your objective is to have two nodes again, you need to bring your secondary node back online as well by repeating the first step ([Make primary a secondary](#make-primary-a-secondary)) for the secondary node.
+This document was moved to [another location](../administration/disaster_recovery/bring-primary-back.md).
--- a/doc/gitlab-geo/configuration.md
+++ b/doc/gitlab-geo/configuration.md
@@ -90,7 +90,7 @@ with the same credentials as used in the primary.
 GitLab integrates with the system-installed SSH daemon, designating a user
 (typically named git) through which all access requests are handled.

-In a [Disaster Recovery](disaster-recovery.md) situation, GitLab system
+In a [Disaster Recovery](../administration/disaster_recovery/index.md) situation, GitLab system
 administrators will promote a secondary Geo replica to a primary and they can
 update the DNS records for the primary domain to point to the secondary to prevent
 the need to update all references to the primary domain to the secondary domain,

--- a/doc/gitlab-geo/disaster-recovery.md
+++ b/doc/gitlab-geo/disaster-recovery.md
-# GitLab Geo Disaster Recovery
-
-> **Note:** Disaster Recovery is in **Alpha** development. Do not use this as
-> your only Disaster Recovery strategy as you may lose data.
-
-GitLab Geo replicates your database and your Git repositories. We will
-support and replicate more data in the future, that will enable you to
-fail-over with minimal effort, in a disaster situation.
-
-See [current limitations](README.md#current-limitations) for more information.
-
-## Promoting secondary Geo replica in single-secondary configuration
-
-We don't currently provide an automated way to promote a Geo replica and do a
-fail-over, but you can do it manually if you have `root` access to the machine.
-
-This process promotes a secondary Geo replica to a primary. To regain
-geographical redundancy as quickly as possible, you should add a new secondary
-immediately after following these instructions.
-
-### Step 1. Allow replication to finish if possible
-
-If the secondary is still replicating data from the primary, follow
-[the Planned Failover doc](planned-failover.md) as closely as possible in
-order to avoid unnecessary data loss.
-
-### Step 2. Permanently disable the primary
-
-**Warning: If a primary goes offline, there may be data saved on the primary
-  that has not been replicated to the secondary. This data should be treated
-  as lost if you proceed.**
-
-If an outage on your primary happens, you should do everything possible to
-avoid a split-brain situation where writes can occur to two different GitLab
-instances, complicating recovery efforts. So to prepare for the failover, we
-must disable the primary.
-
-1. SSH into your **primary** to stop and disable GitLab, if possible.
-
-    ```bash
-    sudo gitlab-ctl stop
-    ```
-
-    Prevent GitLab from starting up again if the server unexpectedly reboots:
-
-    ```bash
-    sudo systemctl disable gitlab-runsvdir
-    ```
-
-    On some operating systems such as CentOS 6, an easy way to prevent GitLab
-    from being started if the machine reboots isn't available
-    (see [Omnibus issue #3058](https://gitlab.com/gitlab-org/omnibus-gitlab/issues/3058)).
-    It may be safest to uninstall the GitLab package completely:
-
-    ```bash
-    yum remove gitlab-ee
-    ```
-
-1. If you do not have SSH access to your primary, take the machine offline and
-    prevent it from rebooting by any means at your disposal.
-
-    Since there are many ways you may prefer to accomplish this, we will avoid a
-    single recommendation. You may need to:
-
-    * Reconfigure load balancers
-    * Change DNS records (e.g. point the primary DNS record to the secondary node in order to stop usage of the primary)
-    * Stop virtual servers
-    * Block traffic through a firewall
-    * Revoke object storage permissions from the primary
-    * Physically disconnect a machine
-
-### Step 3. Promoting a secondary Geo replica
-
-1. SSH in to your **secondary** and login as root:
-
-    ```bash
-    sudo -i
-    ```
-
-1. Edit `/etc/gitlab/gitlab.rb` to reflect its new status as primary.
-
-    Remove the following line:
-
-    ```ruby
-    ## REMOVE THIS LINE
-    geo_secondary_role['enable'] = true
-    ```
-
-    A new secondary should not be added at this time. If you want to add a new
-    secondary, do this after you have completed the entire process of promoting
-    the secondary to the primary.
-
-1. Promote the secondary to primary. Execute:
-
-    ```bash
-    gitlab-ctl promote-to-primary-node
-    ```
-
-1. Verify you can connect to the newly promoted primary using the URL used
-   previously for the secondary.
-1. Success! The secondary has now been promoted to primary.
-
-### Step 4. (Optional) Updating the primary domain's DNS record
-
-Updating the DNS records for the primary domain to point to the secondary
-will prevent the need to update all references to the primary domain to the
-secondary domain, like changing Git remotes and API URLs.
-
-1. SSH in to your **secondary** and login as root:
-
-    ```bash
-    sudo -i
-    ```
-
-1. Update the primary domain's DNS record.
-
-    After updating the primary domain's DNS records to point to the secondary,
-    edit `/etc/gitlab/gitlab.rb` on the the secondary to reflect the new URL:
-
-    ```ruby
-    # Change the existing external_url configuration
-    external_url 'https://gitlab.example.com'
-    ```
-
-1. Reconfigure the secondary node for the change to take effect:
-
-    ```bash
-    gitlab-ctl reconfigure
-    ```
-
-1. Execute the command below to update the newly promoted primary node URL:
-
-    ```bash
-    gitlab-rake geo:update_primary_node_url
-    ```
-
-    This command will use the changed `external_url` configuration defined
-    in `/etc/gitlab/gitlab.rb`.
-
-1. Verify you can connect to the newly promoted primary using the primary URL.
-
-    If you updated the DNS records for the primary domain, these changes may
-    not have yet propagated depending on the previous DNS records TTL.
-
-### Step 5. (Optional) Add secondary Geo replicas to a promoted primary
-
-Promoting a secondary to primary using the process above does not enable
-GitLab Geo on the new primary.
-
-To bring a new secondary online, follow the [GitLab Geo setup instructions](
-README.md#setup-instructions).
-
-## Promoting secondary Geo replica in multi-secondary configurations
-
-Disaster Recovery does not yet support systems with multiple
-secondary Geo replicas (e.g. one primary and two or more secondaries). We are
-working on it, see [#4284](https://gitlab.com/gitlab-org/gitlab-ee/issues/4284)
-for details.
+This document was moved to [another location](../administration/disaster_recovery/index.md).
--- a/doc/gitlab-geo/faq.md
+++ b/doc/gitlab-geo/faq.md
@@ -2,26 +2,10 @@

 ## Can I use Geo in a disaster recovery situation?

-There are limitations to what we replicate (see
+Yes, but there are limitations to what we replicate (see
 [What data is replicated to a secondary node?](#what-data-is-replicated-to-a-secondary-node)).
-In an extreme data-loss situation you can make a secondary Geo into your
-primary, but this is not officially supported yet.

-If you still want to proceed, see our step-by-step instructions on how to
-manually [promote a secondary node](disaster-recovery.md) into primary.
-
-## I followed the disaster recovery instructions and now two-factor auth is broken!
-
-The setup instructions for GitLab Geo prior to 10.5 failed to replicate the
-`otp_key_base` secret, which used to encrypt the two-factor authentication
-secrets stored in the database. If it differs between primary and secondary
-nodes, users with two-factor authentication enabled won't be able to log in
-after a DR failover.
-
-If you still have access to the old primary node, you can follow the
-instructions in the [Upgrading to GitLab 10.5](updating_the_geo_nodes.md#upgrading-to-gitlab-105)
-section to resolve the error. Otherwise, the secret is lost and you'll need to
-[reset two-factor authentication for all users](../security/two_factor_authentication.md#disabling-2fa-for-everyone).
+Read the documentation for [Disaster Recovery](../administration/disaster_recovery/index.md).

 ## What data is replicated to a secondary node?


--- a/doc/gitlab-geo/planned-failover.md
+++ b/doc/gitlab-geo/planned-failover.md
-# GitLab Geo Planned Failover
-
-A planned failover is similar to a disaster recovery scenario, except you are able
-to notify users of the maintenance window, and allow data to finish replicating to
-secondaries.
-
-Please read this entire document as well as
-[GitLab Geo Disaster Recovery](disaster-recovery.md) before proceeding.
-
-### Notify users of scheduled maintenance
-
-1. On the primary, in Admin Area > Messages, add a broadcast message.
-
-    Check Admin Area > Geo Nodes to estimate how long it will take to finish syncing.
-
-    ```
-    We are doing scheduled maintenance at XX:XX UTC, expected to take less than 1 hour.
-    ```
-
-1. On the secondary, you may need to clear the cache for the broadcast message to show up.
-
-### Block primary traffic
-
-1. At the scheduled time, using your cloud provider or your node's firewall, block HTTP and SSH traffic to/from the primary except for your IP and the secondary's IP.
-
-### Allow replication to finish as much as possible
-
-1. On the secondary, navigate to Admin Area > Geo Nodes and wait until all replication progress is 100% on the secondary "Current node".
-
-1. Navigate to Admin Area > Monitoring > Background Jobs > Queues and wait until the "geo" queues drop ideally to 0.
-
-### Promote the secondary
-
-1. Finally, follow [GitLab Geo Disaster Recovery](disaster-recovery.md) to promote the secondary to a primary.
+This document was moved to [another location](../administration/disaster_recovery/planned-fail-over.md).