Skip to content
Projects
Groups
Snippets
Help
Loading...
Help
Support
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
G
gitlab-ce
Project overview
Project overview
Details
Activity
Releases
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Issues
0
Issues
0
List
Boards
Labels
Milestones
Merge Requests
1
Merge Requests
1
Analytics
Analytics
Repository
Value Stream
Wiki
Wiki
Snippets
Snippets
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Create a new issue
Commits
Issue Boards
Open sidebar
nexedi
gitlab-ce
Commits
f8853569
Commit
f8853569
authored
Apr 11, 2018
by
Michael Kozono
Committed by
Nick Thomas
Apr 11, 2018
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
Geo: Minor improvements to Disaster Recovery and Planned Failover docs
parent
e61cc7c4
Changes
2
Hide whitespace changes
Inline
Side-by-side
Showing
2 changed files
with
50 additions
and
34 deletions
+50
-34
doc/administration/geo/disaster_recovery/index.md
doc/administration/geo/disaster_recovery/index.md
+8
-0
doc/administration/geo/disaster_recovery/planned_failover.md
doc/administration/geo/disaster_recovery/planned_failover.md
+42
-34
No files found.
doc/administration/geo/disaster_recovery/index.md
View file @
f8853569
...
...
@@ -79,6 +79,10 @@ must disable the primary.
-
Revoke object storage permissions from the primary
-
Physically disconnect a machine
1.
If you plan to
[
update the primary domain DNS record
](
#step-4-optional-updating-the-primary-domain-dns-record
)
,
you may wish to lower the TTL now to speed up propagation.
### Step 3. Promoting a secondary Geo replica
1.
SSH in to your
**secondary**
and login as root:
...
...
@@ -146,6 +150,10 @@ secondary domain, like changing Git remotes and API URLs.
external_url 'https://gitlab.example.com'
```
NOTE: **Note**
Changing `external_url` won't prevent access via the old secondary URL, as
long as the secondary DNS records are still intact.
1.
Reconfigure the secondary node for the change to take effect:
```bash
...
...
doc/administration/geo/disaster_recovery/planned_failover.md
View file @
f8853569
...
...
@@ -105,7 +105,7 @@ Visit the **Admin Area ➔ Geo nodes** dashboard on the **secondary** node to
review status. Replicated objects (shown in green) should be close to 100%,
and there should be no failures (shown in red). If a large proportion of
objects aren't yet replicated (shown in grey), consider giving the node more
time to complete
time to complete
![
Replication status
](
img/replication-status.png
)
...
...
@@ -187,38 +187,45 @@ Until a [read-only mode][ce-19739] is implemented, updates must be prevented
from happening manually. Note that your
**secondary**
still needs read-only
access to the primary for the duration of the maintenance window.
At the scheduled time, using your cloud provider or your node's firewall, block
all HTTP, HTTPS and SSH traffic to/from the primary,
**except**
for your IP and
the secondary's IP.
For instance, if your secondary originates all its traffic from
`5.6.7.8`
and
your IP is
`100.0.0.1`
, you might run the following commands on the server(s)
making up your primary node:
```
sudo iptables -A INPUT -p tcp -s 5.6.7.8 --destination-port 22 -j ACCEPT
sudo iptables -A INPUT -p tcp -s 100.0.0.1 --destination-port 22 -j ACCEPT
sudo iptables -A INPUT --destination-port 22 -j REJECT
sudo iptables -A INPUT -p tcp -s 5.6.7.8 --destination-port 80 -j ACCEPT
sudo iptables -A INPUT -p tcp -s 100.0.0.1 --destination-port 80 -j ACCEPT
sudo iptables -A INPUT --tcp-dport 80 -j REJECT
sudo iptables -A INPUT -p tcp -s 5.6.7.8 --destination-port 443 -j ACCEPT
sudo iptables -A INPUT -p tcp -s 100.0.0.1 --destination-port 443 -j ACCEPT
sudo iptables -A INPUT --tcp-dport 443 -j REJECT
```
From this point, users will be unable to view their data or make changes on the
**primary**
node. They will also be unable to log in to the
**secondary**
node,
but existing sessions will work for the remainder of the maintenance period, and
public data will be accessible throughout.
Next, disable non-Geo periodic background jobs on the primary node by navigating
to
**Admin Area ➔ Monitoring ➔ Background Jobs ➔ Cron**
, pressing
`Disable All`
,
and then pressing
`Enable`
for the
`geo_sidekiq_cron_config_worker`
cron job.
This job will re-enable several other cron jobs that are essential for planned
failover to complete successfully.
1.
At the scheduled time, using your cloud provider or your node's firewall, block
all HTTP, HTTPS and SSH traffic to/from the primary,
**except**
for your IP and
the secondary's IP.
For instance, if your secondary originates all its traffic from
`5.6.7.8`
and
your IP is
`100.0.0.1`
, you might run the following commands on the server(s)
making up your primary node:
```
sudo iptables -A INPUT -p tcp -s 5.6.7.8 --destination-port 22 -j ACCEPT
sudo iptables -A INPUT -p tcp -s 100.0.0.1 --destination-port 22 -j ACCEPT
sudo iptables -A INPUT --destination-port 22 -j REJECT
sudo iptables -A INPUT -p tcp -s 5.6.7.8 --destination-port 80 -j ACCEPT
sudo iptables -A INPUT -p tcp -s 100.0.0.1 --destination-port 80 -j ACCEPT
sudo iptables -A INPUT --tcp-dport 80 -j REJECT
sudo iptables -A INPUT -p tcp -s 5.6.7.8 --destination-port 443 -j ACCEPT
sudo iptables -A INPUT -p tcp -s 100.0.0.1 --destination-port 443 -j ACCEPT
sudo iptables -A INPUT --tcp-dport 443 -j REJECT
```
From this point, users will be unable to view their data or make changes on the
**primary** node. They will also be unable to log in to the **secondary** node,
but existing sessions will work for the remainder of the maintenance period, and
public data will be accessible throughout.
1.
Verify the primary is blocked to HTTP traffic by visiting it in browser via
another IP. The server should refuse connection.
1.
Verify the primary is blocked to Git over SSH traffic by attempting to pull an
existing Git repository with an SSH remote URL. The server should refuse
connection.
1.
Disable non-Geo periodic background jobs on the primary node by navigating
to
**Admin Area ➔ Monitoring ➔ Background Jobs ➔ Cron**
, pressing
`Disable All`
,
and then pressing
`Enable`
for the
`geo_sidekiq_cron_config_worker`
cron job.
This job will re-enable several other cron jobs that are essential for planned
failover to complete successfully.
## Finish replicating and verifying all data
...
...
@@ -230,7 +237,6 @@ failover to complete successfully.
before it is completed will cause the work to be lost!
1.
On the
**primary**
, navigate to
**Admin Area ➔ Geo Nodes**
and wait for the
following conditions to be true of the
**secondary**
you are failing over to:
*
All replication meters to each 100% replicated, 0% failures
*
All verification meters reach 100% verified, 0% failures
*
Database replication lag is 0ms
...
...
@@ -256,6 +262,8 @@ begin to diverge from the old one. If problems do arise at this point, failing
back to the old primary
[
is possible
][
bring-primary-back
]
, but likely to result
in the loss of any data uploaded to the new primary in the meantime.
Don't forget to remove the broadcast message after failover is complete.
[
bring-primary-back
]:
bring_primary_back.md
[
ce-19739
]:
https://gitlab.com/gitlab-org/gitlab-ce/issues/19739
[
container-registry
]:
../replication/container_registry.md
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment