Commit 9ee6aa20 authored by Marin Jankovski's avatar Marin Jankovski

Merge branch 'repmgr-automation-docs'

parents 8e03b701 521fcf4f
## Configure GitLab using an external PostgreSQL service
If you're hosting GitLab on a cloud provider, you can optionally use a
managed service for PostgreSQL. For example, AWS offers a managed Relational
Database Service (RDS) that runs PostgreSQL.
Alternatively, you may opt to manage your own PostgreSQL instance or cluster
separate from the GitLab Omnibus package.
If you use a cloud-managed service, or provide your own PostgreSQL instance:
1. Setup PostgreSQL according to the
[database requirements document](../install/requirements.md#database).
1. Set up a `gitlab` username with a password of your choice. The `gitlab` user
needs privileges to create the `gitlabhq_production` database.
1. Configure the GitLab application servers with the appropriate details.
This step is covered in [Configuring GitLab for HA](high_availability/gitlab.md).
# Configuring a Database for GitLab HA
# Configuring Databases for GitLab HA
> Note: GitLab HA requires an Enterprise Edition Premium license
**Warning**
This functionality should be considered beta, use with caution.
**Warning**
You can choose to install and manage a database server (PostgreSQL/MySQL)
yourself, or you can use GitLab Omnibus packages to help. GitLab recommends
PostgreSQL. This is the database that will be installed if you use the
Omnibus package to manage your database.
## Configure your own database server
If you're hosting GitLab on a cloud provider, you can optionally use a
managed service for PostgreSQL. For example, AWS offers a managed Relational
Database Service (RDS) that runs PostgreSQL.
Alternatively, you may opt to manage your own PostgreSQL instance or cluster
separate from the GitLab Omnibus package.
If you use a cloud-managed service, or provide your own PostgreSQL instance:
1. Setup PostgreSQL according to the
[database requirements document](../../install/requirements.md#database).
1. Set up a `gitlab` username with a password of your choice. The `gitlab` user
needs privileges to create the `gitlabhq_production` database.
1. Configure the GitLab application servers with the appropriate details.
This step is covered in [Configuring GitLab for HA](gitlab.md).
## Configure using Omnibus
Following these steps should leave you with a database cluster consisting of at least 2 nodes,
using [repmgr](http://www.repmgr.org/) to handle standby synchronization, and failing over.
### On each database node
## Overview
GitLab supports multiple options for its database backend
* Using the Omnibus GitLab package to configure PG in HA setup (EEP only). This document contains directions for EEP users.
* Using GitLab with an [externally managed PostgreSQL service](../external_database.md). This could be a cloud provider, or your own service.
or for a non-HA option
* Using the Omnibus Gitlab CE/EES package with a [single PostgreSQL instance](http://docs.gitlab.com/omnibus/settings/database.html).
## Configure Omnibus GitLab package database HA (Enterprise Edition Premium)
### Preparation
The recommended configuration for a PostgreSQL HA setup requires:
* A minimum of three consul server nodes
* A minimum of two database nodes
* Each node will run the following services
* postgresql -- The database itself
* repmgrd -- A service to monitor, and handle failover in case of a master failure
* consul -- Used for service discovery, to alert other nodes when failover occurs
* At least one separate node for running the `pgbouncer` service.
#### Required information
* Network information for all nodes
* DNS names -- By default, `repmgr` and `pgbouncer` use DNS to locate nodes
* IP address -- PostgreSQL does not listen on any network interface by default. It needs to know which IP address to listen on in order to use the network interface. It can be set to `0.0.0.0` to listen on all interfaces. It cannot be set to the loopack address 127.0.0.1
* Network Address -- PostgreSQL access is controlled based on the network source. This can be in subnet (i.e. 192.168.0.0/255.255.255.0) or CIDR (i.e. 192.168.0.0/24) form.
* User information for `pgbouncer` service
* The service runs as the same user as the database, default of `gitlab-psql`
* The service will have a regular database user account generated for it
* Default username is `pgbouncer`. In the rest of the documentation we will refer to this username as `PGBOUNCER_USERNAME`
* Password for `pgbouncer` service. In the rest of the documentation we will refer to this password as `PGBOUNCER_PASSWORD`
* Password hash for `pgbouncer` service
* This should be generated from `pgbouncer` username and password pair
* Generate the hash with:
``
$ echo -n 'PASSWORD+USERNAME' | md5sum
``
* In the rest of the documentation we will refer to this hash as `PGBOUNCER_PASSWORD_HASH`
* This password will be stored in the following locations
* `/etc/gitlab/gitlab.rb`: hashed, and in plain text
* `/var/opt/gitlab/pgbouncer/pg_auth`: hashed
* User information for the Repmgr service
* The service runs under the same system account as the database by default.
* The service requires a superuser database account be generated for it. This defaults to `gitlab_repmgr`
* User information for the Consul service
* The consul service runs under a dedicated system account by default, `gitlab-consul`. In the rest of the documentation we will refer to this username as `CONSUL_USERNAME`
* There will be a database user created with read only access to the repmgr database
* Password for the database user. In the rest of the documentation we will refer to this password as `CONSUL_DATABASE_PASSWORD`
* Password hash for `gitlab-consul` service
* This should be generated from `gitlab-consul` username and password pair
* Generate the hash with:
``
$ echo -n 'PASSWORD+USERNAME' | md5sum
``
* In the rest of the documentation we will refer to this hash as `CONSUL_PASSWORD_HASH`
* This password will be stored in the following locations
* '/etc/gitlab/gitlab.rb`: hashed
* '/var/opt/gitlab/pgbouncer/pg_auth': hashed
* '/var/opt/gitlab/gitlab-consul/.pgpass': plaintext
* The number of nodes in the database cluster.
* When configuring PostgreSQL, we will set `max_wal_senders` to one more than this number. This is used to prevent replication from using up all of the available database connections.
### Installation
#### On each node
1. Download/install GitLab Omnibus using **steps 1 and 2** from
[GitLab downloads](https://about.gitlab.com/downloads). Do not complete other
steps on the download page.
1. Create a password hash for the sql user (the default username is `gitlab`)
```
$ echo -n 'PASSWORD+USERNAME' | md5sum
```
1. Create/edit `/etc/gitlab/gitlab.rb` and use the following configuration.
If there is a directive listed below that you do not see in the configuration, be sure to add it.
```ruby
# Disable all components except PostgreSQL
postgresql['enable'] = true
bootstrap['enable'] = false
nginx['enable'] = false
unicorn['enable'] = false
sidekiq['enable'] = false
redis['enable'] = false
prometheus['enable'] = false
gitaly['enable'] = false
gitlab_workhorse['enable'] = false
mailroom['enable'] = false
# PostgreSQL configuration
postgresql['md5_auth_cidr_addresses'] = ['0.0.0.0/0']
postgresql['listen_address'] = '0.0.0.0'
postgresql['sql_user_password'] = 'PASSWORD_HASH' # This is the hash generated in the previous step
postgresql['trust_auth_cidr_addresses'] = ['127.0.0.0/24']
postgresql['hot_standby'] = 'on'
postgresql['wal_level'] = 'replica'
postgresql['max_wal_senders'] = X # Should be set to at least 1 more than the number of nodes in the cluster
postgresql['shared_preload_libraries'] = 'repmgr_funcs' # If this attribute is already defined, append the new value as a comma separated list
postgresql['custom_pg_hba_entries']['repmgr'] = [
{
type: 'local',
database: 'replication',
user: 'gitlab_replicator',
method: 'trust',
},
{
type: 'host',
database: 'replication',
user: 'gitlab_replicator',
cidr: '127.0.0.1/32',
method: 'trust'
},
{
type: 'host',
database: 'replication',
user: 'gitlab_replicator',
cidr: 'XXX.XXX.XXX.XXX/YY', # This should be the CIDR of the network your database nodes are on
method: 'trust'
},
{
type: 'local',
database: 'repmgr',
user: 'gitlab_replicator',
method: 'trust',
},
{
type: 'host',
database: 'repmgr',
user: 'gitlab_replicator',
cidr: '127.0.0.1/32',
method: 'trust'
},
{
type: 'host',
database: 'repmgr',
user: 'gitlab_replicator',
cidr: 'XXX.XXX.XXX.XXX/YY', # This should be the CIDR of the network your database nodes are on
method: 'trust'
}
]
# Disable automatic database migrations
gitlab_rails['auto_migrate'] = false
```
1. Reconfigure GitLab for the new settings to take effect
```
# gitlab-ctl reconfigure
```
1. Create `/var/opt/gitlab/postgresql/repmgr.conf` with the following content. Use a unique integer for the value of node.
```
cluster=gitlab_cluster
node=X
node_name=HOSTNAME
conninfo='host=HOSTNAME user=gitlab_replicator dbname=repmgr'
pg_bindir='/opt/gitlab/embedded/bin'
service_start_command = '/opt/gitlab/bin/gitlab-ctl start postgresql'
service_stop_command = '/opt/gitlab/bin/gitlab-ctl stop postgresql'
service_restart_command = '/opt/gitlab/bin/gitlab-ctl restart postgresql'
promote_command = '/opt/gitlab/embedded/bin/repmgr standby promote -f /var/opt/gitlab/postgresql/repmgr.conf'
follow_command = '/opt/gitlab/embedded/bin/repmgr standby follow -f /var/opt/gitlab/postgresql/repmgr.conf'
```
### On the primary database node
#### Configuration
Each node needs to be configured to run only the services it needs. Create an `/etc/gitlab/gitlab.rb` on each node which looks like the following, then run `gitlab-ctl reconfigure`
##### On each consul server node
```ruby
# Disable all components except Consul
bootstrap['enable'] = false
gitaly['enable'] = false
gitlab_workhorse['enable'] = false
mailroom['enable'] = false
nginx['enable'] = false
postgresql['enable'] = false
redis['enable'] = false
sidekiq['enable'] = false
unicorn['enable'] = false
consul['enable'] = true
# START user configuration
# Please set the real values as explained in Required Information section
#
consul['configuration'] = {
server: true,
retry_join: %w(NAMES OR IPS OF ALL CONSUL NODES)
}
#
# END user configuration
```
##### On each database node
```ruby
# Disable all components except PostgreSQL
postgresql['enable'] = true
bootstrap['enable'] = false
nginx['enable'] = false
unicorn['enable'] = false
sidekiq['enable'] = false
redis['enable'] = false
gitaly['enable'] = false
gitlab_workhorse['enable'] = false
mailroom['enable'] = false
# PostgreSQL configuration
postgresql['listen_address'] = '0.0.0.0'
postgresql['trust_auth_cidr_addresses'] = %w(127.0.0.0/24)
postgresql['md5_auth_cidr_addresses'] = %w(0.0.0.0/0)
postgresql['hot_standby'] = 'on'
postgresql['wal_level'] = 'replica'
postgresql['shared_preload_libraries'] = 'repmgr_funcs'
# repmgr configuration
repmgr['enable'] = true
# Disable automatic database migrations
gitlab_rails['auto_migrate'] = false
# Enable the consul agent
consul['enable'] = true
consul['services'] = %w(postgresql)
# START user configuration
# Please set the real values as explained in Required Information section
#
postgresql['pgbouncer_user'] = 'PGBOUNCER_USER'
postgresql['pgbouncer_user_password'] = 'PGBOUNCER_PASSWORD_HASH' # This is the hash generated in the preparation section
postgresql['max_wal_senders'] = X
repmgr['trust_auth_cidr_addresses'] = %w(XXX.XXX.XXX.XXX/YY) # This should be the CIDR of the network(s) your database nodes are on
consul['configuration'] = {
retry_join: %w(NAMES OR IPS OF ALL CONSUL NODES)
}
#
# END user configuration
```
##### On the pgbouncer node
Ensure the following attributes are set
```ruby
# Disable all components except Pgbouncer
postgresql['enable'] = false
bootstrap['enable'] = false
nginx['enable'] = false
unicorn['enable'] = false
sidekiq['enable'] = false
redis['enable'] = false
gitaly['enable'] = false
gitlab_workhorse['enable'] = false
mailroom['enable'] = false
pgbouncer['enable'] = true
# Configure pgbouncer
pgbouncer['listen_address'] = '0.0.0.0'
# Enable the consul agent
consul['enable'] = true
consul['watchers'] = %w(postgresql)
# START user configuration
# Please set the real values as explained in Required Information section
#
consul['configuration'] = {
retry_join: %w(NAMES OR IPS OF ALL CONSUL NODES)
}
#
# END user configuration
```
##### Application node(s)
These will be the nodes running the gitlab-rails service. You may have other attributes set, but the following need to be set
```ruby
gitlab_rails['db_host'] = 'PGBOUNCER_NODE'
gitlab_rails['db_port'] = 6432
```
#### Post-configuration
After reconfigure successfully runs, the following steps must be completed to get the cluster up and running
#### Consul server nodes
1. Verify the nodes are all communicating
```
# consul members
Node Address Status Type Build Protocol DC
NODE_ONE XXX.XXX.XXX.YYY:8301 alive server 0.9.2 2 gitlab_cluster
NODE_TWO XXX.XXX.XXX.YYY:8301 alive server 0.9.2 2 gitlab_cluster
NODE_THREE XXX.XXX.XXX.YYY:8301 alive server 0.9.2 2 gitlab_cluster
```
##### On the primary database node
1. Open a database prompt:
```
$ gitlab-psql -d template1
$ gitlab-psql -d gitlabhq_production
# Output:
psql (DB_VERSION)
Type "help" for help.
template1=#
```
1. Run the following command at the database prompt and you will be asked to
enter the new password for the PostgreSQL superuser.
gitlabhq_production=#
```
template1=# \password
# Output:
Enter new password:
Enter it again:
1. Enable the `pg_trgm` extension:
```
1. Create the repmgr database:
```
template1=# ALTER USER gitlab_replicator WITH SUPERUSER;
template1=# CREATE DATABASE repmgr WITH OWNER gitlab_replicator;
```
1. Switch to the GitLab database and Enable the `pg_trgm` extension:
```
template1=# \c gitlabhq_production
gitlabhq_production=# CREATE EXTENSION pg_trgm;
# Output:
......@@ -175,121 +228,104 @@ using [repmgr](http://www.repmgr.org/) to handle standby synchronization, and fa
1. Exit the database prompt by typing `\q` and Enter.
1. Register the node as the initial master node for the repmgr cluster
```
# su - gitlab-psql
$ repmgr -f /var/opt/gitlab/postgresql/repmgr.conf master register
NOTICE: master node correctly registered for cluster 'gitlab_cluster' with id X (conninfo: host=HOSTNAME user=gitlab_replicator dbname=repmgr)
```
1. Verify the cluster is initialized with one node
```
$ repmgr -f /var/opt/gitlab/postgresql/repmgr.conf cluster show
# gitlab-ctl repmgr cluster show
Role | Name | Upstream | Connection String
----------+-------------|----------|----------------------------------------
* master | HOSTNAME | | host=HOSTNAME user=gitlab_replicator dbname=repmgr
* master | HOSTNAME | | host=HOSTNAME user=gitlab_repmgr dbname=gitlab_repmgr
```
### On each standby node
1. Stop postgresql
```
# gitlab-ctl stop postgresql
```
1. Clear out the current data directory
##### On each standby node
1. Setup the repmgr standby
```
# rm -rf /var/opt/gitlab/postgresql/data/*
```
1. Synchronize the data from the primary node:
```
# su - gitlab-psql
$ repmgr -h PRIMARY_HOSTNAME -U gitlab_replicator -d repmgr -D /var/opt/gitlab/postgresql/data/ -f /var/opt/gitlab/postgresql/repmgr.conf standby clone
```
1. Start the database
```
$ gitlab-ctl start postgresql
```
1. Register the node with the cluster
```
$ repmgr -f /var/opt/gitlab/postgresql/repmgr.conf standby register
NOTICE: standby node correctly registered for cluster gitlab_cluster with id X (conninfo: host=HOSTNAME user=gitlab_replicator dbname=repmgr)
# gitlab-ctl repmgr standby setup MASTER_NODE
```
1. Verify the node now appears in the cluster
```
$ repmgr -f /var/opt/gitlab/postgresql/repmgr.conf cluster show
# gitlab-ctl repmgr cluster show
Role | Name | Upstream | Connection String
----------+------------|------------|------------------------------------------------
* master | MASTER | | host=MASTER_HOSTNAME user=gitlab_replicator dbname=repmgr
standby | STANDBY | MASTER | host=STANDBY_HOSTNAME user=gitlab_replicator dbname=repmgr
* master | MASTER | | host=MASTER_HOSTNAME user=gitlab_repmgr dbname=gitlab_repmgr
standby | STANDBY | MASTER | host=STANDBY_HOSTNAME user=gitlab_repmgr dbname=gitlab_repmgr
```
### (Optional) Enable repmgrd
You can use repmgrd to monitor the database, and automatically failover if it detects the current master is unreachable.
Currently, there is no method of telling the application to automatically fail over to the new master, it must be done
manually. So this step is not required.
##### On the pgbouncer node
1. Create a `.pgpass` file user for the `CONSUL_USER` account to be able to reload pgbouncer
```
# gitlab-ctl write-pgpass --host PGBOUNCER_HOSE --database pgbouncer --user gitlab-consul
Please enter password: ****
Confirm password: ****
```
If you still want to enable this feature, do the following on each database node
1. Add the following line to `/var/opt/gitlab/postgresql/repmgr.conf`
```
failover=automatic
```
1. Ensure the node is talking to the current master
```
# /opt/gitlab/embedded/bin/psql -h 127.0.0.1 -p 6432 -d pgbouncer pgbouncer # You will be prompted for PGBOUNCER_PASSWORD
pgbouncer=# show databases ; show clients ;
name | host | port | database | force_user | pool_size | reserve_pool | pool_mode | max_connections | current_connections
---------------------+-------------+------+---------------------+------------+-----------+--------------+-----------+-----------------+---------------------
gitlabhq_production | MASTER_HOST | 5432 | gitlabhq_production | | 20 | 0 | | 0 | 0
pgbouncer | | 6432 | pgbouncer | pgbouncer | 2 | 0 | statement | 0 | 0
(2 rows)
type | user | database | state | addr | port | local_addr | local_port | connect_time | request_time | ptr | link
| remote_pid | tls
------+-----------+---------------------+---------+----------------+-------+------------+------------+---------------------+---------------------+-----------+-----
-+------------+-----
C | (nouser) | gitlabhq_production | waiting | IP_OF_APP_NODE | 56512 | 127.0.0.1 | 6432 | 2017-08-21 18:08:51 | 2017-08-21 18:08:51 | 0x22b3700 |
| 0 |
C | pgbouncer | pgbouncer | active | 127.0.0.1 | 56846 | 127.0.0.1 | 6432 | 2017-08-21 18:09:59 | 2017-08-21 18:10:48 | 0x22b3880 |
| 0 |
(2 rows)
1. It may be necessary to manually run migrations.
```
# gitlab-rake db:migrate
```
1. Create the log directory
```
install -o -d gitlab-psql /var/log/gitlab/repmgr
```
#### Server running
At this point, your GitLab instance should be up and running, verify you are able to login, and create issues and merge requests.
1. Start repmgrd
```
# su - gitlab-psql -c '/opt/gitlab/embedded/bin/repmgrd -f /var/opt/gitlab/postgresql/repmgr.conf --verbose -d >> /var/log/gitlab/repmgr/repmgr.log 2>&1'
```
### Failover procedure
By default, if the master database fails, repmgrd should promote one of the standby nodes to master automatically, and consul will update pgbouncer with the new master.
### Operations
If your master node is experiencing an issue, you can manually failover.
1. If the master database is still running, shut it down first
If you need to failover manually, you have two options:
1. Shutdown the current master database
```
# gitlab-ctl stop postgresql
```
The automated failover process will see this and failover to one of the standby nodes.
1. Login to the server that should become the new master and run the following
```
# su - gitlab-psql
$ repmgr -f /var/opt/gitlab/postgresql/repmgr.conf standby promote
```
1. Manually failover
1. Ensure the old master node is not still active.
1. If there are any other standby servers in the cluster, have them follow the new master server
```
# su - gitlab-psql
# repmgr -f /var/opt/gitlab/postgresql/repmgr.conf -h NEW_MASTER -U gitlab_replicator -d repmgr -d /var/opt/gitlab/postgresql/data standby follow
```
1. Login to the server that should become the new master and run the following
```
# gitlab-ctl repmgr standby promote
```
1. On the servers that run `gitlab-rails`, set the `gitlab_rails['db_host']` attribute to the new master, and run `gitlab-ctl reconfigure`
1. If there are any other standby servers in the cluster, have them follow the new master server
```
# gitlab-ctl repmgr standby follow NEW_MASTER
```
1. At this point, you should have a functioning cluster with database writes going to the new master. Now you can recover the failed master server, or remove it from the cluster
### Restore procedure
If a node fails, it can be removed from the cluster, or added back as a standby after it has been restored to service.
1. If you want to remove the node from the cluster, on any other node in the cluster, run:
* If you want to remove the node from the cluster, on any other node in the cluster, run:
```
# su - gitlab-psql
$ repmgr -f /var/opt/gitlab/postgresql/repmgr.conf standby unregister --node=X # X should be the value of node in repmgr.conf on the old server
# gitlab-ctl repmgr standby unregister --node=X # X should be the value of node in repmgr.conf on the old server
```
1. If the failed master has been recovered, it can be converted to a standby server and follow the new master server[^1]
* To add the node as a standby server[^1]
```
# su - gitlab-psql
# repmgr -f /var/opt/gitlab/postgresql/repmgr.conf -h NEW_MASTER -U gitlab_replicator -d repmgr -d /var/opt/gitlab/postgresql/data standby follow
# gitlab-ctl repmgr standby follow NEW_MASTER
# gitlab-ctl restart repmgrd
```
[^1]: When the server is back online, and before you switch it to a standby node, repmgr will report that there are two masters.
If there are any clients that are still writing to the old master, this will cause a split, and the old master will need to be resynced from scratch by performing a `standby clone` before you run `standby follow`
## Configuring the Application
After database setup is complete, the next step is to Configure the GitLab application servers with the appropriate details.
When prompted for `gitlab_rails['db_host']`, this should be set to the master node in your cluster.
This step is covered in [Configuring GitLab for HA](gitlab.md).
[^1]: **Warning**: When the server is brought back online, and before you switch it to a standby node, repmgr will report that there are two masters.
If there are any clients that are still attempting to write to the old master, this will cause a split, and the old master will need to be resynced from scratch by performing a `standby setup NEW_MASTER`.
---
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment