Commit 56f66e4c authored by Drew Blessing's avatar Drew Blessing

Merge branch 'ha_docs' into 'master'

Updated HA docs and open to public

Closes https://dev.gitlab.org/gitlab/organization/issues/624

We still need to go through the old gitlab.com/standard/high-availability and move relevant docs over, plus do some docs on DRBD. However, I think this is a good volume for the first pass. Can we get this out there and then iterate?

cc/ @axil @patricio For your review, please.

See merge request !3579
parents c38f2645 34ea1d2c
......@@ -42,6 +42,7 @@
- [Housekeeping](administration/housekeeping.md) Keep your Git repository tidy and fast.
- [GitLab Performance Monitoring](monitoring/performance/introduction.md) Configure GitLab and InfluxDB for measuring performance metrics
- [Sidekiq Troubleshooting](administration/troubleshooting/sidekiq.md) Debug when Sidekiq appears hung and is not processing jobs
- [High Availability](administration/high-availability/README.md) Configure multiple servers for scaling or high availability
## Contributor documentation
......
# High Availability
GitLab supports several different types of clustering and high-availability.
The solution you choose will be based on the level of scalability and
availability you require. The easiest solutions are scalable, but not necessarily
highly available.
## Architecture
### Active/Passive
For pure high-availability/failover with no scaling you can use an
active/passive configuration. This utilizes DRBD (Distributed Replicated
Block Device) to keep all data in sync. DRBD requires a low latency link to
remain in sync. It is not advisable to attempt to run DRBD between data centers
or in different cloud availability zones.
Components/Servers Required:
- 2 servers/virtual machines (one active/one passive)
### Active/Active
This architecture scales easily because all application servers handle
user requests simultaneously. The database, Redis, and GitLab application are
all deployed on separate servers. The configuration is **only** highly-available
if the database, Redis and storage are also configured as such.
**Steps to configure active/active:**
1. [Configure the database](database.md)
1. [Configure Redis](redis.md)
1. [Configure NFS](nfs.md)
1. [Configure the GitLab application servers](gitlab.md)
1. [Configure the load balancers](load_balancer.md)
# Configuring a Database for GitLab HA
You can choose to install and manage a database server (PostgreSQL/MySQL)
yourself, or you can use GitLab Omnibus packages to help. GitLab recommends
PostgreSQL. This is the database that will be installed if you use the
Omnibus package to manage your database.
## Configure your own database server
If you're hosting GitLab on a cloud provider, you can optionally use a
managed service for PostgreSQL. For example, AWS offers a managed Relational
Database Service (RDS) that runs PostgreSQL.
If you use a cloud-managed service, or provide your own PostgreSQL:
1. Set up a `gitlab` username with a password of your choice. The `gitlab` user
needs privileges to create the `gitlabhq_production` database.
1. Configure the GitLab application servers with the appropriate details.
This step is covered in [Configuring GitLab for HA](gitlab.md)
## Configure using Omnibus
1. Download/install GitLab Omnibus using **steps 1 and 2** from
[GitLab downloads](https://about.gitlab.com/downloads). Do not complete other
steps on the download page.
1. Create/edit `/etc/gitlab/gitlab.rb` and use the following configuration.
Be sure to change the `external_url` to match your eventual GitLab front-end
URL.
```ruby
external_url 'https://gitlab.example.com'
# Disable all components except PostgreSQL
postgresql['enable'] = true
bootstrap['enable'] = false
nginx['enable'] = false
unicorn['enable'] = false
sidekiq['enable'] = false
redis['enable'] = false
gitlab_workhorse['enable'] = false
mailroom['enable'] = false
# PostgreSQL configuration
postgresql['sql_password'] = 'DB password'
postgresql['md5_auth_cidr_addresses'] = ['0.0.0.0/0']
postgresql['listen_address'] = '0.0.0.0'
```
1. Run `sudo gitlab-ctl reconfigure` to install and configure PostgreSQL.
> **Note**: This `reconfigure` step will result in some errors.
That's OK - don't be alarmed.
1. Open a database prompt:
```
su - gitlab-psql
/bin/bash
psql -h /var/opt/gitlab/postgresql -d template1
# Output:
psql (9.2.15)
Type "help" for help.
template1=#
```
1. Run the following command at the database prompt and you will be asked to
enter the new password for the PostgreSQL superuser.
```
\password
# Output:
Enter new password:
Enter it again:
```
1. Similarly, set the password for the `gitlab` database user. Use the same
password that you specified in the `/etc/gitlab/gitlab.rb` file for
`postgresql['sql_password']`.
```
\password gitlab
# Output:
Enter new password:
Enter it again:
```
1. Enable the `pg_trgm` extension:
```
CREATE EXTENSION pg_trgm;
# Output:
CREATE EXTENSION
```
1. Exit the database prompt by typing `\q` and Enter.
1. Exit the `gitlab-psql` user by running `exit` twice.
1. Run `sudo gitlab-ctl reconfigure` a final time.
1. Run `touch /etc/gitlab/skip-auto-migrations` to prevent database migrations
from running on upgrade. Only the primary GitLab application server should
handle migrations.
---
Read more on high-availability configuration:
1. [Configure Redis](redis.md)
1. [Configure NFS](nfs.md)
1. [Configure the GitLab application servers](gitlab.md)
1. [Configure the load balancers](load_balancer.md)
# Configuring GitLab for HA
Assuming you have already configured a database, Redis, and NFS, you can
configure the GitLab application server(s) now. Complete the steps below
for each GitLab application server in your environment.
> **Note:** There is some additional configuration near the bottom for
secondary GitLab application servers. It's important to read and understand
these additional steps before proceeding with GitLab installation.
1. If necessary, install the NFS client utility packages using the following
commands:
```
# Ubuntu/Debian
apt-get install nfs-common
# CentOS/Red Hat
yum install nfs-utils nfs-utils-lib
```
1. Specify the necessary NFS shares. Mounts are specified in
`/etc/fstab`. The exact contents of `/etc/fstab` will depend on how you chose
to configure your NFS server. See [NFS documentation](nfs.md) for the various
options. Here is an example snippet to add to `/etc/fstab`:
```
10.1.0.1:/var/opt/gitlab/.ssh /var/opt/gitlab/.ssh nfs defaults,soft,rsize=1048576,wsize=1048576,noatime,nobootwait,lookupcache=positive 0 2
10.1.0.1:/var/opt/gitlab/gitlab-rails/uploads /var/opt/gitlab/gitlab-rails/uploads nfs defaults,soft,rsize=1048576,wsize=1048576,noatime,nobootwait,lookupcache=positive 0 2
10.1.0.1:/var/opt/gitlab/gitlab-rails/shared /var/opt/gitlab/gitlab-rails/shared nfs defaults,soft,rsize=1048576,wsize=1048576,noatime,nobootwait,lookupcache=positive 0 2
10.1.0.1:/var/opt/gitlab/gitlab-ci/builds /var/opt/gitlab/gitlab-ci/builds nfs defaults,soft,rsize=1048576,wsize=1048576,noatime,nobootwait,lookupcache=positive 0 2
10.1.1.1:/var/opt/gitlab/git-data /var/opt/gitlab/git-data nfs defaults,soft,rsize=1048576,wsize=1048576,noatime,nobootwait,lookupcache=positive 0 2
```
1. Create the shared directories. These may be different depending on your NFS
mount locations.
```
mkdir -p /var/opt/gitlab/.ssh /var/opt/gitlab/gitlab-rails/uploads /var/opt/gitlab/gitlab-rails/shared /var/opt/gitlab/gitlab-ci/builds /var/opt/gitlab/git-data
```
1. Download/install GitLab Omnibus using **steps 1 and 2** from
[GitLab downloads](https://about.gitlab.com/downloads). Do not complete other
steps on the download page.
1. Create/edit `/etc/gitlab/gitlab.rb` and use the following configuration.
Be sure to change the `external_url` to match your eventual GitLab front-end
URL. Depending your the NFS configuration, you may need to change some GitLab
data locations. See [NFS documentation](nfs.md) for `/etc/gitlab/gitlab.rb`
configuration values for various scenarios. The example below assumes you've
added NFS mounts in the default data locations.
```ruby
external_url 'https://gitlab.example.com'
# Prevent GitLab from starting if NFS data mounts are not available
high_availability['mountpoint'] = '/var/opt/gitlab/git-data'
# Disable components that will not be on the GitLab application server
postgresql['enable'] = false
redis['enable'] = false
# PostgreSQL connection details
gitlab_rails['db_adapter'] = 'postgresql'
gitlab_rails['db_encoding'] = 'unicode'
gitlab_rails['db_host'] = '10.1.0.5' # IP/hostname of database server
gitlab_rails['db_password'] = 'DB password'
# Redis connection details
gitlab_rails['redis_port'] = '6379'
gitlab_rails['redis_host'] = '10.1.0.6' # IP/hostname of Redis server
gitlab_rails['redis_password'] = 'Redis Password'
```
1. Run `sudo gitlab-ctl reconfigure` to compile the configuration.
## Primary GitLab application server
As a final step, run the setup rake task on the first GitLab application server.
It is not necessary to run this on additional application servers.
1. Initialize the database by running `sudo gitlab-rake gitlab:setup`.
> **WARNING:** Only run this setup task on **NEW** GitLab instances because it
will wipe any existing data.
> **Note:** When you specify `https` in the `external_url`, as in the example
above, GitLab assumes you have SSL certificates in `/etc/gitlab/ssl/`. If
certificates are not present, Nginx will fail to start. See
[Nginx documentation](http://docs.gitlab.com/omnibus/settings/nginx.html#enable-https)
for more information.
## Additional configuration for secondary GitLab application servers
Secondary GitLab servers (servers configured **after** the first GitLab server)
need some additional configuration.
1. Configure shared secrets. These values can be obtained from the primary
GitLab server in `/etc/gitlab/gitlab-secrets.json`. Add these to
`/etc/gitlab/gitlab.rb` **prior to** running the first `reconfigure` in
the steps above.
```ruby
gitlab_shell['secret_token'] = 'fbfb19c355066a9afb030992231c4a363357f77345edd0f2e772359e5be59b02538e1fa6cae8f93f7d23355341cea2b93600dab6d6c3edcdced558fc6d739860'
gitlab_rails['secret_token'] = 'b719fe119132c7810908bba18315259ed12888d4f5ee5430c42a776d840a396799b0a5ef0a801348c8a357f07aa72bbd58e25a84b8f247a25c72f539c7a6c5fa'
gitlab_ci['secret_key_base'] = '6e657410d57c71b4fc3ed0d694e7842b1895a8b401d812c17fe61caf95b48a6d703cb53c112bc01ebd197a85da81b18e29682040e99b4f26594772a4a2c98c6d'
gitlab_ci['db_key_base'] = 'bf2e47b68d6cafaef1d767e628b619365becf27571e10f196f98dc85e7771042b9203199d39aff91fcb6837c8ed83f2a912b278da50999bb11a2fbc0fba52964'
```
1. Run `touch /etc/gitlab/skip-auto-migrations` to prevent database migrations
from running on upgrade. Only the primary GitLab application server should
handle migrations.
## Troubleshooting
- `mount: wrong fs type, bad option, bad superblock on`
You have not installed the necessary NFS client utilities. See step 1 above.
- `mount: mount point /var/opt/gitlab/... does not exist`
This particular directory does not exist on the NFS server. Ensure
the share is exported and exists on the NFS server and try to remount.
---
Read more on high-availability configuration:
1. [Configure the database](database.md)
1. [Configure Redis](redis.md)
1. [Configure NFS](nfs.md)
1. [Configure the load balancers](load_balancer.md)
# Load Balancer for GitLab HA
In an active/active GitLab configuration, you will need a load balancer to route
traffic to the application servers. The specifics on which load balancer to use
or the exact configuration is beyond the scope of GitLab documentation. We hope
that if you're managing HA systems like GitLab you have a load balancer of
choice already. Some examples including HAProxy (open-source), F5 Big-IP LTM,
and Citrix Net Scaler. This documentation will outline what ports and protocols
you need to use with GitLab.
## Basic ports
| LB Port | Backend Port | Protocol |
| ------- | ------------ | -------- |
| 80 | 80 | HTTP |
| 443 | 443 | HTTPS [^1] |
| 22 | 22 | TCP |
## GitLab Pages Ports
If you're using GitLab Pages you will need some additional port configurations.
GitLab Pages requires a separate VIP. Configure DNS to point the
`pages_external_url` from `/etc/gitlab/gitlab.rb` at the new VIP. See the
[GitLab Pages documentation][gitlab-pages] for more information.
| LB Port | Backend Port | Protocol |
| ------- | ------------ | -------- |
| 80 | Varies [^2] | HTTP |
| 443 | Varies [^2] | TCP [^3] |
## Alternate SSH Port
Some organizations have policies against opening SSH port 22. In this case,
it may be helpful to configure an alternate SSH hostname that allows users
to use SSH on port 443. An alternate SSH hostname will require a new VIP
compared to the other GitLab HTTP configuration above.
Configure DNS for an alternate SSH hostname such as altssh.gitlab.example.com.
| LB Port | Backend Port | Protocol |
| ------- | ------------ | -------- |
| 443 | 22 | TCP |
---
Read more on high-availability configuration:
1. [Configure the database](database.md)
1. [Configure Redis](redis.md)
1. [Configure NFS](nfs.md)
1. [Configure the GitLab application servers](gitlab.md)
[^1]: When using HTTPS protocol for port 443, you will need to add an SSL
certificate to the load balancers. If you wish to terminate SSL at the
GitLab application server instead, use TCP protocol.
[^2]: The backend port for GitLab Pages depends on the
`gitlab_pages['external_http']` and `gitlab_pages['external_https']`
setting. See [GitLab Pages documentation][gitlab-pages] for more details.
[^3]: Port 443 for GitLab Pages should always use the TCP protocol. Users can
configure custom domains with custom SSL, which would not be possible
if SSL was terminated at the load balancer.
[gitlab-pages]: http://doc.gitlab.com/ee/pages/administration.html
# NFS
## Required NFS Server features
**File locking**: GitLab **requires** file locking which is only supported
natively in NFS version 4. NFSv3 also supports locking as long as
Linux Kernel 2.6.5+ is used. We recommend using version 4 and do not
specifically test NFSv3.
**no_root_squash**: NFS normally changes the `root` user to `nobody`. This is
a good security measure when NFS shares will be accessed by many different
users. However, in this case only GitLab will use the NFS share so it
is safe. GitLab requires the `no_root_squash` setting because we need to
manage file permissions automatically. Without the setting you will receive
errors when the Omnibus package tries to alter permissions. Note that GitLab
and other bundled components do **not** run as `root` but as non-privileged
users. The requirement for `no_root_squash` is to allow the Omnibus package to
set ownership and permissions on files, as needed.
### Recommended options
When you define your NFS exports, we recommend you also add the following
options:
- `sync` - Force synchronous behavior. Default is asynchronous and under certain
circumstances it could lead to data loss if a failure occurs before data has
synced.
## Client mount options
Below is an example of an NFS mount point we use on GitLab.com:
```
10.1.1.1:/var/opt/gitlab/git-data /var/opt/gitlab/git-data nfs4 defaults,soft,rsize=1048576,wsize=1048576,noatime,nobootwait,lookupcache=positive 0 2
```
Notice several options that you should consider using:
| Setting | Description |
| ------- | ----------- |
| `nobootwait` | Don't halt boot process waiting for this mount to become available
| `lookupcache=positive` | Tells the NFS client to honor `positive` cache results but invalidates any `negative` cache results. Negative cache results cause problems with Git. Specifically, a `git push` can fail to register uniformly across all NFS clients. The negative cache causes the clients to 'remember' that the files did not exist previously.
## Mount locations
When using default Omnibus configuration you will need to share 5 data locations
between all GitLab cluster nodes. No other locations should be shared. The
following are the 5 locations you need to mount:
| Location | Description |
| -------- | ----------- |
| `/var/opt/gitlab/git-data` | Git repository data. This will account for a large portion of your data
| `/var/opt/gitlab/.ssh` | SSH `authorized_keys` file and keys used to import repositories from some other Git services
| `/var/opt/gitlab/gitlab-rails/uploads` | User uploaded attachments
| `/var/opt/gitlab/gitlab-rails/shared` | Build artifacts, GitLab Pages, LFS objects, temp files, etc. If you're using LFS this may also account for a large portion of your data
| `/var/opt/gitlab/gitlab-ci/builds` | GitLab CI build traces
Other GitLab directories should not be shared between nodes. They contain
node-specific files and GitLab code that does not need to be shared. To ship
logs to a central location consider using remote syslog. GitLab Omnibus packages
provide configuration for [UDP log shipping][udp-log-shipping].
### Consolidating mount points
If you don't want to configure 5-6 different NFS mount points, you have a few
alternative options.
#### Change default file locations
Omnibus allows you to configure the file locations. With custom configuration
you can specify just one main mountpoint and have all of these locations
as subdirectories. Mount `/gitlab-data` then use the following Omnibus
configuration to move each data location to a subdirectory:
```ruby
user['home'] = '/gitlab-data/home'
git_data_dir '/gitlab-data/git-data'
gitlab_rails['shared_path'] = '/gitlab-data/shared'
gitlab_rails['uploads_directory'] = "/gitlab-data/uploads"
gitlab_ci['builds_directory'] = '/gitlab-data/builds'
```
To move the `git` home directory, all GitLab services must be stopped. Run
`gitlab-ctl stop && initctl stop gitlab-runsvdir`. Then continue with the
reconfigure.
Run `sudo gitlab-ctl reconfigure` to start using the central location. Please
be aware that if you had existing data you will need to manually copy/rsync it
to these new locations and then restart GitLab.
#### Bind mounts
Bind mounts provide a way to specify just one NFS mount and then
bind the default GitLab data locations to the NFS mount. Start by defining your
single NFS mount point as you normally would in `/etc/fstab`. Let's assume your
NFS mount point is `/gitlab-data`. Then, add the following bind mounts in
`/etc/fstab`:
```bash
/gitlab-data/git-data /var/opt/gitlab/git-data none bind 0 0
/gitlab-data/.ssh /var/opt/gitlab/.ssh none bind 0 0
/gitlab-data/uploads /var/opt/gitlab/gitlab-rails/uploads none bind 0 0
/gitlab-data/shared /var/opt/gitlab/gitlab-rails/shared none bind 0 0
/gitlab-data/builds /var/opt/gitlab/gitlab-ci/builds none bind 0 0
```
---
Read more on high-availability configuration:
1. [Configure the database](database.md)
1. [Configure Redis](redis.md)
1. [Configure the GitLab application servers](gitlab.md)
1. [Configure the load balancers](load_balancer.md)
[udp-log-shipping]: http://doc.gitlab.com/omnibus/settings/logs.html#udp-log-shipping-gitlab-enterprise-edition-only "UDP log shipping"
# Configuring Redis for GitLab HA
You can choose to install and manage Redis yourself, or you can use GitLab
Omnibus packages to help.
## Configure your own Redis server
If you're hosting GitLab on a cloud provider, you can optionally use a
managed service for Redis. For example, AWS offers a managed ElastiCache service
that runs Redis.
> **Note:** Redis does not require authentication by default. See
[Redis Security](http://redis.io/topics/security) documentation for more
information. We recommend using a combination of a Redis password and tight
firewall rules to secure your Redis service.
## Configure using Omnibus
1. Download/install GitLab Omnibus using **steps 1 and 2** from
[GitLab downloads](https://about.gitlab.com/downloads). Do not complete other
steps on the download page.
1. Create/edit `/etc/gitlab/gitlab.rb` and use the following configuration.
Be sure to change the `external_url` to match your eventual GitLab front-end
URL.
```ruby
external_url 'https://gitlab.example.com'
# Disable all components except PostgreSQL
redis['enable'] = true
bootstrap['enable'] = false
nginx['enable'] = false
unicorn['enable'] = false
sidekiq['enable'] = false
postgresql['enable'] = false
gitlab_workhorse['enable'] = false
mailroom['enable'] = false
# Redis configuration
redis['port'] = 6379
redis['bind'] = '0.0.0.0'
# If you wish to use Redis authentication (recommended)
redis['password'] = 'Redis Password'
```
1. Run `sudo gitlab-ctl reconfigure` to install and configure PostgreSQL.
> **Note**: This `reconfigure` step will result in some errors.
That's OK - don't be alarmed.
1. Run `touch /etc/gitlab/skip-auto-migrations` to prevent database migrations
from running on upgrade. Only the primary GitLab application server should
handle migrations.
---
Read more on high-availability configuration:
1. [Configure the database](database.md)
1. [Configure NFS](nfs.md)
1. [Configure the GitLab application servers](gitlab.md)
1. [Configure the load balancers](load_balancer.md)
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment