Commit b2f994ed authored by Achilleas Pipinellis's avatar Achilleas Pipinellis Committed by Suzanne Selhorn

Refactor Puma settings page to conform to CTRT

parent 87256c6f
...@@ -4,45 +4,19 @@ group: Distribution ...@@ -4,45 +4,19 @@ group: Distribution
info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments
--- ---
# Puma **(FREE SELF)** # Configure the bundled Puma instance of the GitLab package **(FREE SELF)**
Puma is a simple, fast, multi-threaded, and highly concurrent HTTP 1.1 server for Puma is a fast, multi-threaded, and highly concurrent HTTP 1.1 server for
Ruby applications. It's the default GitLab web server since GitLab 13.0 Ruby applications. It runs the core Rails application that provides the user-facing
and has replaced Unicorn. From GitLab 14.0, Unicorn is no longer supported. features of GitLab.
NOTE: ## Reducing memory use
Starting with GitLab 13.0, Puma is the default web server and Unicorn has been disabled.
In GitLab 14.0, Unicorn was removed from the Linux package and only Puma is available.
## Configure Puma
To configure Puma:
1. Determine suitable Puma worker and thread [settings](../../install/requirements.md#puma-settings).
1. If you're switching from Unicorn, [convert any custom settings to Puma](#convert-unicorn-settings-to-puma).
1. For multi-node deployments, configure the load balancer to use the
[readiness check](../load_balancer.md#readiness-check).
1. Reconfigure GitLab so the above changes take effect:
```shell
sudo gitlab-ctl reconfigure
```
For Helm-based deployments, see the
[`webservice` chart documentation](https://docs.gitlab.com/charts/charts/gitlab/webservice/index.html).
For more details about the Puma configuration, see the
[Puma documentation](https://github.com/puma/puma#configuration).
## Puma Worker Killer To reduce memory use, Puma forks worker processes. Each time a worker is created,
it shares memory with the primary process. The worker uses additional memory only
when it changes or adds to its memory pages.
Puma forks worker processes as part of a strategy to reduce memory use. Memory use increases over time, but you can use Puma Worker Killer to recover memory.
Each time a worker is created, it shares memory with the primary process and
only uses additional memory when it makes changes or additions to its memory pages.
Memory use by workers therefore increases over time, and Puma Worker Killer is the
mechanism that recovers this memory.
By default: By default:
...@@ -50,6 +24,8 @@ By default: ...@@ -50,6 +24,8 @@ By default:
exceeds a [memory limit](https://gitlab.com/gitlab-org/gitlab/-/blob/master/lib/gitlab/cluster/puma_worker_killer_initializer.rb). exceeds a [memory limit](https://gitlab.com/gitlab-org/gitlab/-/blob/master/lib/gitlab/cluster/puma_worker_killer_initializer.rb).
- Rolling restarts of Puma workers are performed every 12 hours. - Rolling restarts of Puma workers are performed every 12 hours.
### Change the memory limit setting
To change the memory limit setting: To change the memory limit setting:
1. Edit `/etc/gitlab/gitlab.rb`: 1. Edit `/etc/gitlab/gitlab.rb`:
...@@ -58,26 +34,28 @@ To change the memory limit setting: ...@@ -58,26 +34,28 @@ To change the memory limit setting:
puma['per_worker_max_memory_mb'] = 1024 puma['per_worker_max_memory_mb'] = 1024
``` ```
1. Reconfigure GitLab for the changes to take effect: 1. Reconfigure GitLab:
```shell ```shell
sudo gitlab-ctl reconfigure sudo gitlab-ctl reconfigure
``` ```
There are costs associated with killing and replacing workers including When workers are killed and replaced, capacity to run GitLab is reduced,
reduced capacity to run GitLab, and CPU that is consumed and CPU is consumed. Set `per_worker_max_memory_mb` to a higher value if the worker killer
restarting the workers. `per_worker_max_memory_mb` should be set to a is replacing workers too often.
higher value if the worker killer is replacing workers too often.
Worker count is calculated based on CPU cores, so a small GitLab deployment Worker count is calculated based on CPU cores. A small GitLab deployment
with 4-8 workers may experience performance issues if workers are being restarted with 4-8 workers may experience performance issues if workers are being restarted
frequently, once or more per minute. This is too often. too often (once or more per minute).
A higher value of `1200` or more would be beneficial if the server has free memory. A higher value of `1200` or more would be beneficial if the server has free memory.
The worker killer checks every 20 seconds, and can be monitored using ### Monitor worker memory
[the Puma log](../logs.md#puma_stdoutlog) `/var/log/gitlab/puma/puma_stdout.log`.
For example, for GitLab 13.5: The worker killer checks memory every 20 seconds.
To monitor the worker killer, use [the Puma log](../logs.md#puma_stdoutlog) `/var/log/gitlab/puma/puma_stdout.log`.
For example:
```plaintext ```plaintext
PumaWorkerKiller: Out of memory. 4 workers consuming total: 4871.23828125 MB PumaWorkerKiller: Out of memory. 4 workers consuming total: 4871.23828125 MB
...@@ -88,9 +66,9 @@ From this output: ...@@ -88,9 +66,9 @@ From this output:
- The formula that calculates the maximum memory value results in workers - The formula that calculates the maximum memory value results in workers
being killed before they reach the `per_worker_max_memory_mb` value. being killed before they reach the `per_worker_max_memory_mb` value.
- The default values for the formula before GitLab 13.5 were 550MB for the primary - In GitLab 13.4 and earlier, the default values for the formula were 550MB for the primary
and `per_worker_max_memory_mb` specified 850MB for each worker. and 850MB for each worker.
- As of GitLab 13.5 the values are primary: 800MB, worker: 1024MB. - In GitLab 13.5 and later, the values are primary: 800MB, worker: 1024MB.
- The threshold for workers to be killed is set at 98% of the limit: - The threshold for workers to be killed is set at 98% of the limit:
```plaintext ```plaintext
...@@ -102,16 +80,15 @@ From this output: ...@@ -102,16 +80,15 @@ From this output:
Increasing the maximum to `1200`, for example, would set a `max: 5488 MB` value. Increasing the maximum to `1200`, for example, would set a `max: 5488 MB` value.
Workers use additional memory on top of the shared memory, how much Workers use additional memory on top of the shared memory. The amount of memory
depends on a site's use of GitLab. depends on a site's use of GitLab.
## Worker timeout ## Change the worker timeout
A [timeout of 60 seconds](https://gitlab.com/gitlab-org/gitlab/-/blob/master/config/initializers/rack_timeout.rb) The default Puma [timeout is 60 seconds](https://gitlab.com/gitlab-org/gitlab/-/blob/master/config/initializers/rack_timeout.rb).
is used when Puma is enabled.
NOTE: NOTE:
Unlike Unicorn, the `puma['worker_timeout']` setting does not set the maximum request duration. The `puma['worker_timeout']` setting does not set the maximum request duration.
To change the worker timeout to 600 seconds: To change the worker timeout to 600 seconds:
...@@ -123,26 +100,38 @@ To change the worker timeout to 600 seconds: ...@@ -123,26 +100,38 @@ To change the worker timeout to 600 seconds:
} }
``` ```
1. Reconfigure GitLab for the changes to take effect: 1. Reconfigure GitLab:
```shell ```shell
sudo gitlab-ctl reconfigure sudo gitlab-ctl reconfigure
``` ```
## Memory-constrained environments ## Disable Puma clustered mode in memory-constrained environments
In a memory-constrained environment with less than 4GB of RAM available, consider disabling Puma In a memory-constrained environment with less than 4GB of RAM available, consider disabling Puma
[Clustered mode](https://github.com/puma/puma#clustered-mode). [clustered mode](https://github.com/puma/puma#clustered-mode).
Configuring Puma by setting the amount of `workers` to `0` could reduce memory usage by hundreds of MB. Set the number of `workers` to `0` to reduce memory usage by hundreds of MB:
For details on Puma worker and thread settings, see the [Puma requirements](../../install/requirements.md#puma-settings).
1. Edit `/etc/gitlab/gitlab.rb`:
Unlike in a Clustered mode, which is set up by default, only a single Puma process would serve the application. ```ruby
puma['worker_processes'] = 0
```
The downside of running Puma with such configuration is the reduced throughput, which could be 1. Reconfigure GitLab:
considered as a fair tradeoff in a memory-constraint environment.
When running Puma in Single mode, some features are not supported: ```shell
sudo gitlab-ctl reconfigure
```
Unlike in a clustered mode, which is set up by default, only a single Puma process would serve the application.
For details on Puma worker and thread settings, see the [Puma requirements](../../install/requirements.md#puma-settings).
The downside of running Puma in this configuration is the reduced throughput, which can be
considered a fair tradeoff in a memory-constrained environment.
When running Puma in single mode, some features are not supported:
- [Phased restart](https://gitlab.com/gitlab-org/gitlab/-/issues/300665) - [Phased restart](https://gitlab.com/gitlab-org/gitlab/-/issues/300665)
- [Puma Worker Killer](https://gitlab.com/gitlab-org/gitlab/-/issues/300664) - [Puma Worker Killer](https://gitlab.com/gitlab-org/gitlab/-/issues/300664)
...@@ -151,22 +140,23 @@ To learn more, visit [epic 5303](https://gitlab.com/groups/gitlab-org/-/epics/53 ...@@ -151,22 +140,23 @@ To learn more, visit [epic 5303](https://gitlab.com/groups/gitlab-org/-/epics/53
## Performance caveat when using Puma with Rugged ## Performance caveat when using Puma with Rugged
For deployments where NFS is used to store Git repository, we allow GitLab to use For deployments where NFS is used to store Git repositories, GitLab uses
[direct Git access](../gitaly/index.md#direct-access-to-git-in-gitlab) to improve performance using [direct Git access](../gitaly/index.md#direct-access-to-git-in-gitlab) to improve performance by using
[Rugged](https://github.com/libgit2/rugged). [Rugged](https://github.com/libgit2/rugged).
Rugged usage is automatically enabled if direct Git access Rugged usage is automatically enabled if direct Git access
[is available](../gitaly/index.md#how-it-works) [is available](../gitaly/index.md#how-it-works)
and Puma is running single threaded, unless it is disabled by and Puma is running single threaded, unless it is disabled by a
[feature flags](../../development/gitaly.md#legacy-rugged-code). [feature flag](../../development/gitaly.md#legacy-rugged-code).
MRI Ruby uses a GVL. This allows MRI Ruby to be multi-threaded, but running at MRI Ruby uses a Global VM Lock (GVL). GVL allows MRI Ruby to be multi-threaded, but running at
most on a single core. Since Rugged can use a thread for long periods of most on a single core.
time (due to intensive I/O operations of Git access), this can starve other threads
that might be processing requests. This is not a case for Unicorn or Puma running
in a single thread mode, as concurrently at most one request is being processed.
We are actively working on removing Rugged usage. Even though performance without Rugged Git includes intensive I/O operations. When Rugged uses a thread for a long period of time,
other threads that might be processing requests can starve. Puma running in single thread mode
does not have this issue, because concurrently at most one request is being processed.
GitLab is working to remove Rugged usage. Even though performance without Rugged
is acceptable today, in some cases it might be still beneficial to run with it. is acceptable today, in some cases it might be still beneficial to run with it.
Given the caveat of running Rugged with multi-threaded Puma, and acceptable Given the caveat of running Rugged with multi-threaded Puma, and acceptable
...@@ -177,55 +167,70 @@ This default behavior may not be the optimal configuration in some situations. I ...@@ -177,55 +167,70 @@ This default behavior may not be the optimal configuration in some situations. I
plays an important role in your deployment, we suggest you benchmark to find the plays an important role in your deployment, we suggest you benchmark to find the
optimal configuration: optimal configuration:
- The safest option is to start with single-threaded Puma. When working with - The safest option is to start with single-threaded Puma.
Rugged, single-threaded Puma works the same as Unicorn. - To force Rugged to be used with multi-threaded Puma, you can use a
- To force Rugged to be used with multi-threaded Puma, you can use [feature flag](../../development/gitaly.md#legacy-rugged-code).
[feature flags](../../development/gitaly.md#legacy-rugged-code).
## Convert Unicorn settings to Puma ## Switch from Unicorn to Puma
NOTE: NOTE:
Starting with GitLab 13.0, Puma is the default web server and Unicorn has been For Helm-based deployments, see the
disabled by default. In GitLab 14.0, Unicorn was removed from the Linux package [`webservice` chart documentation](https://docs.gitlab.com/charts/charts/gitlab/webservice/index.html).
and only Puma is available.
Starting with GitLab 13.0, Puma is the default web server and Unicorn has been disabled.
In GitLab 14.0, [Unicorn was removed](../../update/removals.md#unicorn-in-gitlab-self-managed)
from the Linux package and is no longer supported.
Puma has a multi-thread architecture which uses less memory than a multi-process Puma has a multi-thread architecture that uses less memory than a multi-process
application server like Unicorn. On GitLab.com, we saw a 40% reduction in memory application server like Unicorn. On GitLab.com, we saw a 40% reduction in memory
consumption. Most Rails applications requests normally include a proportion of I/O wait time. consumption. Most Rails application requests normally include a proportion of I/O wait time.
During I/O wait time MRI Ruby releases the GVL (Global VM Lock) to other threads. During I/O wait time, MRI Ruby releases the GVL to other threads.
Multi-threaded Puma can therefore still serve more requests than a single process. Multi-threaded Puma can therefore still serve more requests than a single process.
When switching to Puma, any Unicorn server configuration will _not_ carry over When switching to Puma, any Unicorn server configuration will _not_ carry over
automatically, due to differences between the two application servers. automatically, due to differences between the two application servers.
The table below summarizes which Unicorn configuration keys correspond to those To switch from Unicorn to Puma:
in Puma when using the Linux package, and which ones have no corresponding counterpart.
1. Determine suitable Puma [worker and thread settings](../../install/requirements.md#puma-settings).
| Unicorn | Puma | 1. Convert any custom Unicorn settings to Puma.
| ------------------------------------ | ---------------------------------- |
| `unicorn['enable']` | `puma['enable']` | The table below summarizes which Unicorn configuration keys correspond to those
| `unicorn['worker_timeout']` | `puma['worker_timeout']` | in Puma when using the Linux package, and which ones have no corresponding counterpart.
| `unicorn['worker_processes']` | `puma['worker_processes']` |
| n/a | `puma['ha']` | | Unicorn | Puma |
| n/a | `puma['min_threads']` | | ------------------------------------ | ---------------------------------- |
| n/a | `puma['max_threads']` | | `unicorn['enable']` | `puma['enable']` |
| `unicorn['listen']` | `puma['listen']` | | `unicorn['worker_timeout']` | `puma['worker_timeout']` |
| `unicorn['port']` | `puma['port']` | | `unicorn['worker_processes']` | `puma['worker_processes']` |
| `unicorn['socket']` | `puma['socket']` | | n/a | `puma['ha']` |
| `unicorn['pidfile']` | `puma['pidfile']` | | n/a | `puma['min_threads']` |
| `unicorn['tcp_nopush']` | n/a | | n/a | `puma['max_threads']` |
| `unicorn['backlog_socket']` | n/a | | `unicorn['listen']` | `puma['listen']` |
| `unicorn['somaxconn']` | `puma['somaxconn']` | | `unicorn['port']` | `puma['port']` |
| n/a | `puma['state_path']` | | `unicorn['socket']` | `puma['socket']` |
| `unicorn['log_directory']` | `puma['log_directory']` | | `unicorn['pidfile']` | `puma['pidfile']` |
| `unicorn['worker_memory_limit_min']` | n/a | | `unicorn['tcp_nopush']` | n/a |
| `unicorn['worker_memory_limit_max']` | `puma['per_worker_max_memory_mb']` | | `unicorn['backlog_socket']` | n/a |
| `unicorn['exporter_enabled']` | `puma['exporter_enabled']` | | `unicorn['somaxconn']` | `puma['somaxconn']` |
| `unicorn['exporter_address']` | `puma['exporter_address']` | | n/a | `puma['state_path']` |
| `unicorn['exporter_port']` | `puma['exporter_port']` | | `unicorn['log_directory']` | `puma['log_directory']` |
| `unicorn['worker_memory_limit_min']` | n/a |
## Puma exporter | `unicorn['worker_memory_limit_max']` | `puma['per_worker_max_memory_mb']` |
| `unicorn['exporter_enabled']` | `puma['exporter_enabled']` |
You can use the Puma exporter to measure various Puma metrics. For more information, see | `unicorn['exporter_address']` | `puma['exporter_address']` |
[Puma exporter](../monitoring/prometheus/puma_exporter.md). | `unicorn['exporter_port']` | `puma['exporter_port']` |
1. Reconfigure GitLab:
```shell
sudo gitlab-ctl reconfigure
```
1. Optional. For multi-node deployments, configure the load balancer to use the
[readiness check](../load_balancer.md#readiness-check).
## Related topics
- [Use the Puma exporter to measure various Puma metrics](../monitoring/prometheus/puma_exporter.md)
...@@ -225,7 +225,7 @@ gitlab_rails['env'] = { ...@@ -225,7 +225,7 @@ gitlab_rails['env'] = {
``` ```
For source installations, set the environment variable. For source installations, set the environment variable.
Refer to [Puma Worker timeout](../operations/puma.md#worker-timeout). Refer to [Puma Worker timeout](../operations/puma.md#change-the-worker-timeout).
[Reconfigure](../restart_gitlab.md#omnibus-gitlab-reconfigure) GitLab for the changes to take effect. [Reconfigure](../restart_gitlab.md#omnibus-gitlab-reconfigure) GitLab for the changes to take effect.
......
...@@ -258,7 +258,7 @@ works. ...@@ -258,7 +258,7 @@ works.
### Puma per worker maximum memory ### Puma per worker maximum memory
By default, each Puma worker will be limited to 1024 MB of memory. By default, each Puma worker will be limited to 1024 MB of memory.
This setting [can be adjusted](../administration/operations/puma.md#puma-worker-killer) and should be considered This setting [can be adjusted](../administration/operations/puma.md#change-the-memory-limit-setting) and should be considered
if you need to increase the number of Puma workers. if you need to increase the number of Puma workers.
## Redis and Sidekiq ## Redis and Sidekiq
......
...@@ -22,6 +22,6 @@ The following timeouts are available. ...@@ -22,6 +22,6 @@ The following timeouts are available.
| Timeout | Default | Description | | Timeout | Default | Description |
|:--------|:-----------|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| |:--------|:-----------|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Default | 55 seconds | Timeout for most Gitaly calls (not enforced for `git` `fetch` and `push` operations, or Sidekiq jobs). For example, checking if a repository exists on disk. Makes sure that Gitaly calls made within a web request cannot exceed the entire request timeout. It should be shorter than the [worker timeout](../../../administration/operations/puma.md#worker-timeout) that can be configured for [Puma](../../../install/requirements.md#puma-settings). If a Gitaly call timeout exceeds the worker timeout, the remaining time from the worker timeout is used to avoid having to terminate the worker. | | Default | 55 seconds | Timeout for most Gitaly calls (not enforced for `git` `fetch` and `push` operations, or Sidekiq jobs). For example, checking if a repository exists on disk. Makes sure that Gitaly calls made within a web request cannot exceed the entire request timeout. It should be shorter than the [worker timeout](../../../administration/operations/puma.md#change-the-worker-timeout) that can be configured for [Puma](../../../install/requirements.md#puma-settings). If a Gitaly call timeout exceeds the worker timeout, the remaining time from the worker timeout is used to avoid having to terminate the worker. |
| Fast | 10 seconds | Timeout for fast Gitaly operations used within requests, sometimes multiple times. For example, checking if a repository exists on disk. If fast operations exceed this threshold, there may be a problem with a storage shard. Failing fast can help maintain the stability of the GitLab instance. | | Fast | 10 seconds | Timeout for fast Gitaly operations used within requests, sometimes multiple times. For example, checking if a repository exists on disk. If fast operations exceed this threshold, there may be a problem with a storage shard. Failing fast can help maintain the stability of the GitLab instance. |
| Medium | 30 seconds | Timeout for Gitaly operations that should be fast (possibly within requests) but preferably not used multiple times within a request. For example, loading blobs. Timeout that should be set between Default and Fast. | | Medium | 30 seconds | Timeout for Gitaly operations that should be fast (possibly within requests) but preferably not used multiple times within a request. For example, loading blobs. Timeout that should be set between Default and Fast. |
...@@ -299,7 +299,7 @@ for `shared_buffers` is quite high, and we are ...@@ -299,7 +299,7 @@ for `shared_buffers` is quite high, and we are
## Puma ## Puma
GitLab.com uses the default of 60 seconds for [Puma request timeouts](../../administration/operations/puma.md#worker-timeout). GitLab.com uses the default of 60 seconds for [Puma request timeouts](../../administration/operations/puma.md#change-the-worker-timeout).
## GitLab.com-specific rate limits ## GitLab.com-specific rate limits
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment