Merge branch 'bvl-client-side-connectionpool-docs' into 'master'

Add client side connection-pool documentation See merge request gitlab-org/gitlab!42861

Merge branch 'bvl-client-side-connectionpool-docs' into 'master'
Add client side connection-pool documentation See merge request gitlab-org/gitlab!42861
1372ea0a · Marcia Ramos · 4bd6e136 · 19fc7457 · 1372ea0a · 1372ea0a
Commit 1372ea0a authored Oct 07, 2020 by Marcia Ramos
3 changed files
--- a/config/initializers/database_config.rb
+++ b/config/initializers/database_config.rb
@@ -20,21 +20,12 @@ Gitlab.ee do
  end
 end

-# Because of the way Ruby on Rails manages database connections, it is
-# important that we have at least as many connections as we have
-# threads. While there is a 'pool' setting in database.yml, it is not
-# very practical because you need to maintain it in tandem with the
-# number of application threads. Because of this we override the number
-# of allowed connections in the database connection pool based on the
-# configured number of application threads.
+# We configure the database connection pool size automatically based on the
+# configured concurrency. We also add some headroom, to make sure we don't run
+# out of connections when more threads besides the 'user-facing' ones are
+# running.
 #
-# Gitlab::Runtime.max_threads is the number of "user facing" application
-# threads the process has been configured with. We also have auxiliary
-# threads that use database connections. Because it is not practical to
-# keep an accurate count of the number auxiliary threads as the
-# application evolves over time, we just add a fixed headroom to the
-# number of user-facing threads. It is OK if this number is too large
-# because connections are instantiated lazily.
+# Read more about this in doc/development/database/client_side_connection_pool.md

 headroom = (ENV["DB_POOL_HEADROOM"].presence || 10).to_i
 calculated_pool_size = Gitlab::Runtime.max_threads + headroom

--- a/doc/development/database/client_side_connection_pool.md
+++ b/doc/development/database/client_side_connection_pool.md
+---
+type: dev, reference
+stage: none
+group: Development
+info: "See the Technical Writers assigned to Development Guidelines: https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments-to-development-guidelines"
+---
+
+# Client-side connection-pool
+
+Ruby processes accessing the database through
+ActiveRecord, automatically calculate the connection-pool size for the
+process based on the concurrency.
+
+Because of the way [Ruby on Rails manages database
+connections](#connection-lifecycle), it is important that we have at
+least as many connections as we have threads. While there is a 'pool'
+setting in [`database.yml`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/config/database.yml.postgresql), it is not very practical because you need to
+maintain it in tandem with the number of application threads. Because
+maintain it in tandem with the number of application threads. For this
+reason, we override the number of allowed connections in the database
+connection-pool based on the configured number of application threads.
+
+`Gitlab::Runtime.max_threads` is the number of user-facing
+application threads the process has been configured with. We also have
+auxiliary threads that use database connections. As it isn't
+straightforward to keep an accurate count of the number of auxiliary threads as
+the application evolves over time, we just add a fixed headroom to the
+number of user-facing threads. It is OK if this number is too large
+because connections are instantiated lazily.
+
+## Troubleshooting connection-pool issues
+
+The connection-pool usage can be seen per environment in the [connection-pool
+saturation
+dashboard](https://dashboards.gitlab.net/d/alerts-sat_rails_db_connection_pool/alerts-rails_db_connection_pool-saturation-detail?orgId=1).
+
+If the connection-pool is too small, this would manifest in
+`ActiveRecord::ConnectionTimeoutError`s from the application. Because we alert
+when almost all connections are used, we should know this before
+timeouts occur. If this happens we can remediate by setting the
+`DB_POOL_HEADROOM` environment variable to something bigger than the
+hardcoded value (10).
+
+At this point, we need to investigate what is using more connections
+than we anticipated. To do that, we can use the
+`gitlab_ruby_threads_running_threads` metric. For example, [this
+graph](https://thanos-query.ops.gitlab.net/graph?g0.range_input=1h&g0.max_source_resolution=0s&g0.expr=sum%20by%20(thread_name)%20(%20gitlab_ruby_threads_running_threads%7Buses_db_connection%3D%22yes%22%7D%20)&g0.tab=0)
+shows all running threads that connect to the database by their
+name. Threads labeled `puma worker` or `sidekiq_worker_thread` are
+the threads that define `Gitlab::Runtime.max_threads` so those are
+accounted for. If there's more than 10 other threads running, we could
+consider raising the default headroom.
+
+## Connection lifecycle
+
+For web requests, a connection is obtained from the pool at the first
+time a database query is made. The connection is returned to the pool
+after the request completes.
+
+For background jobs, the behavior is very similar. The thread obtains
+a connection for the first query, and returns it after the job is
+finished.
+
+This is managed by Rails internally.
--- a/doc/development/database/index.md
+++ b/doc/development/database/index.md
@@ -56,6 +56,7 @@ info: To determine the technical writer assigned to the Stage/Group associated w
 - [Database Debugging and Troubleshooting](../database_debugging.md)
 - [Query Count Limits](../query_count_limits.md)
 - [Creating enums](../creating_enums.md)
+- [Client-side connection-pool](client_side_connection_pool.md)

 ## Case studies