Commit 6e03208e authored by Steve Abrams's avatar Steve Abrams

Add query performance page to the docs

Add a page for query performance guidelines with
timing suggestions and a section on cold and warm
cache.

Link to the new doc from other database doc locations
mentioning query times.
parent 1ece0011
...@@ -59,6 +59,7 @@ info: To determine the technical writer assigned to the Stage/Group associated w ...@@ -59,6 +59,7 @@ info: To determine the technical writer assigned to the Stage/Group associated w
- [Client-side connection-pool](client_side_connection_pool.md) - [Client-side connection-pool](client_side_connection_pool.md)
- [Updating multiple values](setting_multiple_values.md) - [Updating multiple values](setting_multiple_values.md)
- [Constraints naming conventions](constraint_naming_convention.md) - [Constraints naming conventions](constraint_naming_convention.md)
- [Query performance guidelines](../query_performance.md)
## Case studies ## Case studies
......
...@@ -158,8 +158,8 @@ test its execution using `CREATE INDEX CONCURRENTLY` in the `#database-lab` Slac ...@@ -158,8 +158,8 @@ test its execution using `CREATE INDEX CONCURRENTLY` in the `#database-lab` Slac
- Maintainer: After the merge request is merged, notify Release Managers about it on `#f_upcoming_release` Slack channel. - Maintainer: After the merge request is merged, notify Release Managers about it on `#f_upcoming_release` Slack channel.
- Check consistency with `db/structure.sql` and that migrations are [reversible](migration_style_guide.md#reversibility) - Check consistency with `db/structure.sql` and that migrations are [reversible](migration_style_guide.md#reversibility)
- Check that the relevant version files under `db/schema_migrations` were added or removed. - Check that the relevant version files under `db/schema_migrations` were added or removed.
- Check queries timing (If any): Queries executed in a migration - Check queries timing (If any): In a single transaction, cumulative query time executed in a migration
need to fit comfortably within `15s` - preferably much less than that - on GitLab.com. needs to fit comfortably within `15s` - preferably much less than that - on GitLab.com.
- For column removals, make sure the column has been [ignored in a previous release](what_requires_downtime.md#dropping-columns) - For column removals, make sure the column has been [ignored in a previous release](what_requires_downtime.md#dropping-columns)
- Check [background migrations](background_migrations.md): - Check [background migrations](background_migrations.md):
- Establish a time estimate for execution on GitLab.com. For historical purposes, - Establish a time estimate for execution on GitLab.com. For historical purposes,
...@@ -190,7 +190,7 @@ test its execution using `CREATE INDEX CONCURRENTLY` in the `#database-lab` Slac ...@@ -190,7 +190,7 @@ test its execution using `CREATE INDEX CONCURRENTLY` in the `#database-lab` Slac
- For given queries, review parameters regarding data distribution - For given queries, review parameters regarding data distribution
- [Check query plans](understanding_explain_plans.md) and suggest improvements - [Check query plans](understanding_explain_plans.md) and suggest improvements
to queries (changing the query, schema or adding indexes and similar) to queries (changing the query, schema or adding indexes and similar)
- General guideline is for queries to come in below 100ms execution time - General guideline is for queries to come in below [100ms execution time](query_performance.md#timing-guidelines-for-queries)
- Avoid N+1 problems and minimalize the [query count](merge_request_performance_guidelines.md#query-counts). - Avoid N+1 problems and minimalize the [query count](merge_request_performance_guidelines.md#query-counts).
### Timing guidelines for migrations ### Timing guidelines for migrations
...@@ -206,4 +206,4 @@ Keep in mind that all runtimes should be measured against GitLab.com. ...@@ -206,4 +206,4 @@ Keep in mind that all runtimes should be measured against GitLab.com.
|----|----|---| |----|----|---|
| Regular migrations on `db/migrate` | `3 minutes` | A valid exception are index creation as this can take a long time. | | Regular migrations on `db/migrate` | `3 minutes` | A valid exception are index creation as this can take a long time. |
| Post migrations on `db/post_migrate` | `10 minutes` | | | Post migrations on `db/post_migrate` | `10 minutes` | |
| Background migrations | --- | Since these are suitable for larger tables, it's not possible to set a precise timing guideline, however, any single query must stay below `1 second` execution time with cold caches. | | Background migrations | --- | Since these are suitable for larger tables, it's not possible to set a precise timing guideline, however, any single query must stay below [`1 second` execution time](query_performance.md#timing-guidelines-for-queries) with cold caches. |
...@@ -153,8 +153,9 @@ and therefore it does not have any records yet. ...@@ -153,8 +153,9 @@ and therefore it does not have any records yet.
When using a single-transaction migration, a transaction will hold on a database connection When using a single-transaction migration, a transaction will hold on a database connection
for the duration of the migration, so you must make sure the actions in the migration for the duration of the migration, so you must make sure the actions in the migration
do not take too much time: In general, queries executed in a migration need to fit comfortably do not take too much time: GitLab.com’s production database has a `15s` timeout, so
within `15s` on GitLab.com. in general, the cumulative execution time in a migration should aim to fit comfortably
in that limit. Singular query timings should fit within the [standard limit](query_performance.md#timing-guidelines-for-queries)
In case you need to insert, update, or delete a significant amount of data, you: In case you need to insert, update, or delete a significant amount of data, you:
......
...@@ -631,7 +631,7 @@ Paste the SQL query into `#database-lab` to see how the query performs at scale. ...@@ -631,7 +631,7 @@ Paste the SQL query into `#database-lab` to see how the query performs at scale.
- `#database-lab` is a Slack channel which uses a production-sized environment to test your queries. - `#database-lab` is a Slack channel which uses a production-sized environment to test your queries.
- GitLab.com’s production database has a 15 second timeout. - GitLab.com’s production database has a 15 second timeout.
- Any single query must stay below 1 second execution time with cold caches. - Any single query must stay below [1 second execution time](../query_performance.md#timing-guidelines-for-queries) with cold caches.
- Add a specialized index on columns involved to reduce the execution time. - Add a specialized index on columns involved to reduce the execution time.
In order to have an understanding of the query's execution we add in the MR description the following information: In order to have an understanding of the query's execution we add in the MR description the following information:
......
---
stage: Enablement
group: Database
info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#designated-technical-writers
---
# Query performance guidelines
This document describes various guidelines to follow when optimizing SQL queries.
When you are optimizing your SQL queries, there are two dimensions to pay attention to:
1. The query execution time. This is paramount as it reflects how the user experiences GitLab.
1. The query plan. Optimizing the query plan is important in allowing queries to independently scale over time. Realizing that an index will keep a query performing well as the table grows before the query degrades is an example of why we analyze these plans.
## Timing guidelines for queries
| Query Type | Maximum Query Time | Notes |
|----|----|---|
| General queries | `100ms` | This is not a hard limit, but if a query is getting above it, it is important to spend time understanding why it can or cannot be optimized. |
| Queries in a migration | `100ms` | This is different than the total [migration time](database_review.md#timing-guidelines-for-migrations). |
| Concurrent operations in a migration | `5min` | Concurrent operations do not block the database, but they block the GitLab update. This includes operations such as `add_concurrent_index` and `add_concurrent_foreign_key`. |
| Background migrations | `1s` | |
| Usage Ping | `1s` | See the [usage ping docs](product_analytics/usage_ping.md#developing-and-testing-usage-ping) for more details. |
- When analyzing your query's performance, pay attention to if the time you are seeing is on a [cold or warm cache](#cold-and-warm-cache). These guidelines apply for both cache types.
- When working with batched queries, change the range and batch size to see how it effects the query timing and caching.
- If an existing query is already underperforming, make an effort to improve it. If it is too complex or would stall development, create a follow-up so it can be addressed in a timely manner. You can always ask the database reviewer or maintainer for help and guidance.
## Cold and warm cache
When evaluating query performance it is important to understand the difference between
cold and warm cached queries.
The first time a query is made, it is made on a "cold cache". Meaning it needs
to read from disk. If you run the query again, the data can be read from the
cache, or what PostgreSQL calls shared buffers. This is the "warm cache" query.
When analyzing an [`EXPLAIN` plan](understanding_explain_plans.md), you can see
the difference not only in the timing, but by looking at the output for `Buffers`
by running your explain with `EXPLAIN(analyze, buffers)`. The [#database-lab](understanding_explain_plans.md#database-lab)
tool will automatically include these options.
If you are making a warm cache query, you will only see the `shared hits`.
For example in #database-lab:
```plaintext
Shared buffers:
- hits: 36467 (~284.90 MiB) from the buffer pool
- reads: 0 from the OS file cache, including disk I/O
```
Or in the explain plan from `psql`:
```sql
Buffers: shared hit=7323
```
If the cache is cold, you will also see `reads`.
In #database-lab:
```plaintext
Shared buffers:
- hits: 17204 (~134.40 MiB) from the buffer pool
- reads: 15229 (~119.00 MiB) from the OS file cache, including disk I/O
```
In `psql`:
```sql
Buffers: shared hit=7202 read=121
```
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment