Commit 21d655ee authored by Craig Norris's avatar Craig Norris

Merge branch '284446-unassigned-future-tense-2' into 'master'

Convert future tense to present tense

See merge request gitlab-org/gitlab!48525
parents 0ba15986 d5d153d9
...@@ -268,8 +268,8 @@ query($project_path: ID!) { ...@@ -268,8 +268,8 @@ query($project_path: ID!) {
} }
``` ```
To ensure that we get consistent ordering, we will append an ordering on the primary To ensure that we get consistent ordering, we append an ordering on the primary
key, in descending order. This is usually `id`, so basically we will add `order(id: :desc)` key, in descending order. This is usually `id`, so we add `order(id: :desc)`
to the end of the relation. A primary key _must_ be available on the underlying table. to the end of the relation. A primary key _must_ be available on the underlying table.
#### Shortcut fields #### Shortcut fields
...@@ -315,7 +315,7 @@ class MergeRequestPermissionsType < BasePermissionType ...@@ -315,7 +315,7 @@ class MergeRequestPermissionsType < BasePermissionType
end end
``` ```
- **`permission_field`**: Will act the same as `graphql-ruby`'s - **`permission_field`**: Acts the same as `graphql-ruby`'s
`field` method but setting a default description and type and making `field` method but setting a default description and type and making
them non-nullable. These options can still be overridden by adding them non-nullable. These options can still be overridden by adding
them as arguments. them as arguments.
...@@ -323,7 +323,7 @@ end ...@@ -323,7 +323,7 @@ end
behaves the same way as `permission_field` and the same behaves the same way as `permission_field` and the same
arguments can be overridden. arguments can be overridden.
- **`abilities`**: Allows exposing several abilities defined in our - **`abilities`**: Allows exposing several abilities defined in our
policies at once. The fields for these will all have be non-nullable policies at once. The fields for these must all be non-nullable
booleans with a default description. booleans with a default description.
## Feature flags ## Feature flags
...@@ -331,7 +331,7 @@ end ...@@ -331,7 +331,7 @@ end
Developers can add [feature flags](../development/feature_flags/index.md) to GraphQL Developers can add [feature flags](../development/feature_flags/index.md) to GraphQL
fields in the following ways: fields in the following ways:
- Add the `feature_flag` property to a field. This will allow the field to be _hidden_ - Add the `feature_flag` property to a field. This allows the field to be _hidden_
from the GraphQL schema when the flag is disabled. from the GraphQL schema when the flag is disabled.
- Toggle the return value when resolving the field. - Toggle the return value when resolving the field.
...@@ -339,7 +339,7 @@ You can refer to these guidelines to decide which approach to use: ...@@ -339,7 +339,7 @@ You can refer to these guidelines to decide which approach to use:
- If your field is experimental, and its name or type is subject to - If your field is experimental, and its name or type is subject to
change, use the `feature_flag` property. change, use the `feature_flag` property.
- If your field is stable and its definition will not change, even after the flag is - If your field is stable and its definition doesn't change, even after the flag is
removed, toggle the return value of the field instead. Note that removed, toggle the return value of the field instead. Note that
[all fields should be nullable](#nullable-fields) anyway. [all fields should be nullable](#nullable-fields) anyway.
...@@ -347,15 +347,15 @@ You can refer to these guidelines to decide which approach to use: ...@@ -347,15 +347,15 @@ You can refer to these guidelines to decide which approach to use:
The `feature_flag` property allows you to toggle the field's The `feature_flag` property allows you to toggle the field's
[visibility](https://graphql-ruby.org/authorization/visibility.html) [visibility](https://graphql-ruby.org/authorization/visibility.html)
within the GraphQL schema. This will remove the field from the schema within the GraphQL schema. This removes the field from the schema
when the flag is disabled. when the flag is disabled.
A description is [appended](https://gitlab.com/gitlab-org/gitlab/-/blob/497b556/app/graphql/types/base_field.rb#L44-53) A description is [appended](https://gitlab.com/gitlab-org/gitlab/-/blob/497b556/app/graphql/types/base_field.rb#L44-53)
to the field indicating that it is behind a feature flag. to the field indicating that it is behind a feature flag.
CAUTION: **Caution:** CAUTION: **Caution:**
If a client queries for the field when the feature flag is disabled, the query will If a client queries for the field when the feature flag is disabled, the query
fail. Consider this when toggling the visibility of the feature on or off on fails. Consider this when toggling the visibility of the feature on or off on
production. production.
The `feature_flag` property does not allow the use of The `feature_flag` property does not allow the use of
...@@ -385,7 +385,7 @@ When applying a feature flag to toggle the value of a field, the ...@@ -385,7 +385,7 @@ When applying a feature flag to toggle the value of a field, the
- State that the value of the field can be toggled by a feature flag. - State that the value of the field can be toggled by a feature flag.
- Name the feature flag. - Name the feature flag.
- State what the field will return when the feature flag is disabled (or - State what the field returns when the feature flag is disabled (or
enabled, if more appropriate). enabled, if more appropriate).
Example: Example:
...@@ -424,8 +424,8 @@ field :token, GraphQL::STRING_TYPE, null: true, ...@@ -424,8 +424,8 @@ field :token, GraphQL::STRING_TYPE, null: true,
``` ```
The original `description` of the things being deprecated should be maintained, The original `description` of the things being deprecated should be maintained,
and should _not_ be updated to mention the deprecation. Instead, the `reason` will and should _not_ be updated to mention the deprecation. Instead, the `reason`
be appended to the `description`. is appended to the `description`.
### Deprecation reason style guide ### Deprecation reason style guide
...@@ -484,13 +484,13 @@ module Types ...@@ -484,13 +484,13 @@ module Types
end end
``` ```
If the enum will be used for a class property in Ruby that is not an uppercase string, If the enum is used for a class property in Ruby that is not an uppercase string,
you can provide a `value:` option that will adapt the uppercase value. you can provide a `value:` option that adapts the uppercase value.
In the following example: In the following example:
- GraphQL inputs of `OPENED` will be converted to `'opened'`. - GraphQL inputs of `OPENED` are converted to `'opened'`.
- Ruby values of `'opened'` will be converted to `"OPENED"` in GraphQL responses. - Ruby values of `'opened'` are converted to `"OPENED"` in GraphQL responses.
```ruby ```ruby
module Types module Types
...@@ -536,7 +536,7 @@ When data to be returned by GraphQL is stored as ...@@ -536,7 +536,7 @@ When data to be returned by GraphQL is stored as
GraphQL types whenever possible. Avoid using the `GraphQL::Types::JSON` type unless GraphQL types whenever possible. Avoid using the `GraphQL::Types::JSON` type unless
the JSON data returned is _truly_ unstructured. the JSON data returned is _truly_ unstructured.
If the structure of the JSON data varies, but will be one of a set of known possible If the structure of the JSON data varies, but is one of a set of known possible
structures, use a structures, use a
[union](https://graphql-ruby.org/type_definitions/unions.html). [union](https://graphql-ruby.org/type_definitions/unions.html).
An example of the use of a union for this purpose is An example of the use of a union for this purpose is
...@@ -605,7 +605,7 @@ descriptions: ...@@ -605,7 +605,7 @@ descriptions:
this field do?". Example: `'Indicates project has a Git repository'`. this field do?". Example: `'Indicates project has a Git repository'`.
- Always include the word `"timestamp"` when describing an argument or - Always include the word `"timestamp"` when describing an argument or
field of type `Types::TimeType`. This lets the reader know that the field of type `Types::TimeType`. This lets the reader know that the
format of the property will be `Time`, rather than just `Date`. format of the property is `Time`, rather than just `Date`.
- No `.` at end of strings. - No `.` at end of strings.
Example: Example:
...@@ -618,7 +618,7 @@ field :closed_at, Types::TimeType, description: 'Timestamp of when the issue was ...@@ -618,7 +618,7 @@ field :closed_at, Types::TimeType, description: 'Timestamp of when the issue was
### `copy_field_description` helper ### `copy_field_description` helper
Sometimes we want to ensure that two descriptions will always be identical. Sometimes we want to ensure that two descriptions are always identical.
For example, to keep a type field description the same as a mutation argument For example, to keep a type field description the same as a mutation argument
when they both represent the same property. when they both represent the same property.
...@@ -641,8 +641,8 @@ abilities as in the Rails app. ...@@ -641,8 +641,8 @@ abilities as in the Rails app.
If the: If the:
- Currently authenticated user fails the authorization, the authorized - Currently authenticated user fails the authorization, the authorized
resource will be returned as `null`. resource is returned as `null`.
- Resource is part of a collection, the collection will be filtered to - Resource is part of a collection, the collection is filtered to
exclude the objects that the user's authorization checks failed against. exclude the objects that the user's authorization checks failed against.
Also see [authorizing resources in a mutation](#authorizing-resources). Also see [authorizing resources in a mutation](#authorizing-resources).
...@@ -656,7 +656,7 @@ authorization checks of the loaded records. ...@@ -656,7 +656,7 @@ authorization checks of the loaded records.
### Type authorization ### Type authorization
Authorize a type by passing an ability to the `authorize` method. All Authorize a type by passing an ability to the `authorize` method. All
fields with the same type will be authorized by checking that the fields with the same type is authorized by checking that the
currently authenticated user has the required ability. currently authenticated user has the required ability.
For example, the following authorization ensures that the currently For example, the following authorization ensures that the currently
...@@ -922,7 +922,7 @@ before calling `resolve`! An example can be seen in our [`GraphQLHelpers`](https ...@@ -922,7 +922,7 @@ before calling `resolve`! An example can be seen in our [`GraphQLHelpers`](https
The full query is known in advance during execution, which means we can make use The full query is known in advance during execution, which means we can make use
of [lookahead](https://graphql-ruby.org/queries/lookahead.html) to optimize our of [lookahead](https://graphql-ruby.org/queries/lookahead.html) to optimize our
queries, and batch load associations we know we will need. Consider adding queries, and batch load associations we know we need. Consider adding
lookahead support in your resolvers to avoid `N+1` performance issues. lookahead support in your resolvers to avoid `N+1` performance issues.
To enable support for common lookahead use-cases (pre-loading associations when To enable support for common lookahead use-cases (pre-loading associations when
...@@ -996,7 +996,7 @@ When using resolvers, they can and should serve as the SSoT for field metadata. ...@@ -996,7 +996,7 @@ When using resolvers, they can and should serve as the SSoT for field metadata.
All field options (apart from the field name) can be declared on the resolver. All field options (apart from the field name) can be declared on the resolver.
These include: These include:
- `type` (this is particularly important, and will soon be mandatory) - `type` (this is particularly important, and is planned to be mandatory)
- `extras` - `extras`
- `description` - `description`
...@@ -1164,7 +1164,7 @@ argument :my_arg, GraphQL::STRING_TYPE, ...@@ -1164,7 +1164,7 @@ argument :my_arg, GraphQL::STRING_TYPE,
description: "A description of the argument" description: "A description of the argument"
``` ```
Each GraphQL `argument` defined will be passed to the `#resolve` method Each GraphQL `argument` defined is passed to the `#resolve` method
of a mutation as keyword arguments. of a mutation as keyword arguments.
Example: Example:
...@@ -1175,7 +1175,7 @@ def resolve(my_arg:) ...@@ -1175,7 +1175,7 @@ def resolve(my_arg:)
end end
``` ```
`graphql-ruby` will automatically wrap up arguments into an `graphql-ruby` wraps up arguments into an
[input type](https://graphql.org/learn/schema/#input-types). [input type](https://graphql.org/learn/schema/#input-types).
For example, the For example, the
...@@ -1231,7 +1231,7 @@ single mutation when multiple are performed within a single request. ...@@ -1231,7 +1231,7 @@ single mutation when multiple are performed within a single request.
### The `resolve` method ### The `resolve` method
The `resolve` method receives the mutation's arguments as keyword arguments. The `resolve` method receives the mutation's arguments as keyword arguments.
From here, we can call the service that will modify the resource. From here, we can call the service that modifies the resource.
The `resolve` method should then return a hash with the same field The `resolve` method should then return a hash with the same field
names as defined on the mutation including an `errors` array. For example, names as defined on the mutation including an `errors` array. For example,
...@@ -1263,7 +1263,7 @@ should look like this: ...@@ -1263,7 +1263,7 @@ should look like this:
To make the mutation available it must be defined on the mutation To make the mutation available it must be defined on the mutation
type that lives in `graphql/types/mutation_types`. The type that lives in `graphql/types/mutation_types`. The
`mount_mutation` helper method will define a field based on the `mount_mutation` helper method defines a field based on the
GraphQL-name of the mutation: GraphQL-name of the mutation:
```ruby ```ruby
...@@ -1278,7 +1278,7 @@ module Types ...@@ -1278,7 +1278,7 @@ module Types
end end
``` ```
Will generate a field called `mergeRequestSetWip` that Generates a field called `mergeRequestSetWip` that
`Mutations::MergeRequests::SetWip` to be resolved. `Mutations::MergeRequests::SetWip` to be resolved.
### Authorizing resources ### Authorizing resources
...@@ -1301,13 +1301,13 @@ end ...@@ -1301,13 +1301,13 @@ end
We can then call `authorize!` in the `resolve` method, passing in the resource we We can then call `authorize!` in the `resolve` method, passing in the resource we
want to validate the abilities for. want to validate the abilities for.
Alternatively, we can add a `find_object` method that will load the Alternatively, we can add a `find_object` method that loads the
object on the mutation. This would allow you to use the object on the mutation. This would allow you to use the
`authorized_find!` helper method. `authorized_find!` helper method.
When a user is not allowed to perform the action, or an object is not When a user is not allowed to perform the action, or an object is not
found, we should raise a found, we should raise a
`Gitlab::Graphql::Errors::ResourceNotAvailable` error. Which will be `Gitlab::Graphql::Errors::ResourceNotAvailable` error which is
correctly rendered to the clients. correctly rendered to the clients.
### Errors in mutations ### Errors in mutations
...@@ -1418,8 +1418,8 @@ of errors should be treated as internal, and not shown to the user in specific ...@@ -1418,8 +1418,8 @@ of errors should be treated as internal, and not shown to the user in specific
detail. detail.
We need to inform the user when the mutation fails, but we do not need to We need to inform the user when the mutation fails, but we do not need to
tell them why, since they cannot have caused it, and nothing they can do will tell them why, since they cannot have caused it, and nothing they can do
fix it, although we may offer to retry the mutation. fixes it, although we may offer to retry the mutation.
#### Categorizing errors #### Categorizing errors
...@@ -1483,7 +1483,7 @@ Sometimes a mutation or resolver may accept a number of optional ...@@ -1483,7 +1483,7 @@ Sometimes a mutation or resolver may accept a number of optional
arguments, but we still want to validate that at least one of the optional arguments, but we still want to validate that at least one of the optional
arguments is provided. In this situation, consider using the `#ready?` arguments is provided. In this situation, consider using the `#ready?`
method within your mutation or resolver to provide the validation. The method within your mutation or resolver to provide the validation. The
`#ready?` method will be called before any work is done within the `#ready?` method is called before any work is done within the
`#resolve` method. `#resolve` method.
Example: Example:
...@@ -1547,11 +1547,11 @@ visit [Testing](graphql_guide/pagination.md#testing) for details. ...@@ -1547,11 +1547,11 @@ visit [Testing](graphql_guide/pagination.md#testing) for details.
To test GraphQL mutation requests, `GraphqlHelpers` provides 2 To test GraphQL mutation requests, `GraphqlHelpers` provides 2
helpers: `graphql_mutation` which takes the name of the mutation, and helpers: `graphql_mutation` which takes the name of the mutation, and
a hash with the input for the mutation. This will return a struct with a hash with the input for the mutation. This returns a struct with
a mutation query, and prepared variables. a mutation query, and prepared variables.
This struct can then be passed to the `post_graphql_mutation` helper, This struct can then be passed to the `post_graphql_mutation` helper,
that will post the request with the correct parameters, like a GraphQL that posts the request with the correct parameters, like a GraphQL
client would do. client would do.
To access the response of a mutation, the `graphql_mutation_response` To access the response of a mutation, the `graphql_mutation_response`
......
...@@ -11,7 +11,7 @@ GitLab creates a new Project with an associated Git repository that is a ...@@ -11,7 +11,7 @@ GitLab creates a new Project with an associated Git repository that is a
copy of the original project at the time of the fork. If a large project copy of the original project at the time of the fork. If a large project
gets forked often, this can lead to a quick increase in Git repository gets forked often, this can lead to a quick increase in Git repository
storage disk use. To counteract this problem, we are adding Git object storage disk use. To counteract this problem, we are adding Git object
deduplication for forks to GitLab. In this document, we will describe how deduplication for forks to GitLab. In this document, we describe how
GitLab implements Git object deduplication. GitLab implements Git object deduplication.
## Pool repositories ## Pool repositories
...@@ -27,9 +27,9 @@ If we want repository A to borrow from repository B, we first write a ...@@ -27,9 +27,9 @@ If we want repository A to borrow from repository B, we first write a
path that resolves to `B.git/objects` in the special file path that resolves to `B.git/objects` in the special file
`A.git/objects/info/alternates`. This establishes the alternates link. `A.git/objects/info/alternates`. This establishes the alternates link.
Next, we must perform a Git repack in A. After the repack, any objects Next, we must perform a Git repack in A. After the repack, any objects
that are duplicated between A and B will get deleted from A. Repository that are duplicated between A and B are deleted from A. Repository
A is now no longer self-contained, but it still has its own refs and A is now no longer self-contained, but it still has its own refs and
configuration. Objects in A that are not in B will remain in A. For this configuration. Objects in A that are not in B remain in A. For this
to work, it is of course critical that **no objects ever get deleted from to work, it is of course critical that **no objects ever get deleted from
B** because A might need them. B** because A might need them.
...@@ -49,7 +49,7 @@ repositories** which are hidden from the user. We then use Git ...@@ -49,7 +49,7 @@ repositories** which are hidden from the user. We then use Git
alternates to let a collection of project repositories borrow from a alternates to let a collection of project repositories borrow from a
single pool repository. We call such a collection of project single pool repository. We call such a collection of project
repositories a pool. Pools form star-shaped networks of repositories repositories a pool. Pools form star-shaped networks of repositories
that borrow from a single pool, which will resemble (but not be that borrow from a single pool, which resemble (but not be
identical to) the fork networks that get formed when users fork identical to) the fork networks that get formed when users fork
projects. projects.
...@@ -72,9 +72,9 @@ across a collection of GitLab project repositories at the Git level: ...@@ -72,9 +72,9 @@ across a collection of GitLab project repositories at the Git level:
The effectiveness of Git object deduplication in GitLab depends on the The effectiveness of Git object deduplication in GitLab depends on the
amount of overlap between the pool repository and each of its amount of overlap between the pool repository and each of its
participants. Each time garbage collection runs on the source project, participants. Each time garbage collection runs on the source project,
Git objects from the source project will get migrated to the pool Git objects from the source project are migrated to the pool
repository. One by one, as garbage collection runs, other member repository. One by one, as garbage collection runs, other member
projects will benefit from the new objects that got added to the pool. projects benefit from the new objects that got added to the pool.
## SQL model ## SQL model
...@@ -123,19 +123,19 @@ are as follows: ...@@ -123,19 +123,19 @@ are as follows:
the fork parent and the fork child project become members of the new the fork parent and the fork child project become members of the new
pool. pool.
- Once project A has become the source project of a pool, all future - Once project A has become the source project of a pool, all future
eligible forks of A will become pool members. eligible forks of A become pool members.
- If the fork source is itself a fork, the resulting repository will - If the fork source is itself a fork, the resulting repository will
neither join the repository nor will a new pool repository be neither join the repository nor is a new pool repository
seeded. seeded.
eg: Such as:
Suppose fork A is part of a pool repository, any forks created off Suppose fork A is part of a pool repository, any forks created off
of fork A *will not* be a part of the pool repository that fork A is of fork A *are not* a part of the pool repository that fork A is
a part of. a part of.
Suppose B is a fork of A, and A does not belong to an object pool. Suppose B is a fork of A, and A does not belong to an object pool.
Now C gets created as a fork of B. C will not be part of a pool Now C gets created as a fork of B. C is not part of a pool
repository. repository.
> TODO should forks of forks be deduplicated? > TODO should forks of forks be deduplicated?
...@@ -146,11 +146,11 @@ are as follows: ...@@ -146,11 +146,11 @@ are as follows:
- If a normal Project participating in a pool gets moved to another - If a normal Project participating in a pool gets moved to another
Gitaly storage shard, its "belongs to PoolRepository" relation will Gitaly storage shard, its "belongs to PoolRepository" relation will
be broken. Because of the way moving repositories between shard is be broken. Because of the way moving repositories between shard is
implemented, we will automatically get a fresh self-contained copy implemented, we get a fresh self-contained copy
of the project's repository on the new storage shard. of the project's repository on the new storage shard.
- If the source project of a pool gets moved to another Gitaly storage - If the source project of a pool gets moved to another Gitaly storage
shard or is deleted the "source project" relation is not broken. shard or is deleted the "source project" relation is not broken.
However, as of GitLab 12.0 a pool will not fetch from a source However, as of GitLab 12.0 a pool does not fetch from a source
unless the source is on the same Gitaly shard. unless the source is on the same Gitaly shard.
## Consistency between the SQL pool relation and Gitaly ## Consistency between the SQL pool relation and Gitaly
...@@ -163,7 +163,7 @@ repository and a pool. ...@@ -163,7 +163,7 @@ repository and a pool.
### Pool existence ### Pool existence
If GitLab thinks a pool repository exists (i.e. it exists according to If GitLab thinks a pool repository exists (i.e. it exists according to
SQL), but it does not on the Gitaly server, then it will be created on SQL), but it does not on the Gitaly server, then it is created on
the fly by Gitaly. the fly by Gitaly.
### Pool relation existence ### Pool relation existence
...@@ -173,27 +173,27 @@ There are three different things that can go wrong here. ...@@ -173,27 +173,27 @@ There are three different things that can go wrong here.
#### 1. SQL says repo A belongs to pool P but Gitaly says A has no alternate objects #### 1. SQL says repo A belongs to pool P but Gitaly says A has no alternate objects
In this case, we miss out on disk space savings but all RPC's on A In this case, we miss out on disk space savings but all RPC's on A
itself will function fine. The next time garbage collection runs on A, itself function fine. The next time garbage collection runs on A,
the alternates connection gets established in Gitaly. This is done by the alternates connection gets established in Gitaly. This is done by
`Projects::GitDeduplicationService` in GitLab Rails. `Projects::GitDeduplicationService` in GitLab Rails.
#### 2. SQL says repo A belongs to pool P1 but Gitaly says A has alternate objects in pool P2 #### 2. SQL says repo A belongs to pool P1 but Gitaly says A has alternate objects in pool P2
In this case `Projects::GitDeduplicationService` will throw an exception. In this case `Projects::GitDeduplicationService` throws an exception.
#### 3. SQL says repo A does not belong to any pool but Gitaly says A belongs to P #### 3. SQL says repo A does not belong to any pool but Gitaly says A belongs to P
In this case `Projects::GitDeduplicationService` will try to In this case `Projects::GitDeduplicationService` tries to
"re-duplicate" the repository A using the DisconnectGitAlternates RPC. "re-duplicate" the repository A using the DisconnectGitAlternates RPC.
## Git object deduplication and GitLab Geo ## Git object deduplication and GitLab Geo
When a pool repository record is created in SQL on a Geo primary, this When a pool repository record is created in SQL on a Geo primary, this
will eventually trigger an event on the Geo secondary. The Geo secondary eventually triggers an event on the Geo secondary. The Geo secondary
will then create the pool repository in Gitaly. This leads to an then creates the pool repository in Gitaly. This leads to an
"eventually consistent" situation because as each pool participant gets "eventually consistent" situation because as each pool participant gets
synchronized, Geo will eventually trigger garbage collection in Gitaly on synchronized, Geo eventually triggers garbage collection in Gitaly on
the secondary, at which stage Git objects will get deduplicated. the secondary, at which stage Git objects are deduplicated.
> TODO How do we handle the edge case where at the time the Geo > TODO How do we handle the edge case where at the time the Geo
> secondary tries to create the pool repository, the source project does > secondary tries to create the pool repository, the source project does
......
...@@ -46,8 +46,8 @@ and running. ...@@ -46,8 +46,8 @@ and running.
Can the queries used potentially take down any critical services and result in Can the queries used potentially take down any critical services and result in
engineers being woken up in the night? Can a malicious user abuse the code to engineers being woken up in the night? Can a malicious user abuse the code to
take down a GitLab instance? Will my changes simply make loading a certain page take down a GitLab instance? Do my changes simply make loading a certain page
slower? Will execution time grow exponentially given enough load or data in the slower? Does execution time grow exponentially given enough load or data in the
database? database?
These are all questions one should ask themselves before submitting a merge These are all questions one should ask themselves before submitting a merge
...@@ -67,14 +67,14 @@ in turn can request a performance specialist to review the changes. ...@@ -67,14 +67,14 @@ in turn can request a performance specialist to review the changes.
## Think outside of the box ## Think outside of the box
Everyone has their own perception how the new feature is going to be used. Everyone has their own perception of how to use the new feature.
Always consider how users might be using the feature instead. Usually, Always consider how users might be using the feature instead. Usually,
users test our features in a very unconventional way, users test our features in a very unconventional way,
like by brute forcing or abusing edge conditions that we have. like by brute forcing or abusing edge conditions that we have.
## Data set ## Data set
The data set that will be processed by the merge request should be known The data set the merge request processes should be known
and documented. The feature should clearly document what the expected and documented. The feature should clearly document what the expected
data set is for this feature to process, and what problems it might cause. data set is for this feature to process, and what problems it might cause.
...@@ -86,8 +86,8 @@ from the repository and perform search for the set of files. ...@@ -86,8 +86,8 @@ from the repository and perform search for the set of files.
As an author you should in context of that problem consider As an author you should in context of that problem consider
the following: the following:
1. What repositories are going to be supported? 1. What repositories are planned to be supported?
1. How long it will take for big repositories like Linux kernel? 1. How long it do big repositories like Linux kernel take?
1. Is there something that we can do differently to not process such a 1. Is there something that we can do differently to not process such a
big data set? big data set?
1. Should we build some fail-safe mechanism to contain 1. Should we build some fail-safe mechanism to contain
...@@ -96,28 +96,28 @@ the following: ...@@ -96,28 +96,28 @@ the following:
## Query plans and database structure ## Query plans and database structure
The query plan can tell us if we will need additional The query plan can tell us if we need additional
indexes, or expensive filtering (such as using sequential scans). indexes, or expensive filtering (such as using sequential scans).
Each query plan should be run against substantial size of data set. Each query plan should be run against substantial size of data set.
For example, if you look for issues with specific conditions, For example, if you look for issues with specific conditions,
you should consider validating a query against you should consider validating a query against
a small number (a few hundred) and a big number (100_000) of issues. a small number (a few hundred) and a big number (100_000) of issues.
See how the query will behave if the result will be a few See how the query behaves if the result is a few
and a few thousand. and a few thousand.
This is needed as we have users using GitLab for very big projects and This is needed as we have users using GitLab for very big projects and
in a very unconventional way. Even if it seems that it's unlikely in a very unconventional way. Even if it seems that it's unlikely
that such a big data set will be used, it's still plausible that one that such a big data set is used, it's still plausible that one
of our customers will encounter a problem with the feature. of our customers could encounter a problem with the feature.
Understanding ahead of time how it's going to behave at scale, even if we accept it, Understanding ahead of time how it behaves at scale, even if we accept it,
is the desired outcome. We should always have a plan or understanding of what it will take is the desired outcome. We should always have a plan or understanding of what is needed
to optimize the feature for higher usage patterns. to optimize the feature for higher usage patterns.
Every database structure should be optimized and sometimes even over-described Every database structure should be optimized and sometimes even over-described
in preparation for easy extension. The hardest part after some point is in preparation for easy extension. The hardest part after some point is
data migration. Migrating millions of rows will always be troublesome and data migration. Migrating millions of rows is always troublesome and
can have a negative impact on the application. can have a negative impact on the application.
To better understand how to get help with the query plan reviews To better understand how to get help with the query plan reviews
...@@ -130,7 +130,7 @@ queries unless absolutely necessary. ...@@ -130,7 +130,7 @@ queries unless absolutely necessary.
The total number of queries executed by the code modified or added by a merge request The total number of queries executed by the code modified or added by a merge request
must not increase unless absolutely necessary. When building features it's must not increase unless absolutely necessary. When building features it's
entirely possible you will need some extra queries, but you should try to keep entirely possible you need some extra queries, but you should try to keep
this at a minimum. this at a minimum.
As an example, say you introduce a feature that updates a number of database As an example, say you introduce a feature that updates a number of database
...@@ -144,7 +144,7 @@ objects_to_update.each do |object| ...@@ -144,7 +144,7 @@ objects_to_update.each do |object|
end end
``` ```
This will end up running one query for every object to update. This code can This means running one query for every object to update. This code can
easily overload a database given enough rows to update or many instances of this easily overload a database given enough rows to update or many instances of this
code running in parallel. This particular problem is known as the code running in parallel. This particular problem is known as the
["N+1 query problem"](https://guides.rubyonrails.org/active_record_querying.html#eager-loading-associations). You can write a test with [QueryRecorder](query_recorder.md) to detect this and prevent regressions. ["N+1 query problem"](https://guides.rubyonrails.org/active_record_querying.html#eager-loading-associations). You can write a test with [QueryRecorder](query_recorder.md) to detect this and prevent regressions.
...@@ -189,8 +189,8 @@ build.project == pipeline_project ...@@ -189,8 +189,8 @@ build.project == pipeline_project
# => true # => true
``` ```
When we call `build.project`, it will not hit the database, it will use the cached result, but it will re-instantiate When we call `build.project`, it doesn't hit the database, it uses the cached result, but it re-instantiates
same pipeline project object. It turns out that associated objects do not point to the same in-memory object. the same pipeline project object. It turns out that associated objects do not point to the same in-memory object.
If we try to serialize each build: If we try to serialize each build:
...@@ -200,7 +200,7 @@ pipeline.builds.each do |build| ...@@ -200,7 +200,7 @@ pipeline.builds.each do |build|
end end
``` ```
It will re-instantiate project object for each build, instead of using the same in-memory object. It re-instantiates project object for each build, instead of using the same in-memory object.
In this particular case the workaround is fairly easy: In this particular case the workaround is fairly easy:
...@@ -212,7 +212,7 @@ end ...@@ -212,7 +212,7 @@ end
``` ```
We can assign `pipeline.project` to each `build.project`, since we know it should point to the same project. We can assign `pipeline.project` to each `build.project`, since we know it should point to the same project.
This will allow us that each build point to the same in-memory project, This allows us that each build point to the same in-memory project,
avoiding the cached SQL query and re-instantiation of the project object for each build. avoiding the cached SQL query and re-instantiation of the project object for each build.
## Executing Queries in Loops ## Executing Queries in Loops
...@@ -323,7 +323,7 @@ Certain UI elements may not always be needed. For example, when hovering over a ...@@ -323,7 +323,7 @@ Certain UI elements may not always be needed. For example, when hovering over a
diff line there's a small icon displayed that can be used to create a new diff line there's a small icon displayed that can be used to create a new
comment. Instead of always rendering these kind of elements they should only be comment. Instead of always rendering these kind of elements they should only be
rendered when actually needed. This ensures we don't spend time generating rendered when actually needed. This ensures we don't spend time generating
Haml/HTML when it's not going to be used. Haml/HTML when it's not used.
## Instrumenting New Code ## Instrumenting New Code
...@@ -411,8 +411,8 @@ The quota should be optimised to a level that we consider the feature to ...@@ -411,8 +411,8 @@ The quota should be optimised to a level that we consider the feature to
be performant and usable for the user, but **not limiting**. be performant and usable for the user, but **not limiting**.
**We want the features to be fully usable for the users.** **We want the features to be fully usable for the users.**
**However, we want to ensure that the feature will continue to perform well if used at its limit** **However, we want to ensure that the feature continues to perform well if used at its limit**
**and it won't cause availability issues.** **and it doesn't cause availability issues.**
Consider that it's always better to start with some kind of limitation, Consider that it's always better to start with some kind of limitation,
instead of later introducing a breaking change that would result in some instead of later introducing a breaking change that would result in some
...@@ -433,11 +433,11 @@ The intent of quotas could be different: ...@@ -433,11 +433,11 @@ The intent of quotas could be different:
Examples: Examples:
1. Pipeline Schedules: It's very unlikely that user will want to create 1. Pipeline Schedules: It's very unlikely that user wants to create
more than 50 schedules. more than 50 schedules.
In such cases it's rather expected that this is either misuse In such cases it's rather expected that this is either misuse
or abuse of the feature. Lack of the upper limit can result or abuse of the feature. Lack of the upper limit can result
in service degradation as the system will try to process all schedules in service degradation as the system tries to process all schedules
assigned the project. assigned the project.
1. GitLab CI/CD includes: We started with the limit of maximum of 50 nested includes. 1. GitLab CI/CD includes: We started with the limit of maximum of 50 nested includes.
...@@ -477,7 +477,7 @@ We can consider the following types of storages: ...@@ -477,7 +477,7 @@ We can consider the following types of storages:
for most of our implementations. Even though this allows the above limit to be significantly larger, for most of our implementations. Even though this allows the above limit to be significantly larger,
it does not really mean that you can use more. The shared temporary storage is shared by it does not really mean that you can use more. The shared temporary storage is shared by
all nodes. Thus, the job that uses significant amount of that space or performs a lot all nodes. Thus, the job that uses significant amount of that space or performs a lot
of operations will create a contention on execution of all other jobs and request of operations creates a contention on execution of all other jobs and request
across the whole application, this can easily impact stability of the whole GitLab. across the whole application, this can easily impact stability of the whole GitLab.
Be respectful of that. Be respectful of that.
......
...@@ -467,7 +467,7 @@ of the `gitlab-org/gitlab-foss` project. These jobs are only created in the foll ...@@ -467,7 +467,7 @@ of the `gitlab-org/gitlab-foss` project. These jobs are only created in the foll
The `* as-if-foss` jobs are run in addition to the regular EE-context jobs. They have the `FOSS_ONLY='1'` variable The `* as-if-foss` jobs are run in addition to the regular EE-context jobs. They have the `FOSS_ONLY='1'` variable
set and get their EE-specific folders removed before the tests start running. set and get their EE-specific folders removed before the tests start running.
The intent is to ensure that a change won't introduce a failure once the `gitlab-org/gitlab` project will be synced to The intent is to ensure that a change doesn't introduce a failure after the `gitlab-org/gitlab` project is synced to
the `gitlab-org/gitlab-foss` project. the `gitlab-org/gitlab-foss` project.
## Performance ## Performance
...@@ -502,7 +502,7 @@ request, be sure to start the `dont-interrupt-me` job before pushing. ...@@ -502,7 +502,7 @@ request, be sure to start the `dont-interrupt-me` job before pushing.
- `update-assets-compile-production-cache`, defined in [`.gitlab/ci/frontend.gitlab-ci.yml`](https://gitlab.com/gitlab-org/gitlab/blob/master/.gitlab/ci/frontend.gitlab-ci.yml). - `update-assets-compile-production-cache`, defined in [`.gitlab/ci/frontend.gitlab-ci.yml`](https://gitlab.com/gitlab-org/gitlab/blob/master/.gitlab/ci/frontend.gitlab-ci.yml).
- `update-assets-compile-test-cache`, defined in [`.gitlab/ci/frontend.gitlab-ci.yml`](https://gitlab.com/gitlab-org/gitlab/blob/master/.gitlab/ci/frontend.gitlab-ci.yml). - `update-assets-compile-test-cache`, defined in [`.gitlab/ci/frontend.gitlab-ci.yml`](https://gitlab.com/gitlab-org/gitlab/blob/master/.gitlab/ci/frontend.gitlab-ci.yml).
- `update-yarn-cache`, defined in [`.gitlab/ci/frontend.gitlab-ci.yml`](https://gitlab.com/gitlab-org/gitlab/blob/master/.gitlab/ci/frontend.gitlab-ci.yml). - `update-yarn-cache`, defined in [`.gitlab/ci/frontend.gitlab-ci.yml`](https://gitlab.com/gitlab-org/gitlab/blob/master/.gitlab/ci/frontend.gitlab-ci.yml).
1. These jobs will run in merge requests whose title include `UPDATE CACHE`. 1. These jobs run in merge requests whose title include `UPDATE CACHE`.
### Pre-clone step ### Pre-clone step
...@@ -546,8 +546,7 @@ on a scheduled pipeline, it does the following: ...@@ -546,8 +546,7 @@ on a scheduled pipeline, it does the following:
1. Saves the data as a `.tar.gz`. 1. Saves the data as a `.tar.gz`.
1. Uploads it into the Google Cloud Storage bucket. 1. Uploads it into the Google Cloud Storage bucket.
When a CI job runs with this configuration, you'll see something like When a CI job runs with this configuration, the output looks something like this:
this:
```shell ```shell
$ eval "$CI_PRE_CLONE_SCRIPT" $ eval "$CI_PRE_CLONE_SCRIPT"
...@@ -568,7 +567,7 @@ GitLab Team Member, find credentials in the ...@@ -568,7 +567,7 @@ GitLab Team Member, find credentials in the
[GitLab shared 1Password account](https://about.gitlab.com/handbook/security/#1password-for-teams). [GitLab shared 1Password account](https://about.gitlab.com/handbook/security/#1password-for-teams).
Note that this bucket should be located in the same continent as the Note that this bucket should be located in the same continent as the
runner, or [network egress charges will apply](https://cloud.google.com/storage/pricing). runner, or [you can incur network egress charges](https://cloud.google.com/storage/pricing).
## CI configuration internals ## CI configuration internals
...@@ -662,7 +661,7 @@ and included in `rules` definitions via [YAML anchors](../ci/yaml/README.md#anch ...@@ -662,7 +661,7 @@ and included in `rules` definitions via [YAML anchors](../ci/yaml/README.md#anch
| `if-not-canonical-namespace` | Matches if the project isn't in the canonical (`gitlab-org/`) or security (`gitlab-org/security`) namespace. | Use to create a job for forks (by using `when: on_success\|manual`), or **not** create a job for forks (by using `when: never`). | | `if-not-canonical-namespace` | Matches if the project isn't in the canonical (`gitlab-org/`) or security (`gitlab-org/security`) namespace. | Use to create a job for forks (by using `when: on_success\|manual`), or **not** create a job for forks (by using `when: never`). |
| `if-not-ee` | Matches if the project isn't EE (i.e. project name isn't `gitlab` or `gitlab-ee`). | Use to create a job only in the FOSS project (by using `when: on_success|manual`), or **not** create a job if the project is EE (by using `when: never`). | | `if-not-ee` | Matches if the project isn't EE (i.e. project name isn't `gitlab` or `gitlab-ee`). | Use to create a job only in the FOSS project (by using `when: on_success|manual`), or **not** create a job if the project is EE (by using `when: never`). |
| `if-not-foss` | Matches if the project isn't FOSS (i.e. project name isn't `gitlab-foss`, `gitlab-ce`, or `gitlabhq`). | Use to create a job only in the EE project (by using `when: on_success|manual`), or **not** create a job if the project is FOSS (by using `when: never`). | | `if-not-foss` | Matches if the project isn't FOSS (i.e. project name isn't `gitlab-foss`, `gitlab-ce`, or `gitlabhq`). | Use to create a job only in the EE project (by using `when: on_success|manual`), or **not** create a job if the project is FOSS (by using `when: never`). |
| `if-default-refs` | Matches if the pipeline is for `master`, `/^[\d-]+-stable(-ee)?$/` (stable branches), `/^\d+-\d+-auto-deploy-\d+$/` (auto-deploy branches), `/^security\//` (security branches), merge requests, and tags. | Note that jobs won't be created for branches with this default configuration. | | `if-default-refs` | Matches if the pipeline is for `master`, `/^[\d-]+-stable(-ee)?$/` (stable branches), `/^\d+-\d+-auto-deploy-\d+$/` (auto-deploy branches), `/^security\//` (security branches), merge requests, and tags. | Note that jobs aren't created for branches with this default configuration. |
| `if-master-refs` | Matches if the current branch is `master`. | | | `if-master-refs` | Matches if the current branch is `master`. | |
| `if-master-or-tag` | Matches if the pipeline is for the `master` branch or for a tag. | | | `if-master-or-tag` | Matches if the pipeline is for the `master` branch or for a tag. | |
| `if-merge-request` | Matches if the pipeline is for a merge request. | | | `if-merge-request` | Matches if the pipeline is for a merge request. | |
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment