Commit 3c9766d6 authored by Jacob Vosmaer's avatar Jacob Vosmaer

Document that gitlab-org/gitlab no longer uses CI_PRE_CLONE_SCRIPT

This updates the developer documentation to reflect the fact that we
stopped using CI_PRE_CLONE_SCRIPT to make the gitlab-org/gitlab CI Git
fetch workload manageable.
parent a09edca1
......@@ -250,12 +250,15 @@ concurrent = 4
This makes the cloning configuration to be part of the given runner
and does not require us to update each `.gitlab-ci.yml`.
## Pre-clone step
> [An issue exists](https://gitlab.com/groups/gitlab-com/gl-infra/-/epics/463) to remove the need for this optimization.
For very active repositories with a large number of references and files, you can also
optimize your CI jobs by seeding repository data with GitLab Runner's [`pre_clone_script`](https://docs.gitlab.com/runner/configuration/advanced-configuration.html#the-runners-section).
See [our development documentation](../../development/pipelines.md#pre-clone-step) for
an overview of how we implemented this approach on GitLab.com for the main GitLab repository.
## Git fetch caching or pre-clone step
For very active repositories with a large number of references and files, you can either (or both):
- Consider using the [Gitaly pack-objects cache](../../administration/gitaly/configure_gitaly.md#pack-objects-cache) instead of a
pre-clone step. This is easier to set up and it benefits all repositories on your GitLab server, unlike the pre-clone step that
must be configured per-repository. The pack-objects cache also automatically works for forks. For `gitlab-org/gitlab` development
on GitLab.com, we stopped using a pre-clone step.
- Optimize your CI/CD jobs by seeding repository data in a pre-clone step with the
[`pre_clone_script`](https://docs.gitlab.com/runner/configuration/advanced-configuration.html#the-runners-section) of GitLab Runner. See our
[development documentation](../../development/pipelines.md#pre-clone-step) for an overview of how we used to implement this approach on
GitLab.com for the main GitLab repository.
......@@ -791,19 +791,30 @@ request, be sure to start the `dont-interrupt-me` job before pushing.
We limit the artifacts that are saved and retrieved by jobs to the minimum in order to reduce the upload/download time and costs, as well as the artifacts storage.
### Pre-clone step
### Git fetch caching
Because GitLab.com uses the [pack-objects cache](../administration/gitaly/configure_gitaly.md#pack-objects-cache),
concurrent Git fetches of the same pipeline ref are deduplicated on
the Gitaly server (always) and served from cache (when available).
This works well for the following reasons:
The `gitlab-org/gitlab` project on GitLab.com uses a [pre-clone step](https://gitlab.com/gitlab-org/gitlab/-/issues/39134)
to seed the project with a recent archive of the repository. This is done for
several reasons:
- The pack-objects cache is enabled on all Gitaly servers on GitLab.com.
- The CI/CD [Git strategy setting](../ci/pipelines/settings.md#choose-the-default-git-strategy) for `gitlab-org/gitlab` is **Git clone**,
causing all jobs to fetch the same data, which maximizes the cache hit ratio.
- We use [shallow clone](../ci/pipelines/settings.md#limit-the-number-of-changes-fetched-during-clone) to avoid downloading the full Git
history for every job.
### Pre-clone step
- It speeds up builds because a 800 MB download only takes seconds, as opposed to a full Git clone.
- It significantly reduces load on the file server, as smaller deltas mean less time spent in `git pack-objects`.
NOTE:
We no longer use this optimization for `gitlab-org/gitlab` because the [pack-objects cache](../administration/gitaly/configure_gitaly.md#pack-objects-cache)
allows Gitaly to serve the full CI/CD fetch traffic now. See [Git fetch caching](#git-fetch-caching).
The pre-clone step works by using the `CI_PRE_CLONE_SCRIPT` variable
[defined by GitLab.com shared runners](../ci/runners/build_cloud/linux_build_cloud.md#pre-clone-script).
The `CI_PRE_CLONE_SCRIPT` is currently defined as a project CI/CD variable:
The `CI_PRE_CLONE_SCRIPT` is defined as a project CI/CD variable:
```shell
(
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment