Merge branch 'docs-add-ci-architecture' into 'master'

Add CI Architecture overview See merge request gitlab-org/gitlab!31703

Merge branch 'docs-add-ci-architecture' into 'master'
Add CI Architecture overview See merge request gitlab-org/gitlab!31703
606ad982 · Marcel Amirault · c2d5ddf5 · 72460255 · 606ad982 · 606ad982
Commit 606ad982 authored May 14, 2020 by Marcel Amirault
Hide whitespace changes
Inline Side-by-side

Showing with 75 additions and 0 deletions

doc/development/cicd/img/ci_architecture.png doc/development/cicd/img/ci_architecture.png +0 -0

doc/development/cicd/index.md doc/development/cicd/index.md +75 -0

No files found.
--- a/doc/development/cicd/img/ci_architecture.png
+++ b/doc/development/cicd/img/ci_architecture.png
--- a/doc/development/cicd/index.md
+++ b/doc/development/cicd/index.md
@@ -2,6 +2,81 @@

 Development guides that are specific to CI/CD are listed here.

+## CI Architecture overview
+
+The following is a simplified diagram of the CI architecture. Some details are left out in order to focus on
+the main components.
+
+![CI software architecture](img/ci_architecture.png)
+<!-- Editable diagram available at https://app.diagrams.net/#G1LFl-KW4fgpBPzz8VIH9rsOlAH4t0xwKj -->
+
+On the left side we have the events that can trigger a pipeline based on various events (trigged by a user or automation):
+
+- A `git push` is the most common event that triggers a pipeline.
+- The [Web API](../../api/pipelines.md#create-a-new-pipeline).
+- A user clicking the "Run Pipeline" button in the UI.
+- When a [merge request is created or updated](../../ci/merge_request_pipelines/index.md#pipelines-for-merge-requests).
+- When an MR is added to a [Merge Train](../../ci/merge_request_pipelines/pipelines_for_merged_results/merge_trains/index.md#merge-trains-premium).
+- A [scheduled pipeline](../../ci/pipelines/schedules.md#pipeline-schedules).
+- When project is [subscribed to an upstream project](../../ci/multi_project_pipelines.md#trigger-a-pipeline-when-an-upstream-project-is-rebuilt).
+- When [Auto DevOps](../../topics/autodevops/index.md) is enabled.
+- When GitHub integration is used with [external pull requests](../../ci/ci_cd_for_external_repos/index.md#pipelines-for-external-pull-requests).
+- When an upstream pipeline contains a [bridge job](../../ci/yaml/README.md#trigger) which triggers a downstream pipeline.
+
+Triggering any of these events will invoke the [`CreatePipelineService`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/app/services/ci/create_pipeline_service.rb)
+which takes as input event data and the user triggering it, then will attempt to create a pipeline.
+
+The `CreatePipelineService` relies heavily on the [`YAML Processor`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/lib/gitlab/ci/yaml_processor.rb)
+component, which is responsible for taking in a YAML blob as input and returns the abstract data structure of a
+pipeline (including stages and all jobs). This component also validates the structure of the YAML while
+processing it, and returns any syntax or semantic errors. The `YAML Processor` component is where we define
+[all the keywords](../../ci/yaml/README.md) available to structure a pipeline.
+
+The `CreatePipelineService` receives the abstract data structure returned by the `YAML Processor`,
+which then converts it to persisted models (pipeline, stages, jobs, etc.). After that, the pipeline is ready
+to be processed. Processing a pipeline means running the jobs in order of execution (stage or DAG)
+until either one of the following:
+
+- All expected jobs have been executed.
+- Failures interrupt the pipeline execution.
+
+The component that processes a pipeline is [`ProcessPipelineService`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/app/services/ci/process_pipeline_service.rb),
+which is responsible for moving all the pipeline's jobs to a completed state. When a pipeline is created, all its
+jobs are initially in `created` state. This services looks at what jobs in `created` stage are eligible
+to be processed based on the pipeline structure. Then it moves them into the `pending` state, which means
+they can now [be picked up by a Runner](#job-scheduling). After a job has been executed it can complete
+successfully or fail. Each status transition for job within a pipeline triggers this service again, which
+looks for the next jobs to be transitioned towards completion. While doing that, `ProcessPipelineService`
+updates the status of jobs, stages and the overall pipeline.
+
+On the right side of the diagram we have a list of [Runners](../../ci/runners/README.md#configuring-gitlab-runners)
+connected to the GitLab instance. These can be Shared Runners, Group Runners or Project-specific Runners.
+The communication between Runners and the Rails server occurs through a set of API endpoints, grouped as
+the `Runner API Gateway`.
+
+We can register, delete and verify Runners, which also causes read/write queries to the database. After a Runner is connected,
+it keeps asking for the next job to execute. This invokes the [`RegisterJobService`](https://gitlab.com/gitlab-org/gitlab/blob/master/app/services/ci/register_job_service.rb)
+which will pick the next job and assign it to the Runner. At this point the job will transition to a
+`running` state, which again triggers `ProcessPipelineService` due to the status change.
+For more details read [Job scheduling](#job-scheduling)).
+
+While a job is being executed, the Runner sends logs back to the server as well any possible artifacts
+that need to be stored. Also, a job may depend on artifacts from previous jobs in order to run. In this
+case the Runner will download them using a dedicated API endpoint.
+
+Artifacts are stored in object storage, while metadata is kept in the database. An important example of artifacts
+is reports (JUnit, SAST, DAST, etc.) which are parsed and rendered in the merge request.
+
+Job status transitions are not all automated. A user may run [manual jobs](../../ci/yaml/README.md#whenmanual), cancel a pipeline, retry
+specific failed jobs or the entire pipeline. Anything that
+causes a job to change status will trigger `ProcessPipelineService`, as it's responsible for
+tracking the status of the entire pipeline.
+
+A special type of job is the [bridge job](../../ci/yaml/README.md#trigger) which is executed server-side
+when transitioning to the `pending` state. This job is responsible for creating a downstream pipeline, such as
+a multi-project or child pipeline. The workflow loop starts again
+from the `CreatePipelineService` every time a downstream pipeline is triggered.
+
 ## Job scheduling

 When a Pipeline is created all its jobs are created at once for all stages, with an initial state of `created`. This makes it possible to visualize the full content of a pipeline.