Commit 06c9f530 authored by Nick Gaskill's avatar Nick Gaskill

Merge branch 'docs-recovery-alert-refresh' into 'master'

Better document recovery alert behavior and incident management settings

See merge request gitlab-org/gitlab!58514
parents a87bd40a 3e3323ed
......@@ -267,3 +267,19 @@ any other Markdown text field in GitLab by
You can embed both [GitLab-hosted metrics](../metrics/embed.md) and
[Grafana metrics](../metrics/embed_grafana.md) in incidents and issue
templates.
### Automatically close incidents via recovery alerts
> - [Introduced for Prometheus Integrations](https://gitlab.com/gitlab-org/gitlab/-/issues/13401) in GitLab 12.5.
> - [Introduced for HTTP Integrations](https://gitlab.com/gitlab-org/gitlab/-/issues/13402) in GitLab 13.4.
With Maintainer or higher [permissions](../../user/permissions.md), you can enable
GitLab to close an incident automatically when a **Recovery Alert** is received:
1. Navigate to **Settings > Operations > Incidents** and expand **Incidents**.
1. Check the **Automatically close associated Incident** checkbox.
1. Click **Save changes**.
When GitLab receives a **Recovery Alert**, it closes the associated incident.
This action is recorded as a system message on the incident indicating that it
was closed automatically by the GitLab Alert bot.
......@@ -97,17 +97,17 @@ to configure alerts for this integration.
## Customize the alert payload outside of GitLab
For all integration types, you can customize the payload by sending the following
For HTTP Endpoints without [custom mappings](#map-fields-in-custom-alerts), you can customize the payload by sending the following
parameters. All fields are optional. If the incoming alert does not contain a value for the `Title` field, a default value of `New: Alert` will be applied.
| Property | Type | Description |
| ------------------------- | --------------- | ----------- |
| `title` | String | The title of the incident. |
| `title` | String | The title of the alert.|
| `description` | String | A high-level summary of the problem. |
| `start_time` | DateTime | The time of the incident. If none is provided, a timestamp of the issue is used. |
| `end_time` | DateTime | For existing alerts only. When provided, the alert is resolved and the associated incident is closed. |
| `start_time` | DateTime | The time of the alert. If none is provided, a current time is used. |
| `end_time` | DateTime | The resolution time of the alert. If provided, the alert is resolved. |
| `service` | String | The affected service. |
| `monitoring_tool` | String | The name of the associated monitoring tool. |
| `monitoring_tool` | String | The name of the associated monitoring tool. |
| `hosts` | String or Array | One or more hosts, as to where this incident occurred. |
| `severity` | String | The severity of the alert. Case-insensitive. Can be one of: `critical`, `high`, `medium`, `low`, `info`, `unknown`. Defaults to `critical` if missing or value is not in this list. |
| `fingerprint` | String or Array | The unique identifier of the alert. This can be used to group occurrences of the same alert. |
......@@ -189,6 +189,17 @@ If the existing alert is already `resolved`, GitLab creates a new alert instead.
![Alert Management List](img/alert_list_v13_1.png)
## Recovery alerts
> [Introduced](https://gitlab.com/gitlab-org/gitlab/-/issues/13402) in GitLab 13.4.
The alert in GitLab will be automatically resolved when an HTTP Endpoint
receives a payload with the end time of the alert set. For HTTP Endpoints
without [custom mappings](#map-fields-in-custom-alerts), the expected
field is `end_time`. With custom mappings, you can select the expected field.
You can also configure the associated [incident to be closed automatically](../incident_management/incidents.md#automatically-close-incidents-via-recovery-alerts) when the alert resolves.
## Link to your Opsgenie Alerts
> [Introduced](https://gitlab.com/groups/gitlab-org/-/epics/3066) in GitLab Premium 13.2.
......
......@@ -96,7 +96,6 @@ Prometheus server to use the
## Trigger actions from alerts **(ULTIMATE)**
> - [Introduced](https://gitlab.com/gitlab-org/gitlab/-/issues/4925) in [GitLab Ultimate](https://about.gitlab.com/pricing/) 11.11.
> - [From GitLab Ultimate 12.5](https://gitlab.com/gitlab-org/gitlab/-/issues/13401), when GitLab receives a recovery alert, it automatically closes the associated issue.
Alerts can be used to trigger actions, like opening an issue automatically
(disabled by default since `13.1`). To configure the actions:
......@@ -127,10 +126,6 @@ values extracted from the [`alerts` field in webhook payload](https://prometheus
- **Low**: `low`, `s4`, `p4`, `warn`, `warning`
- **Info**: `info`, `s5`, `p5`, `debug`, `information`, `notice`
When GitLab receives a **Recovery Alert**, it closes the associated issue.
This action is recorded as a system message on the issue indicating that it
was closed automatically by the GitLab Alert bot.
To further customize the issue, you can add labels, mentions, or any other supported
[quick action](../../user/project/quick_actions.md) in the selected issue template,
which applies to all incidents. To limit quick actions or other information to
......@@ -143,3 +138,12 @@ does not yet exist, it is also created automatically.
If the metric exceeds the threshold of the alert for over 5 minutes, GitLab sends
an email to all [Maintainers and Owners](../../user/permissions.md#project-members-permissions)
of the project.
### Recovery alerts
> - [From GitLab Ultimate 12.5](https://gitlab.com/gitlab-org/gitlab/-/issues/13401), when GitLab receives a recovery alert, it automatically closes the associated issue.
The alert in GitLab will be automatically resolved when Prometheus
sends a payload with the field `status` set to `resolved`.
You can also configure the associated [incident to be closed automatically](../incident_management/incidents.md#automatically-close-incidents-via-recovery-alerts) when the alert resolves.
......@@ -356,6 +356,24 @@ to remove a fork relationship.
## Operations settings
### Alerts
Configure [alert integrations](../../../operations/incident_management/integrations.md#configuration) to triage and manage critical problems in your application as [alerts](../../../operations/incident_management/alerts.md).
### Incidents
#### Alert integration
Automatically [create](../../../operations/incident_management/incidents.md#create-incidents-automatically), [notify on](../../../operations/incident_management/paging.md#email-notifications), and [resolve](../../../operations/incident_management/incidents.md#automatically-close-incidents-via-recovery-alerts) incidents based on GitLab alerts.
#### PagerDuty integration
[Create incidents in GitLab for each PagerDuty incident](../../../operations/incident_management/incidents.md#create-incidents-via-the-pagerduty-webhook).
#### Incident settings
[Manage Service Level Agreements for incidents](../../../operations/incident_management/incidents.md#service-level-agreement-countdown-timer) with an SLA countdown timer.
### Error Tracking
Configure Error Tracking to discover and view [Sentry errors within GitLab](../../../operations/error_tracking.md).
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment