Commit a3986a81 authored by Fabian Reinartz's avatar Fabian Reinartz

Document new alerting rule syntax

parent 82c10c74
......@@ -15,53 +15,63 @@ Alerting rules are configured in Prometheus in the same way as [recording
rules](../../querying/rules).
### Defining alerting rules
Alerting rules are defined in the following syntax:
ALERT <alert name>
IF <expression>
[FOR <duration>]
[WITH <label set>]
SUMMARY "<summary template>"
DESCRIPTION "<description template>"
[LABELS <label set>]
[ANNOTATIONS <label set>]
The optional `FOR` clause causes Prometheus to wait for a certain duration
between first encountering a new expression output vector element (like an
instance with a high HTTP error rate) and counting an alert as firing for this
element. Elements that are active, but not firing yet, are in pending state.
The `WITH` clause allows specifying a set of additional labels to be attached
to the alert. Any existing conflicting labels will be overwritten.
The `LABELS` clause allows specifying a set of additional labels to be attached
to the alert. Any existing conflicting labels will be overwritten. The label
values can be templated.
The `ANNOTATIONS` clause specifies another set of labels that are not
identifying for an alert instance. They are used to store longer additional
information such as alert descriptions or runbook links. The annotation values
can be templated.
#### Templating
The `SUMMARY` should be a short, human-readable summary of the alert (suitable
for e.g. an email subject line), while the `DESCRIPTION` clause should provide
a longer description. Both string fields allow the inclusion of template
variables derived from the firing vector elements of the alert:
Label and annotation values can be templated using Go's template language.
The `$labels` variable holds the label key/value pairs of an alert instance
and `$value` holds the evaluated value of an alert instance.
# To insert a firing element's label values:
{{$labels.<labelname>}}
{{ $labels.<labelname> }}
# To insert the numeric expression value of the firing element:
{{$value}}
{{ $value }}
Examples:
# Alert for any instance that is unreachable for >5 minutes.
ALERT InstanceDown
IF up == 0
IF up == 0
FOR 5m
WITH {
severity="page"
LABELS { severity = "page" }
ANNOTATIONS {
summary = "Instance {{ $labels.instance }} down",
description = "{{ $labels.instance }} of job {{ $labels.job }} has been down for more than 5 minutes.",
}
SUMMARY "Instance {{$labels.instance}} down"
DESCRIPTION "{{$labels.instance}} of job {{$labels.job}} has been down for more than 5 minutes."
# Alert for any instance that have a median request latency >1s.
ALERT ApiHighRequestLatency
IF api_http_request_latencies_ms{quantile="0.5"} > 1000
ALERT APIHighRequestLatency
IF api_http_request_latencies_second{quantile="0.5"} > 1
FOR 1m
SUMMARY "High request latency on {{$labels.instance}}"
DESCRIPTION "{{$labels.instance}} has a median request latency above 1s (current value: {{$value}})"
ANNOTATIONS {
summary = "High request latency on {{ $labels.instance }}",
description = "{{ $labels.instance }} has a median request latency above 1s (current value: {{ $value }}s)",
}
### Inspecting alerts during runtime
To manually inspect which alerts are active (pending or firing), navigate to
the "Alerts" tab of your Prometheus instance. This will show you the exact
label sets for which each defined alert is currently active.
......@@ -74,6 +84,7 @@ transitions from active to inactive state. Once inactive, the time series does
not get further updates.
### Sending alert notifications
Prometheus's alerting rules are good at figuring what is broken *right now*,
but they are not a fully-fledged notification solution. Another layer is needed
to add summarization, notification rate limiting, silencing and alert
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment