Skip to content
Projects
Groups
Snippets
Help
Loading...
Help
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
D
docs
Project
Project
Details
Activity
Releases
Cycle Analytics
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Charts
Issues
0
Issues
0
List
Boards
Labels
Milestones
Merge Requests
0
Merge Requests
0
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Charts
Wiki
Wiki
Snippets
Snippets
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Charts
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
Administrator
docs
Commits
cb0080d4
Commit
cb0080d4
authored
Dec 23, 2015
by
Fabian Reinartz
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
Document and overview for the new AM
parent
a3986a81
Changes
1
Show whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
98 additions
and
193 deletions
+98
-193
alertmanager.md
content/docs/alerting/alertmanager.md
+98
-193
No files found.
content/docs/alerting/alertmanager.md
View file @
cb0080d4
...
...
@@ -6,196 +6,101 @@ nav_icon: sliders
# Alertmanager
The Alertmanager receives alerts from one or more Prometheus servers.
It manages those alerts, including silencing, inhibition, aggregation and
sending out notifications via methods such as email, PagerDuty and HipChat.
The Alertmanager handles alerts sent by client applications such as the
Prometheus server. It takes care of deduplicating, grouping, and routing
them to the correct receiver integration such as email, PagerDuty, or OpsGenie.
It also takes care of silencing and inhibition of alerts.
**WARNING: The Alertmanager is still considered to be very experimental.**
The following describes the core concepts the Alertmanager implements. Consult
the
[
configuration documentation
](
../configuration
)
to learn how to use them
in more detail.
##
Configuration
##
Grouping
The Alertmanager is configured via command-line flags and a configuration file.
Grouping categorizes alerts of similar nature into a single notification. This
is especially useful during larger outages when many systems fail at once and
hundreds the thousands of alerts may be firing simultaniously.
The configuration file is an ASCII protocol buffer. To specify which
configuration file to load, use the
`-config.file`
flag.
**Example:**
Dozens or hundreds of instances of a service are running in your
cluster when a network partition occurs. Half our your service instances
can no longer reach the database.
Alerting rules in Prometheus were configured to send an alert for each service
instance if it cannot communicate with the database. As a result hundreds of
alerts are sent to Alertmanager.
```
./alertmanager -config.file alertmanager.conf
```
To send all alerts to email, set the
`-notification.smtp.smarthost`
flag to
an SMTP smarthost (such as a
[
Postfix null client
](
http://www.postfix.org/STANDARD_CONFIGURATION_README.html#null_client
)
)
and use the following configuration:
```
notification_config {
name: "alertmanager_test"
email_config {
email: "test@example.org"
}
}
aggregation_rule {
notification_config_name: "alertmanager_test"
}
```
### Filtering
An aggregation rule can be made to apply to only some alerts using a filter.
For example, to apply a rule only to alerts with a
`severity`
label with the value
`page`
:
```
aggregation_rule {
filter {
name_re: "severity"
value_re: "page"
}
notification_config_name: "alertmanager_test"
}
```
Multiple filters can be provided.
### Repeat Rate
By default an aggregation rule will repeat notifications every 2 hours. This can be changed using
`repeat_rate_seconds`
.
```
aggregation_rule {
repeat_rate_seconds: 3600
notification_config_name: "alertmanager_test"
}
```
### Notifications
The Alertmanager has support for a growing number of notification methods.
Multiple notifications methods of one or more types can be used in the same
notification config.
The
`send_resolved`
field can be used with all notification methods to enable or disable
sending notifications that an alert has stopped firing.
#### Email
As a user one only wants to get a single page while still being able to see
exactly which service instances were affected. Thus one can configure
Alertmanager to group alerts by their cluster and alertname so it sends a
single compact notification.
The
`-notification.smtp.smarthost`
flag must be set to an SMTP smarthost.
The
`-notification.smtp.sender`
flag may be set to change the default From address.
Grouping of alerts, timing for the grouped notifications, and the receivers
of those notificiations are configured by a routing tree in the configuration
file.
```
notification_config {
name: "alertmanager_email"
email_config {
email: "test@example.org"
}
email_config {
email: "foo@example.org"
}
}
```
## Inhibition
Plain and CRAM-MD5 SMTP authentication methods are supported.
The
`SMTP_AUTH_USERNAME`
,
`SMTP_AUTH_SECRET`
,
`SMTP_AUTH_PASSWORD`
and
`SMTP_AUTH_IDENTITY`
environment variables are used to configure them.
Inhibition is a concept of surpressing notifications for certain alerts if
certain other alerts are already firing.
#### PagerDuty
**Example:**
An alert is firing that informs that an entire cluster is not
reachable. Alertmanager can be configured to mute all other alerts concerning
this cluster if that particular alert is firing.
This prevents hundreds to thousands of alerts firing unrelated to the actual
issue.
The Alertmanager integrates as a
[
Generic API
Service](https://support.pagerduty.com/hc/en-us/articles/202830340-Creating-a-Generic-API-Service)
with PagerDuty.
Inhibitions are configured through the Alertmanager's configuration file.
```
notification_config {
name: "alertmanager_pagerduty"
pagerduty_config {
service_key: "supersecretapikey"
}
}
```
## Silences
#### Pushover
```
notification_config {
name: "alertmanager_pushover"
pushover_config {
token: "mypushovertoken"
user_key: "mypushoverkey"
}
}
```
Silences are a straightforward way to simply mute alerts for a given time.
A silence is configured based on matchers, just as the routing tree. Incoming
alerts are checked whether they match all the equality or regular expression
matchers of an active silence.
If they do, no notifications will be send out for that alert.
#### HipChat
```
notification_config {
name: "alertmanager_hipchat"
hipchat_config {
auth_token: "hipchatauthtoken"
room_id: 123456
}
}
```
Silences are configured in the web interface of the Alertmanager.
#### Slack
```
notification_config {
name: "alertmanager_slack"
slack_config {
webhook_url: "webhookurl"
channel: "channelname"
}
}
```
##
## Flowdock
##
Sending alerts
```
notification_config {
name: "alertmanager_flowdock"
flowdock_config {
api_token: "4c7234902348234902384234234cdb59"
from_address: "aliaswithgravatar@example.com"
tag: "monitoring"
}
}
```
#### Generic Webhook
__
Prometheus automatically takes care of sending alerts generated by its
configured
[
alerting rules
](
../rules
)
. The following is a general documentation for clients.__
The Alertmanager supports sending notifications as JSON to arbitrary
URLs. This could be used to perform automated actions when an
alert fires or integrate with a system that the Alertmanager does not support.
The Alertmanager listens for alerts on an API endpoint at
`/api/v1/alerts`
.
Clients are expected to continously re-send alerts as long as they are still
active (usually at the order of 30 seconds to 3 minutes).
Clients can push a list of alerts to that endpoint via a POST request of
the following format:
```
notification_config {
name: "alertmanager_webhook"
webhook_config {
url: "http://example.org/my/hook"
}
}
```
An example of JSON message it sends is below.
```
json
{
"version"
:
"1"
,
"status"
:
"firing"
,
"alert"
:
[
[
{
"summary"
:
"summary"
,
"description"
:
"description"
,
"labels": {
"alertname"
:
"TestAlert"
"<labelname>": "<labelvalue>",
...
},
"payload
"
:
{
"activeSince"
:
"2015-06-01T12:55:47.356+01:00
"
,
"alertingRule"
:
"ALERT TestAlert IF absent(metric_name) FOR 0y WITH "
,
"generatorURL"
:
"http://localhost:9090/graph#%5B%7B%22expr%22%3A%22absent%28metric_name%29%22%2C%22tab%22%3A0%7D%5D
"
,
"value"
:
"1
"
}
}
]
}
"annotations
": {
"<labelname>": "<labelvalue>
",
}
,
"startsAt": "<rfc3339>
",
"endAt": "<rfc3339>
"
"generatorURL": "<generator_url>"
},
...
]
```
This format is subject to change.
The labels are used to identify identical instances of an alert and to perform
deduplication. The annotations are always set to those received most recently
and are not identifying an alert.
Both timestamps are optional. If
`startsAt`
is omitted, the current time
is assigned by the Alertmanager.
`endsAt`
is only set if the end time of an
alert is known. Otherwise it will be set to a configurable timeout period from
the time since the alert was last received.
The
`generatorURL`
field is a unique back-link which identifies the causing
entity of this alert in the client.
Alertmanager also supports a legacy endpoint on
`/api/alerts`
which is
compatible with Prometheus versions 0.16 and lower.
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment