Commit 24adaf62 authored by beorn7's avatar beorn7

Merge remote-tracking branch 'origin/master' into beorn7/doc-improve

Conflicts:
	content/docs/instrumenting/exporters.md
parents aa9e3f07 923feb9a
...@@ -11,3 +11,5 @@ crash.log ...@@ -11,3 +11,5 @@ crash.log
# OS X file # OS X file
static/.DS_Store static/.DS_Store
prometheus_rsa
language: ruby
branches:
only:
- master
script: make deploy
before_install:
- eval "$(ssh-agent -s)"
- openssl aes-256-cbc -K $encrypted_2ba894bc7c2f_key -iv $encrypted_2ba894bc7c2f_iv -in prometheus_rsa.enc -out prometheus_rsa -d
- chmod 600 prometheus_rsa
- ssh-add prometheus_rsa
...@@ -7,3 +7,4 @@ gem 'guard-nanoc' ...@@ -7,3 +7,4 @@ gem 'guard-nanoc'
gem 'nokogiri' gem 'nokogiri'
gem 'redcarpet' gem 'redcarpet'
gem 'pygments.rb' gem 'pygments.rb'
gem 'builder'
...@@ -3,6 +3,7 @@ GEM ...@@ -3,6 +3,7 @@ GEM
specs: specs:
adsf (1.2.0) adsf (1.2.0)
rack (>= 1.0.0) rack (>= 1.0.0)
builder (3.2.2)
celluloid (0.16.0) celluloid (0.16.0)
timers (~> 4.0.0) timers (~> 4.0.0)
coderay (1.1.0) coderay (1.1.0)
...@@ -57,6 +58,7 @@ PLATFORMS ...@@ -57,6 +58,7 @@ PLATFORMS
DEPENDENCIES DEPENDENCIES
adsf adsf
builder
guard-nanoc guard-nanoc
kramdown kramdown
nanoc nanoc
......
compile:
rm -rf output
bundle exec nanoc
deploy: github_pages_export github_pages_push
github_pages_export: compile
cd output && \
echo prometheus.io > CNAME && \
git init && \
git config user.name "Travis CI" && \
git config user.email "travis@prometheus.io" && \
git add . && \
git commit --message="Static site builder output"
github_pages_push:
cd output && \
git push -f git@github.com:prometheus/prometheus.github.io master
.PHONY: compile deploy github_pages_export github_pages_push
...@@ -25,6 +25,15 @@ route '/README/' do ...@@ -25,6 +25,15 @@ route '/README/' do
'/README.md' '/README.md'
end end
# RSS Feed
compile '/blog/feed/' do
filter :erb
end
route '/blog/feed/' do
'/blog/feed.xml'
end
compile '*' do compile '*' do
filter :erb filter :erb
......
...@@ -2,7 +2,7 @@ ...@@ -2,7 +2,7 @@
title: Prometheus Monitoring Spreads through the Internet title: Prometheus Monitoring Spreads through the Internet
created_at: 2015-04-24 created_at: 2015-04-24
kind: article kind: article
author: Brian Brazil author_name: Brian Brazil
--- ---
It has been almost three months since we publicy announced Prometheus version It has been almost three months since we publicy announced Prometheus version
......
This diff is collapsed.
<%= atom_feed :title => 'Prometheus Blog', :author_name => '© Prometheus Authors 2015',
:author_uri => 'http://prometheus.io/blog/', :limit => 10,
:logo => 'http://prometheus.io/assets/prometheus_logo.png',
:icon => 'http://prometheus.io/assets/favicons/favicon.ico' %>
...@@ -5,7 +5,7 @@ title: Blog ...@@ -5,7 +5,7 @@ title: Blog
<% sorted_articles.each do |post| %> <% sorted_articles.each do |post| %>
<div class="blog doc-content"> <div class="blog doc-content">
<h1><%= link_to post[:title], post.path %></h1> <h1><%= link_to post[:title], post.path %></h1>
<aside>Posted at: <%= get_pretty_date(post) %> by <%= post[:author]%></aside> <aside>Posted at: <%= get_pretty_date(post) %> by <%= post[:author_name]%></aside>
<article class="doc-content"> <article class="doc-content">
<%= get_post_start(post) %> <%= get_post_start(post) %>
</article> </article>
......
---
title: Alertmanager
sort_rank: 2
nav_icon: sliders
---
# Alertmanager
The Alertmanager receives alerts from one or more Prometheus servers.
It manages those alerts, including silencing, inhibition, aggregation and
sending out notifications via methods such as email, PagerDuty and HipChat.
**WARNING: The Alertmanager is still considered to be very experimental.**
## Configuration
The Alertmanager is configured via command-line flags and a configuration file.
The configuration file is an ASCII protocol buffer. To specify which
configuration file to load, use the `-config.file` flag.
```
./alertmanager -config.file alertmanager.conf
```
To send all alerts to email, set the `-notification.smtp.smarthost` flag to
an SMTP smarthost (such as a [Postfix null client](http://www.postfix.org/STANDARD_CONFIGURATION_README.html#null_client))
and use the following configuration:
```
notification_config {
name: "alertmanager_test"
email_config {
email: "test@example.org"
}
}
aggregation_rule {
notification_config_name: "alertmanager_test"
}
```
### Filtering
An aggregation rule can be made to apply to only some alerts using a filter.
For example, to apply a rule only to alerts with a `severity` label with the value `page`:
```
aggregation_rule {
filter {
name_re: "severity"
value_re: "page"
}
notification_config_name: "alertmanager_test"
}
```
Multiple filters can be provided.
### Repeat Rate
By default an aggregation rule will repeat notifications every 2 hours. This can be changed using `repeat_rate_seconds`.
```
aggregation_rule {
repeat_rate_seconds: 3600
notification_config_name: "alertmanager_test"
}
```
### Notifications
The Alertmanager has support for a growing number of notification methods.
Multiple notifications methods of one or more types can be used in the same
notification config.
The `send_resolved` field can be used with all notification methods to enable or disable
sending notifications that an alert has stopped firing.
#### Email
The `-notification.smtp.smarthost` flag must be set to an SMTP smarthost.
The `-notification.smtp.sender` flag may be set to change the default From address.
```
notification_config {
name: "alertmanager_email"
email_config {
email: "test@example.org"
}
email_config {
email: "foo@example.org"
}
}
```
Plain and CRAM-MD5 SMTP authentication methods are supported.
The `SMTP_AUTH_USERNAME`, `SMTP_AUTH_SECRET`, `SMTP_AUTH_PASSWORD` and
`SMTP_AUTH_IDENTITY` environment variables are used to configure them.
#### PagerDuty
The Alertmanager integrates as a [Generic API
Service](https://support.pagerduty.com/hc/en-us/articles/202830340-Creating-a-Generic-API-Service)
with PagerDuty.
```
notification_config {
name: "alertmanager_pagerduty"
pagerduty_config {
service_key: "supersecretapikey"
}
}
```
#### Pushover
```
notification_config {
name: "alertmanager_pushover"
pushover_config {
token: "mypushovertoken"
user_key: "mypushoverkey"
}
}
```
#### HipChat
```
notification_config {
name: "alertmanager_hipchat"
hipchat_config {
auth_token: "hipchatauthtoken"
room_id: 123456
}
}
```
#### Slack
```
notification_config {
name: "alertmanager_slack"
slack_config {
webhook_url: "webhookurl"
channel: "channelname"
}
}
```
#### Flowdock
```
notification_config {
name: "alertmanager_flowdock"
flowdock_config {
api_token: "4c7234902348234902384234234cdb59"
from_address: "aliaswithgravatar@example.com"
tag: "monitoring"
}
}
```
#### Generic Webhook
The Alertmanager supports sending notifications as JSON to arbitrary
URLs. This could be used to perform automated actions when an
alert fires or integrate with a system that the Alertmanager does not support.
```
notification_config {
name: "alertmanager_webhook"
webhook_config {
url: "http://example.org/my/hook"
}
}
```
An example of JSON message it sends is below.
```json
{
"version": "1",
"status": "firing",
"alert": [
{
"summary": "summary",
"description": "description",
"labels": {
"alertname": "TestAlert"
},
"payload": {
"activeSince": "2015-06-01T12:55:47.356+01:00",
"alertingRule": "ALERT TestAlert IF absent(metric_name) FOR 0y WITH ",
"generatorURL": "http://localhost:9090/graph#%5B%7B%22expr%22%3A%22absent%28metric_name%29%22%2C%22tab%22%3A0%7D%5D",
"value": "1"
}
}
]
}
```
This format is subject to change.
---
title: Alerting
sort_rank: 7
nav_icon: bell-o
---
---
title: Alerting Overview
sort_rank: 1
nav_icon: sliders
---
# Alerting Overview
Alerting with Prometheus is separated into two parts. Alerting rules in
Prometheus servers send alerts to an Alertmanager. The Alertmanager then
manages those alerts, including silencing, inhibition, aggregation and sending
out notifications via methods such as email, PagerDuty and HipChat.
**WARNING: The Alertmanager is still considered to be very experimental.**
The main steps to setting up alerting and notifications are:
* Setup and configure the Alertmanager
* Configure Prometheus to talk to the Alertmanager with the `-alertmanager.url` flag
* Create alerting rules in Prometheus
---
title: Alerting rules
sort_rank: 3
---
# Alerting rules
Alerting rules allow you to define alert conditions based on Prometheus
expression language expressions and to send notifications about firing alerts
to an external service. Whenever the alert expression results in one or more
vector elements at a given point in time, the alert counts as active for these
elements' label sets.
Alerting rules are configured in Prometheus in the same way as [recording
rules](../../querying/rules).
### Defining alerting rules
Alerting rules are defined in the following syntax:
ALERT <alert name>
IF <expression>
[FOR <duration>]
[WITH <label set>]
SUMMARY "<summary template>"
DESCRIPTION "<description template>"
The optional `FOR` clause causes Prometheus to wait for a certain duration
between first encountering a new expression output vector element (like an
instance with a high HTTP error rate) and counting an alert as firing for this
element. Elements that are active, but not firing yet, are in pending state.
The `WITH` clause allows specifying a set of additional labels to be attached
to the alert. Any existing conflicting labels will be overwritten.
The `SUMMARY` should be a short, human-readable summary of the alert (suitable
for e.g. an email subject line), while the `DESCRIPTION` clause should provide
a longer description. Both string fields allow the inclusion of template
variables derived from the firing vector elements of the alert:
# To insert a firing element's label values:
{{$labels.<labelname>}}
# To insert the numeric expression value of the firing element:
{{$value}}
Examples:
# Alert for any instance that is unreachable for >5 minutes.
ALERT InstanceDown
IF up == 0
FOR 5m
WITH {
severity="page"
}
SUMMARY "Instance {{$labels.instance}} down"
DESCRIPTION "{{$labels.instance}} of job {{$labels.job}} has been down for more than 5 minutes."
# Alert for any instance that have a median request latency >1s.
ALERT ApiHighRequestLatency
IF api_http_request_latencies_ms{quantile="0.5"} > 1000
FOR 1m
SUMMARY "High request latency on {{$labels.instance}}"
DESCRIPTION "{{$labels.instance}} has a median request latency above 1s (current value: {{$value}})"
### Inspecting alerts during runtime
To manually inspect which alerts are active (pending or firing), navigate to
the "Alerts" tab of your Prometheus instance. This will show you the exact
label sets for which each defined alert is currently active.
For pending and firing alerts, Prometheus also stores synthetic time series of
the form `ALERTS{alertname="<alert name>", alertstate="pending|firing", <additional alert labels>}`.
The sample value is set to `1` as long as the alert is in the indicated active
(pending or firing) state, and a single `0` value gets written out when an alert
transitions from active to inactive state. Once inactive, the time series does
not get further updates.
### Sending alert notifications
Prometheus's alerting rules are good at figuring what is broken *right now*,
but they are not a fully-fledged notification solution. Another layer is needed
to add summarization, notification rate limiting, silencing and alert
dependencies on top of the simple alert definitions. In Prometheus's ecosystem,
the [Alertmanager](../alertmanager) takes on this
role. Thus, Prometheus may be configured to periodically send information about
alert states to an Alertmanager instance, which then takes care of dispatching
the right notifications. The Alertmanager instance may be configured via the
`-alertmanager.url` command line flag.
...@@ -5,11 +5,12 @@ sort_rank: 2 ...@@ -5,11 +5,12 @@ sort_rank: 2
# Metric types # Metric types
The Prometheus client libraries offer three core metric types: The Prometheus client libraries offer four core metric types:
* Counters * Counter
* Gauges * Gauge
* Summaries * Histogram
* Summary
These metric types are currently only differentiated in the client libraries These metric types are currently only differentiated in the client libraries
(to enable APIs tailored to the usage of the specific types) and in the wire (to enable APIs tailored to the usage of the specific types) and in the wire
......
...@@ -46,6 +46,7 @@ hosted outside of the Prometheus GitHub organization. ...@@ -46,6 +46,7 @@ hosted outside of the Prometheus GitHub organization.
* [Minecraft exporter module](https://github.com/Baughn/PrometheusIntegration) * [Minecraft exporter module](https://github.com/Baughn/PrometheusIntegration)
* [MongoDB exporter](https://github.com/dcu/mongodb_exporter) * [MongoDB exporter](https://github.com/dcu/mongodb_exporter)
* [Munin exporter](https://github.com/pvdh/munin_exporter) * [Munin exporter](https://github.com/pvdh/munin_exporter)
* [New Relic exporter](https://github.com/jfindley/newrelic_exporter)
* [Redis exporter](https://github.com/oliver006/redis_exporter) * [Redis exporter](https://github.com/oliver006/redis_exporter)
* [RethinkDB exporter](https://github.com/oliver006/rethinkdb_exporter) * [RethinkDB exporter](https://github.com/oliver006/rethinkdb_exporter)
* [scollector exporter](https://github.com/tgulacsi/prometheus_scollector) * [scollector exporter](https://github.com/tgulacsi/prometheus_scollector)
...@@ -61,3 +62,4 @@ separate exporters are needed: ...@@ -61,3 +62,4 @@ separate exporters are needed:
* [gokit](https://github.com/peterbourgon/gokit) * [gokit](https://github.com/peterbourgon/gokit)
* [Kubernetes-Mesos](https://github.com/mesosphere/kubernetes-mesos) * [Kubernetes-Mesos](https://github.com/mesosphere/kubernetes-mesos)
* [Kubernetes](https://github.com/GoogleCloudPlatform/kubernetes) * [Kubernetes](https://github.com/GoogleCloudPlatform/kubernetes)
* [RobustIRC](http://robustirc.net/)
...@@ -79,15 +79,29 @@ Prometheus is released under the ...@@ -79,15 +79,29 @@ Prometheus is released under the
After extensive research it has been determined that the correct plural of After extensive research it has been determined that the correct plural of
'Prometheus' is 'Prometheis'. 'Prometheus' is 'Prometheis'.
### Can I reload Prometheus's configuration?
Yes, sending SIGHUP to the Prometheus process will reload
and apply the configuration file. The different components attempt
to handle failing changes gracefully.
### Can I send alerts? ### Can I send alerts?
Yes, with the experimental [Alertmanager](https://github.com/prometheus/alertmanager). Yes, with the experimental [Alertmanager](https://github.com/prometheus/alertmanager).
[PagerDuty](http://www.pagerduty.com/) and email are supported.
Currently, the following external systems are supported:
* Email
* Generic Webhooks
* [PagerDuty](http://www.pagerduty.com/)
* [HipChat](https://www.hipchat.com/)
* [Slack](https://slack.com/)
* [Pushover](https://pushover.net/)
* [Flowdock](https://www.flowdock.com/)
### Can I create dashboards? ### Can I create dashboards?
Yes, with [PromDash](/docs/visualization/promdash/) and [Console Yes, with [PromDash](/docs/visualization/promdash/) and [Console templates](/docs/visualization/consoles/). There is also a early support for querying Prometheus servers from [Grafana](/docs/visualization/grafana/).
templates](/docs/visualization/consoles/).
### Can I change the timezone? Why is everything in UTC? ### Can I change the timezone? Why is everything in UTC?
...@@ -103,7 +117,7 @@ for the current state of this effort. ...@@ -103,7 +117,7 @@ for the current state of this effort.
### Which languages have instrumentation libraries? ### Which languages have instrumentation libraries?
Currently there are client libraries for: Currently, there are client libraries for:
* [Go](https://github.com/prometheus/client_golang) * [Go](https://github.com/prometheus/client_golang)
* [Java or Scala](https://github.com/prometheus/client_java) * [Java or Scala](https://github.com/prometheus/client_java)
......
...@@ -47,10 +47,10 @@ two examples. ...@@ -47,10 +47,10 @@ two examples.
### Volumes & bind-mount ### Volumes & bind-mount
Bind-mount your prometheus.conf from the host by running: Bind-mount your prometheus.yml from the host by running:
``` ```
docker run -p 9090:9090 -v /tmp/prometheus.conf:/etc/prometheus/prometheus.conf \ docker run -p 9090:9090 -v /tmp/prometheus.yml:/etc/prometheus/prometheus.yml \
prom/prometheus prom/prometheus
``` ```
...@@ -58,7 +58,7 @@ Or use an additional volume for the config: ...@@ -58,7 +58,7 @@ Or use an additional volume for the config:
``` ```
docker run -p 9090:9090 -v /prometheus-data \ docker run -p 9090:9090 -v /prometheus-data \
prom/prometheus -config.file=/prometheus-data/prometheus.conf prom/prometheus -config.file=/prometheus-data/prometheus.yml
``` ```
### Custom image ### Custom image
...@@ -73,7 +73,7 @@ Dockerfile like this: ...@@ -73,7 +73,7 @@ Dockerfile like this:
``` ```
FROM prom/prometheus FROM prom/prometheus
ADD prometheus.conf /etc/prometheus/ ADD prometheus.yml /etc/prometheus/
``` ```
Now build and run it: Now build and run it:
......
...@@ -38,7 +38,7 @@ optional: ...@@ -38,7 +38,7 @@ optional:
* a [push gateway](https://github.com/prometheus/pushgateway) for supporting short-lived jobs * a [push gateway](https://github.com/prometheus/pushgateway) for supporting short-lived jobs
* a [GUI-based dashboard builder](/docs/visualization/promdash/) based on Rails/SQL * a [GUI-based dashboard builder](/docs/visualization/promdash/) based on Rails/SQL
* special-purpose [exporters](/docs/instrumenting/exporters/) (for HAProxy, StatsD, Ganglia, etc.) * special-purpose [exporters](/docs/instrumenting/exporters/) (for HAProxy, StatsD, Ganglia, etc.)
* an (experimental) [alert manager](https://github.com/prometheus/alertmanager) * an (experimental) [alertmanager](https://github.com/prometheus/alertmanager)
* a [command-line querying tool](https://github.com/prometheus/prometheus_cli) * a [command-line querying tool](https://github.com/prometheus/prometheus_cli)
* various support tools * various support tools
......
...@@ -27,19 +27,8 @@ GitHub issue: [#9](https://github.com/prometheus/prometheus/issues/9) ...@@ -27,19 +27,8 @@ GitHub issue: [#9](https://github.com/prometheus/prometheus/issues/9)
Currently Prometheus supports configuring static HTTP targets, as well as Currently Prometheus supports configuring static HTTP targets, as well as
discovering targets dynamically via [DNS SRV discovering targets dynamically via [DNS SRV
records](http://en.wikipedia.org/wiki/SRV_record). We plan to support more records](http://en.wikipedia.org/wiki/SRV_record) and [Consul](https://www.consul.io/). There is also a file-based interface that allows you to connect your own discovery mechanisms. We plan to natively support more
types of service discovery (e.g. Consul or Zookeeper) in the future. Some will types of service discovery (e.g. Zookeeper) in the future.
be implemented natively, but we may also add a plugin system for arbitrary
discovery mechanisms.
### Restartless configuration changes
Currently Prometheus requires a restart after any configuration or rule file
change. This can mean monitoring interruptions for short periods of time. In
the future, we want to support reloading configuration changes without having
to restart Prometheus.
GitHub issue: [#108](https://github.com/prometheus/prometheus/issues/108)
### Long-term storage ### Long-term storage
......
...@@ -110,6 +110,10 @@ dns_sd_configs: ...@@ -110,6 +110,10 @@ dns_sd_configs:
consul_sd_configs: consul_sd_configs:
[ - <consul_sd_config> ... ] [ - <consul_sd_config> ... ]
# List of Zookeeper Serverset service discovery configurations.
serverset_sd_configs:
[ - <serverset_sd_config> ... ]
# List of file service discovery configurations. # List of file service discovery configurations.
file_sd_configs: file_sd_configs:
[ - <file_sd_config> ... ] [ - <file_sd_config> ... ]
...@@ -118,9 +122,13 @@ file_sd_configs: ...@@ -118,9 +122,13 @@ file_sd_configs:
target_groups: target_groups:
[ - <target_group> ... ] [ - <target_group> ... ]
# List of relabel configurations. # List of target relabel configurations.
relabel_configs: relabel_configs:
[ - <relabel_config> ... ] [ - <relabel_config> ... ]
# List of metric relabel configurations.
metric_relabel_configs:
[ - <relabel_config> ... ]
``` ```
Where `<scheme>` may be `http` or `https` and `<path>` is a valid URL path. Where `<scheme>` may be `http` or `https` and `<path>` is a valid URL path.
...@@ -198,6 +206,34 @@ services: ...@@ -198,6 +206,34 @@ services:
[ tag_separator: <string> | default = , ] [ tag_separator: <string> | default = , ]
``` ```
### Zookeeper Serverset SD configurations `<serverset_sd_config>`
Serverset SD configurations allow retrieving scrape targets from [Serversets]
(https://github.com/twitter/finagle/tree/master/finagle-serversets) which are
stored in [Zookeeper](https://zookeeper.apache.org/). Serversets are commonly
used by [Finagle](https://twitter.github.io/finagle/) and
[Aurora](http://aurora.apache.org/).
The following meta labels are available on targets during relabeling:
* `__meta_serverset_path`: the full path to the serverset member node in Zookeeper
* `__meta_serverset_endpoint_host`: the host of the default endpoint
* `__meta_serverset_endpoint_port`: the port of the default endpoint
* `__meta_serverset_endpoint_host_<endpoint>`: the host of the given endpoint
* `__meta_serverset_endpoint_port_<endpoint>`: the port of the given endpoint
* `__meta_serverset_status`: the status of the member
```
# The Zookeeper servers.
servers:
- <host>
# Paths can point to a single serverset, or the root of a tree of serversets.
paths:
- <string>
[ timeout: <duration> | default = 10s ]
```
Serverset data must be in the JSON format, the Thrift format is not currently supported.
### File-based SD configurations `<file_sd_config>` ### File-based SD configurations `<file_sd_config>`
...@@ -239,7 +275,7 @@ Where `<filename_pattern>` may be a path ending in `.json`, `.yml` or `.yaml`. T ...@@ -239,7 +275,7 @@ Where `<filename_pattern>` may be a path ending in `.json`, `.yml` or `.yaml`. T
may contain a single `*` that matches any character sequence, e.g. `my/path/tg_*.json`. may contain a single `*` that matches any character sequence, e.g. `my/path/tg_*.json`.
### Relabeling `<relabel_config>` ### Target relabeling `<relabel_config>`
Relabeling is a powerful tool to dynamically rewrite the label set of a target before Relabeling is a powerful tool to dynamically rewrite the label set of a target before
it gets scraped. Multiple relabeling steps can be configured per scrape configuration. it gets scraped. Multiple relabeling steps can be configured per scrape configuration.
...@@ -290,3 +326,11 @@ regex: <regex> ...@@ -290,3 +326,11 @@ regex: <regex>
(`${1}`, `${2}`, ...) in `replacement` substituted by their value. (`${1}`, `${2}`, ...) in `replacement` substituted by their value.
* `keep`: Drop targets for which `regex` does not match the concatenated `source_labels`. * `keep`: Drop targets for which `regex` does not match the concatenated `source_labels`.
* `drop`: Drop targets for which `regex` matches the concatenated `source_labels`. * `drop`: Drop targets for which `regex` matches the concatenated `source_labels`.
### Metric relabeling `<metric_relabel_configs>`
Metric relabeling is applied to samples as the last step before ingestion. It
has the same configuration format and actions as target relabeling. Metric
relabeling does not apply to automatically generated timeseries such as `up`.
One use for this is to blacklist time series that are too expensive to ingest.
--- ---
title: Best practices title: Best practices
sort_rank: 7 sort_rank: 8
nav_icon: thumbs-o-up nav_icon: thumbs-o-up
--- ---
...@@ -149,6 +149,23 @@ for quantiles located in the lowest bucket. ...@@ -149,6 +149,23 @@ for quantiles located in the lowest bucket.
If `b` contains fewer than two buckets, `NaN` is returned. For φ < 0, `-Inf` is If `b` contains fewer than two buckets, `NaN` is returned. For φ < 0, `-Inf` is
returned. For φ > 1, `+Inf` is returned. returned. For φ > 1, `+Inf` is returned.
## `increase()`
`increase(v range-vector)` calculates the increase in the
time series in the range vector. Breaks in monotonicity (such as counter
resets due to target restarts) are automatically adjusted for.
The following example expression returns the number of HTTP requests as measured
over the last 5 minutes, per time series in the range vector:
```
increase(http_requests_total{job="api-server"}[5m])
```
`increase` should only be used with counters. It should be used primarily for
human readability. Use `rate` in recording rules so that increases are tracked
consistently on a per-second basis.
## `ln()` ## `ln()`
`ln(v instant-vector)` calculates the natural logarithm for all elements in `v`. `ln(v instant-vector)` calculates the natural logarithm for all elements in `v`.
...@@ -171,7 +188,7 @@ The special cases are equivalent to those in `ln`. ...@@ -171,7 +188,7 @@ The special cases are equivalent to those in `ln`.
## `rate()` ## `rate()`
`rate(v range-vector)` calculate the per-second average rate of increase of the `rate(v range-vector)` calculates the per-second average rate of increase of the
time series in the range vector. Breaks in monotonicity (such as counter time series in the range vector. Breaks in monotonicity (such as counter
resets due to target restarts) are automatically adjusted for. resets due to target restarts) are automatically adjusted for.
......
--- ---
title: Recording and alerting rules title: Recording rules
sort_rank: 6 sort_rank: 6
--- ---
# Defining recording and alerting rules # Defining recording rules
## Configuring rules ## Configuring rules
Prometheus supports two types of rules which may be configured and then Prometheus supports two types of rules which may be configured and then
evaluated at regular intervals: recording rules and alerting rules. To include evaluated at regular intervals: recording rules and [alerting
rules in Prometheus, create a file containing the necessary rule statements and rules](../../alerting/rules). To include rules in Prometheus, create a file
have Prometheus load the file via the `rule_files` field in the [Prometheus containing the necessary rule statements and have Prometheus load the file via
configuration](/docs/operating/configuration). the `rule_files` field in the [Prometheus configuration](/docs/operating/configuration).
The rule files can be reloaded at runtime by sending `SIGHUP` to the Prometheus The rule files can be reloaded at runtime by sending `SIGHUP` to the Prometheus
process. The changes are only applied if all rule files are well-formatted. process. The changes are only applied if all rule files are well-formatted.
...@@ -62,81 +62,3 @@ evaluation cycle, the right-hand-side expression of the rule statement is ...@@ -62,81 +62,3 @@ evaluation cycle, the right-hand-side expression of the rule statement is
evaluated at the current instant in time and the resulting sample vector is evaluated at the current instant in time and the resulting sample vector is
stored as a new set of time series with the current timestamp and a new metric stored as a new set of time series with the current timestamp and a new metric
name (and perhaps an overridden set of labels). name (and perhaps an overridden set of labels).
## Alerting rules
Alerting rules allow you to define alert conditions based on Prometheus
expression language expressions and to send notifications about firing alerts
to an external service. Whenever the alert expression results in one or more
vector elements at a given point in time, the alert counts as active for these
elements' label sets.
### Defining alerting rules
Alerting rules are defined in the following syntax:
ALERT <alert name>
IF <expression>
[FOR <duration>]
WITH <label set>
SUMMARY "<summary template>"
DESCRIPTION "<description template>"
The optional `FOR` clause causes Prometheus to wait for a certain duration
between first encountering a new expression output vector element (like an
instance with a high HTTP error rate) and counting an alert as firing for this
element. Elements that are active, but not firing yet, are in pending state.
The `WITH` clause allows specifying a set of additional labels to be attached
to the alert. Any existing conflicting labels will be overwritten.
The `SUMMARY` should be a short, human-readable summary of the alert (suitable
for e.g. an email subject line), while the `DESCRIPTION` clause should provide
a longer description. Both string fields allow the inclusion of template
variables derived from the firing vector elements of the alert:
# To insert a firing element's label values:
{{$labels.<labelname>}}
# To insert the numeric expression value of the firing element:
{{$value}}
Examples:
# Alert for any instance that is unreachable for >5 minutes.
ALERT InstanceDown
IF up == 0
FOR 5m
WITH {
severity="page"
}
SUMMARY "Instance {{$labels.instance}} down"
DESCRIPTION "{{$labels.instance}} of job {{$labels.job}} has been down for more than 5 minutes."
# Alert for any instance that have a median request latency >1s.
ALERT ApiHighRequestLatency
IF api_http_request_latencies_ms{quantile="0.5"} > 1000
FOR 1m
WITH {}
SUMMARY "High request latency on {{$labels.instance}}"
DESCRIPTION "{{$labels.instance}} has a median request latency above 1s (current value: {{$value}})"
### Inspecting alerts during runtime
To manually inspect which alerts are active (pending or firing), navigate to
the "Alerts" tab of your Prometheus instance. This will show you the exact
label sets for which each defined alert is currently active.
For pending and firing alerts, Prometheus also stores synthetic time series of
the form `ALERTS{alertname="<alert name>", alertstate="pending|firing", <additional alert labels>}`.
The sample value is set to `1` as long as the alert is in the indicated active
(pending or firing) state, and a single `0` value gets written out when an alert
transitions from active to inactive state. Once inactive, the time series does
not get further updates.
### Sending alert notifications
Prometheus's alerting rules are good at figuring what is broken *right now*,
but they are not a fully-fledged notification solution. Another layer is needed
to add summarization, notification rate limiting, silencing and alert
dependencies on top of the simple alert definitions. In Prometheus's ecosystem,
the [Alert Manager](https://github.com/prometheus/alertmanager) takes on this
role. Thus, Prometheus may be configured to periodically send information about
alert states to an Alert Manager instance, which then takes care of dispatching
the right notifications. The Alert Manager instance may be configured via the
`-alertmanager.url` command line flag.
...@@ -129,6 +129,9 @@ interpolated version of the given format string. To reference specific label ...@@ -129,6 +129,9 @@ interpolated version of the given format string. To reference specific label
values in the format string, use double curly braces: `{{label-name}}`. For values in the format string, use double curly braces: `{{label-name}}`. For
example: `{{host}} - cluster {{cluster}}`. example: `{{host}} - cluster {{cluster}}`.
Format strings support filters. See the Filters section below for a list of
currently available filters, expected inputs, and outputs.
### Link to graph ### Link to graph
The "Link to this graph" menu tab allows you to generate a link to a specific The "Link to this graph" menu tab allows you to generate a link to a specific
graph. This link will show the graph in a single-widget fullscreen view as it graph. This link will show the graph in a single-widget fullscreen view as it
...@@ -189,6 +192,24 @@ In the example of the host dashboard, the URL could look like this: ...@@ -189,6 +192,24 @@ In the example of the host dashboard, the URL could look like this:
http://promdash.somedomain.int/hoststats#!?var.host=myhost http://promdash.somedomain.int/hoststats#!?var.host=myhost
Template variables support filters. See the Filters section below for a list of
currently available filters, expected inputs, and outputs.
## Filters
Filters can be used in all places where variable interpolation is supported,
e.g. in legend format strings or template variables. The format is `{{variable
| filter}}` and the following filters are currently available:
- `toPercent`: Input: `0.5`; Output: `50%`
- `toPercentile`: Input: `0.5`; Output: `50th`
- `hostnameFqdn`: Input: `http://your-prometheus-endpoint.net:1111/`; Output: `your-prometheus-endpoint.net:1111`
- `hostname`: Input: `http://your-prometheus-endpoint.net:1111/`; Output: `your-prometheus-endpoint`
- `regex`: If `job` == `prometheus`, `{{job | regex:"pro":"faux"}}` => `fauxmetheus`
Filters are chainable, so `{{label | filter1 | filter2}}` will apply `filter1`
to `label`, and then apply `filter2` to that result.
## Annotations ## Annotations
PromDash allows you to load timestamped annotations from an external service PromDash allows you to load timestamped annotations from an external service
......
<% render 'default' do %> <% render 'default' do %>
<div class="col-md-9 blog doc-content"> <div class="col-md-9 blog doc-content">
<h1><%= item[:title] %></h1> <h1><%= item[:title] %></h1>
<aside>Posted at: <%= get_pretty_date(item) %></aside> <aside>Posted at: <%= get_pretty_date(item) %> by <%= item[:author_name]%></aside>
<article class="doc-content"> <article class="doc-content">
<%= yield %> <%= yield %>
<article> <article>
......
...@@ -8,6 +8,8 @@ ...@@ -8,6 +8,8 @@
<meta name="keywords" content="prometheus, monitoring, monitoring system, time series, time series database, alerting, metrics, telemetry"> <meta name="keywords" content="prometheus, monitoring, monitoring system, time series, time series database, alerting, metrics, telemetry">
<meta name="author" content="Prometheus"> <meta name="author" content="Prometheus">
<link rel="alternate" type="application/atom+xml" title="Prometheus Blog » Feed" href="/blog/feed.xml">
<link rel="shortcut icon" href="/assets/favicons/favicon.ico"> <link rel="shortcut icon" href="/assets/favicons/favicon.ico">
<link rel="apple-touch-icon" sizes="57x57" href="/assets/favicons/apple-touch-icon-57x57.png"> <link rel="apple-touch-icon" sizes="57x57" href="/assets/favicons/apple-touch-icon-57x57.png">
<link rel="apple-touch-icon" sizes="60x60" href="/assets/favicons/apple-touch-icon-60x60.png"> <link rel="apple-touch-icon" sizes="60x60" href="/assets/favicons/apple-touch-icon-60x60.png">
......
...@@ -75,3 +75,6 @@ checks: ...@@ -75,3 +75,6 @@ checks:
# E.g.: # E.g.:
# exclude: ['^/server_status'] # exclude: ['^/server_status']
exclude: [] exclude: []
# The base url required by atom_feed
base_url: "http://prometheus.io"
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment