@@ -25,7 +25,7 @@ For more elaborate overviews of Prometheus, see the resources linked from the
...
@@ -25,7 +25,7 @@ For more elaborate overviews of Prometheus, see the resources linked from the
Prometheus's main features are:
Prometheus's main features are:
* a multi-dimensional [data model](/docs/concepts/data_model/) with time series data identified by metric name and key/value pairs
* a multi-dimensional [data model](/docs/concepts/data_model/) with time series data identified by metric name and key/value pairs
* a [flexible query language](/docs/querying/basics/)
* a [flexible query language](/docs/prometheus/latest/querying/basics/)
to leverage this dimensionality
to leverage this dimensionality
* no reliance on distributed storage; single server nodes are autonomous
* no reliance on distributed storage; single server nodes are autonomous
* time series collection happens via a pull model over HTTP
* time series collection happens via a pull model over HTTP
...
@@ -57,7 +57,9 @@ its ecosystem components:
...
@@ -57,7 +57,9 @@ its ecosystem components:
Prometheus scrapes metrics from instrumented jobs, either directly or via an
Prometheus scrapes metrics from instrumented jobs, either directly or via an
intermediary push gateway for short-lived jobs. It stores all scraped samples
intermediary push gateway for short-lived jobs. It stores all scraped samples
locally and runs rules over this data to either aggregate and record new time series from existing data or generate alerts. [Grafana](https://grafana.com/) or other API consumers can be used to visualize the collected data.
locally and runs rules over this data to either aggregate and record new time
series from existing data or generate alerts. [Grafana](https://grafana.com/) or
other API consumers can be used to visualize the collected data.
The two approaches have a number of different implications:
The two approaches have a number of different implications:
...
@@ -107,11 +107,11 @@ The two approaches have a number of different implications:
...
@@ -107,11 +107,11 @@ The two approaches have a number of different implications:
|---|-----------|---------
|---|-----------|---------
| Required configuration | Pick buckets suitable for the expected range of observed values. | Pick desired φ-quantiles and sliding window. Other φ-quantiles and sliding windows cannot be calculated later.
| Required configuration | Pick buckets suitable for the expected range of observed values. | Pick desired φ-quantiles and sliding window. Other φ-quantiles and sliding windows cannot be calculated later.
| Client performance | Observations are very cheap as they only need to increment counters. | Observations are expensive due to the streaming quantile calculation.
| Client performance | Observations are very cheap as they only need to increment counters. | Observations are expensive due to the streaming quantile calculation.
| Server performance | The server has to calculate quantiles. You can use [recording rules](/docs/querying/rules/#recording-rules) should the ad-hoc calculation take too long (e.g. in a large dashboard). | Low server-side cost.
| Server performance | The server has to calculate quantiles. You can use [recording rules](/docs/prometheus/latest/querying/rules/#recording-rules) should the ad-hoc calculation take too long (e.g. in a large dashboard). | Low server-side cost.
| Number of time series (in addition to the `_sum` and `_count` series) | One time series per configured bucket. | One time series per configured quantile.
| Number of time series (in addition to the `_sum` and `_count` series) | One time series per configured bucket. | One time series per configured quantile.
| Quantile error (see below for details) | Error is limited in the dimension of observed values by the width of the relevant bucket. | Error is limited in the dimension of φ by a configurable value.
| Quantile error (see below for details) | Error is limited in the dimension of observed values by the width of the relevant bucket. | Error is limited in the dimension of φ by a configurable value.
| Specification of φ-quantile and sliding time-window | Ad-hoc with [Prometheus expressions](/docs/querying/functions/#histogram_quantile). | Preconfigured by the client.
| Specification of φ-quantile and sliding time-window | Ad-hoc with [Prometheus expressions](/docs/prometheus/latest/querying/functions/#histogram_quantile). | Preconfigured by the client.
| Aggregation | Ad-hoc with [Prometheus expressions](/docs/querying/functions/#histogram_quantile). | In general [not aggregatable](http://latencytipoftheday.blogspot.de/2014/06/latencytipoftheday-you-cant-average.html).
| Aggregation | Ad-hoc with [Prometheus expressions](/docs/prometheus/latest/querying/functions/#histogram_quantile). | In general [not aggregatable](http://latencytipoftheday.blogspot.de/2014/06/latencytipoftheday-you-cant-average.html).
Note the importance of the last item in the table. Let us return to
Note the importance of the last item in the table. Let us return to
the SLA of serving 95% of requests within 300ms. This time, you do not
the SLA of serving 95% of requests within 300ms. This time, you do not
<p>Each server is independent for reliability, relying only on local storage. Written in Go, all binaries are statically linked and easy to deploy.</p>
<p>Each server is independent for reliability, relying only on local storage. Written in Go, all binaries are statically linked and easy to deploy.</p>