@@ -25,7 +25,7 @@ For more elaborate overviews of Prometheus, see the resources linked from the
Prometheus's main features are:
* a multi-dimensional [data model](/docs/concepts/data_model/) with time series data identified by metric name and key/value pairs
* a [flexible query language](/docs/querying/basics/)
* a [flexible query language](/docs/prometheus/latest/querying/basics/)
to leverage this dimensionality
* no reliance on distributed storage; single server nodes are autonomous
* time series collection happens via a pull model over HTTP
...
...
@@ -57,7 +57,9 @@ its ecosystem components:
Prometheus scrapes metrics from instrumented jobs, either directly or via an
intermediary push gateway for short-lived jobs. It stores all scraped samples
locally and runs rules over this data to either aggregate and record new time series from existing data or generate alerts. [Grafana](https://grafana.com/) or other API consumers can be used to visualize the collected data.
locally and runs rules over this data to either aggregate and record new time
series from existing data or generate alerts. [Grafana](https://grafana.com/) or
other API consumers can be used to visualize the collected data.
The two approaches have a number of different implications:
...
...
@@ -107,11 +107,11 @@ The two approaches have a number of different implications:
|---|-----------|---------
| Required configuration | Pick buckets suitable for the expected range of observed values. | Pick desired φ-quantiles and sliding window. Other φ-quantiles and sliding windows cannot be calculated later.
| Client performance | Observations are very cheap as they only need to increment counters. | Observations are expensive due to the streaming quantile calculation.
| Server performance | The server has to calculate quantiles. You can use [recording rules](/docs/querying/rules/#recording-rules) should the ad-hoc calculation take too long (e.g. in a large dashboard). | Low server-side cost.
| Server performance | The server has to calculate quantiles. You can use [recording rules](/docs/prometheus/latest/querying/rules/#recording-rules) should the ad-hoc calculation take too long (e.g. in a large dashboard). | Low server-side cost.
| Number of time series (in addition to the `_sum` and `_count` series) | One time series per configured bucket. | One time series per configured quantile.
| Quantile error (see below for details) | Error is limited in the dimension of observed values by the width of the relevant bucket. | Error is limited in the dimension of φ by a configurable value.
| Specification of φ-quantile and sliding time-window | Ad-hoc with [Prometheus expressions](/docs/querying/functions/#histogram_quantile). | Preconfigured by the client.
| Aggregation | Ad-hoc with [Prometheus expressions](/docs/querying/functions/#histogram_quantile). | In general [not aggregatable](http://latencytipoftheday.blogspot.de/2014/06/latencytipoftheday-you-cant-average.html).
| Specification of φ-quantile and sliding time-window | Ad-hoc with [Prometheus expressions](/docs/prometheus/latest/querying/functions/#histogram_quantile). | Preconfigured by the client.
| Aggregation | Ad-hoc with [Prometheus expressions](/docs/prometheus/latest/querying/functions/#histogram_quantile). | In general [not aggregatable](http://latencytipoftheday.blogspot.de/2014/06/latencytipoftheday-you-cant-average.html).
Note the importance of the last item in the table. Let us return to
the SLA of serving 95% of requests within 300ms. This time, you do not
or a function aggregating over time (any function ending in `_over_time`),
always take a `irate()` first, then aggregate. Otherwise `irate()` cannot detect
counter resets when your target restarts.
## `label_join()`
For each timeseries in `v`, `label_join(v instant-vector, dst_label string, separator string, src_label_1 string, src_label_2 string, ...)` joins all the values of all the `src_labels`
using `separator` and returns the timeseries with the label `dst_label` containing the joined value.
There can be any number of `src_labels` in this function.
This example will return a vector with each time series having a `foo` label with the value `a,b,c` added to it:
<p>Each server is independent for reliability, relying only on local storage. Written in Go, all binaries are statically linked and easy to deploy.</p>