Commit db247df0 authored by Fabian Reinartz's avatar Fabian Reinartz

Merge pull request #445 from prometheus/next-release

Next release
parents c592965c 530aa19b
...@@ -59,6 +59,49 @@ not what you want for actual operations. The flag `storage.local.retention` ...@@ -59,6 +59,49 @@ not what you want for actual operations. The flag `storage.local.retention`
allows you to configure the retention time for samples. Adjust it to your needs allows you to configure the retention time for samples. Adjust it to your needs
and your available disk space. and your available disk space.
## Chunk encoding
Prometheus currently offers three different types of chunk encodings. The chunk
encoding for newly created chunks is determined by the
`-storage.local.chunk-encoding-version` flag. The valid values are 0, 1,
or 2.
Type 0 is the simple delta encoding implemented for Prometheus's first chunked
storage layer. Type 1 is the current default encoding, a double-delta encoding
with much better compression behavior than type 0. Both encodings feature a
fixed byte width per sample over the whole chunk, which allows fast random
access. While type 0 is the fastest encoding, the difference in encoding cost
compared to encoding 1 is tiny. Due to the better compression behavior of type
1, there is really no reason to select type 0 except compatibility with very
old Prometheus versions.
Type 2 is a variable bit-width encoding, i.e. each sample in the chunk can use
a different number of bits. Timestamps are double-delta encoded, too, but with
a slightly different algorithm. A number of different encoding schemes are
available for sample values. The choice is made per chunk based on the nature
of the sample values (constant, integer, regularly increasing, random…). Major
parts of the type 2 encoding are inspired by a paper published by Facebook
engineers:
[_Gorilla: A Fast, Scalable, In-Memory Time Series Database_](http://www.vldb.org/pvldb/vol8/p1816-teller.pdf).
With type 2, access within a chunk has to happen sequentially, and the encoding
and decoding cost is a bit higher. Overall, type 2 will cause more CPU usage
and increased query latency compared to type 1 but offers a much improved
compression ratio. The exact numbers depend heavily on the data set and the
kind of queries. Below are results from a typical production server with a
fairly expensive set of recording rules.
Chunk type | bytes per sample | cores | rule evaluation duration
:------:|:-----:|:----:|:----:
1 | 3.3 | 1.6 | 2.9s
2 | 1.3 | 2.4 | 4.9s
You can change the chunk encoding each time you start the server, so
experimenting with your own use case is encouraged. Take into account, however,
that only newly created chunks will use the newly selected chunk encoding, so
it will take a while until you see the effects.
## Settings for high numbers of time series ## Settings for high numbers of time series
Prometheus can handle millions of time series. However, you have to adjust the Prometheus can handle millions of time series. However, you have to adjust the
......
...@@ -78,6 +78,7 @@ These logical/set binary operators are only defined between instant vectors: ...@@ -78,6 +78,7 @@ These logical/set binary operators are only defined between instant vectors:
* `and` (intersection) * `and` (intersection)
* `or` (union) * `or` (union)
* `unless` (complement)
`vector1 and vector2` results in a vector consisting of the elements of `vector1 and vector2` results in a vector consisting of the elements of
`vector1` for which there are elements in `vector2` with exactly matching `vector1` for which there are elements in `vector2` with exactly matching
...@@ -88,6 +89,10 @@ over from the left-hand-side vector. ...@@ -88,6 +89,10 @@ over from the left-hand-side vector.
(label sets + values) of `vector1` and additionally all elements of `vector2` (label sets + values) of `vector1` and additionally all elements of `vector2`
which do not have matching label sets in `vector1`. which do not have matching label sets in `vector1`.
`vector1 unless vector2` results in a vector consisting of the elements of
`vector1` for which there are no elements in `vector2` with exactly matching
label sets. All matching elements in both vectors are dropped.
## Vector matching ## Vector matching
Operations between vectors attempt to find a matching element in the right-hand-side Operations between vectors attempt to find a matching element in the right-hand-side
...@@ -97,17 +102,20 @@ matching behavior: ...@@ -97,17 +102,20 @@ matching behavior:
**One-to-one** finds a unique pair of entries from each side of the operation. **One-to-one** finds a unique pair of entries from each side of the operation.
In the default case, that is an operation following the format `vector1 <operator> vector2`. In the default case, that is an operation following the format `vector1 <operator> vector2`.
Two entries match if they have the exact same set of labels and corresponding values. Two entries match if they have the exact same set of labels and corresponding values.
The `on` keyword allows reducing the set of considered labels to a provided list: The `ignoring` keyword allows ignoring certain labels when matching, while the
`on` keyword allows reducing the set of considered labels to a provided list:
<vector expr> <bin-op> ignoring(<label list>) <vector expr>
<vector expr> <bin-op> on(<label list>) <vector expr> <vector expr> <bin-op> on(<label list>) <vector expr>
Example input: Example input:
method:http_errors:rate5m{source="internal", method="get", code="500"} 24 method_code:http_errors:rate5m{method="get", code="500"} 24
method:http_errors:rate5m{source="external", method="get", code="404"} 30 method_code:http_errors:rate5m{method="get", code="404"} 30
method:http_errors:rate5m{source="internal", method="put", code="501"} 3 method_code:http_errors:rate5m{method="put", code="501"} 3
method:http_errors:rate5m{source="internal", method="post", code="500"} 6 method_code:http_errors:rate5m{method="post", code="500"} 6
method:http_errors:rate5m{source="external", method="post", code="404"} 21 method_code:http_errors:rate5m{method="post", code="404"} 21
method:http_requests:rate5m{method="get"} 600 method:http_requests:rate5m{method="get"} 600
method:http_requests:rate5m{method="del"} 34 method:http_requests:rate5m{method="del"} 34
...@@ -115,35 +123,41 @@ Example input: ...@@ -115,35 +123,41 @@ Example input:
Example query: Example query:
method:http_errors:rate5m{code="500"} / on(method) method:http_requests:rate5m method_code:http_errors:rate5m{code="500"} / ignoring(code) method:http_requests:rate5m
This returns a result vector containing the fraction of HTTP requests with status code This returns a result vector containing the fraction of HTTP requests with status code
of 500 for each method, as measured over the last 5 minutes. Without `on(method)` there of 500 for each method, as measured over the last 5 minutes. Without `ignoring(code)` there
would have been no match as the metrics do not share the same set of labels. would have been no match as the metrics do not share the same set of labels.
The entries with methods `put` and `del` have no match and will not show up in the result: The entries with methods `put` and `del` have no match and will not show up in the result:
{method="get"} 0.04 // 24 / 600 {method="get"} 0.04 // 24 / 600
{method="post"} 0.1 // 12 / 120 {method="post"} 0.1 // 12 / 120
**Many-to-one** and **one-to-many** matchings refer to the case where each vector element on **Many-to-one** and **one-to-many** matchings refer to the case where each vector element on
the "one"-side can match with multiple elements on the "many"-side. This has to the "one"-side can match with multiple elements on the "many"-side. This has to
be explicitly requested using the `group_left` or `group_right` modifier, where be explicitly requested using the `group_left` or `group_right` modifier, where
left/right determines which vector has the higher cardinality. left/right determines which vector has the higher cardinality.
<vector expr> <bin-op> ignoring(<label list>) group_left(<label list>) <vector expr>
<vector expr> <bin-op> ignoring(<label list>) group_right(<label list>) <vector expr>
<vector expr> <bin-op> on(<label list>) group_left(<label list>) <vector expr> <vector expr> <bin-op> on(<label list>) group_left(<label list>) <vector expr>
<vector expr> <bin-op> on(<label list>) group_right(<label list>) <vector expr> <vector expr> <bin-op> on(<label list>) group_right(<label list>) <vector expr>
The label list provided with the group modifier contains additional labels from the "many"-side The label list provided with the group modifier contains additional labels from
to be included in the result metrics. A label can only appear in one of the lists. Every time the "one"-side to be included in the result metrics. For `on` a label can only
series of the result vector must be uniquely identifiable by the labels from both lists combined. appear in one of the lists. Every time series of the result vector must be
uniquely identifiable.
_Grouping modifiers can only be used for [comparison](#comparison-binary-operators) _Grouping modifiers can only be used for
and [arithmetic](#arithmetic-binary-operators) operations as `and` and `or` operations [comparison](#comparison-binary-operators) and
match with all possible entries in the right vector by default._ [arithmetic](#arithmetic-binary-operators). Operations as `and`, `unless` and
`or` operations match with all possible entries in the right vector by
default._
Example query: Example query:
method:http_errors:rate5m / on(method) group_left(code,source) method:http_requests:rate5m method_code:http_errors:rate5m / ignoring(code) group_left method:http_requests:rate5m
In this case the left vector contains more than one entry per `method` label value. Thus, In this case the left vector contains more than one entry per `method` label value. Thus,
we indicate this using `group_left`. To ensure that the result vector entries are unique, additional we indicate this using `group_left`. To ensure that the result vector entries are unique, additional
...@@ -151,14 +165,13 @@ labels have to be provided. Either `code` or `source` satisfy this requirement, ...@@ -151,14 +165,13 @@ labels have to be provided. Either `code` or `source` satisfy this requirement,
can be added for a more detailed result. The elements from the right side can be added for a more detailed result. The elements from the right side
are now matched with multiple elements with the same `method` label on the left: are now matched with multiple elements with the same `method` label on the left:
{source="internal", method="get", code="500"} 0.04 // 24 / 600 {method="get", code="500"} 0.04 // 24 / 600
{source="external", method="get", code="404"} 0.05 // 30 / 600 {method="get", code="404"} 0.05 // 30 / 600
{source="internal", method="post", code="500"} 0.1 // 12 / 120 {method="post", code="500"} 0.1 // 12 / 120
{source="external", method="post", code="404"} 0.175 // 21 / 120 {method="post", code="404"} 0.175 // 21 / 120
_Many-to-one and one-to-many matching are advanced use cases that should be carefully considered. _Many-to-one and one-to-many matching are advanced use cases that should be carefully considered.
Often a proper use of `on(<labels>)` provides the desired outcome._ Often a proper use of `ignoring(<labels>)` provides the desired outcome._
## Aggregation operators ## Aggregation operators
...@@ -182,7 +195,7 @@ or preserve distinct dimensions by including a `without` or `by` clause. ...@@ -182,7 +195,7 @@ or preserve distinct dimensions by including a `without` or `by` clause.
`without` removes the listed labels from the result vector, while all other `without` removes the listed labels from the result vector, while all other
labels are preserved the output. `by` does the opposite and drops labels that labels are preserved the output. `by` does the opposite and drops labels that
are not listed in the `by` clause, even if their label values are identical are not listed in the `by` clause, even if their label values are identical
between all elements of the vector. The `keep_common` clause allows to keep between all elements of the vector. The `keep_common` clause allows keeping
those extra labels (labels that are identical between elements, but not in the those extra labels (labels that are identical between elements, but not in the
`by` clause). `by` clause).
...@@ -211,8 +224,8 @@ highest to lowest. ...@@ -211,8 +224,8 @@ highest to lowest.
1. `*`, `/`, `%` 1. `*`, `/`, `%`
2. `+`, `-` 2. `+`, `-`
3. `==`, `!=`, `<=`, `<`, `>=`, `>` 3. `==`, `!=`, `<=`, `<`, `>=`, `>`
4. `AND`, `UNLESS` 4. `and`, `unless`
5. `OR` 5. `or`
Operators on the same precedence level are left-associative. For example, Operators on the same precedence level are left-associative. For example,
`2 * 3 % 2` is equivalent to `(2 * 3) % 2`. `2 * 3 % 2` is equivalent to `(2 * 3) % 2`.
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment