Merge pull request #445 from prometheus/next-release

Next release

Merge pull request #445 from prometheus/next-release
Next release
db247df0 · Fabian Reinartz · c592965c · 530aa19b · db247df0 · db247df0
Commit db247df0 authored May 24, 2016 by Fabian Reinartz
Show whitespace changes
Inline Side-by-side

Showing with 80 additions and 24 deletions

storage.md content/docs/operating/storage.md +43 -0

operators.md content/docs/querying/operators.md +37 -24

No files found.
--- a/content/docs/operating/storage.md
+++ b/content/docs/operating/storage.md
@@ -59,6 +59,49 @@ not what you want for actual operations. The flag `storage.local.retention`
 allows you to configure the retention time for samples. Adjust it to your needs
 and your available disk space.
+## Chunk encoding
+Prometheus currently offers three different types of chunk encodings. The chunk
+encoding for newly created chunks is determined by the
+`-storage.local.chunk-encoding-version` flag. The valid values are 0, 1,
+or 2.
+Type 0 is the simple delta encoding implemented for Prometheus's first chunked
+storage layer. Type 1 is the current default encoding, a double-delta encoding
+with much better compression behavior than type 0. Both encodings feature a
+fixed byte width per sample over the whole chunk, which allows fast random
+access. While type 0 is the fastest encoding, the difference in encoding cost
+compared to encoding 1 is tiny. Due to the better compression behavior of type
+1, there is really no reason to select type 0 except compatibility with very
+old Prometheus versions.
+Type 2 is a variable bit-width encoding, i.e. each sample in the chunk can use
+a different number of bits. Timestamps are double-delta encoded, too, but with
+a slightly different algorithm. A number of different encoding schemes are
+available for sample values. The choice is made per chunk based on the nature
+of the sample values (constant, integer, regularly increasing, random…). Major
+parts of the type 2 encoding are inspired by a paper published by Facebook
+engineers:
+[_Gorilla: A Fast, Scalable, In-Memory Time Series Database_](http://www.vldb.org/pvldb/vol8/p1816-teller.pdf).
+With type 2, access within a chunk has to happen sequentially, and the encoding
+and decoding cost is a bit higher. Overall, type 2 will cause more CPU usage
+and increased query latency compared to type 1 but offers a much improved
+compression ratio. The exact numbers depend heavily on the data set and the
+kind of queries. Below are results from a typical production server with a
+fairly expensive set of recording rules.
+Chunk type | bytes per sample | cores | rule evaluation duration
+:------:|:-----:|:----:|:----:
+1 | 3.3 | 1.6 | 2.9s
+2 | 1.3 | 2.4 | 4.9s
+You can change the chunk encoding each time you start the server, so
+experimenting with your own use case is encouraged. Take into account, however,
+that only newly created chunks will use the newly selected chunk encoding, so
+it will take a while until you see the effects.
 ## Settings for high numbers of time series
 Prometheus can handle millions of time series. However, you have to adjust the

--- a/content/docs/querying/operators.md
+++ b/content/docs/querying/operators.md
@@ -78,6 +78,7 @@ These logical/set binary operators are only defined between instant vectors:
 * `and` (intersection)
 * `or` (union)
+* `unless` (complement)
 `vector1 and vector2` results in a vector consisting of the elements of
 `vector1` for which there are elements in `vector2` with exactly matching
@@ -88,6 +89,10 @@ over from the left-hand-side vector.
 (label sets + values) of `vector1` and additionally all elements of `vector2`
 which do not have matching label sets in `vector1`.
+`vector1 unless vector2` results in a vector consisting of the elements of
+`vector1` for which there are no elements in `vector2` with exactly matching
+label sets. All matching elements in both vectors are dropped.
 ## Vector matching
 Operations between vectors attempt to find a matching element in the right-hand-side
@@ -97,17 +102,20 @@ matching behavior:
 **One-to-one** finds a unique pair of entries from each side of the operation.
 In the default case, that is an operation following the format `vector1 <operator> vector2`.
 Two entries match if they have the exact same set of labels and corresponding values.
-The `on` keyword allows reducing the set of considered labels to a provided list:
+The `ignoring` keyword allows ignoring certain labels when matching, while the
+`on` keyword allows reducing the set of considered labels to a provided list:
+    <vector expr> <bin-op> ignoring(<label list>) <vector expr>
    <vector expr> <bin-op> on(<label list>) <vector expr>
 Example input:
-    method:http_errors:rate5m{source="internal", method="get", code="500"}  24
+    method_code:http_errors:rate5m{method="get", code="500"}  24
-    method:http_errors:rate5m{source="external", method="get", code="404"}  30
+    method_code:http_errors:rate5m{method="get", code="404"}  30
-    method:http_errors:rate5m{source="internal", method="put", code="501"}  3
+    method_code:http_errors:rate5m{method="put", code="501"}  3
-    method:http_errors:rate5m{source="internal", method="post", code="500"} 6
+    method_code:http_errors:rate5m{method="post", code="500"} 6
-    method:http_errors:rate5m{source="external", method="post", code="404"} 21
+    method_code:http_errors:rate5m{method="post", code="404"} 21
    method:http_requests:rate5m{method="get"}  600
    method:http_requests:rate5m{method="del"}  34
@@ -115,35 +123,41 @@ Example input:
 Example query:
-    method:http_errors:rate5m{code="500"} / on(method) method:http_requests:rate5m
+    method_code:http_errors:rate5m{code="500"} / ignoring(code) method:http_requests:rate5m
 This returns a result vector containing the fraction of HTTP requests with status code
-of 500 for each method, as measured over the last 5 minutes. Without `on(method)` there
+of 500 for each method, as measured over the last 5 minutes. Without `ignoring(code)` there
 would have been no match as the metrics do not share the same set of labels.
 The entries with methods `put` and `del` have no match and will not show up in the result:
    {method="get"}  0.04            //  24 / 600
    {method="post"} 0.1             //  12 / 120
 **Many-to-one** and **one-to-many** matchings refer to the case where each vector element on
 the "one"-side can match with multiple elements on the "many"-side. This has to
 be explicitly requested using the `group_left` or `group_right` modifier, where
 left/right determines which vector has the higher cardinality.
+    <vector expr> <bin-op> ignoring(<label list>) group_left(<label list>) <vector expr>
+    <vector expr> <bin-op> ignoring(<label list>) group_right(<label list>) <vector expr>
    <vector expr> <bin-op> on(<label list>) group_left(<label list>) <vector expr>
    <vector expr> <bin-op> on(<label list>) group_right(<label list>) <vector expr>
-The label list provided with the group modifier contains additional labels from the "many"-side
+The label list provided with the group modifier contains additional labels from
-to be included in the result metrics. A label can only appear in one of the lists. Every time
+the "one"-side to be included in the result metrics. For `on` a label can only
-series of the result vector must be uniquely identifiable by the labels from both lists combined.
+appear in one of the lists. Every time series of the result vector must be
+uniquely identifiable.
-_Grouping modifiers can only be used for [comparison](#comparison-binary-operators)
+_Grouping modifiers can only be used for
-and [arithmetic](#arithmetic-binary-operators) operations as `and` and `or` operations
+[comparison](#comparison-binary-operators) and
-match with all possible entries in the right vector by default._
+[arithmetic](#arithmetic-binary-operators). Operations as `and`, `unless` and
+`or` operations match with all possible entries in the right vector by
+default._
 Example query:
-    method:http_errors:rate5m / on(method) group_left(code,source) method:http_requests:rate5m
+    method_code:http_errors:rate5m / ignoring(code) group_left method:http_requests:rate5m
 In this case the left vector contains more than one entry per `method` label value. Thus,
 we indicate this using `group_left`. To ensure that the result vector entries are unique, additional
@@ -151,14 +165,13 @@ labels have to be provided. Either `code` or `source` satisfy this requirement,
 can be added for a more detailed result. The elements from the right side
 are now matched with multiple elements with the same `method` label on the left:
-    {source="internal", method="get", code="500"}  0.04            //  24 / 600
+    {method="get", code="500"}  0.04            //  24 / 600
-    {source="external", method="get", code="404"}  0.05            //  30 / 600
+    {method="get", code="404"}  0.05            //  30 / 600
-    {source="internal", method="post", code="500"} 0.1             //  12 / 120
+    {method="post", code="500"} 0.1             //  12 / 120
-    {source="external", method="post", code="404"} 0.175           //  21 / 120
+    {method="post", code="404"} 0.175           //  21 / 120
 _Many-to-one and one-to-many matching are advanced use cases that should be carefully considered.
-Often a proper use of `on(<labels>)` provides the desired outcome._
+Often a proper use of `ignoring(<labels>)` provides the desired outcome._
 ## Aggregation operators
@@ -182,7 +195,7 @@ or preserve distinct dimensions by including a `without` or `by` clause.
 `without` removes the listed labels from the result vector, while all other
 labels are preserved the output. `by` does the opposite and drops labels that
 are not listed in the `by` clause, even if their label values are identical
-between all elements of the vector. The `keep_common` clause allows to keep
+between all elements of the vector. The `keep_common` clause allows keeping
 those extra labels (labels that are identical between elements, but not in the
 `by` clause).
@@ -211,8 +224,8 @@ highest to lowest.
 1. `*`, `/`, `%`
 2. `+`, `-`
 3. `==`, `!=`, `<=`, `<`, `>=`, `>`
-4. `AND`, `UNLESS` 
+4. `and`, `unless`
-5. `OR`
+5. `or`
 Operators on the same precedence level are left-associative. For example,
 `2 * 3 % 2` is equivalent to `(2 * 3) % 2`.