Skip to content
Projects
Groups
Snippets
Help
Loading...
Help
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
D
docs
Project
Project
Details
Activity
Releases
Cycle Analytics
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Charts
Issues
0
Issues
0
List
Boards
Labels
Milestones
Merge Requests
0
Merge Requests
0
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Charts
Wiki
Wiki
Snippets
Snippets
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Charts
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
Administrator
docs
Commits
f98da7de
Commit
f98da7de
authored
Jan 19, 2015
by
Julius Volz
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
Explain interpolation and staleness.
parent
602c380f
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
32 additions
and
14 deletions
+32
-14
basics.md
content/docs/querying/basics.md
+32
-14
No files found.
content/docs/querying/basics.md
View file @
f98da7de
...
@@ -119,24 +119,42 @@ in detail in the [expression language functions](/docs/querying/functions) page.
...
@@ -119,24 +119,42 @@ in detail in the [expression language functions](/docs/querying/functions) page.
## Gotchas
## Gotchas
### Time series staleness
### Interpolation and staleness
TODO: TODO: explain staleness
When queries are run, timestamps at which to sample data are selected
independent of the actual present time series data. This is mainly to support
### Interpolation
cases like aggregation (
`sum`
,
`avg`
, and so on), where multiple aggregated
time series do not exactly align in time. At every one of these predetermined
TODO: TODO: explain interpolation
sampling times, Prometheus searches for the closest surrounding samples and
linearly interpolates a timestamp-value pair between the actual stored samples.
That means that the generated sample has timestamp and sample values which are
somewhere in between the timestamp and sample values of the surrounding samples
(depending on how close the query timestamp is to either surrounding point).
If no stored sample is found either (by default) 5 minutes before or after a
sampling timestamp, no interpolated sample is generated for this time series at
this point in time. This effectively means that time series "disappear" from
graphs at times where their latest collected sample is 5 older than 5 minutes.
NOTE:
<b>
NOTE:
</b>
Staleness and interpolation handling might change. See
https://github.com/prometheus/prometheus/issues/398 and
https://github.com/prometheus/prometheus/issues/386.
### Avoiding slow queries and overloads
### Avoiding slow queries and overloads
If a query needs to operate on a very large amount of data, graphing it might
If a query needs to operate on a very large amount of data, graphing it might
time out or overload the server or browser. Thus, when gradually constructing
time out or overload the server or browser. Thus, when constructing queries
queries over unknown data, always start building the query in the tabular view
over unknown data, always start building the query in the tabular view of
of Prometheus's expression browser until the result set seems reasonable. Only
Prometheus's expression browser until the result set seems reasonable
when you have filtered or aggregated your data sufficiently, switch to graph
(hundreds, not thousands, of time series at most). Only when you have filtered
mode. If the expression still takes too long to graph ad-hoc, pre-record it via
or aggregated your data sufficiently, switch to graph mode. If the expression
a
[
recording rule
](
/docs/operating/rules/#recording-rules
)
.
still takes too long to graph ad-hoc, pre-record it via a
[
recording
rule](/docs/operating/rules/#recording-rules).
This is especially relevant for Prometheus's query language, where a bare
This is especially relevant for Prometheus's query language, where a bare
metric name selector like
`api_http_requests_total`
could expand to thousands
metric name selector like
`api_http_requests_total`
could expand to thousands
of time series with different labels.
of time series with different labels. Also keep in mind that expressions which
aggregate over many time series will generate load on the server even if the
output is only a small number of time series. This is similar to how it would
be slow to sum all values of a column in a relational database, even if the
output value is only a single number.
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment