Skip to content
Projects
Groups
Snippets
Help
Loading...
Help
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
D
docs
Project
Project
Details
Activity
Releases
Cycle Analytics
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Charts
Issues
0
Issues
0
List
Boards
Labels
Milestones
Merge Requests
0
Merge Requests
0
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Charts
Wiki
Wiki
Snippets
Snippets
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Charts
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
Administrator
docs
Commits
5f3f4709
Commit
5f3f4709
authored
Jan 25, 2015
by
juliusv
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
Minor spelling / wording improvements.
parent
fe36d07e
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
9 additions
and
9 deletions
+9
-9
instrumentation.md
content/docs/practices/instrumentation.md
+9
-9
No files found.
content/docs/practices/instrumentation.md
View file @
5f3f4709
...
@@ -55,7 +55,7 @@ but it is very localised information. A better approach is to send a heartbeat
...
@@ -55,7 +55,7 @@ but it is very localised information. A better approach is to send a heartbeat
through the system: some dummy item that gets passed all the way through
through the system: some dummy item that gets passed all the way through
and includes the timestamp when it was inserted. Each stage can export the most
and includes the timestamp when it was inserted. Each stage can export the most
recent heartbeat timestamp it has seen, letting you know how long items are
recent heartbeat timestamp it has seen, letting you know how long items are
taking to prop
o
gate through the system. For systems that do not have quiet
taking to prop
a
gate through the system. For systems that do not have quiet
periods where no processing occurs, an explicit heartbeat may not be needed.
periods where no processing occurs, an explicit heartbeat may not be needed.
#### Batch jobs
#### Batch jobs
...
@@ -97,7 +97,7 @@ Depending on how heavy the library is, track internal errors and
...
@@ -97,7 +97,7 @@ Depending on how heavy the library is, track internal errors and
latency within the library itself, and any general statistics you think may be
latency within the library itself, and any general statistics you think may be
useful.
useful.
A library may be used by multiple independ
a
nt parts of an application against
A library may be used by multiple independ
e
nt parts of an application against
different resources, so take care to distinguish uses with labels where
different resources, so take care to distinguish uses with labels where
appropriate. For example, a database connection pool should distinguish the databases
appropriate. For example, a database connection pool should distinguish the databases
it is talking to, whereas there is no need to differentiate
it is talking to, whereas there is no need to differentiate
...
@@ -109,9 +109,9 @@ As a general rule, for every line of logging code you should also have a
...
@@ -109,9 +109,9 @@ As a general rule, for every line of logging code you should also have a
counter that is incremented. If you find an interesting log message, you want to
counter that is incremented. If you find an interesting log message, you want to
be able to see how often it has been happening and for how long.
be able to see how often it has been happening and for how long.
If there are multiple closely-related log messages in the same function (for example
If there are multiple closely-related log messages in the same function (for example
,
different branches of an if or switch statement), it can sometimes make sense
different branches of an if or switch statement), it can sometimes make sense
increment
the same on
e counter for all of them.
increment
a singl
e counter for all of them.
It is also generally useful to export the total number of info/error/warning
It is also generally useful to export the total number of info/error/warning
lines that were logged by the application as a whole, and check for significant
lines that were logged by the application as a whole, and check for significant
...
@@ -144,7 +144,7 @@ gauge for how long the collection took in seconds and another for the number of
...
@@ -144,7 +144,7 @@ gauge for how long the collection took in seconds and another for the number of
errors encountered.
errors encountered.
This is one of the two cases when it is okay to export a duration as a gauge
This is one of the two cases when it is okay to export a duration as a gauge
rather than a summary, the other being batch job durations. This is
as
both
rather than a summary, the other being batch job durations. This is
because
both
represent information about that particular push/scrape, rather than
represent information about that particular push/scrape, rather than
tracking multiple durations over time.
tracking multiple durations over time.
...
@@ -161,7 +161,7 @@ take advantage of them, so it takes a bit of getting used to.
...
@@ -161,7 +161,7 @@ take advantage of them, so it takes a bit of getting used to.
When you have multiple metrics that you want to add/average/sum, they should
When you have multiple metrics that you want to add/average/sum, they should
usually be one metric with labels rather than multiple metrics.
usually be one metric with labels rather than multiple metrics.
For example, rather
`http_responses_500_total`
and
`http_resonses_403_total`
,
For example, rather
than
`http_responses_500_total`
and
`http_resonses_403_total`
,
create a single metric called
`http_responses_total`
with a
`code`
label
create a single metric called
`http_responses_total`
with a
`code`
label
for the HTTP response code. You can then process the entire metric as one in
for the HTTP response code. You can then process the entire metric as one in
rules and graphs.
rules and graphs.
...
@@ -209,15 +209,15 @@ never take a `rate()` of a gauge.
...
@@ -209,15 +209,15 @@ never take a `rate()` of a gauge.
Summaries are similar to having two counters. They track the number of events
Summaries are similar to having two counters. They track the number of events
*and*
the amount of something for each event, allowing you to calculate the
*and*
the amount of something for each event, allowing you to calculate the
average amount per event (useful for latency, for example). In addition,
average amount per event (useful for latency, for example). In addition,
summaries can also export quantiles of the amounts, but note that quantiles are not
summaries can also export quantiles of the amounts, but note that
[
quantiles are not
aggregatable.
aggregatable
](http://latencytipoftheday.blogspot.de/2014/06/latencytipoftheday-you-cant-average.html)
.
### Timestamps, not time since
### Timestamps, not time since
If you want to track the amount of time since something happened, export the
If you want to track the amount of time since something happened, export the
Unix timestamp at which it happened - not the time since it happened.
Unix timestamp at which it happened - not the time since it happened.
With the timestamp exported, you can use
`time() - my_timestamp_metric`
to
With the timestamp exported, you can use
the expression
`time() - my_timestamp_metric`
to
calculate the time since the event, removing the need for update logic and
calculate the time since the event, removing the need for update logic and
protecting you against the update logic getting stuck.
protecting you against the update logic getting stuck.
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment