Commit e1cbc218 authored by Julius Volz's avatar Julius Volz

Reorganize and improve documentation section.

parent 831d3761
<div class="row"> <div class="col-md-8 col-md-offset-2 doc-content">
<div class="col-md-4"> <h1>Community</h1>
TODO: Add some text about the community here. <p>
</div> Prometheus is developed in the open and has a growing community outside of
SoundCloud. Here are some of the channels we use to communicate and
contribute:
</p>
<p>
<strong>Mailing list:</strong>
<a href="https://groups.google.com/forum/#!forum/prometheus-developers">prometheus-developers</a> Google Group
</p>
<p>
<strong>IRC:</strong> <code>#prometheus</code> on <a href="http://freenode.net/">irc.freenode.net</a>
</p>
<p>
<strong>Issue tracker:</strong> We use the GitHub issue tracker for the various <a href="http://github.com/prometheus">Prometheus repositories</a>
</p>
<h1>Contributing</h1>
<p>
We welcome community contributions! Please see the
<code>CONTRIBUTING.md</code> file in the respective Prometheus repository
for instructions on how to submit changes. If you are planning on making
more elaborate or controversial changes, please discuss them on the mailing
list before sending a pull request.
</p>
<h1>Sponsorship</h1>
<p>
Prometheus was initially started privately by
<a href="https://github.com/matttproud">Matt Proud</a> and
<a href="https://github.com/juliusv">Julius Volz</a>, the majority of its
development has been sponsored by <a href="https://soundcloud.com">SoundCloud</a>.
</p>
</div> </div>
--- ---
title: Automatic labels and synthetic metrics title: Automatic labels and synthetic metrics
sort_rank: 2 sort_rank: 3
--- ---
# Automatic labels and synthetic metrics # Automatic labels and metrics
## Automatically attached labels ## Automatically attached labels
When Prometheus scrapes a target, it attaches some labels automatically to the When Prometheus scrapes a target, it attaches some labels automatically to the
scraped metrics timeseries which serve to identify the scraped target: scraped metrics timeseries which serve to identify the scraped target:
* `job`: The Prometheus job name from which the timeseries was scraped. * `job`: The configured Prometheus job name for which the target was scraped.
* `instance`: The specific instance/endpoint of the job which was scraped. * `instance`: The specific URL of the instance's endpoint that was scraped.
If either of these labels are already present in the scraped data, they are not If either of these labels are already present in the scraped data, they are not
replaced. Instead, Prometheus adds its own labels with `exporter_` prepended to replaced. Instead, Prometheus adds its own labels with `exporter_` prepended to
......
--- ---
title: Concepts title: Concepts
sort_rank: 4 sort_rank: 2
nav_icon: flask nav_icon: flask
--- ---
--- ---
title: Metric types title: Metric types
sort_rank: 1 sort_rank: 2
--- ---
# Metric Types # Metric Types
...@@ -16,8 +16,8 @@ right one for the right job. ...@@ -16,8 +16,8 @@ right one for the right job.
Metric types are currently only differentiated in the client libraries (to Metric types are currently only differentiated in the client libraries (to
enable APIs tailored to the usage of the specific types) and in the wire enable APIs tailored to the usage of the specific types) and in the wire
protocol. The Prometheus server does not yet persist and make use of the type protocol. The Prometheus server does not yet make use of the type information
information after ingesting samples. This may change in the future, however. after ingesting samples as timeseries. This may change in the future.
## Counter ## Counter
......
---
title: Timeseries
sort_rank: 1
---
# Timeseries
TODO: explain how timeseries are identified and stored.
--- ---
title: Instrumenting your code title: Start
sort_rank: 2 sort_rank: 1
--- ---
# Instrumenting your code # Instrumenting your code
...@@ -10,7 +10,7 @@ instrumentation, you will need to instrument your application's code via one of ...@@ -10,7 +10,7 @@ instrumentation, you will need to instrument your application's code via one of
the Prometheus client libraries. the Prometheus client libraries.
First, familiarize yourself with the Prometheus-supported First, familiarize yourself with the Prometheus-supported
[metrics types](/concepts/metric_types/). To use these types programmatically, see [metric types](/docs/concepts/metric_types/). To use these types programmatically, see
your specific client library's documentation. your specific client library's documentation.
Choose a Prometheus client library that matches the language in which your Choose a Prometheus client library that matches the language in which your
......
---
title: Instrumenting
sort_rank: 4
nav_icon: code
---
--- ---
title: Starter codelab title: Starter codelab
sort_rank: 1 sort_rank: 3
--- ---
# Intro Codelab # Intro codelab
This guide is a "Hello World"-style codelab which shows how to install, This guide is a "Hello World"-style codelab which shows how to install,
configure, and use Prometheus in a simple example setup. You'll build and run configure, and use Prometheus in a simple example setup. You'll build and run
...@@ -29,7 +29,7 @@ cd prometheus ...@@ -29,7 +29,7 @@ cd prometheus
make build make build
``` ```
## Configuring Prometheus to Monitor Itself ## Configuring Prometheus to monitor itself
Prometheus collects metrics from monitored targets by scraping metrics HTTP Prometheus collects metrics from monitored targets by scraping metrics HTTP
endpoints on these targets. Since Prometheus also exposes data in the same endpoints on these targets. Since Prometheus also exposes data in the same
...@@ -69,11 +69,10 @@ job: { ...@@ -69,11 +69,10 @@ job: {
} }
``` ```
As you might have noticed, Prometheus configuration is supplied in an ASCII Prometheus configuration is supplied in an ASCII form of [protocol
form of buffers](https://developers.google.com/protocol-buffers/docs/overview). The
[protocol buffers](https://developers.google.com/protocol-buffers/docs/overview). [schema definition](https://github.com/prometheus/prometheus/blob/master/config/config.proto)
The protocol buffer schema definition has a [complete documentation of all has a complete documentation of all available configuration options.
available configuration options](https://github.com/prometheus/prometheus/blob/master/config/config.proto).
## Starting Prometheus ## Starting Prometheus
...@@ -91,62 +90,60 @@ http://localhost:9090. Give it a couple of seconds to start collecting data ...@@ -91,62 +90,60 @@ http://localhost:9090. Give it a couple of seconds to start collecting data
about itself from its own HTTP metrics endpoint. about itself from its own HTTP metrics endpoint.
You can also verify that Prometheus is serving metrics about itself by You can also verify that Prometheus is serving metrics about itself by
navigating to its metrics exposure endpoint: [[http://localhost:9090/metrics]] navigating to its metrics exposure endpoint: http://localhost:9090/metrics
## Using the Expression Browser ## Using the expression browser
Let's try looking at some data that Prometheus has collected about itself. To Let's try looking at some data that Prometheus has collected about itself. To
use Prometheus' built-in expression browser, navigate to use Prometheus's built-in expression browser, navigate to
[[http://localhost:9090/]] and choose the "Tabular" from the "Graph & Console" http://localhost:9090/ and choose the "Tabular" view within the "Graph"
tab. tab.
As you can gather from [[http://localhost:9090/metrics]], one metric that As you can gather from http://localhost:9090/metrics, one metric that
Prometheus exports about itself is called Prometheus exports about itself is called
`prometheus_metric_disk_latency_microseconds`. Go ahead and enter this into the `prometheus_target_operation_latency_milliseconds`. Go ahead and enter this into the
expression console: expression console:
``` ```
prometheus_metric_disk_latency_microseconds prometheus_target_operation_latency_milliseconds
``` ```
This should return a lot of different timeseries (along with the latest value This should return a lot of different timeseries (along with the latest value
recorded for each), all with the metric name recorded for each), all with the metric name
`prometheus_metric_disk_latency_microseconds`, but with different labels. These `prometheus_target_operation_latency_milliseconds`, but with different labels. These
labels designate different latency percentiles, operation types, and operation labels designate different latency percentiles and operation outcomes.
results (success, failure).
To count the number of returned timeseries, you could write: To count the number of returned timeseries, you could write:
``` ```
count(prometheus_metric_disk_latency_microseconds) count(prometheus_target_operation_latency_milliseconds)
``` ```
If we were only interested in the 99th percentile latencies for e.g. If we were only interested in the 99th percentile latencies for scraping
`get_value_at_time` operations, we could use this query to retrieve that Prometheus itself, we could use this query to retrieve that information:
information:
``` ```
prometheus_metric_disk_latency_microseconds{operation="get_value_at_time", percentile="0.990000"} prometheus_target_operation_latency_milliseconds{instance="http://localhost:9090/metrics", quantile="0.99"}
``` ```
For further details about the expression language, see the [[Expression Language]] For further details about the expression language, see the
documentation. [expression language documentation](/docs/querying/basics).
## Using the Graphing Interface ## Using the graphing interface
To graph expressions, navigate to [[http://localhost:9090/]] and use the To graph expressions, navigate to http://localhost:9090/ and use the "Graph"
"Graph" tab. tab.
For example, enter the following expression to graph all latency percentiles For example, enter the following expression to graph all latency percentiles
for `get_value_at_time` operations in Prometheus: for scraping Prometheus itself operations:
``` ```
prometheus_metric_disk_latency_microseconds{operation="get_value_at_time"} prometheus_target_operation_latency_milliseconds{instance="http://localhost:9090/metrics"}
``` ```
Experiment with the graph range parameters and other settings. Experiment with the graph range parameters and other settings.
## Starting Up Some Sample Targets ## Starting up some sample targets
Let's make this more interesting and start some example targets for Prometheus Let's make this more interesting and start some example targets for Prometheus
to scrape. to scrape.
...@@ -168,14 +165,15 @@ go run main.go -listeningAddress=:8081 ...@@ -168,14 +165,15 @@ go run main.go -listeningAddress=:8081
go run main.go -listeningAddress=:8082 go run main.go -listeningAddress=:8082
``` ```
You should now have example targets listening on You should now have example targets listening on http://localhost:8080/metrics,
[[http://localhost:8080/metrics]], [[http://localhost:8081/metrics]], and http://localhost:8081/metrics, and http://localhost:8082/metrics.
[[http://localhost:8082/metrics]].
TODO: These examples don't exist anymore. Provide alternatives.
## Configuring Prometheus to Monitor the Sample Targets ## Configuring Prometheus to monitor the sample targets
Now we'll configure Prometheus to scrape these new targets. Let's group these Now we'll configure Prometheus to scrape these new targets. Let's group all
three endpoints into a job we call `random-example`. However, imagine that the three endpoints into one job called `random-example`. However, imagine that the
first two endpoints are production targets, while the third one represents a first two endpoints are production targets, while the third one represents a
canary instance. To model this in Prometheus, we can add several groups of canary instance. To model this in Prometheus, we can add several groups of
endpoints to a single job, adding extra labels to each group of targets. In endpoints to a single job, adding extra labels to each group of targets. In
...@@ -217,11 +215,11 @@ Go to the expression browser and verify that Prometheus now has information ...@@ -217,11 +215,11 @@ Go to the expression browser and verify that Prometheus now has information
about timeseries that these example endpoints expose, e.g. the about timeseries that these example endpoints expose, e.g. the
`rpc_calls_total` metric. `rpc_calls_total` metric.
## Configure Rules For Aggregating Scraped Data into New Timeseries ## Configure rules for aggregating scraped data into new timeseries
Manually entering expressions every you time you need them can get cumbersome Queries that aggregate over thousands of timeseries can get slow when computed
and might also be slow to compute in some cases. Prometheus allows you to ad-hoc. To make this more efficient, Prometheus allows you to prerecord
periodically record expressions into completely new timeseries via configured expressions into completely new persisted timeseries via configured recording
rules. Let's say we're interested in recording the per-second rate of rules. Let's say we're interested in recording the per-second rate of
`rpc_calls_total` averaged over all instances as measured over the last 5 `rpc_calls_total` averaged over all instances as measured over the last 5
minutes. We could write this as: minutes. We could write this as:
......
--- ---
title: Download and install title: Installing
sort_rank: 2 sort_rank: 2
--- ---
# Download and Install Prometheus # Installing
## Downloading ## Using pre-compiled binaries
## Installing We plan on providing precompiled binaries for various platforms and even
packages for common Linux distributions soon. Once those are offered, it
will be the recommended way of installing Prometheus.
## From source
For building Prometheus from source, see the relevant [`README.md` section](https://github.com/prometheus/prometheus/blob/master/README.md#use-make).
## Using Docker
TODO: Add docker instructions.
...@@ -3,16 +3,18 @@ title: Overview ...@@ -3,16 +3,18 @@ title: Overview
sort_rank: 1 sort_rank: 1
--- ---
# Overview
## What is Prometheus? ## What is Prometheus?
[Prometheus](https://github.com/prometheus) is an open-source systems [Prometheus](https://github.com/prometheus) is an open-source systems
monitoring and alerting toolkit built at [SoundCloud](http://soundcloud.com). monitoring and alerting toolkit built at [SoundCloud](http://soundcloud.com).
Since its inception in 2012, it has become the standard for instrumenting new Since its inception in 2012, it has become the standard for instrumenting new
services at SoundCloud. Prometheus' main distinguishing features as compared to services at SoundCloud and has seen growing external usage and contributions.
other monitoring systems are: Prometheus's main distinguishing features are:
- a **multi-dimensional** data model (via key/value pairs attached to timeseries) - a **multi-dimensional** data model (timeseries identified by metric name and key/value pairs)
- a [**flexible query language**](http://localhost:3000/using/querying/basics/) - a [**flexible query language**](/docs/using/querying/basics/)
to leverage this dimensionality to leverage this dimensionality
- no reliance on distributed storage; **single server nodes are autonomous** - no reliance on distributed storage; **single server nodes are autonomous**
- timeseries collection happens via a **pull model** over HTTP - timeseries collection happens via a **pull model** over HTTP
......
--- ---
title: Operating title: Operating
sort_rank: 3 sort_rank: 5
nav_icon: cog nav_icon: cog
--- ---
--- ---
title: Best practices title: Best practices
sort_rank: 5 sort_rank: 6
nav_icon: thumbs-o-up nav_icon: thumbs-o-up
--- ---
---
title: Metric and label naming
sort_rank: 1
---
# Metric and label naming
The metric and label conventions presented in this document are not required
for using Prometheus, but can serve as both a style-guide and collection of
best practices. Individual organizations might want to approach e.g. naming
conventions differently.
## Metric Names
A metric name:
* should have a (single-word) application prefix relevant to the containing Prometheus domain
* `prometheus_notifications_total`
* `indexer_requests_latencies_milliseconds`
* `processor_requests_total`
* must have a single unit (i.e. don't mix seconds with milliseconds)
* should have a units suffix
* `api_http_request_latency_milliseconds`
* `node_memory_usage_bytes`
* `api_http_requests_total` (for an accumulating count)
* should represent the same logical thing-being-measured
* request duration
* bytes of data transfer
* instantaneous resource usage as a percentage
As a rule of thumb, if you `sum()` or `avg()` over all dimensions of a given
metric, the result should be meaningful (though not necessarily useful). If it
isn't meaningful, split the data up into multiple metrics. For example, having
the capacity of various queues in the metric is good, mixing the capacity of a
queue with the number of elements is not.
## Labels
Use labels to differentiate
* class of thing-being-measured
* `api_http_requests_total` - differentiate request types: `type={create,update,delete}`
* `api_request_duration_nanoseconds` - differentiate request stages: `stage={extract,transform,load}`
Remember that every unique (label, value) pair represents a new axis of
cardinality for the associated metric, which can dramatically increase the
amount of data stored.
--- ---
title: Pushing data title: Pushing data
sort_rank: 6 sort_rank: 2
--- ---
# Pushing Data # Pushing Data
......
--- ---
title: The basics title: Basics
sort_rank: 1 sort_rank: 1
--- ---
...@@ -15,7 +15,7 @@ consumed and further processed by external systems via the HTTP API. ...@@ -15,7 +15,7 @@ consumed and further processed by external systems via the HTTP API.
## Examples ## Examples
This document is meant as a reference. For learning, it might be easier to This document is meant as a reference. For learning, it might be easier to
start with a couple of examples. See the [Expression Language Examples](/using/querying/examples). start with a couple of [examples](/docs/using/querying/examples).
## Basic Concepts ## Basic Concepts
......
--- ---
title: Query language title: Query language
sort_rank: 3 sort_rank: 2
nav_icon: search
--- ---
---
title: Using
sort_rank: 2
nav_icon: line-chart
---
---
title: Expression browser
sort_rank: 1
---
# Expression browser
TODO: Add content.
---
title: Console templates
sort_rank: 3
---
# Console templates
TODO: Add content.
--- ---
title: Graphing and dashboards title: Visualization
sort_rank: 3 sort_rank: 3
nav_icon: search
--- ---
---
title: PromDash
sort_rank: 2
---
# Console templates
TODO: Add content.
...@@ -22,7 +22,7 @@ title: Home ...@@ -22,7 +22,7 @@ title: Home
<div class="row"> <div class="row">
<div class="col-md-4"> <div class="col-md-4">
<h2><i class="fa fa-cloud-download"></i> Pull Model</h2> <h2><i class="fa fa-cloud-download"></i> Pull Model</h2>
<p class="desc">Prometheus collects timeseries data by scraping instrumented services via HTTP. Short-lived jobs are supported via a push gateway.</p> <p class="desc">Prometheus collects timeseries data by scraping instrumented services. This allows Prometheus to detect when targets are down.</p>
<p><a class="btn btn-default" href="#" role="button">View details &raquo;</a></p> <p><a class="btn btn-default" href="#" role="button">View details &raquo;</a></p>
</div> </div>
<div class="col-md-4"> <div class="col-md-4">
...@@ -32,7 +32,7 @@ title: Home ...@@ -32,7 +32,7 @@ title: Home
</div> </div>
<div class="col-md-4"> <div class="col-md-4">
<h2><i class="fa fa-cog"></i> Operation</h2> <h2><i class="fa fa-cog"></i> Operation</h2>
<p class="desc">Prometheus servers are autonomous, with no dependency on distributed storage. Written in Go, all binaries are statically linked and easy to deploy.</p> <p class="desc">Each server is independent for reliability, relying only on local storage. Written in Go, all binaries are statically linked and easy to deploy.</p>
<p><a class="btn btn-default" href="#" role="button">View details &raquo;</a></p> <p><a class="btn btn-default" href="#" role="button">View details &raquo;</a></p>
</div> </div>
</div> </div>
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment