Bring getting-started tutorial up-to-date.

db530b9b · Julius Volz · fc7df310 · db530b9b
Commit db530b9b authored Jan 06, 2015 by Julius Volz
Hide whitespace changes
Inline Side-by-side

Showing with 56 additions and 44 deletions

getting_started.md content/docs/introduction/getting_started.md +56 -44

No files found.
--- a/content/docs/introduction/getting_started.md
+++ b/content/docs/introduction/getting_started.md
@@ -6,10 +6,10 @@ sort_rank: 3
 # Getting started
 This guide is a "Hello World"-style tutorial which shows how to install,
-configure, and use Prometheus in a simple example setup. You'll build and run
+configure, and use Prometheus in a simple example setup. You will build and run
 Prometheus locally, configure it to scrape itself and an example application,
-and then work with queries, rules, and graphs to make use of the collected
+and then work with queries, rules, and graphs to make use of the collected time
-time series data.
+series data.
 ## Getting Prometheus
@@ -36,7 +36,7 @@ endpoints on these targets. Since Prometheus also exposes data in the same
 manner about itself, it may also be used to scrape and monitor its own health.
 While a Prometheus server which collects only data about itself is not very
-useful in practice, it's a good starting example. Save the following basic
+useful in practice, it is a good starting example. Save the following basic
 Prometheus configuration as a file named `prometheus.conf`:
 ```
@@ -61,7 +61,8 @@ job: {
  # Override the global default and scrape targets from this job every 5 seconds.
  scrape_interval: "5s"
-  # Let's define a group of targets to scrape for this job. In this case, only one.
+  # Let's define a group of static targets to scrape for this job. In this
+  # case, only one.
  target_group: {
    # These endpoints are scraped via HTTP.
    target: "http://localhost:9090/metrics"
@@ -96,34 +97,34 @@ navigating to its metrics exposure endpoint: http://localhost:9090/metrics
 Let's try looking at some data that Prometheus has collected about itself. To
 use Prometheus's built-in expression browser, navigate to
-http://localhost:9090/ and choose the "Tabular" view within the "Graph"
+http://localhost:9090/graph and choose the "Tabular" view within the "Graph"
 tab.
 As you can gather from http://localhost:9090/metrics, one metric that
 Prometheus exports about itself is called
-`prometheus_target_operation_latency_milliseconds`. Go ahead and enter this into the
+`prometheus_target_interval_length_seconds` (the actual amount of time between
-expression console:
+target scrapes). Go ahead and enter this into the expression console:
 ```
-prometheus_target_operation_latency_milliseconds
+prometheus_target_interval_length_seconds
 ```
 This should return a lot of different time series (along with the latest value
 recorded for each), all with the metric name
-`prometheus_target_operation_latency_milliseconds`, but with different labels. These
+`prometheus_target_interval_length_seconds`, but with different labels. These
-labels designate different latency percentiles and operation outcomes.
+labels designate different latency percentiles and target group intervals.
-To count the number of returned time series, you could write:
+If we were only interested in the 99th percentile latencies, we could use this
+query to retrieve that information:
 ```
-count(prometheus_target_operation_latency_milliseconds)
+prometheus_target_interval_length_seconds{quantile="0.99"}
 ```
-If we were only interested in the 99th percentile latencies for scraping
+To count the number of returned time series, you could write:
-Prometheus itself, we could use this query to retrieve that information:
 ```
-prometheus_target_operation_latency_milliseconds{instance="http://localhost:9090/metrics", quantile="0.99"}
+count(prometheus_target_interval_length_seconds)
 ```
 For further details about the expression language, see the
@@ -131,14 +132,14 @@ For further details about the expression language, see the
 ## Using the graphing interface
-To graph expressions, navigate to http://localhost:9090/ and use the "Graph"
+To graph expressions, navigate to http://localhost:9090/graph and use the "Graph"
 tab.
-For example, enter the following expression to graph all latency percentiles
+For example, enter the following expression to graph the per-second rate of all
-for scraping Prometheus itself operations:
+storage chunk operations happening in the self-scraped Prometheus:
 ```
-prometheus_target_operation_latency_milliseconds{instance="http://localhost:9090/metrics"}
+rate(prometheus_local_storage_chunk_ops_total[1m])
 ```
 Experiment with the graph range parameters and other settings.
@@ -148,28 +149,31 @@ Experiment with the graph range parameters and other settings.
 Let's make this more interesting and start some example targets for Prometheus
 to scrape.
-Download the Go client library for Prometheus, and run some random examples
+The Go client library includes an example which exports fictional RPC latencies
-from it that export time series with random data:
+for three services with different latency distributions.
+Download the Go client library for Prometheus and run three of these example
+processes:
 ```bash
-# Fetch the client library code:
+# Fetch the client library code.
 git clone git@github.com:/prometheus/client_golang
-# You might also want to do this if you didn't download the above repo into your Go package path already:
+# Change to the random RPC example.
-go get github.com/prometheus/client_golang
+cd client_golang/examples/random
+# Assuming a working Go setup, fetch necessary dependencies.
+go get -d
 # Start 3 example targets in screen sessions:
-cd client_golang/examples/random
+go run main.go -listen-address=:8080
-go run main.go -listeningAddress=:8080
+go run main.go -listen-address=:8081
-go run main.go -listeningAddress=:8081
+go run main.go -listen-address=:8082
-go run main.go -listeningAddress=:8082
 ```
 You should now have example targets listening on http://localhost:8080/metrics,
 http://localhost:8081/metrics, and http://localhost:8082/metrics.
-TODO: These examples don't exist anymore. Provide alternatives.
 ## Configuring Prometheus to monitor the sample targets
 Now we'll configure Prometheus to scrape these new targets. Let's group all
@@ -186,6 +190,7 @@ restart your Prometheus instance:
 ```
 job: {
  name: "random-example"
+  scrape_interval: "5s"
  # The "production" targets for this job.
  target_group: {
@@ -212,27 +217,32 @@ job: {
 ```
 Go to the expression browser and verify that Prometheus now has information
-about time series that these example endpoints expose, e.g. the
+about time series that these example endpoints expose, such as the
-`rpc_calls_total` metric.
+`rpc_durations_microseconds` metric.
 ## Configure rules for aggregating scraped data into new time series
-Queries that aggregate over thousands of time series can get slow when computed
+Though not a problem in our example, queries that aggregate over thousands of
-ad-hoc. To make this more efficient, Prometheus allows you to prerecord
+time series can get slow when computed ad-hoc. To make this more efficient,
-expressions into completely new persisted time series via configured recording
+Prometheus allows you to prerecord expressions into completely new persisted
-rules. Let's say we're interested in recording the per-second rate of
+time series via configured recording rules. Let's say we are interested in
-`rpc_calls_total` averaged over all instances as measured over the last 5
+recording the per-second rate of example RPCs
+(`rpc_durations_microseconds_count`) averaged over all instances (but
+preserving the `job` and `service` dimensions) as measured over a window of 5
 minutes. We could write this as:
 ```
-avg(rate(rpc_calls_total[5m]))
+avg(rate(rpc_durations_microseconds_count[5m])) by (job, service)
 ```
-To record this expression as a new time series called `job:rpc_calls:avg_rate5m`, create a
+Try graphing this expression.
-file with the following recording rule and save it as `prometheus.rules`:
+To record the time series resulting from this expression into a new metric
+called `job_service:rpc_durations_microseconds_count:avg_rate5m`, create a file
+with the following recording rule and save it as `prometheus.rules`:
 ```
-job:rpc_calls:avg_rate5m = avg(rate(rpc_calls_total[5m]))
+job_service:rpc_durations_microseconds_count:avg_rate5m = avg(rate(rpc_durations_microseconds_count[5m])) by (job, service)
 ```
 To make Prometheus pick up this new rule, add a `rule_files` statement to the
@@ -255,8 +265,10 @@ global: {
  # Load and evaluate rules in this file every 'evaluation_interval' seconds. This field may be repeated.
  rule_file: "prometheus.rules"
 }
+[...]
 ```
 Restart Prometheus with the new configuration and verify that a new time series
-with the metric name `job:rpc_calls:avg_rate5m` is now available by querying it
+with the metric name `job_service:rpc_durations_microseconds_count:avg_rate5m`
-through the expression browser or graphing it.
+is now available by querying it through the expression browser or graphing it.