Commit e8fdc6e1 authored by Julius Volz's avatar Julius Volz

Improve SD blog post wording.

parent 60304257
--- ---
title: Service Discovery Introduction title: Advanced Service Discovery Features in Prometheus 0.14.0
created_at: 2015-06-02 created_at: 2015-06-01
kind: article kind: article
author: Fabian Reinartz author: Fabian Reinartz, Julius Volz
--- ---
This week we released Prometheus v0.14.0 — a version with many long-awaited additions This week we released Prometheus v0.14.0 — a version with many long-awaited additions
and improvements. and improvements.
On the user side, new mechanisms for service discovery aim to make it easier to On the user side, Prometheus now supports new service discovery mechanisms. In
dynamically update the targets Prometheus scrapes. In addition to DNS-SRV records, addition to DNS-SRV records, it now supports [Consul](https://www.consul.io)
[Consul](https://www.consul.io) is now supported out-of-the-box and a file-based interface out of the box, and a file-based interface allows you to connect your own
allows you to connect your own discovery mechanism. discovery mechanisms. Over time, we plan to add other common service discovery
Over time, we plan to add other common service discovery mechanisms to Prometheus. mechanisms to Prometheus.
Aside from many smaller fixes and improvements, you can now also reload your configuration at Aside from many smaller fixes and improvements, you can now also reload your configuration during
runtime by sending a `SIGHUP` to the Prometheus process. For a full list of changes, check the runtime by sending a `SIGHUP` to the Prometheus process. For a full list of changes, check the
[changelog for this release](tba). [changelog for this release](https://github.com/prometheus/prometheus/blob/master/CHANGELOG.md#0140--2015-06-01).
In this blog post, we will take a closer look at the built-in service discovery mechanisms and provide In this blog post, we will take a closer look at the built-in service discovery mechanisms and provide
small practical examples. As an additional resource, some practical examples. As an additional resource, see
[Prometheus's configuration documentation](/docs/operating/configuration) [Prometheus's configuration documentation](/docs/operating/configuration).
can be helpful.
## Prometheus and targets ## Prometheus and targets
For a proper understanding of this blog post, we have to take a look at how Prometheus sees targets. For a proper understanding of this blog post, we first need to take a look at how
To Prometheus each scrape target is nothing more but another set of labels – even its address, which Prometheus labels targets.
is defined by the `__address__` label.
In configurations we have a hierarchy of labels. There are global labels, which are assigned to There are various places in the configuration file where target labels may be
every target scraped by the Prometheus instance. One level above is the `job` label, that is defined directly set. They are applied in the following order, with later stages overwriting any
in the configuration file. On the next level are labels defined in the target group from which labels set by an earlier stage:
a label is derived. Finally, there are labels that are set per-target as we will see soon.
Each layer of labels overwrites any colliding labels from the layers below. Eventually, we have a flat 1. Global labels, which are assigned to every target scraped by the Prometheus instance.
set of labels that describe a single target. Those labels are then attached to every metric that 2. The `job` label, which is configured as a default value for each scrape configuration.
3. Labels that are set per target group within a scrape configuration.
4. Advanced label manipulation via [_relabeling_](/docs/operating/configuration/#relabeling-relabel_config).
Each stage overwrites any colliding labels from the earlier stages. Eventually, we have a flat
set of labels that describe a single target. Those labels are then attached to every time series that
is scraped from this target. is scraped from this target.
Note: Internally, even the address of a target is stored in a special
`__address__` label. This can be useful during advanced label manipulation
(relabeling), as we will see later. Labels starting with `__` do not appear in
the final time series.
## Scrape configurations and relabeling ## Scrape configurations and relabeling
Aside from switching from the ASCII protocol buffer format to YAML, a fundamental change to Aside from moving from an ASCII protocol buffer format to YAML, a fundamental change to
Prometheus's configuration is the switch from per-job configurations to more generalized scrape Prometheus's configuration is the change from per-job configurations to more generalized scrape
configurations. For simple setups the two can be thought of as almost equivalent. configurations. While the two are almost equivalent for simple setups, scrape configurations
At the same time, scrape configurations allow greater flexibility for more advanced use cases. allow for greater flexibility in more advanced use cases.
Each scrape configuration is assigned a job name which serves as a default value for the Each scrape configuration defines a job name which serves as a default value for the
`job` label. The `job` label can then be redefined for whole target groups or single targets. `job` label. The `job` label can then be redefined for entire target groups or individual targets.
For example, we can define two target groups, each of which defines targets for one job. For example, we can define two target groups, each of which defines targets for one job.
To scrape them with the same parameters, we can configure them as follows: To scrape them with the same parameters, we can configure them as follows:
``` ```
...@@ -69,16 +76,18 @@ scrape_configs: ...@@ -69,16 +76,18 @@ scrape_configs:
job: 'job2' job: 'job2'
``` ```
Through a mechanism named _relabeling_ any label can be removed, created, or modified on a per-target level. Through a mechanism named [_relabeling_](http://prometheus.io/docs/operating/configuration/#relabeling-relabel_config),
This allows fine-grained labeling that can also take into account meta-data coming from the service discovery. any label can be removed, created, or modified on a per-target level. This
Relabeling is the last stage of label assignment and overwrites any labels previously set. enables fine-grained labeling that can also take into account metadata coming
from the service discovery. Relabeling is the last stage of label assignment
and overwrites any labels previously set.
Relabeling works as follows: Relabeling works as follows:
- A list of source labels is defined - A list of source labels is defined.
- For each target the values of those labels are concatenated with a separator - For each target, the values of those labels are concatenated with a separator.
- A regular expression is matched against the resulting string - A regular expression is matched against the resulting string.
- A new value based on those matches is assigned to another label - A new value based on those matches is assigned to another label.
Mutiple relabeling rules can be defined for each scrape configuration. A simple one Mutiple relabeling rules can be defined for each scrape configuration. A simple one
that squashes two labels into one, looks as follows: that squashes two labels into one, looks as follows:
...@@ -101,15 +110,19 @@ This rule transforms a target with the label set: ...@@ -101,15 +110,19 @@ This rule transforms a target with the label set:
"label_b": "bar" "label_b": "bar"
} }
``` ```
... into a target with label set: ...into a target with the label set:
``` ```
{ {
"job": "job1", "job": "job1",
"label_a": "foo",
"label_b": "bar",
"label_c": "foo-bar" "label_c": "foo-bar"
} }
``` ```
You could then also remove the source labels in an additional relabeling step.
You can read more about relabeling and how you can use it to filter targets in the You can read more about relabeling and how you can use it to filter targets in the
[configuration documentation](/docs/operating/configuration#relabeling-relabel_config). [configuration documentation](/docs/operating/configuration#relabeling-relabel_config).
...@@ -118,7 +131,7 @@ Over the next sections, we will see how you can leverage relabeling when using s ...@@ -118,7 +131,7 @@ Over the next sections, we will see how you can leverage relabeling when using s
## Discovery with DNS-SRV records ## Discovery with DNS-SRV records
Prometheus has supported target discovery via DNS-SRV records for some time. Since the beginning, Prometheus has supported target discovery via DNS-SRV records.
The respective configuration looked like this: The respective configuration looked like this:
``` ```
...@@ -129,18 +142,19 @@ job { ...@@ -129,18 +142,19 @@ job {
} }
``` ```
With the latest changes not only can you describe multiple SRV records to be queried but can Prometheus 0.14.0 allows you to specify multiple SRV records to be queried in a
also use meta labels that are attached by the service discovery during the relabeling phase. single scrape configuration, and also provides service-discovery-specific meta
information that is helpful during the relabeling phase.
When querying the the DNS-SRV records, a label named `__meta_dns_srv_name` is attached to each When querying the the DNS-SRV records, a label named `__meta_dns_srv_name` is
target. Its value is set to the record name for which it was returned. Having structured attached to each target. Its value is set to the SRV record name for which it was
SRV record names, e.g. `telemetry.<zone>.<job>.srv.example.org`, we can extract various returned. If we have structured SRV record names like `telemetry.<zone>.<job>.srv.example.org`,
labels from it: we can extract relevant labels from it those names:
``` ```
scrape_configs: scrape_configs:
- job_name: 'myjob' - job_name: 'myjob'
dns_sd_configs: dns_sd_configs:
- names: - names:
- 'telemetry.eu-west.api.srv.example.org' - 'telemetry.eu-west.api.srv.example.org'
...@@ -165,12 +179,13 @@ it came from. ...@@ -165,12 +179,13 @@ it came from.
## Discovery with Consul ## Discovery with Consul
Service discovery via Consul is now supported natively. It can be configured by defining Service discovery via Consul is now supported natively. It can be configured by defining
access parameters for our Consul agent and a list of Consul services for which we want access parameters for our Consul agent and a list of Consul services for which we want
to query targets. to query targets.
The tags of each Consul node are concatenated by a configurable separator and exposed The tags of each Consul node are concatenated by a configurable separator and exposed
through the `__meta_consul_tags` label. Various other meta labels are extracted. through the `__meta_consul_tags` label. Various other Consul-specific meta
labels are also provided.
Scraping all instances for a list of given services can be achieved with a simple Scraping all instances for a list of given services can be achieved with a simple
`consul_sd_config` and relabeling rules: `consul_sd_config` and relabeling rules:
...@@ -180,8 +195,7 @@ scrape_configs: ...@@ -180,8 +195,7 @@ scrape_configs:
- job_name: 'overwritten-default' - job_name: 'overwritten-default'
consul_sd_configs: consul_sd_configs:
- server: '127.0.0.1:5361' - server: '127.0.0.1:5361'
services: ['auth', 'api', 'load-balancer', 'postgres'] services: ['auth', 'api', 'load-balancer', 'postgres']
relabel_configs: relabel_configs:
...@@ -199,9 +213,9 @@ scrape_configs: ...@@ -199,9 +213,9 @@ scrape_configs:
replacement: '$1' replacement: '$1'
``` ```
This synchronizes the given services from the local Consul agent. This discovers the given services from the local Consul agent.
As a result, we get metrics for four jobs (`auth`, `api`, `load-balancer`, and `postgres`). If a node As a result, we get metrics for four jobs (`auth`, `api`, `load-balancer`, and `postgres`). If a node
has the `production` or `canary` tag a respective `group` label is assigned to the target. has the `production` or `canary` Consul tag, a respective `group` label is assigned to the target.
Each target's `instance` label is set to the node name provided by Consul. Each target's `instance` label is set to the node name provided by Consul.
A full documentation of all configuration parameters for service discovery via Consul A full documentation of all configuration parameters for service discovery via Consul
...@@ -210,17 +224,17 @@ can be found on the [Prometheus website](/docs/operating/configuration##relabeli ...@@ -210,17 +224,17 @@ can be found on the [Prometheus website](/docs/operating/configuration##relabeli
## Custom service discovery ## Custom service discovery
Finally, we added a file-based interface to talk to your custom service discovery or other common mechanisms Finally, we added a file-based interface to integrate your custom service discovery or other common mechanisms
that are not yet supported out-of-the-box. that are not yet supported out of the box.
With this mechanism, Prometheus watches a set of directories or files which contain target group information. With this mechanism, Prometheus watches a set of directories or files which define target groups.
Whenever any of those files changes, a list of target groups is read from the files and scrape targets Whenever any of those files changes, a list of target groups is read from the files and scrape targets
are extracted. are extracted.
It's now our job to write a small bridge program that runs as Prometheus's side-kick. It's now our job to write a small bridge program that runs as Prometheus's side-kick.
It retrieves changes from a service discovery mechanism and writes the target information It retrieves changes from an arbitrary service discovery mechanism and writes the target information
to the watched files as lists of target groups. to the watched files as lists of target groups.
The files can either be in YAML: These files can either be in YAML:
``` ```
- targets: ['10.11.150.1:7870', '10.11.150.4:7870'] - targets: ['10.11.150.1:7870', '10.11.150.4:7870']
...@@ -232,7 +246,7 @@ The files can either be in YAML: ...@@ -232,7 +246,7 @@ The files can either be in YAML:
job: 'postgres' job: 'postgres'
``` ```
... or JSON format: ...or in JSON format:
``` ```
[ [
...@@ -251,7 +265,7 @@ The files can either be in YAML: ...@@ -251,7 +265,7 @@ The files can either be in YAML:
] ]
``` ```
We now configure Prometheus to watch the `tgroups/` directory in its working directory We now configure Prometheus to watch the `tgroups/` directory in its working directory
for all `.json` files: for all `.json` files:
``` ```
...@@ -262,25 +276,27 @@ scrape_configs: ...@@ -262,25 +276,27 @@ scrape_configs:
- names: ['tgroups/*.json'] - names: ['tgroups/*.json']
``` ```
What's missing now is a program that writes files to this directory. For the sake of this example, What's missing now is a program that writes files to this directory. For the sake of this example,
let's assume we have all our instances for different jobs in one large unnormalized MySQL table. let's assume we have all our instances for different jobs in a single denormalized MySQL table.
(Hint: you probably don't want to do service discovery this way.) (Hint: you probably don't want to do service discovery this way.)
Every 30 seconds, we read all instances from the table and write the result Every 30 seconds, we read all instances from the MySQL table and write the
into a JSON file. Note that we do not have to keep state whether or not anything has changed. Prometheus resulting target groups into a JSON file. Note that we do not have to keep
will automatically detect changes and applies them to targets without interrupting their scrape cycles. state whether or not any targets or their labels have changed. Prometheus will
automatically detect changes and applies them to targets without interrupting
their scrape cycles.
``` ```
import os, time, json import os, time, json
from itertools import groupby from itertools import groupby
from MySQLdb import connect from MySQLdb import connect
def refresh(cur): def refresh(cur):
# Fetch all rows. # Fetch all rows.
cur.execute("SELECT address, job, zone FROM instances") cur.execute("SELECT address, job, zone FROM instances")
tgs = [] tgs = []
# Group all instances by their job and zone values. # Group all instances by their job and zone values.
for key, vals in groupby(cur.fetchall(), key=lambda r: (r[1], r[2])): for key, vals in groupby(cur.fetchall(), key=lambda r: (r[1], r[2])):
...@@ -288,16 +304,16 @@ def refresh(cur): ...@@ -288,16 +304,16 @@ def refresh(cur):
'labels': dict(zip(['job', 'zone'], key)), 'labels': dict(zip(['job', 'zone'], key)),
'targets': [t[0] for t in vals], 'targets': [t[0] for t in vals],
}) })
# Persist the target groups to disk as JSON file. # Persist the target groups to disk as JSON file.
with open('tgroups/target_groups.json.new', 'w') as f: with open('tgroups/target_groups.json.new', 'w') as f:
json.dump(tgs, f) json.dump(tgs, f)
f.flush() f.flush()
os.fsync(f.fileno()) os.fsync(f.fileno())
os.rename('tgroups/target_groups.json.new', 'tgroups/target_groups.json') os.rename('tgroups/target_groups.json.new', 'tgroups/target_groups.json')
if __name__ == '__main__': if __name__ == '__main__':
while True: while True:
with connect('localhost', 'root', '', 'test') as cur: with connect('localhost', 'root', '', 'test') as cur:
...@@ -306,19 +322,19 @@ if __name__ == '__main__': ...@@ -306,19 +322,19 @@ if __name__ == '__main__':
``` ```
While Prometheus will not apply any malformed changes to files, it is considered best practice to While Prometheus will not apply any malformed changes to files, it is considered best practice to
change your files via renaming, as we do in our example. update your files atomically via renaming, as we do in our example.
It is also recommended to split larger amounts of target groups into several files based on It is also recommended to split larger amounts of target groups into several files based on
logical grouping. logical grouping.
## Conclusion ## Conclusion
With DNS-SRV records and Consul, two major ways of doing service discovery are now natively supported With DNS-SRV records and Consul, two major service discovery methods are now
by Prometheus. From the examples we've seen that relabeling is a powerful approach to make use natively supported by Prometheus. We've seen that relabeling is a powerful
of meta data provided by service discovery mechanisms. approach to make use of metadata provided by service discovery mechanisms.
Make sure to take a look at the new [configuration documentation](/docs/operating/configuration/) Make sure to take a look at the new [configuration documentation](/docs/operating/configuration/)
to upgrade your Prometheus setup to the new release and find out about other configuration options to upgrade your Prometheus setup to the new release and find out about other configuration options,
such as basic HTTP authentication and target filtering via relabeling. such as basic HTTP authentication and target filtering via relabeling.
We provide a [migration tool](https://github.com/prometheus/migrate/releases) that upgrades We provide a [migration tool](https://github.com/prometheus/migrate/releases) that upgrades
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment