Merge pull request #27 from prometheus/beorn7/doc-improve

Improve documentation about storage.

Merge pull request #27 from prometheus/beorn7/doc-improve
Improve documentation about storage.
7dcef5fb · Björn Rabenstein · 9ef72f6d · af80d03c · 7dcef5fb · 7dcef5fb
Commit 7dcef5fb authored Feb 02, 2015 by Björn Rabenstein
Hide whitespace changes
Inline Side-by-side

Showing with 93 additions and 4 deletions

faq.md content/docs/introduction/faq.md +16 -0

storage.md content/docs/operating/storage.md +77 -4

No files found.
--- a/content/docs/introduction/faq.md
+++ b/content/docs/introduction/faq.md
@@ -126,3 +126,19 @@ Performance across client libraries and languages may vary. For Java,
 indicate that incrementing a counter/gauge with the Java client will take
 12-17ns, depending on contention. This is negligible for all but the most
 latency-critical code.
+## Troubleshooting
+### My server takes a long time to start up and spams the log with copious information about crash recovery.
+You are suffering from an unclean shutdown. Prometheus has to shut
+down cleanly after a `SIGTERM`, which might take a while for heavily
+used servers. If the server crashes or is killed hard (e.g. OOM kill
+by the kernel or your runlevel system got impatient while waiting for
+Prometheus to shutdown), a crash recovery has to be performed, which
+should take less than a minute under normal circumstances. See [crash recovery](/docs/operating/storage/#crash-recovery) for details.
+### I am using ZFS on Linux, and the unit test `TestPersistLoadDropChunks` fails. If I run Prometheus despite the failing test, the weirdest things happen.
+You have run into a bug of ZFS on Linux. See [issue #484](https://github.com/prometheus/prometheus/issues/484)
+for details. Upgrading to ZFS on Linux v0.6.4 should fix the issue.
\ No newline at end of file
--- a/content/docs/operating/storage.md
+++ b/content/docs/operating/storage.md
@@ -5,10 +5,83 @@ nav_icon: database
 # Storage
-Prometheus stores its time series data under the directory specified by the flag
+Prometheus has a sophisticated local storage subsystem. For indexes,
-`storage.local.path`. If you suspect problems caused by corruption in the
+it uses [LevelDB](https://github.com/google/leveldb). For the bulk
-database, or you simply want to erase the existing database, you can easily
+sample data, it has its own custom storage layer, which organizes
-start fresh by deleting the contents of this directory:
+sample data in chunks of constant size (1024 bytes payload). These
+chunks are then stored on disk in one file per time series.
+## Memory usage
+Prometheus keeps all the currently used chunks in memory. In addition,
+it keeps the most recently used chunks in memory up to a threshold
+configurable via the `storage.local.memory-chunks` flag. If you have a
+lot of RAM available, you might want to increase it above the default
+value of 1048576 (and vice versa, if you run into RAM problems, you
+can try to decrease it). Note that the actual RAM usage of your server
+will be much higher than what you would expect from multiplying
+`storage.local.memory-chunks` by 1024 bytes. There is inevitable
+overhead for managing the sample data in the storage layer. Also, your
+server is doing many more things than just storing samples. The actual
+overhead depends on your usage pattern. In extreme cases, Prometheus
+has to keep more chunks in memory than configured because all those
+chunks are in use at the same time. You have to experiment a bit. The
+metrics `prometheus_local_storage_memory_chunks` and
+`process_resident_memory_bytes`, exported by the Prometheus server,
+will come in handy. As a rule of thumb, you should have at least three
+times more RAM available than needed by the memory chunks alone.
+LevelDB is essentially dealing with data on disk and relies on the
+disk caches of the operating system for optimal performance. However,
+it maintains in-memory caches, whose size you can configure for each
+index via the following flags:
+* `storage.local.index-cache-size.fingerprint-to-metric`
+* `storage.local.index-cache-size.fingerprint-to-timerange`
+* `storage.local.index-cache-size.label-name-to-label-values`
+* `storage.local.index-cache-size.label-pair-to-fingerprints`
+## Disk usage
+Prometheus stores its on-disk time series data under the directory
+specified by the flag `storage.local.path`. The default path is
+`/tmp/metrics`, which is good to try something out quickly but most
+likely not what you want for actual operations. The flag
+`storage.local.retention` allows you to configure the retention time
+for samples. Adjust it to your needs and your available disk space.
+## Crash recovery
+Prometheus saves chunks to disk as soon as possible after they are
+complete. Incomplete chunks are saved to disk during regular
+checkpoints. You can configure the checkpoint interval with the flag
+`storage.local.checkpoint-interval`. Prometheus creates checkpoints
+more frequently than that if too many time series are in a "dirty"
+state, i.e. their current incomplete head chunk is not the one that is
+contained in the most recent checkpoint. This limit is configurable
+via the `storage.local.checkpoint-dirty-series-limit` flag.
+Nevertheless, should your server crash, you might still lose data, and
+your storage might be left in an inconsistent state. Therefore,
+Prometheus performs a crash recovery after an unclean shutdown,
+similar to an `fsck` run for a file system. Details about the crash
+recovery are logged, so you can use it for forensics if required. Data
+that cannot be recovered is moved to a directory called `orphaned`
+(located under `storage.local.path`). Remember to delete that data if
+you do not need it anymore.
+The crash recovery usually takes less than a minute. Should it take much
+longer, consult the log to find out what has gone wrong.
+## Data corruption
+If you suspect problems caused by corruption in the database, you can
+enforce a crash recovery by starting the server with the flag
+`storage.local.dirty`.
+If that does not help, or if you simply want to erase the existing
+database, you can easily start fresh by deleting the contents of the
+storage directory:
   1. Stop Prometheus.
   1. `rm -r <storage path>/*`