*[Java](https://github.com/prometheus/client_java/blob/master/simpleclient/src/main/java/io/prometheus/client/Histogram.java)(histograms are only supported by the simple client but not by the legacy client)
The two approaches have a number of different implications:
| | Histogram | Summary
|---|-----------|---------
| Required configuration | Pick buckets suitable for the expected range of observed values. | Pick desired φ-quantiles and sliding window. Other φ-quantiles and sliding windows cannot be calculated later.
| Client performance | Observations are very cheap as they only need to increment counters. | Observations are expensive due to the streaming quantile calculation.
| Server performance | The server has to calculate quantiles. You can use [recording rules](/docs/querying/rules/#recording-rules) should the ad-hoc calculation take too long (e.g. in a large dashboard). | Low server-side cost.
| Number of time series (in addition to the `_sum` and `_count` series) | One time series per configured bucket. | One time series per configured quantile.
| Quantile error (see below for details) | Error is limited in the dimension of observed values by the width of the relevant bucket. | Error is limited in the dimension of φ by a configurable value.
| Specification of φ-quantile and sliding time-window | Ad-hoc with [Prometheus expressions](/docs/querying/functions/#histogram_quantile()). | Preconfigured by the client.
| Aggregation | Ad-hoc with [Prometheus expressions](/docs/querying/functions/#histogram_quantile()). | In general [not aggregatable](http://latencytipoftheday.blogspot.de/2014/06/latencytipoftheday-you-cant-average.html).
Note the importance of the last item in the table. Let us return to
the SLA of serving 95% of requests within 300ms. This time, you do not
want to display the percentage of requests served within 300ms, but
instead the 95th percentile, i.e. the request duration within which
you have served 95% of requests. To do that, you can either configure
a summary with a 0.95-quantile and (for example) a 5-minute decay
time, or you configure a histogram with a few buckets around the 300ms
mark, e.g. `{le="0.1"}`, `{le="0.2"}`, `{le="0.3"}`, and
`{le="0.45"}`. If your service runs replicated with a number of
instances, you will collect request durations from every single one of
them, and then you want to aggregate everything into an overall 95th
percentile. However, aggregating the precomputed quantiles from a
summary rarely makes sense. In this particular case, averaging the