2017-10-10 14:58:52 +02:00
---
title: Getting started
2017-10-26 15:53:27 +02:00
sort_rank: 1
2017-10-10 14:58:52 +02:00
---
# Getting started
This guide is a "Hello World"-style tutorial which shows how to install,
2020-10-27 02:50:37 -07:00
configure, and use a simple Prometheus instance. You will download and run
2017-10-10 14:58:52 +02:00
Prometheus locally, configure it to scrape itself and an example application,
2020-10-27 02:50:37 -07:00
then work with queries, rules, and graphs to use collected time
2017-10-10 14:58:52 +02:00
series data.
## Downloading and running Prometheus
[Download the latest release ](https://prometheus.io/download ) of Prometheus for
your platform, then extract and run it:
```bash
tar xvfz prometheus-*.tar.gz
cd prometheus-*
```
Before starting Prometheus, let's configure it.
## Configuring Prometheus to monitor itself
2020-10-27 02:50:37 -07:00
Prometheus collects metrics from _targets_ by scraping metrics HTTP
endpoints. Since Prometheus exposes data in the same
2017-10-10 14:58:52 +02:00
manner about itself, it can also scrape and monitor its own health.
While a Prometheus server that collects only data about itself is not very
2020-10-27 02:50:37 -07:00
useful, it is a good starting example. Save the following basic
2017-10-10 14:58:52 +02:00
Prometheus configuration as a file named `prometheus.yml` :
```yaml
global:
scrape_interval: 15s # By default, scrape targets every 15 seconds.
# Attach these labels to any time series or alerts when communicating with
# external systems (federation, remote storage, Alertmanager).
external_labels:
monitor: 'codelab-monitor'
# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
# The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
- job_name: 'prometheus'
# Override the global default and scrape targets from this job every 5 seconds.
scrape_interval: 5s
static_configs:
- targets: ['localhost:9090']
```
For a complete specification of configuration options, see the
2017-10-27 09:47:38 +02:00
[configuration documentation ](configuration/configuration.md ).
2017-10-10 14:58:52 +02:00
## Starting Prometheus
2017-10-26 15:42:07 +02:00
To start Prometheus with your newly created configuration file, change to the
directory containing the Prometheus binary and run:
2017-10-10 14:58:52 +02:00
```bash
# Start Prometheus.
2017-10-28 12:08:33 +02:00
# By default, Prometheus stores its database in ./data (flag --storage.tsdb.path).
./prometheus --config.file=prometheus.yml
2017-10-10 14:58:52 +02:00
```
2017-10-26 15:42:07 +02:00
Prometheus should start up. You should also be able to browse to a status page
about itself at [localhost:9090 ](http://localhost:9090 ). Give it a couple of
seconds to collect data about itself from its own HTTP metrics endpoint.
2017-10-10 14:58:52 +02:00
You can also verify that Prometheus is serving metrics about itself by
navigating to its metrics endpoint:
[localhost:9090/metrics ](http://localhost:9090/metrics )
## Using the expression browser
2020-10-27 02:50:37 -07:00
Let us explore data that Prometheus has collected about itself. To
2017-10-10 14:58:52 +02:00
use Prometheus's built-in expression browser, navigate to
2022-01-14 23:14:55 +02:00
http://localhost:9090/graph and choose the "Table" view within the "Graph" tab.
2017-10-10 14:58:52 +02:00
2017-10-26 15:42:07 +02:00
As you can gather from [localhost:9090/metrics ](http://localhost:9090/metrics ),
2020-10-27 02:50:37 -07:00
one metric that Prometheus exports about itself is named
2017-10-10 14:58:52 +02:00
`prometheus_target_interval_length_seconds` (the actual amount of time between
2020-10-27 02:50:37 -07:00
target scrapes). Enter the below into the expression console and then click "Execute":
2017-10-10 14:58:52 +02:00
```
prometheus_target_interval_length_seconds
```
2017-10-26 15:42:07 +02:00
This should return a number of different time series (along with the latest value
2020-10-27 02:50:37 -07:00
recorded for each), each with the metric name
2017-10-10 14:58:52 +02:00
`prometheus_target_interval_length_seconds` , but with different labels. These
labels designate different latency percentiles and target group intervals.
2020-10-27 02:50:37 -07:00
If we are interested only in 99th percentile latencies, we could use this
query:
2017-10-10 14:58:52 +02:00
```
prometheus_target_interval_length_seconds{quantile="0.99"}
```
To count the number of returned time series, you could write:
```
count(prometheus_target_interval_length_seconds)
```
For more about the expression language, see the
[expression language documentation ](querying/basics.md ).
## Using the graphing interface
To graph expressions, navigate to http://localhost:9090/graph and use the "Graph"
tab.
2020-05-04 12:49:45 +02:00
For example, enter the following expression to graph the per-second rate of chunks
2017-11-01 21:05:50 +05:30
being created in the self-scraped Prometheus:
2017-10-10 14:58:52 +02:00
```
2017-11-01 21:05:50 +05:30
rate(prometheus_tsdb_head_chunks_created_total[1m])
2017-10-10 14:58:52 +02:00
```
Experiment with the graph range parameters and other settings.
## Starting up some sample targets
2020-10-27 02:50:37 -07:00
Let's add additional targets for Prometheus to scrape.
2017-10-10 14:58:52 +02:00
2020-05-04 12:49:45 +02:00
The Node Exporter is used as an example target, for more information on using it
[see these instructions. ](https://prometheus.io/docs/guides/node-exporter/ )
2017-10-10 14:58:52 +02:00
```bash
2020-05-04 12:49:45 +02:00
tar -xzvf node_exporter-*.*.tar.gz
cd node_exporter-*.*
2017-10-10 14:58:52 +02:00
# Start 3 example targets in separate terminals:
2020-05-04 12:49:45 +02:00
./node_exporter --web.listen-address 127.0.0.1:8080
./node_exporter --web.listen-address 127.0.0.1:8081
./node_exporter --web.listen-address 127.0.0.1:8082
2017-10-10 14:58:52 +02:00
```
You should now have example targets listening on http://localhost:8080/metrics,
http://localhost:8081/metrics, and http://localhost:8082/metrics.
2020-05-04 12:49:45 +02:00
## Configure Prometheus to monitor the sample targets
2017-10-10 14:58:52 +02:00
Now we will configure Prometheus to scrape these new targets. Let's group all
2020-10-27 02:50:37 -07:00
three endpoints into one job called `node` . We will imagine that the
2017-10-10 14:58:52 +02:00
first two endpoints are production targets, while the third one represents a
canary instance. To model this in Prometheus, we can add several groups of
endpoints to a single job, adding extra labels to each group of targets. In
this example, we will add the `group="production"` label to the first group of
targets, while adding `group="canary"` to the second.
To achieve this, add the following job definition to the `scrape_configs`
section in your `prometheus.yml` and restart your Prometheus instance:
```yaml
scrape_configs:
2020-05-04 12:49:45 +02:00
- job_name: 'node'
2017-10-10 14:58:52 +02:00
# Override the global default and scrape targets from this job every 5 seconds.
scrape_interval: 5s
static_configs:
- targets: ['localhost:8080', 'localhost:8081']
labels:
group: 'production'
- targets: ['localhost:8082']
labels:
group: 'canary'
```
Go to the expression browser and verify that Prometheus now has information
2020-05-04 12:49:45 +02:00
about time series that these example endpoints expose, such as `node_cpu_seconds_total` .
2017-10-10 14:58:52 +02:00
## Configure rules for aggregating scraped data into new time series
Though not a problem in our example, queries that aggregate over thousands of
time series can get slow when computed ad-hoc. To make this more efficient,
2020-10-27 02:50:37 -07:00
Prometheus can prerecord expressions into new persisted
time series via configured _recording rules_ . Let's say we are interested in
2020-05-04 12:49:45 +02:00
recording the per-second rate of cpu time (`node_cpu_seconds_total` ) averaged
over all cpus per instance (but preserving the `job` , `instance` and `mode`
dimensions) as measured over a window of 5 minutes. We could write this as:
2017-10-10 14:58:52 +02:00
```
2020-05-04 12:49:45 +02:00
avg by (job, instance, mode) (rate(node_cpu_seconds_total[5m]))
2017-10-10 14:58:52 +02:00
```
Try graphing this expression.
To record the time series resulting from this expression into a new metric
2020-05-04 12:49:45 +02:00
called `job_instance_mode:node_cpu_seconds:avg_rate5m` , create a file
2017-10-31 13:29:41 +00:00
with the following recording rule and save it as `prometheus.rules.yml` :
2017-10-10 14:58:52 +02:00
```
2017-10-31 13:29:41 +00:00
groups:
2020-05-04 12:49:45 +02:00
- name: cpu-node
2017-10-31 13:29:41 +00:00
rules:
2020-05-04 12:49:45 +02:00
- record: job_instance_mode:node_cpu_seconds:avg_rate5m
expr: avg by (job, instance, mode) (rate(node_cpu_seconds_total[5m]))
2017-10-10 14:58:52 +02:00
```
2018-12-25 15:28:56 +02:00
To make Prometheus pick up this new rule, add a `rule_files` statement in your `prometheus.yml` . The config should now
2017-10-10 14:58:52 +02:00
look like this:
```yaml
global:
scrape_interval: 15s # By default, scrape targets every 15 seconds.
evaluation_interval: 15s # Evaluate rules every 15 seconds.
# Attach these extra labels to all timeseries collected by this Prometheus instance.
external_labels:
monitor: 'codelab-monitor'
rule_files:
2017-10-31 13:29:41 +00:00
- 'prometheus.rules.yml'
2017-10-10 14:58:52 +02:00
scrape_configs:
- job_name: 'prometheus'
# Override the global default and scrape targets from this job every 5 seconds.
scrape_interval: 5s
static_configs:
- targets: ['localhost:9090']
2020-05-04 12:49:45 +02:00
- job_name: 'node'
2017-10-10 14:58:52 +02:00
# Override the global default and scrape targets from this job every 5 seconds.
scrape_interval: 5s
static_configs:
- targets: ['localhost:8080', 'localhost:8081']
labels:
group: 'production'
- targets: ['localhost:8082']
labels:
group: 'canary'
```
Restart Prometheus with the new configuration and verify that a new time series
2020-05-04 12:49:45 +02:00
with the metric name `job_instance_mode:node_cpu_seconds:avg_rate5m`
2017-10-10 14:58:52 +02:00
is now available by querying it through the expression browser or graphing it.
2022-05-17 11:49:54 +02:00
## Reloading configuration
As mentioned in the [configuration documentation ](configuration/configuration.md ) a
Prometheus instance can have its configuration reloaded without restarting the
process by using the `SIGHUP` signal. If you're running on Linux this can be
performed by using `kill -s SIGHUP <PID>` , replacing `<PID>` with your Prometheus
process ID.
## Shutting down your instance gracefully.
While Prometheus does have recovery mechanisms in the case that there is an
abrupt process failure it is recommend to use the `SIGTERM` signal to cleanly
shutdown a Prometheus instance. If you're running on Linux this can be performed
2023-05-15 13:18:30 +05:30
by using `kill -s SIGTERM <PID>` , replacing `<PID>` with your Prometheus process ID.