A quick introduction to Prometheus
Prometheus is an open-source systems monitoring and alerting toolkit. It was originally build at SoundCloud. Prometheus is one of the best monitoring toolkits for containers and micro services. The toolkit is highly customizable and designed to deliver rich metrics without creating a drag on system performance.
It is easy to setup the Prometheus server. You can install and start Prometheus on Physical/VM servers as a standalone service or containerised setup. Prometheus works as pull based. Which means, Prometheus server originate connection to end servers and scrape the metrics from the remote servers. This is one of the noticeable difference between Prometheus monitoring and other time series database. Prometheus actively scrape targets in order to retrieve metrics from them.
Prometheus expects to retrieve metrics via HTTP calls done to certain endpoints that are defined in Prometheus configuration. Node exporter and App exporters will listen on particular ports and Prometheus server initiate a HTTP call to this particular exporter and fetch system / app metrics from end points. There are multiple exporters are available for collecting system & application metrics. As I mentioned, it is very easy to configure the exporters. Once the exporters are up and we added the configuration in Prometheus configuration file, it starts to scrape the metrics at the intervals defined in the configuration of Prometheus.
Why we need monitoring for exporters and what’s the best way?
Of course, we need monitoring for every bit of our infrastructure and this bit is the critical one collecting your infrastructure metrics and you’re gonna use this for analysing system performance. So the data in your Prometheus is very critical. Exporters doesn’t run on the servers and collect data without a signal from Prometheus. So in case of any connectivity issue from the Prometheus server to your servers, you will loose data.
How much time node_exporter or other exporters holds data locally?
Worry about the data loss? In case of any connectivity issue between Prometheus server and the Targets, you will loose metrics. Exporters doesn’t save any data locally, we can not configure like that. Exporters are binaries, it runs and fetches the data when a request comes to the socket it is listening at. If nobody fetches (scrapes) data, no data is gathered and the node_exporter instance idles waiting for incoming requests. Refer this discussion for more details.
So monitoring exporters are critical. Here I am going to explain the ways that we can implement for monitoring exporters.
Option 1. Monitor the exporter process from the remote server.
Yes this is one of the way to monitor the exporter process on your servers. Here we can use any process monitoring scripts (like checks_procs plugin in Nagios) to monitor the process for your exporter. But this is useful 100 percentage, because, in this case we only monitoring the exporter process. A running exporter can’t collect metrics without connection from Prometheus. Because it’s a pull based system and Prometheus initiating the traffic to servers for scraping the metrics. Explained above.
Option 2. Monitor the exporter process & connectivity from Promethues server
Yes, the first option plus connectivity check from Prometheus. It is okay to setup these kind of thing if you have only few servers/apps to monitor. For example, if you have to monitor 5 servers and each server contains 2 java applications, you should have minimum 2 exporters per servers. One NE (node exporter) for monitoring the system and application exporter.
In this case, you have to create two process monitoring per host (total 10) and 10 connectivity test from Prometheus. Consider, you have 1000 servers and more applications. It’s not possible to generate a connectivity test from Prometheus for a large number. It can cause high resource usage, and it’s not efficient.
Option 3. The Best Way To Monitor Prometheus Exporters
By using the API call. This is the best option to monitor the exporter status plus connectivity as Prometheus will mark as the Target is down if any of the above case fails.
Both the active and dropped targets are part of the response by default. You can create custom scripts to monitor and get alerts. Read more from Prometheus page. You can write a script with proper exit code and integrate it to your monitoring tools to get alerts. Here you can find one sample script to monitor the Prometheus targets –> Python script to monitor Prometheus exporters
Intersted ? Read more:
- Advantages of Prometheus monitoring tool
- Modern Monitoring Concepts – An Introduction To Prometheus World
If you have any questions or suggestion or any other best ways, please add you comments here. That will help me and others.
Please follow my LinkedIn page for getting more updates. Thanks!