Performance assessment and tuning experiences with RHEL

POC on Red Hat Directory Server (RHDS)

Migrate Elasticsearch Index to OpenSearch Index.

Rootless Podman Container And UID/GID Mapping in Ansible Automation Platform 2

Search

Munshi Hafizul Haque
- Jul 28, 2020
- 3 min read

How To Install Alertmanager to Alert Based on Metrics From Prometheus

Updated: Oct 3, 2021

Previously, we have posted "How to install and configure Prometheus & grafana in Red Hat Enterprise Linux 7",Now, let's try to install and configure the Alertmanager to get alert based on metrics from Prometheus. Let's start the configuration.

Setup Alertmanager:

Step:1 To download Alertmanager binaries for Linux from the Prometheus Download Page.

# wget https://github.com/prometheus/alertmanager/releases/download/v0.21.0/alertmanager-0.21.0.linux-amd64.tar.gz 

::::::::::::: CUT SOME OUTPUT :::::::::::::

100%[========================================================================================================>] 25,710,888   174KB/s   in 2m 32s 

2020-07-27 13:31:05 (166 KB/s) - ‘alertmanager-0.21.0.linux-amd64.tar.gz’ saved [25710888/25710888]

Step:2 To prepared prerequisite configurations for Alertmanager.

# useradd --no-create-home --shell /bin/false alertmanager

Step:3 To extract downloaded zip file & configure alertmanager.

# tar -xvf alertmanager-0.21.0.linux-amd64.tar.gz
alertmanager-0.21.0.linux-amd64/
alertmanager-0.21.0.linux-amd64/alertmanager
alertmanager-0.21.0.linux-amd64/amtool
alertmanager-0.21.0.linux-amd64/NOTICE
alertmanager-0.21.0.linux-amd64/LICENSE
alertmanager-0.21.0.linux-amd64/alertmanager.yml

# cp alertmanager-0.21.0.linux-amd64/alertmanager /usr/local/bin/
# cp alertmanager-0.21.0.linux-amd64/amtool /usr/local/bin/

# chown alertmanager:alertmanager /usr/local/bin/alertmanager
# chown alertmanager:alertmanager /usr/local/bin/amtool

Step:3 To create the alertmanager directory and configure the global alertmanager configuration:

# mkdir /etc/alertmanager
# vim /etc/alertmanager/alertmanager.yml
global:
  smtp_smarthost: 'localhost:25'
  smtp_from: 'AlertManager <root@example.com>'
  # My smtp server does not require authentication, 
  # but still we need to set below attributes.
  smtp_require_tls: false
  smtp_hello: 'alertmanager'
  smtp_auth_username: 'username'
  smtp_auth_password: 'password'
  
  slack_api_url: 'https://hooks.slack.com/services/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'

route:
  group_by: ['instance', 'alert']
  group_wait: 30s
  group_interval: 5m
  repeat_interval: 3h
  receiver: team-1

receivers:
  - name: 'team-1'
    email_configs:
      - to: 'root@localhost'
    slack_configs:
      # https://prometheus.io/docs/alerting/configuration/#slack_config
      - channel: '#ansible'
      - username: 'AlertManager'
      - icon_emoji: ':joy:'

Note: In my privious post, we have created the slack workspace and the webhook, see the the details at "How to create a new workspace and setup a Slack Webhook for Sending Messages From Applications"

Step:4 To create a systemd configuration file for alertmanager.

# vim /usr/lib/systemd/system/alertmanager.service
[Unit]
Description=Alertmanager
Wants=network-online.target
After=network-online.target

[Service]
User=alertmanager
Group=alertmanager
Type=simple
WorkingDirectory=/etc/alertmanager/
ExecStart=/usr/local/bin/alertmanager --config.file=/etc/alertmanager/alertmanager.yml --web.external-url http://192.168.122.80:9093

[Install]
WantedBy=multi-user.target

Note: 192.168.122.80 is my alertmanager ip address.

Step:5 To start the alertmanager service.

# systemctl enable --now alertmanager.service
Created symlink from /etc/systemd/system/multi-user.target.wants/alertmanager.service to /usr/lib/systemd/system/alertmanager.service.
# systemctl status alertmanager.service

Step:5 To verify the running alertmanager service from the alertmanager user interface (http://<alertmanager_ip_address>:9093)

Okay, alertmanager service is running...

Change the required configuration in Prometheus:

Step:1 To add (enable) the alertmanager configuration in prometheus:

# vim /etc/prometheus/prometheus.yml 
::::::::::::: CUT SOME OUTPUT :::::::::::::
# Alertmanager configuration
alerting:
  alertmanagers:
  - static_configs:
    - targets:
       - 192.168.122.80:9093
#       - localhost:9093
#       - alertmanager:9093

# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
   - first_rules.yml
  # - "second_rules.yml"

# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
::::::::::::: CUT SOME OUTPUT :::::::::::::

Note: we are going to create first_rules.yml in next steps

Step:2 To verify the running instances from the Prometheus user interface (http://<prometheus_ip_address>:9090)

We need to create a rules file that will specify the conditions when would like to be alerted. Let's consider we will get alert when instances goin to down.

As we have seen, All running instances have value of 1, while all instances that are currently not running have value of 0.

Step:3 To create a first_rules.yml to specify alerting condition

# cat /etc/prometheus/first_rules.yml 
groups:
- name: AllInstances
  rules:
  - alert: InstanceDown
   # Condition for alerting
    expr: up == 0
    for: 1m
   # Annotation - additional informational labels to store more information
    annotations:
      title: 'Instance {{ $labels.instance }} down'
      description: '{{ $labels.instance }} of job {{ $labels.job }} has been down for more than 1 minute.'
   # Labels - additional labels to be attached to the alert
    labels:
      severity: 'critical'

Source: grafana blog

Step:5 To change the Prometheus systemd configuration file.

# cat /usr/lib/systemd/system/prometheus.service 
[Unit]
Description=Prometheus
Wants=network-online.target
After=network-online.target

[Service]
User=prometheus
Group=prometheus
Type=simple
ExecStart=/usr/local/bin/prometheus \
--config.file=/etc/prometheus/prometheus.yml \
--storage.tsdb.path=/var/lib/prometheus/ \
--web.console.templates=/var/lib/prometheus/consoles \
--web.console.libraries=/var/lib/prometheus/console_libraries \
--web.external-url=http://192.168.122.144

[Install]
WantedBy=multi-user.target

We have assigned the --web.external-url to passing the alertmanager base ip address, and now did to the same for prometheus, e.g, assigned the --web.external-url passing the prometheus base ip address.

Step:6 To restart the Prometheus service.

# systemctl daemon-reload
# systemctl restart prometheus

To trigger the alert:

As we have integrates the Ansible Tower with Prometheus, we can stop the ansible tower to simulate the alert.

Note: See for the details at "How to monitor Red Hat Ansible Tower Using Prometheus, Node Exporter Grafana"

# ansible-tower-service stop
Stopping Tower
Redirecting to /bin/systemctl stop postgresql-9.6.service
Redirecting to /bin/systemctl stop rabbitmq-server.service
Redirecting to /bin/systemctl stop nginx.service
Redirecting to /bin/systemctl stop supervisord.service

To Verify the alert:

From the Prometheus user interface.