Prometheus-Alertmanager, Slack messages not fully showing

103 Views Asked by At

I have my Prometheus/Alertmanager(0.26.0) in Docker Compose on a VM.

My problem is most of the time Alerts are not fully showing Slack.

Lets say I get one good alert and then when I run docker-compose restart, and change to some instance that I know it fail and fire an alert it will not give me the data I need.

My Alert format I think is not the problem since you can see 1st notification its showing all data , but other all missing the data.

Any idea/suggestion how I could get my alerts fully showing on each slack notification ?

SLACK MESSAGES:

enter image description here

PROMETHEUS
enter image description here

ALARM MANAGER UI

enter image description here

Alertmanager debug logs.

enter image description here

Prometheus.yml

global:
  scrape_interval: 10s
  evaluation_interval: 5s

scrape_configs:

  - job_name: My Hosts
    metrics_path: /probe
    params:
      module: [icmp]
    file_sd_configs:
      - files:
        - '/etc/prometheus/targets.yml'  
    relabel_configs:
      - source_labels: [__address__]
        target_label: __param_target
      - source_labels: [__param_target]
        target_label: instance
      - target_label: __address__
        replacement: blackbox:9115

rule_files:
  - "/etc/prometheus/rules/my_rules.yml"

alerting:
  alertmanagers:
    - static_configs:
        - targets:
          - 'alertmanager:9093'

my_targets.yml

      - targets: ['10.113.204.3']   
        labels:
          instance_name: '44-base-ws03.dddemo3.com'
          cluster_name: '44-Test'

my_rules.yml

groups:
  - name: Instancess
    rules:
    - alert: InstanceDown
      expr: probe_success == 0
      for: 10s
      labels:
        severity: CRITICAL
        instance: "{{ $labels.instance }}"
        job: "{{ $labels.job }}"
        instance_name: "{{ $labels.instance_name }}"
        cluster_name: "{{ $labels.cluster_name }}"
      annotations:
        description: '{{ $labels.instance }} has been down for more than 5 minutes.'
        summary: 'Instance: {{ $labels.instance }} is down'

alertmanager.yml

global:
  slack_api_url: 'my_slack_url'
route:
  receiver: 'slack-notifications'
  group_by: ['alertname']
  group_interval: 1m
  group_wait: 20s
  repeat_interval: 2m

   
receivers:
- name: 'slack-notifications'
  slack_configs:
  - channel: '#se_goes_alerts_dev'
    send_resolved: true
    title: ':bullhorn-slack: {{ .CommonLabels.severity | toUpper }}:                           {{ .CommonLabels.alertname }} {{ .CommonLabels.instance }} - {{ .CommonLabels.job }}'
    text: |
      Description:  {{ .CommonAnnotations.description }}
      Summary:  {{ .CommonAnnotations.summary }}
      Instance Name: {{ .CommonLabels.instance_name }}
      cluster_name: {{ .CommonLabels.cluster_name }}
      instance: {{ .CommonLabels.instance }}

docker-compose.yml

version: '3'
services:
  prometheus:
    image: prom/prometheus:latest
    container_name: prometheus
    volumes:
      - ./prometheus:/etc/prometheus
    ports:
      - '9090:9090'
    command:
      - '--config.file=/etc/prometheus/prometheus.yml'
      - '--web.enable-remote-write-receiver'
    networks:
      - my-network
      
  alertmanager:
    image: prom/alertmanager:latest
    container_name: alertmanager
    command:
      - '--config.file=/etc/alertmanager/alertmanager.yml'
      - '--storage.path=/alertmanager'
      - '--log.level=debug'
    ports:
      - '9093:9093'
    volumes:
      - ./alertmanager/alertmanager.yml:/etc/alertmanager/alertmanager.yml
      - ./alertmanager:/etc/alertmanager
    networks:
      - my-network
    depends_on:
      - prometheus

I have tried to debug with --log-level: debug, and I checked different github issues but I couldn't find the answer. Using alertmanager to generate Slack alerts for me is was inconsistent with notification results.

Any help highly appreciated.

0

There are 0 best solutions below