Skip to content

Monitoring Task: Prometheus and Grafana Health Checks

Description

Ensure monitoring stack is up-to-date, reachable, and properly collecting and visualizing metrics.

Checklist

  • Is Prometheus scraping targets?
  • Are Grafana dashboards loading?
  • Are alerts firing as expected?

Triage Steps

  1. Check Prometheus status:
  2. Visit /targets page to check scrape status
  3. Review logs:
    docker logs prometheus
    
  4. Check Grafana:
  5. Access dashboards
  6. Verify data sources
  7. Check grafana.log

  8. Test alerts (silence or trigger manually)

Preventive Actions

  • Backup config files and dashboards
  • Set up external alert notification (email, Slack)
  • Monitor monitoring!

Tools & Commands

  • Prometheus UI, Grafana UI, docker logs, alertmanager