Monitoring Task: Prometheus and Grafana Health Checks¶
Description¶
Ensure monitoring stack is up-to-date, reachable, and properly collecting and visualizing metrics.
Checklist¶
- Is Prometheus scraping targets?
- Are Grafana dashboards loading?
- Are alerts firing as expected?
Triage Steps¶
- Check Prometheus status:
- Visit
/targets
page to check scrape status - Review logs:
docker logs prometheus
- Check Grafana:
- Access dashboards
- Verify data sources
-
Check
grafana.log
-
Test alerts (silence or trigger manually)
Preventive Actions¶
- Backup config files and dashboards
- Set up external alert notification (email, Slack)
- Monitor monitoring!
Tools & Commands¶
- Prometheus UI, Grafana UI,
docker logs
, alertmanager