Monitoring Task: Prometheus and Grafana Health Checks¶
Description¶
Ensure monitoring stack is up-to-date, reachable, and properly collecting and visualizing metrics.
Checklist¶
- Is Prometheus scraping targets?
- Are Grafana dashboards loading?
- Are alerts firing as expected?
Triage Steps¶
- Check Prometheus status:
- Visit
/targetspage to check scrape status - Review logs:
docker logs prometheus - Check Grafana:
- Access dashboards
- Verify data sources
-
Check
grafana.log -
Test alerts (silence or trigger manually)
Preventive Actions¶
- Backup config files and dashboards
- Set up external alert notification (email, Slack)
- Monitor monitoring!
Tools & Commands¶
- Prometheus UI, Grafana UI,
docker logs, alertmanager