Grafana Integration¶
Grafana connects to Victoria Metrics to visualize platform metrics and to Jaeger for trace exploration. Alerting uses Grafana's built-in unified alerting engine with provisioned contact points and alert rules — no custom backend endpoints involved.
Dashboards¶
Grafana is available at http://localhost:3000 when the stack is running (anonymous viewer access enabled by default).
Victoria Metrics serves as the Prometheus-compatible data source. See Metrics Reference for the
full metric catalog and example PromQL queries.
Alerting Architecture¶
flowchart LR
VM["Victoria Metrics"] --> Grafana
Grafana -->|"evaluate rules"| Grafana
Grafana -->|"notify"| Slack["Slack"]
Grafana -->|"notify"| Email["Email"]
Grafana -->|"notify"| PagerDuty["PagerDuty / etc."]
Grafana's unified alerting engine evaluates rules on a schedule, queries Victoria Metrics, and sends notifications directly to configured contact points (Slack, email, PagerDuty, OpsGenie, webhooks, etc.). This is the standard Grafana approach — no intermediate backend service is needed.
Provisioning¶
Alert configuration is managed via YAML files in backend/grafana/provisioning/alerting/. Grafana loads these on
startup, so alert rules, contact points, and notification policies are version-controlled and reproducible.
The provisioning file ships with commented-out examples for common setups:
Contact Points¶
Contact points define where notifications go. Uncomment and configure in alerting.yml:
contactPoints:
- orgId: 1
name: slack-notifications
receivers:
- uid: slack-receiver
type: slack
settings:
url: https://hooks.slack.com/services/YOUR/WEBHOOK/URL
recipient: "#alerts"
Grafana supports 20+ contact point types out of the box: Slack, email, PagerDuty, OpsGenie, Microsoft Teams, generic webhooks, and more. See the Grafana contact points documentation for the full list.
Notification Policies¶
Policies route alerts to the right contact point based on labels:
policies:
- orgId: 1
receiver: slack-notifications
group_by: ["alertname", "namespace"]
group_wait: 30s
group_interval: 5m
repeat_interval: 4h
Alert Rules¶
Rules query Victoria Metrics and fire when thresholds are breached. Example for HTTP 5xx error rate:
groups:
- orgId: 1
name: backend-alerts
folder: Integr8sCode
interval: 1m
rules:
- uid: high-error-rate
title: High HTTP 5xx Error Rate
condition: C
data:
- refId: A
relativeTimeRange:
from: 300
to: 0
datasourceUid: victoria-metrics
model:
expr: >
sum(rate(http_requests_total{status=~"5.."}[5m]))
/ sum(rate(http_requests_total[5m])) * 100
for: 5m
labels:
severity: warning
annotations:
summary: "HTTP 5xx error rate is above 5%"
Configuration¶
Unified alerting is enabled in backend/grafana/grafana.ini:
The [alerting] section controls the legacy alerting engine (Grafana < 9) and stays disabled. [unified_alerting]
is the modern engine used for all provisioned rules.
Key Files¶
| File | Purpose |
|---|---|
grafana/grafana.ini |
Grafana server configuration |
grafana/provisioning/alerting/alerting.yml |
Alert rules and contact points |
grafana/provisioning/datasources/ |
Victoria Metrics data source |