Grafana Integration¶
Grafana connects to Victoria Metrics to visualize platform metrics, to Jaeger for trace exploration, and to Loki for log aggregation. Alerting uses Grafana's built-in unified alerting engine with provisioned contact points and alert rules — no custom backend endpoints involved.
Dashboards¶
Grafana is available at http://localhost:3000 when the stack is running. Login is required — anonymous access is
disabled. Admin credentials are set via GRAFANA_ADMIN_USER and GRAFANA_ADMIN_PASSWORD environment variables in
docker-compose.yaml (defaults: admin / admin123).
Victoria Metrics serves as the Prometheus-compatible data source. See Metrics Reference for the
full metric catalog and example PromQL queries.
Alerting Architecture¶
flowchart LR
VM["Victoria Metrics"] --> Grafana
Grafana -->|"evaluate rules"| Grafana
Grafana -->|"notify"| Slack["Slack"]
Grafana -->|"notify"| Email["Email"]
Grafana -->|"notify"| PagerDuty["PagerDuty / etc."]
Grafana's unified alerting engine evaluates rules on a schedule, queries Victoria Metrics, and sends notifications directly to configured contact points (Slack, email, PagerDuty, OpsGenie, webhooks, etc.). This is the standard Grafana approach — no intermediate backend service is needed.
Provisioning¶
Alert configuration is managed via YAML files in backend/grafana/provisioning/alerting/. Grafana loads these on
startup, so alert rules, contact points, and notification policies are version-controlled and reproducible.
The provisioning file (alerting.yml) includes active infrastructure alert rules under groups and ships with empty contactPoints: [] and policies: [] — populate these keys to route alerts to Slack, email, or other channels (see examples below):
Built-in Alert Rules¶
Two host-memory alert rules are provisioned out of the box. They query the system_memory_utilization metric produced by the OTel Collector's hostmetrics receiver:
| Rule | Threshold | for |
Severity |
|---|---|---|---|
| Host Memory > 85 % | 85 % | 5 m | warning |
| Host Memory > 95 % | 95 % | 2 m | critical |
These rules fire into Grafana's built-in alerting UI. To route them to Slack, email, or PagerDuty, add a contact point and notification policy (see below).
Contact Points¶
Contact points define where notifications go. Configure in alerting.yml:
contactPoints:
- orgId: 1
name: slack-notifications
receivers:
- uid: slack-receiver
type: slack
settings:
url: https://hooks.slack.com/services/YOUR/WEBHOOK/URL
recipient: "#alerts"
Grafana supports 20+ contact point types out of the box: Slack, email, PagerDuty, OpsGenie, Microsoft Teams, generic webhooks, and more. See the Grafana contact points documentation for the full list.
Notification Policies¶
Policies route alerts to the right contact point based on labels:
policies:
- orgId: 1
receiver: slack-notifications
group_by: ["alertname", "namespace"]
group_wait: 30s
group_interval: 5m
repeat_interval: 4h
Alert Rules¶
Rules query Victoria Metrics and fire when thresholds are breached. Example for HTTP 5xx error rate:
groups:
- orgId: 1
name: backend-alerts
folder: Integr8sCode
interval: 1m
rules:
- uid: high-error-rate
title: High HTTP 5xx Error Rate
condition: C
data:
- refId: A
relativeTimeRange:
from: 300
to: 0
datasourceUid: victoria-metrics
model:
expr: >
sum(rate(http_requests_total{status=~"5.."}[5m]))
/ sum(rate(http_requests_total[5m])) * 100
for: 5m
labels:
severity: warning
annotations:
summary: "HTTP 5xx error rate is above 5%"
Configuration¶
The full Grafana server configuration lives in backend/grafana/grafana.ini:
[users]
allow_sign_up = false
allow_org_create = false
auto_assign_org = true
[smtp]
enabled = true
host = ${GF_SMTP_HOST}
user = ${GF_SMTP_USER}
password = ${GF_SMTP_PASSWORD}
from_address = ${GF_SMTP_FROM_ADDRESS}
from_name = Integr8sCode Alerts
starttls_policy = MandatoryStartTLS
[auth]
disable_login_form = false
[auth.anonymous]
enabled = false
[auth.basic]
enabled = true
[unified_alerting]
enabled = true
[alerting]
enabled = false
Anonymous access is disabled and basic auth is enabled. Admin credentials are injected via GF_SECURITY_ADMIN_USER
and GF_SECURITY_ADMIN_PASSWORD environment variables in docker-compose.yaml.
The [alerting] section controls the legacy alerting engine (Grafana < 9) and stays disabled. [unified_alerting]
is the modern engine used for all provisioned rules.
Key Files¶
| File | Purpose |
|---|---|
grafana/grafana.ini |
Grafana server configuration |
grafana/provisioning/alerting/alerting.yml |
Alert rules and contact points |
grafana/provisioning/datasources/ |
Victoria Metrics, Jaeger, and Loki data sources |