Dashboards
All dashboards are pre-provisioned and available in Grafana under the Project Sauron folder immediately after first launch.
Web Traffic & Uptime
UID: web-traffic-sauron |
Refresh: 1 minute | Default time range: Last 24 hours |
Monitors the availability and performance of all frontend and web properties.
Panels
| Panel | Description |
|---|---|
| Endpoint Status | Current UP/DOWN state for all monitored URLs |
| Response Time | HTTP probe duration over time, with mean and max |
| Uptime (5m avg) | Rolling 5-minute availability percentage per endpoint |
| SSL Certificate Expiry | Days until SSL certificate expires — turns red at < 30 days |
Key Metrics
probe_success— 1 if the HTTP probe succeeded, 0 if it failedprobe_duration_seconds— how long the probe tookprobe_ssl_earliest_cert_expiry— Unix timestamp of the earliest cert expiryendpoint:probe_success:rate5m— recording rule: 5-minute uptime ratio
API & Host Overview
UID: api-overview-sauron |
Refresh: 30 seconds | Default time range: Last 6 hours |
The default home dashboard. Shows API endpoint health alongside the EC2 host resource utilization.
Panels
| Panel | Description |
|---|---|
| API Status | UP/DOWN state for API endpoints specifically |
| Host CPU | Current CPU utilization % of the EC2 instance |
| Host Memory | Current memory utilization ratio |
| Host Disk | Current disk utilization ratio for / |
| API Response Time | Probe duration for API endpoints, with mean and P95 |
| Host CPU Over Time | CPU utilization trend |
Key Metrics
instance:node_cpu_utilisation:rate5m— recording rule: CPU %instance:node_memory_utilisation:ratio— recording rule: memory ratioinstance:node_disk_utilisation:ratio— recording rule: disk rationode_network_receive_bytes_total— network receive bytes
AWS Overview
UID: aws-overview-sauron |
Refresh: 5 minutes | Default time range: Last 6 hours |
CloudWatch metrics bridged via the CloudWatch Exporter. Covers EC2, Lambda, and S3.
Panels
| Panel | Description |
|---|---|
| EC2 CPU Utilization | CloudWatch EC2 CPU % per instance |
| EC2 Network I/O | Inbound and outbound network throughput |
| Lambda Invocations | Invocation count per function |
| Lambda Errors | Error count per function |
| Lambda Duration | Average execution duration per function |
Notes
- CloudWatch metrics have a 1-minute resolution (Prometheus scrapes every 60s)
- Lambda function names appear automatically as dimensions once data flows
- S3 metrics are daily — they won’t appear on sub-24h time ranges
Adding Custom Dashboards
To add a new dashboard:
- Build it in the Grafana UI
- Export as JSON (
Dashboard settings > JSON Model) - Save the JSON file to
monitoring/grafana/dashboards/ - Commit and push — the dashboard will auto-provision on next restart
Or use allowUiUpdates: true in the provisioning config to save changes directly from the UI (changes persist to the Docker volume but won’t be in git).