Architecture
Overview
Project Sauron uses a pull-based metrics architecture: Prometheus scrapes exporters on a 15-second interval, stores metrics in a local time-series database (30-day retention), and Grafana queries Prometheus to render dashboards.
All services run as Docker containers on a single EC2 t3.small instance, orchestrated by Docker Compose. This minimizes cost and operational complexity for a personal observability stack.
Component Descriptions
Prometheus
- Role: Metrics store and scrape engine
- Port: 9090 (internal only — not exposed to the internet)
- Config:
monitoring/prometheus/prometheus.yml - Retention: 30 days of metrics
- Access: Via SSH tunnel (
ssh -L 9090:localhost:9090 ec2-user@<ip>)
Grafana
- Role: Visualization, dashboarding, and alerting UI
- Port: 3000 (public, protected by admin password)
- Config: Provisioned automatically from
monitoring/grafana/provisioning/ - Datasource: Prometheus (auto-provisioned)
- Dashboards: Auto-provisioned from
monitoring/grafana/dashboards/
Node Exporter
- Role: Exposes EC2 host metrics (CPU, memory, disk, network) to Prometheus
- Port: 9100 (internal only)
- Metrics: All standard Linux host metrics
Blackbox Exporter
- Role: Probes HTTP/HTTPS endpoints from the outside, measuring uptime, response time, and SSL certificate health
- Port: 9115 (internal only)
- Modules:
http_2xx,tcp_connect(TLS cert expiry)
CloudWatch Exporter
- Role: Bridges AWS CloudWatch metrics into Prometheus
- Port: 9106 (internal only)
- Metrics: EC2 CPU, network; Lambda invocations, errors, duration; S3 bucket size
- Auth: IAM role attached to EC2 instance (no static credentials needed on EC2)
Data Flow
External Endpoints (HTTP)
│
▼
Blackbox Exporter ──────────────────────────┐
│
EC2 Host ──► Node Exporter ─────────────────┤
▼
AWS CloudWatch ──► CloudWatch Exporter ──► Prometheus ──► Grafana ──► Browser
│
Rules Engine
(Alert evaluation)
Security Model
| Resource | Access |
|---|---|
| Grafana (:3000) | Public internet (password protected) |
| Prometheus (:9090) | Internal only — SSH tunnel required |
| Node Exporter (:9100) | Internal only |
| Blackbox (:9115) | Internal only |
| CloudWatch (:9106) | Internal only |
| SSH (:22) | Restricted to your IP via security group |
Infrastructure
| Resource | Value |
|---|---|
| Provider | AWS |
| Region | us-east-1 (configurable) |
| Instance | EC2 t3.small |
| AMI | Amazon Linux 2023 (latest) |
| Storage | 20 GiB gp3 EBS (encrypted) |
| IP | Elastic IP (stable across reboots) |
| IAM | EC2 role with CloudWatch read-only |
Alerting
Alert rules are defined in monitoring/prometheus/rules/alerting.yml. Configured alerts include:
- Host CPU > 80% for 5 minutes
- Host memory < 15% available
- Host disk < 20% available
- Any monitored endpoint down for 2+ minutes
- Endpoint response time > 2s
- SSL certificate expiring within 30 days
- Prometheus target missing
Note: Alertmanager is not yet configured. Alerts are evaluated but not routed. To add notifications, add an Alertmanager service to docker-compose.yml and configure receivers (email, Slack, PagerDuty).