This project shows a full observability pipeline running on four servers: a Web Server that hosts the app and Alloy,
a Prometheus Server for metrics collection, a Loki Server for centralized logs, and a Grafana Server for dashboards,
visualization, alerting, and Slack notifications.
Web Server
💻
Web + Alloy
Hosts the Flask application, exposes app metrics, exposes node metrics through Node Exporter,
and uses Alloy to ship logs and forward metrics.
Main components
Flask / Titan application on port 5000
Node Exporter on port 9100
Alloy agent for metrics and logs
Load generation + fake traffic scripts
What it sends
Application metrics → Prometheus
System metrics → Prometheus
Application logs → Loki
Why it matters
Shows CPU, memory, disk, and process health
Shows request rate, endpoint traffic, and app behavior
Produces logs for troubleshooting and alert investigations
Prometheus Server
🔥
Prometheus
Scrapes metrics from configured targets, stores them as time-series data,
and answers PromQL queries used by Grafana dashboards and alerts.
Main responsibilities
Scrape interval collection from targets
Store metrics in a time-series database
Support PromQL queries for monitoring panels
Track target health and availability
Key targets
Prometheus self-monitoring
Web server system metrics job
Web server application metrics job
Example metrics
CPU utilization and load average
Memory available and disk usage
HTTP request totals and request rate
Loki Server
🟢
Loki
Receives and stores logs sent by Alloy, keeps labels for efficient log filtering,
and makes application logs searchable inside Grafana.
Main responsibilities
Log aggregation from the web server
Label-based log indexing and querying
Centralized log storage for troubleshooting
Support for log exploration in Grafana
Incoming data
App logs from /var/log/titan
Error logs, access logs, info logs
Streams forwarded by Alloy
Why it matters
Correlates logs with metrics during incidents
Helps find root cause faster
Supports observability beyond dashboards
Grafana Server
📊
Grafana
Connects to Prometheus and Loki, builds dashboards, visualizes metrics and logs,
defines threshold-based alerts, and sends notifications to Slack.
Main responsibilities
Dashboards for system and application monitoring
Panels for CPU, memory, disk, and HTTP traffic
Alert rules, thresholds, and evaluation groups
Notification routing to Slack contact points
Data sources
Prometheus for metrics
Loki for logs
Slack webhook for notifications
Example outcomes
Production dashboards by environment and service
Root disk usage alerts and resolved notifications
Live troubleshooting with metrics + logs together
Web Server exposes metrics and logs
➜
Prometheus scrapes metrics / Loki ingests logs
➜
Grafana builds dashboards and sends alerts
Dashboards & Observability Views
System Dashboard
CPU utilization
Memory available
Root disk usage
Application Dashboard
HTTP requests total
Request rate by endpoint
Dynamic variable filtering
Logs & Investigation
Search app logs by labels
Inspect errors and request traces
Correlate metrics with incidents
Metrics from Prometheus Logs from Loki App traffic and request rate Threshold breaches and alerts
Alerting & Notifications
🚨 Critical Alert
Root Disk Usage > 65%
Grafana rule enters Firing state
⚠️ Warning Alert
CPU or Memory threshold crossed
Evaluation group checks every minute
✅ Resolved Notification
Metric returned to normal range
Slack receives recovery message