Monitoring
Homedata runs two layers of monitoring: an uptime check that polls health endpoints every 5 minutes and sends Slack alerts on failure, and Sentry for application-level error tracking. Both are already wired — they just need environment variables to activate.
Already installed — just needs configuration
The monitor:uptime-check command and its scheduler entry are already in the codebase.
Sentry's Laravel SDK is installed and the bootstrap binding is wired.
Set the env vars below and both systems activate automatically on next deploy.
Sentry Error Tracking
Sentry captures unhandled exceptions and performance traces from both Thor (Laravel) and Loki (Django). The SDK is already installed — activation requires a single environment variable.
Create Sentry projects
Create two projects in Sentry — one for Thor (Laravel) and one for Loki (Django). Both projects will give you a DSN in the format:
https://PUBLIC_KEY@SENTRY_HOST/PROJECT_ID
Set the DSN in production
Thor (Laravel) reads SENTRY_LARAVEL_DSN.
Set it in your hosting platform's environment config — DO App Platform for Loki, Laravel Forge for Thor.
The sentry/sentry-laravel package (v4.21) is already installed.
config/sentry.php is published
and bootstrap/app.php is wired.
Set one env var and Sentry activates:
SENTRY_LARAVEL_DSN=https://your_dsn_here
PII scrubbing is on by default (send_default_pii = False).
Sentry will not capture request bodies or user details unless you explicitly enable it.
| Event type | Examples | Default? |
|---|---|---|
| Unhandled exceptions | 500 errors, uncaught exceptions, fatal errors | Yes |
| Queue job failures | Failed registration jobs, email send errors | Yes |
| Slow DB queries | Performance traces for queries >100ms | Yes |
| User PII | Request bodies, emails, IP addresses | Off — scrubbed |
Uptime Check
The monitor:uptime-check command
runs every 5 minutes via the Laravel scheduler. It hits /health on
Thor and /api/health on Loki, then sends a Slack alert
if either endpoint fails two consecutive checks.
How it works
Every 5 minutes the scheduler runs monitor:uptime-check (with withoutOverlapping() so long checks don't queue up).
Each endpoint is checked with a 10-second timeout (5-second connect timeout). A 200 { "status": "ok" } response counts as healthy. A degraded status or any non-200 counts as a failure.
Consecutive failures are tracked in the cache. Once failures reach MONITORING_FAILURE_THRESHOLD (default: 2), a Slack alert fires. The alert is sent once — not on every subsequent failure.
When the endpoint recovers, a recovery alert fires and the failure counter resets. The down state has a 24-hour cache TTL so stale "down" flags self-clear after an outage.
What alerts look like
🚨 Loki is DOWN
Endpoint: https://api.homedata.co.uk/api/health
Status: HTTP 503
Error: cURL error 28: Operation timed out
Failed checks: 2
2026-03-16 11:00:00 UTC
✅ Loki recovered
Endpoint https://api.homedata.co.uk/api/health
is healthy again.
2026-03-16 11:10:00 UTC
Health endpoints
| Service | Endpoint | Healthy response |
|---|---|---|
| Thor (API gateway) | GET /health | 200 {"status":"ok"} |
| Loki (data API) | GET /api/health | 200 {"status":"ok"} |
Running manually
Trigger a check outside the scheduler — useful for testing your Slack webhook or verifying a deployment:
# Run a single uptime check php artisan monitor:uptime-check # Expected output (both healthy) ✓ Thor healthy (142ms) ✓ Loki healthy (213ms) All endpoints healthy.
Environment Variables
All monitoring configuration is environment-driven. No code changes needed.
Sentry
| Variable | Service | Required | Description |
|---|---|---|---|
| SENTRY_LARAVEL_DSN | Thor | Yes | DSN from your Sentry project settings. Empty = Sentry disabled. |
| SENTRY_DSN | Loki | Yes | DSN for the Loki Django project. Set in DO App Platform env vars. |
Uptime Check
| Variable | Default | Description |
|---|---|---|
| MONITORING_SLACK_WEBHOOK | — | Incoming webhook URL from Slack. Without this, alerts are logged but not sent. |
| MONITORING_FAILURE_THRESHOLD | 2 | Consecutive failed checks before a Slack alert fires. Default 2 = 10 minutes of downtime before alerting. |
| MONITORING_THOR_URL | https://homedata.co.uk | Base URL for Thor. The command appends /health. |
| MONITORING_LOKI_URL | https://api.homedata.co.uk | Base URL for Loki. The command appends /api/health. |
# Sentry (Thor) SENTRY_LARAVEL_DSN=https://your_dsn@sentry.io/project_id # Uptime check MONITORING_SLACK_WEBHOOK=https://hooks.slack.com/services/T.../B.../... MONITORING_FAILURE_THRESHOLD=2 # Override URL targets if needed (optional) # MONITORING_THOR_URL=https://homedata.co.uk # MONITORING_LOKI_URL=https://api.homedata.co.uk
Flap Protection
The uptime checker is designed to avoid alert storms during intermittent issues.
The Slack alert only fires after MONITORING_FAILURE_THRESHOLD consecutive
failures (default: 2). A single blip doesn't page anyone.
Once an alert fires, the "down" flag is cached for 24 hours. Subsequent failed checks won't re-alert — you get one notification, not one every 5 minutes.
When a previously-down endpoint passes a check, a recovery alert goes to Slack automatically. You always know when the incident is resolved.
GitHub Actions
An uptime workflow at .github/workflows/uptime.yml runs the same
health checks on a cron schedule from GitHub's infrastructure — useful as an independent check outside your own servers.
For the schedule trigger to fire, the workflow must live on the default branch (main).
Three GitHub Actions secrets are required:
| Secret | Description |
|---|---|
| SLACK_MONITORING_WEBHOOK | Incoming webhook URL for the #monitoring channel |
| MONITORING_API_KEY | Internal API key used by the uptime check job |
| SENTRY_AUTH_TOKEN | Sentry auth token for source map uploads on deploy |
Set secrets at GitHub → Repository → Settings → Secrets and variables → Actions. Only repo admins (Louis, James) can create or update secrets.
Check API status
Real-time uptime for homedata.co.uk and api.homedata.co.uk — updated every 5 minutes.