Phase 4 of 7
Logging and Debugging
Make failures diagnosable by implementing structured JSON logs with request tracing and a queryable log aggregation service.
Overview
Move all service output from unstructured text to structured JSON with consistent fields: timestamp, level, service, requestId, and message. Add Loki and Promtail to the compose stack so logs from every service are queryable in one place. Implement log-level filtering and strip sensitive values before they reach the log stream. After this phase, you should be able to trace a single HTTP request across the API, worker, and database layers using a shared requestId.
What to build
Deliverables
Advance only when these outputs exist in your code or compose definitions.
- Adopt structured JSON logging in the API and worker (use a library like pino or winston)
- Include requestId, service name, log level, and timestamp in every log entry
- Add Loki (loki:2.9) and Promtail to docker-compose.dev.yml with Promtail scraping all service containers
- Implement log level configuration via LOG_LEVEL environment variable (debug, info, warn, error)
- Add a log filter that removes any field containing 'password', 'secret', or 'token'
- Add a debug script (scripts/debug.sh) that opens a shell, tails logs, or checks resource usage for a named service
- Document how to query logs in Loki and how to trace a request by requestId
Done when
Success criteria
These are acceptance indicators, not a checklist to start from.
- Every API log line is valid JSON with timestamp, level, service, requestId, and message fields
- Loki is running and Promtail is successfully shipping logs from all containers
- You can query logs by service name and log level in the Loki HTTP API or UI
- A requestId set on an incoming API request appears in all related log entries for that request
- Running docker compose logs api | grep password returns no results
- Log retention is configured for at least 30 days in the Loki config
Verification
Testing and validation
Run these in order. Confirm each result before moving to the next step.
-
docker compose -f docker-compose.dev.yml up -d— confirm all services including Loki start
-
for i in {1..10}; do curl -s http://localhost:8000/api/tasks > /dev/null; done— generate log traffic
-
curl -s http://localhost:3100/loki/api/v1/labels— expect a JSON response listing available label names including 'service'
-
curl -G http://localhost:3100/loki/api/v1/query_range --data-urlencode 'query={service="api"}' | jq .data.result[0].values[0]— should return a structured JSON log entry
-
Pick a requestId from a recent API log entry and search for it across all services to confirm propagation
-
docker compose logs api | grep -i password— should return no output
-
Set LOG_LEVEL=debug in .env, restart the api service, and confirm debug-level entries appear in docker compose logs api