Infrastructure
Monitoring
Observability that tells you what's actually happening in production — before your users become your monitoring system.
Application Performance Monitoring
Traces, spans, and error tracking with Sentry or Datadog. Know what's slow, what's broken, and who's affected before your users tell you.
Structured Logging
Log pipelines with Axiom or Loki — structured JSON logs, queryable by service, user, request ID, and anything else you need to debug at 2am.
Uptime & Alerting
Uptime monitoring, health checks, and PagerDuty/Slack alerts that fire on the right things and don't cry wolf on the rest.
Infrastructure Metrics
CPU, memory, disk, and network metrics from your containers and databases. Dashboards that tell a story, not just numbers.
Incident Response Setup
Runbooks, on-call rotations, and post-mortem templates. The boring operational work that pays off enormously when things go wrong.
Tech I use