Logging Sucks

Meet

Logging was designed for monoliths. Once a single request fans out across 15+ services, conventional logs turn into noise — 17 lines for one successful request, 130,000+ lines per second at scale — because they optimize for writing, not querying. OpenTelemetry doesn't fix this; it's a delivery mechanism, not a mental model.

Instead of logging what your code is doing, log what happened to this request.

The proposed fix is wide events (canonical log lines): emit one comprehensive, high-cardinality event per request per service, carrying everything you'd want at query time:

Infrastructure context — service, version, region, deployment
Request details — method, path, status, duration
User/business context — subscription tier, lifetime value, cart contents
Error details — type, code, whether it's retriable
Feature flags and contextual metadata

Tail sampling keeps cost sane by retaining events by priority: 100% of errors, 100% of slow requests (p99+), all VIP/enterprise users, and a 1–5% random sample of healthy requests.

The payoff: debugging shifts from archaeological text-searching to analytics-style queries over structured business events.

Boris is building nominal.dev, an observability platform based on this idea.

How I actually use this

My usual move: send this article to an LLM agent and have it build me an Effect layer for logging and observability from it — since most of my code is Effect anyway. I specify the sampling strategy up front: on small projects I sample at 100% (no sampling — I want all the data), and as the project grows I start dialing in tail sampling along the lines Boris describes.

Logging Sucks

How I actually use this

Related

Logging Sucks

How I actually use this

Related