Andrey Markin
  • home
  • services
  • projects
  • blog
  • directory
    • Tool
    • Library
    • Repo
    • Article
    • XTweet
    • Guideline
    • Video
  • courses
  • resume
  • about
  • contact
  • meet

Mark Life Ltd

  1. Home
  2. Directory
  3. Logging Sucks
Meet

Mark Life Ltd

BG208147965

HomeContactPrivacyLLM-friendlyBlog RSSDirectory RSS
  1. Directory
  2. Logging Sucks
ArticleInfrastructureBackendobservabilitytracing

Logging Sucks

Boris Tane's argument that traditional logging is broken for distributed systems and that "wide events" — one comprehensive, high-cardinality structured event per request per service — are the fix. Log what happened to the request, not what your code is doing.

Added July 4, 2026Boris Tane
Visit article

Logging was designed for monoliths. Once a single request fans out across 15+ services, conventional logs turn into noise — 17 lines for one successful request, 130,000+ lines per second at scale — because they optimize for writing, not querying. OpenTelemetry doesn't fix this; it's a delivery mechanism, not a mental model.

Instead of logging what your code is doing, log what happened to this request.

The proposed fix is wide events (canonical log lines): emit one comprehensive, high-cardinality event per request per service, carrying everything you'd want at query time:

  • Infrastructure context — service, version, region, deployment
  • Request details — method, path, status, duration
  • User/business context — subscription tier, lifetime value, cart contents
  • Error details — type, code, whether it's retriable
  • Feature flags and contextual metadata

Tail sampling keeps cost sane by retaining events by priority: 100% of errors, 100% of slow requests (p99+), all VIP/enterprise users, and a 1–5% random sample of healthy requests.

The payoff: debugging shifts from archaeological text-searching to analytics-style queries over structured business events.

Boris is building nominal.dev, an observability platform based on this idea.

How I actually use this

My usual move: send this article to an LLM agent and have it build me an Effect layer for logging and observability from it — since most of my code is Effect anyway. I specify the sampling strategy up front: on small projects I sample at 100% (no sampling — I want all the data), and as the project grows I start dialing in tail sampling along the lines Boris describes.

Related

  • It's Time To Rethink EverythingTheo Browne's CascadiaJS 2026 talk arguing that AI is a "new cloud moment" — just as the cloud removed the cost of provisioning servers, agents remove the cost of building, so the sacred rules of software (file systems, codebases, packages, git, deployment) are worth tearing down and rebuilding from first principles.
  • InfisicalOpen-source security platform for developers and AI agents — secrets management, certificate management (PKI), and privileged access (PAM) under a single identity model with unified auditing.
  • Agent Registration with auth.mdOpen protocol for autonomous agents to self-register for services without human intervention — a standardized auth.md file + HTTP endpoints replace OAuth and sign-up forms.
  • Learn Harness EngineeringA 12-lecture curriculum on building effective harnesses that enable AI agents to complete complex tasks reliably — covering architecture, state management, session continuity, and observability patterns.