Observability February 5, 2026 7 min read

Why Your Cloud Logs Are Useless (And How to Fix Them)

CloudWatch and Logs Explorer collect everything and explain nothing. Here's what to do about it.

The logging anti-patterns we see everywhere

After auditing dozens of startup logging setups, the same problems appear over and over:

1. Log-and-pray: Services log everything at INFO level with no structure. "Processing request," "Request complete," "Error occurred." No request IDs, no user context, no timing information.

2. Console.log debugging left in production: Developers add logging during debugging and never remove it. The signal-to-noise ratio is abysmal.

3. No correlation across services: Each service logs independently. When a request flows through 4 services, there's no way to connect the dots without manually matching timestamps.

4. Logs as the only observability tool: No metrics, no traces, no structured events. Everything is a log line, and querying logs is the only way to understand system behavior.

The result: logs that are expensive to store, slow to query, and useless for debugging.

Structured logging: the minimum viable fix

The first step is embarrassingly simple: log in JSON with consistent fields.

Every log line should include: a timestamp (ISO 8601), a log level, a service name, a request/trace ID, and a structured message with relevant context. Instead of logging "User order failed," log a JSON object with the user ID, order ID, error code, error message, and the service that produced it.

This alone makes your existing CloudWatch/Logs Explorer setup dramatically more useful. You can filter by service, by error code, by user — queries that are impossible with unstructured log lines.

Most logging libraries support structured logging out of the box. It's a one-day change per service, and the ROI is immediate.

From logs to traces: the paradigm shift

Structured logs are better, but they still have a fundamental limitation: they're point-in-time events. A log line tells you what happened at a single moment in a single service. It doesn't tell you the story of a request.

Traces fill this gap. A trace is a structured representation of a request's entire journey through your system. Each operation (HTTP call, database query, cache lookup) is a span with timing data, status codes, and custom attributes. Spans are nested, so you can see that the 5-second API response was caused by a 4.8-second database query, which was caused by a missing index on the users table.

OpenTelemetry gives you this for free (or close to it). The auto-instrumentation libraries for most frameworks and database drivers create spans automatically. You add a few lines of SDK setup, point it at a trace backend, and suddenly you can see the full story of every request.

The observability stack that actually works

Here's what we recommend for startups that want useful observability without enterprise complexity:

1. Structured JSON logging everywhere, with trace IDs attached to every log line. 2. OpenTelemetry SDK in every service, with auto-instrumentation for your framework and database. 3. A trace backend — Grafana Tempo (open source, cheap to run) or Honeycomb (SaaS, great query UX). 4. Dashboards for the 3-5 metrics that actually matter: request rate, error rate, latency percentiles, and resource utilization. 5. Alerts on symptoms (error rate spike, latency increase), not causes (CPU > 80%). Let the traces tell you the cause.

This stack costs a fraction of what most teams spend on logging alone, and it actually helps you debug production issues instead of just recording them for posterity.

We set up observability stacks that turn your cloud logs from noise into signal. Most implementations take under two weeks.

Book a call