The Three Pillars of Observability
One-Liner
The three core types of telemetry data that provide a comprehensive view of a system’s health: Metrics, Logs, and Traces.
What It Is
- Metrics: Numeric representations of data measured over time (e.g., CPU usage, request latency). They are cheap to store and good for dashboards and alerting.
- Logs: Immutable, timestamped records of discrete events. They provide detailed, contextual information about what occurred at a specific point in time.
- Traces: A representation of a single request as it flows through a distributed system. Traces are made up of spans, which represent individual operations.
Why It Exists
To provide a framework for understanding and debugging complex systems. Each pillar provides a different perspective, and together they give a more complete picture than any one pillar alone.
How It Works
- Metrics tell you what is happening.
- Logs tell you why it is happening for a specific event.
- Traces tell you where in the system the problem is.
Tradeoffs
Metrics
- Pros: Cheap, efficient, good for aggregation.
- Cons: Lack context.
Logs
- Pros: Rich context.
- Cons: Expensive to store and query, can be unstructured.
Traces
- Pros: Great for debugging latency in distributed systems.
- Cons: Can be complex to set up, sampling may miss rare events.
Failure Modes
- Relying on only one pillar: For example, having metrics but no logs to explain why a metric has spiked.
- Uncorrelated data: Having all three pillars but no way to link them together (e.g., finding the logs for a specific trace).
Interview Traps
- Not being able to explain the role of each pillar.
- Not understanding that they are complementary, not mutually exclusive.
Real-World Usage
- Modern observability platforms (like Datadog, New Relic, Honeycomb) are built around these three pillars.
Anti-Patterns
- Putting high-cardinality data (like user IDs) in metric tags, which can cause an “explosion” of time series.
- Logging unstructured text that is hard to parse and query.
Related Concepts
- The Four Golden Signals
- Structured Logging
- Distributed Tracing