Observability vs. Privacy: The Tension Between Seeing Everything and Knowing Nothing

An examination of the fundamental tension between cloud observability and user privacy, covering telemetry data exposure, log redaction, distributed tracing privacy risks, metric aggregation strategies, and architectures that achieve operational visibility without compromising user confidentiality.

The three pillars of observability — logs, metrics, and traces — exist because you cannot operate what you cannot see. When a service degrades, logs tell you what happened. Metrics tell you how the system is behaving. Traces tell you where the latency or failure occurred in a distributed call chain. Without observability, operating a cloud system at scale is guesswork.

The problem: observability works by recording what the system does. And what the system does involves processing user data. Logs capture request payloads, error messages with user identifiers, SQL queries with parameter values. Traces capture the path of a request through every service it touches, annotated with timing and contextual data. Metrics aggregate user behavior into statistical summaries that can reveal patterns about individuals or cohorts.

The industry treats this as a solvable tension — add some redaction, mask some fields, and you have “privacy-preserving observability.” The reality is more difficult. The features that make observability data useful for debugging are the same features that make it dangerous for privacy. A log entry that is detailed enough to diagnose a production issue is detailed enough to reveal user behavior. A trace that is complete enough to identify a performance bottleneck is complete enough to reconstruct a user’s session.

This tension does not have a simple resolution. But it does have architectural approaches that shift the balance — approaches that provide operational visibility without requiring the observability system to store user data.

The Exposure Surface

Logs: The Largest Privacy Liability

Application logs are the single largest source of accidental privacy exposure in cloud systems. The 2025 Elastic Observability Report surveyed 2,000 organizations and found that 89% had PII present in their production logs. Among those:

64% had email addresses in log entries
51% had physical addresses or phone numbers
43% had authentication tokens or session identifiers
28% had financial data (account numbers, transaction amounts)
12% had health-related information

This PII enters logs through multiple channels:

Error messages: Stack traces and exception messages often include the data that caused the error. A failed database insert might log the entire row, including customer PII. A failed API call might log the request body with user data.

Debug logging: Developers add debug-level logging during development that captures request and response payloads. This logging is supposed to be disabled in production. It frequently is not. A 2025 analysis by Datadog found that 31% of production services emit debug-level logs — a rate that has remained stubbornly constant since 2021.

Framework defaults: Many web frameworks log incoming requests by default, including query parameters and request bodies. Express.js with morgan middleware, Spring Boot’s default request logging, and Django’s debug toolbar all capture request data that may include PII.

Infrastructure logs: Load balancers log client IP addresses. DNS servers log query hostnames. CDNs log requested URLs (which may contain user identifiers in path parameters or query strings). This infrastructure-level logging is often outside application developers’ control.

Traces: The Session Reconstructor

Distributed traces correlate requests across services using trace IDs and span IDs. A single user action — loading a dashboard, submitting a form — generates a trace that touches every service in the request path.

The privacy risk: a complete trace for a user action reveals:

Which services the user’s request touched (the data flow path)
How long each service took (which can reveal the operation performed — a 2ms cache hit versus a 200ms database query)
What errors occurred (which may include error details with user data)
What downstream services were called (revealing the user’s data footprint across the system)

Correlated across multiple requests, traces can reconstruct a user’s session: what they did, in what order, how long each action took, and which services processed their data. This is a behavioral profile built from infrastructure telemetry.

Jaeger, Zipkin, and other tracing platforms store this data for days to weeks. The data is accessible to anyone with access to the tracing platform — typically the entire engineering organization.

Metrics: The Statistical Leaker

Metrics are aggregated by design, which provides some privacy protection. A counter tracking “requests per second” does not identify individual users. A histogram of response times does not reveal what any specific user experienced.

But metric cardinality creates privacy exposure. Metrics labeled with high-cardinality dimensions — user ID, session ID, customer ID, specific URL paths — can be disaggregated to individual-level data. A metric tracking “request latency by user_id” is not an aggregate; it is a per-user surveillance system with a monitoring label.

Prometheus, the most widely used metrics system, explicitly warns against high-cardinality labels. But the temptation to add user-level labels for debugging is constant, and the guard against it is human discipline — which is not an architecture.

The Regulatory Dimension

Observability data falls under data protection regulations because it contains or can be linked to personal data.

GDPR: Log entries containing user identifiers, IP addresses, or behavioral data are personal data under GDPR. Storage requires a lawful basis (typically “legitimate interest” for operational purposes). Retention must be limited to what is necessary. Data subject access requests (DSARs) must include observability data if it contains the subject’s personal data.

The Article 29 Working Party (now the EDPB) clarified in its guidance on legitimate interest that “logging of user activities for security purposes is generally covered by legitimate interest, provided that the data collected is limited to what is necessary and the retention period is appropriate.” The qualifier — “limited to what is necessary” — is precisely the discipline that most observability implementations fail to achieve.

CCPA/CPRA: Log data that identifies or can be linked to California consumers falls under the CCPA. Consumers have the right to know what data is collected, the right to delete, and the right to opt out of sale. If observability data is shared with a third-party observability vendor (Datadog, Splunk, New Relic), that sharing may constitute a “sale” under the CCPA’s broad definition.

PCI DSS 4.0: Requires that authentication credentials are not logged. Requires that cardholder data is not logged. Requires that logs are protected from unauthorized access and modification. Most PCI-scoped organizations satisfy these requirements for their application logs but overlook the same data appearing in error messages, traces, and infrastructure logs.

Architectural Approaches

Approach 1: Log Redaction Pipelines

The most common approach. A pipeline component sits between the log source and the log store, scanning log entries for PII patterns and redacting or masking them before storage.

Tools: Fluentd/Fluent Bit plugins, Logstash filters, Vector transforms, custom middleware.

Limitations:

Pattern-based detection misses PII that does not match predefined patterns (unusual name formats, domain-specific identifiers)
Overly aggressive redaction makes logs useless for debugging (“Error processing [REDACTED] for [REDACTED]: [REDACTED]”)
The redaction pipeline itself processes cleartext PII — if the pipeline is compromised, PII is exposed
Redaction is a one-way operation applied after log generation — the application still generates and briefly holds unredacted log entries in memory

The redaction approach is necessary but insufficient. It catches the known patterns and misses the unknown ones. A 2025 audit by Bishop Fox found that log redaction pipelines caught an average of 78% of PII patterns in test datasets — leaving 22% unredacted.

Approach 2: Structured Logging with Separation

Rather than logging arbitrary strings and trying to redact PII after the fact, structured logging separates contextual data (what happened, which service, what error code) from user data (who, what input, what output):

{
  "timestamp": "2026-03-08T10:23:15Z",
  "service": "user-profile",
  "operation": "update_profile",
  "status": "error",
  "error_code": "VALIDATION_FAILED",
  "field": "email",
  "request_id": "abc-123",
  "correlation_id": "def-456"
}

This log entry describes what happened without including the user’s email address, the invalid value, or any identifying information. The request_id and correlation_id allow correlation with other log entries and traces without embedding user identity.

The privacy property: the log store contains operational data that cannot be linked to individuals without access to a separate identity resolution service. The identity resolution service can be restricted to authorized personnel and audited independently.

Approach 3: Differential Privacy for Metrics

Differential privacy adds calibrated noise to metric data to prevent individual-level inference while preserving statistical utility. A metric tracking “average session duration by region” with differential privacy guarantees that the presence or absence of any single user’s data does not meaningfully change the metric value.

Google’s RAPPOR and Apple’s differential privacy implementations in iOS demonstrate that differential privacy is practical for high-volume metrics. For cloud observability, applying differential privacy to per-customer or per-user metrics prevents the disaggregation attack described above.

The trade-off: noisy metrics are less precise for debugging specific issues. Differential privacy is suited for trend analysis and alerting, not for root-cause investigation of individual requests.

Approach 4: Trace Sampling with Privacy Bias

Rather than capturing every trace, sample traces based on operational relevance:

Error-biased sampling: Capture 100% of error traces, 1% of success traces. Error traces are operationally valuable; success traces rarely are.
Latency-biased sampling: Capture traces that exceed latency thresholds (p99+ traces). Normal-latency traces provide little diagnostic value.
Privacy-biased sampling: Never capture traces for requests to privacy-sensitive endpoints. A trace through the /api/health-records/ path is a privacy risk regardless of its operational value.

Combining these strategies reduces the volume of trace data by 90-99% while retaining the traces most useful for debugging — and eliminating traces that primarily create privacy exposure.

Approach 5: Zero-Knowledge Observability

The most aggressive approach, and the one aligned with zero-knowledge architecture: the observability system operates without access to user data.

Implementation:

Applications emit only operational metrics and anonymized logs (no request payloads, no user identifiers, no PII)
Trace IDs are cryptographic hashes that cannot be reversed to user identifiers without a client-side lookup table
Error details are stored encrypted with a key held by a designated incident response role, not accessible to the observability platform
The service mesh provides traffic metrics (request count, latency, error rate) without inspecting payloads

This approach works for systems where the observability system is itself untrusted — either because it is operated by a third party (Datadog, New Relic) or because the organization wants to minimize the number of systems with access to user data.

The operational cost: debugging is harder. An engineer investigating a user-reported bug cannot simply grep the logs for the user’s ID. They must use the anonymized correlation IDs, access the encrypted error details through an authorized process, and reconstruct the issue from sanitized data. This is slower. It is also more private.

OpenTelemetry and Privacy

OpenTelemetry (OTel), the CNCF project that is becoming the standard for cloud-native telemetry, provides several privacy-relevant capabilities:

Attribute redaction: OTel collectors can filter or transform attributes before export. Sensitive attributes (user IDs, IP addresses, request bodies) can be removed or hashed in the collector pipeline.

Sampling: OTel supports head-based sampling (decide at trace start whether to capture), tail-based sampling (decide after trace completion based on outcomes), and probabilistic sampling. Tail-based sampling is particularly useful for privacy because it captures only traces that match operational criteria (errors, high latency) rather than all traces.

Multiple exporters: OTel can export different telemetry streams to different backends. Full-fidelity traces go to a restricted, short-retention store for active incident response. Redacted, aggregated metrics go to a long-retention store for trend analysis. This separation allows different access controls and retention policies for different privacy tiers of observability data.

Semantic conventions: OTel’s semantic conventions define standard attribute names and values. By adhering to conventions, organizations can apply uniform redaction rules across all services. An attribute named http.request.body can be systematically redacted; a custom attribute named payload might be missed.

The OTel specification does not mandate privacy controls — it provides the mechanisms. The implementation remains the organization’s responsibility.

Vendor Considerations

Third-party observability vendors introduce additional privacy considerations:

Data residency: Datadog processes data in the US and EU. Splunk Cloud is available in multiple regions. New Relic processes primarily in the US. If your observability data contains personal data (it almost certainly does), the vendor’s processing location is a GDPR consideration.

Data retention: Vendors retain data according to their tier pricing: Datadog retains logs for 15 days (standard) to 60 days (premium). Longer retention increases privacy exposure duration.

Sub-processing: Observability vendors use sub-processors (cloud providers, CDNs, support tools) that may have access to customer telemetry data. Sub-processor lists are typically disclosed in DPAs but change over time.

Vendor access: Vendor support engineers may access customer data for troubleshooting. The scope and controls for this access vary by vendor and are rarely negotiable for standard-tier customers.

For organizations with strict privacy requirements, self-hosted observability (Prometheus + Grafana + Loki + Tempo) eliminates vendor access and data residency concerns at the cost of operational overhead. The infrastructure-as-code approach makes self-hosted observability reproducible and maintainable, though it requires dedicated operations expertise.

The Stealth Cloud Perspective

The observability-privacy tension is real and irreducible in conventional architectures. If the server processes plaintext user data, the observability system that monitors the server will inevitably encounter that data — in logs, in error messages, in trace attributes, in metric labels. Redaction, sampling, and access controls are mitigation strategies. They reduce the exposure. They do not eliminate it.

Stealth Cloud resolves the tension by removing its source. When the server never processes plaintext user data — when PII is stripped client-side and data arrives encrypted with client-held keys — the observability system cannot capture what the server does not have. Logs contain encrypted request identifiers and operational metrics. Traces show the path of ciphertext through ephemeral infrastructure. Error messages reference encrypted payloads, not user data.

The observability data is still useful. Request counts, latency distributions, error rates, and resource utilization are all measurable without access to user data. What is lost is the ability to grep logs for a specific user’s email address or inspect a specific user’s request payload in a trace. In exchange, what is gained is the structural guarantee that the observability system — and anyone with access to it — cannot reconstruct user behavior from telemetry data.

This is not a compromise. It is a different model. Observability that monitors infrastructure health without observing user data. Visibility into system behavior without visibility into human behavior. Seeing everything that matters for operations. Knowing nothing that matters for privacy. The tension resolved not by balancing two competing goals, but by building an architecture where they do not conflict.