Sampling: Observing Enough to Understand
High-throughput systems cannot always record every trace. Sampling reduces volume while preserving meaning.
SpanForge samples at the trace level. Either all spans within a trace are recorded, or none are. This avoids partial traces, which are difficult to interpret.
Consider a system processing 10,000 requests per second. With a 10% sampling rate, approximately 1,000 complete traces are recorded per second. If each trace contains 1,000 spans, throughput drops from 10 million spans per second to 1 million—an order-of-magnitude reduction.
Sampling strategies can be adapted. Development environments may record everything. Production systems may sample probabilistically. Critical events, such as guardrail failures, can override sampling to ensure they are always captured.
In the contract example, a hallucination detected by the guardrail is preserved regardless of the base sampling rate.
The objective is to capture what matters, not everything.
From Data to Explanation
At this point, the system is complete.
A request produces a trace. The trace consists of spans. Each span becomes an event envelope. Envelopes are ordered, buffered, and exported. Sampling controls volume while preserving integrity.
When a failure occurs, the system can reconstruct the full sequence of events. It can show what was retrieved, how the prompt was constructed, what the model produced, and why the guardrail flagged it.
SpanForge does not simply record that something happened. It records enough to explain why.
In systems where correctness cannot be assumed, the ability to explain behavior is not an enhancement.
It is the system itself.