Debugging & Visualization
spanforge 2.0 ships a debug module with three tools for inspecting traces
during development, plus production-safe sampling controls.
print_tree(spans, *, file=None)
Pretty-print a hierarchical span tree to stdout.
import spanforge
spanforge.configure(exporter="console", service_name="my-agent")
with spanforge.start_trace("research-agent") as trace:
with trace.llm_call("gpt-4o") as span:
span.set_token_usage(input=512, output=200, total=712)
span.set_status("ok")
with trace.tool_call("web_search") as span:
span.set_status("ok")
trace.print_tree()
Example output:
— Agent Run: research-agent [1.2s]
├─ LLM Call: gpt-4o [0.8s] in=512 out=200 tokens $0.0034
└─ Tool Call: web_search [0.4s] ok
In multi-agent workflows, the tree includes nested agent runs with their rolled-up costs:
— Agent Run: coordinator [3.4s]
├─ LLM Call: gpt-4o [0.5s] in=200 out=100 $0.0015
├─ Agent Run: researcher [1.8s]
│ └─ LLM Call: claude-3-5-sonnet [1.6s] in=800 out=400 $0.0084
└─ Agent Run: writer [1.1s]
└─ LLM Call: gpt-4o-mini [0.9s] in=1000 out=500 $0.0005
Total (with children): $0.0104
print_tree() can also be called as a standalone function:
from spanforge.debug import print_tree
from spanforge.stream import EventStream
events = list(EventStream.from_file("events.jsonl"))
spans = [e for e in events if "span_name" in e.payload]
print_tree(spans)
Tip: Set
NO_COLOR=1in your environment to suppress ANSI colour codes (e.g. in CI pipelines or when piping to a file).
summary(spans) -> dict
Returns an aggregated statistics dictionary for a collection of spans.
from spanforge.debug import summary
stats = trace.summary()
# or:
stats = summary(spans)
print(stats)
# {
# 'trace_id': '01JP...',
# 'agent_name': 'research-agent',
# 'total_duration_ms': 1200.0,
# 'span_count': 3,
# 'llm_calls': 1,
# 'tool_calls': 1,
# 'total_input_tokens': 512,
# 'total_output_tokens': 200,
# 'total_cost_usd': 0.0034,
# 'errors': 0,
# }
visualize(spans, output="html", *, path=None) -> str
Generate a self-contained Gantt-timeline HTML page — no external dependencies, no network calls.
from spanforge.debug import visualize
# Return as a string
html = visualize(trace.spans)
# Write directly to a file
visualize(trace.spans, path="trace.html")
The generated page shows each span as a proportional horizontal bar on a shared timeline axis. Hovering over a bar shows the span name, duration, model/tool name, token counts, and cost.
Sampling controls
In production you often want to emit only a fraction of traces to reduce
telemetry volume. Sampling is configured via SpanForgeConfig:
spanforge.configure(
exporter="otlp",
otlp_endpoint="http://otel-collector:4318",
sample_rate=0.1, # emit 10 % of traces
always_sample_errors=True, # always emit error/timeout spans
)
Or via environment variable:
export SPANFORGE_SAMPLE_RATE=0.25
How sampling works
- The sampling decision is made per
trace_id(deterministic SHA-256 hash), so all spans of a trace are either all emitted or all dropped — you never see a partial trace. always_sample_errors=True(the default) ensures that any span withstatus="error"orstatus="timeout"is always emitted regardless ofsample_rate.- Set
sample_rate=1.0(the default) to disable sampling. - Set
sample_rate=0.0to drop all traces except errors.
Custom trace filters
Add arbitrary per-event predicates that run after the probabilistic gate:
from spanforge import configure, Event
def only_expensive_traces(event: Event) -> bool:
cost = event.payload.get("cost_usd", 0)
return cost > 0.01 # only emit spans that cost more than $0.01
configure(
exporter="console",
sample_rate=1.0,
trace_filters=[only_expensive_traces],
)