spanforge.migrate
Migration helpers for upgrading events from one schema version to the next, plus the Phase 9 v2 migration roadmap with structured deprecation records.
See the Migration Guide for background and strategy.
MigrationStats
@dataclass(frozen=True)
class MigrationStats:
total: int
migrated: int
skipped: int
errors: int
warnings: list[str] = field(default_factory=list)
output_path: str = ""
transformed_fields: dict[str, int] = field(default_factory=dict)
Result of a bulk migration operation (returned by migrate_file()).
Attributes:
| Attribute | Type | Description |
|---|---|---|
total | int | Total events processed. |
migrated | int | Events that were upgraded to a new schema version. |
skipped | int | Events already at the target version (not modified). |
errors | int | Events that could not be parsed or migrated. |
warnings | list[str] | Non-fatal warnings encountered during migration. |
output_path | str | Path where the migrated events were written. |
transformed_fields | dict[str, int] | Mapping of field names to the count of events where that field was transformed (e.g. "payload.model→model_id", "checksum.md5→sha256", "tags.value_coercion"). |
SunsetPolicy
class SunsetPolicy(str, Enum):
NEXT_MAJOR = "next_major"
NEXT_MINOR = "next_minor"
LONG_TERM = "long_term"
UNSCHEDULED = "unscheduled"
Describes how aggressively a deprecated item will be removed.
| Value | Meaning |
|---|---|
NEXT_MAJOR | Removed in the next major release. |
NEXT_MINOR | Removed in the next minor release. |
LONG_TERM | Kept for at least two more major releases. |
UNSCHEDULED | No removal planned; deprecation is advisory only. |
DeprecationRecord
@dataclass(frozen=True)
class DeprecationRecord:
event_type: str
since: str
sunset: str
sunset_policy: SunsetPolicy = SunsetPolicy.NEXT_MAJOR
replacement: str | None = None
migration_notes: str | None = None
field_renames: Dict[str, str] = field(default_factory=dict)
Structured deprecation metadata for a single event type on the migration roadmap.
Attributes:
| Attribute | Type | Description |
|---|---|---|
event_type | str | The deprecated event type. |
since | str | Version in which the type was marked deprecated. |
sunset | str | Target version for removal. |
sunset_policy | SunsetPolicy | SunsetPolicy.NEXT_MAJOR |
replacement | str | None | None |
migration_notes | str | None | None |
field_renames | Dict[str, str] | {} |
Methods
summary() -> str
Return a single-line summary of the deprecation record.
Example:
llm.eval.regression → llm.eval.regression_failed (since 1.0.0, sunset 2.0.0, NEXT_MAJOR)
Module-level functions
v2_migration_roadmap() -> List[DeprecationRecord]
Return the complete list of event types deprecated in v1.0.0 and scheduled for
removal in v2.0.0, sorted by event_type.
Each entry documents the recommended replacement, any relevant field renames,
and the SunsetPolicy governing its removal timeline.
Returns: List[DeprecationRecord] — 9 entries covering the llm.trace.*,
llm.eval.*, llm.guard.*, llm.cost.*, and llm.cache.* namespaces.
Example:
from spanforge.migrate import v2_migration_roadmap
for record in v2_migration_roadmap():
print(record.summary())
v1_to_v2(event) -> Event | dict
def v1_to_v2(event: Event | dict) -> Event | dict
Migrate a single event from schema version 1.0 to 2.0.
Accepts either an Event instance or a plain dict (as loaded from JSONL).
Returns the same type as the input. Idempotent — events already at version
"2.0" are returned unchanged.
Changes applied:
schema_versionset to"2.0".- Missing
org_id/team_idset toNone. - Payload key
modelnormalised tomodel_idif present. tagsinitialised to{}if missing; all values coerced to strings.checksumre-hashed from md5 to sha256 if applicable.
Args:
| Parameter | Type | Description |
|---|---|---|
event | Event | dict | A v1.0 event (Event instance or dict from JSONL). |
Returns: The migrated event (same type as input).
Raises: TypeError — if the input is neither an Event nor a dict.
Example:
from spanforge.migrate import v1_to_v2
# Migrate an Event
v2_event = v1_to_v2(v1_event)
assert v2_event.schema_version == "2.0"
# Migrate a dict from raw JSONL
v2_dict = v1_to_v2({"schema_version": "1.0", "event_type": "llm.trace.span.completed", ...})
assert v2_dict["schema_version"] == "2.0"
migrate_file(input_path, *, output=None, org_secret=None, target_version="2.0", dry_run=False) -> MigrationStats
def migrate_file(
input_path: str | Path,
*,
output: str | Path | None = None,
org_secret: str | None = None,
target_version: str = "2.0",
dry_run: bool = False,
) -> MigrationStats
Migrate all events in a JSONL file from v1 to v2.
Reads line-by-line, applies v1_to_v2() to each JSON object, and writes the
result to output (defaults to <input>_v2.jsonl).
Args:
| Parameter | Type | Description |
|---|---|---|
input_path | str | Path | Path to the source JSONL file. |
output | str | Path | None | Output file path. Default: <stem>_v2.jsonl. |
org_secret | str | None | When provided, re-signs the entire migrated chain using HMAC. |
target_version | str | Target schema version (default "2.0"). |
dry_run | bool | When True, report stats without writing output. |
Returns: MigrationStats — summary of the operation.
Example:
from spanforge.migrate import migrate_file
# Basic migration
stats = migrate_file("audit.jsonl")
print(f"Migrated {stats.migrated}/{stats.total} events → {stats.output_path}")
# Re-sign with a new key
stats = migrate_file("audit.jsonl", output="audit_v2.jsonl", org_secret="my-key")
# Preview without writing
stats = migrate_file("audit.jsonl", dry_run=True)
print(f"Would migrate {stats.migrated} events, skip {stats.skipped}")
print(f"Transformed fields: {stats.transformed_fields}")
migrate_from_langsmith(runs, *, source="langsmith-import") -> list[dict]
def migrate_from_langsmith(
runs: list[dict[str, Any]],
*,
source: str = "langsmith-import",
) -> list[dict[str, Any]]
Added in: 2.0.12 (F-27)
Convert a list of LangSmith run dicts to SpanForge v2 event dicts.
Supports both the JSON array and JSONL line shapes that LangSmith produces when
you export a project. Returns a ready-to-use list of SpanForge v2 event dicts
suitable for writing as JSONL or passing to SFAuditClient.
Args:
| Parameter | Type | Default | Description |
|---|---|---|---|
runs | list[dict] | (required) | List of LangSmith run dicts (from a .json or .jsonl export). |
source | str | "langsmith-import" | source label stamped on every output event. |
Returns: list[dict] — SpanForge v2 event dicts (one per input run).
Run-type mapping:
LangSmith run_type | SpanForge event_type |
|---|---|
"llm" | llm.trace.span.completed |
"tool" | llm.tool.call.completed |
"retriever" | llm.tool.call.completed |
"chain" | llm.chain.completed |
| (other) | llm.trace.span.completed |
Fields mapped:
| LangSmith field | SpanForge payload field |
|---|---|
name | payload.span_name |
run_type | payload.run_type |
status | payload.status |
prompt_tokens / completion_tokens / total_tokens | payload.token_usage |
start_time / end_time | payload.start_time, payload.end_time |
inputs (keys only) | payload.input_keys |
outputs (keys only) | payload.output_keys |
error (truncated to 500 chars) | payload.error |
id | tags.langsmith_run_id |
trace_id / session_id | tags.langsmith_trace_id |
parent_run_id | tags.langsmith_parent_id |
Note: Raw
inputsandoutputsvalues are never stored — only the dict key names are recorded to avoid persisting potentially sensitive content.
Example:
import json
from spanforge.migrate import migrate_from_langsmith
# Load a LangSmith project export
with open("langsmith_export.json") as fh:
runs = json.load(fh)
events = migrate_from_langsmith(runs, source="my-project")
# Write as SpanForge v2 JSONL
import json as _json
with open("spanforge_events.jsonl", "w") as out:
for ev in events:
out.write(_json.dumps(ev) + "\n")
print(f"Converted {len(events)} LangSmith runs → SpanForge events")