Skip to content

spanforge.migrate

Migration helpers for upgrading events from one schema version to the next, plus the Phase 9 v2 migration roadmap with structured deprecation records.

See the Migration Guide for background and strategy.


MigrationStats

@dataclass(frozen=True)
class MigrationStats:
    total: int
    migrated: int
    skipped: int
    errors: int
    warnings: list[str] = field(default_factory=list)
    output_path: str = ""
    transformed_fields: dict[str, int] = field(default_factory=dict)

Result of a bulk migration operation (returned by migrate_file()).

Attributes:

AttributeTypeDescription
totalintTotal events processed.
migratedintEvents that were upgraded to a new schema version.
skippedintEvents already at the target version (not modified).
errorsintEvents that could not be parsed or migrated.
warningslist[str]Non-fatal warnings encountered during migration.
output_pathstrPath where the migrated events were written.
transformed_fieldsdict[str, int]Mapping of field names to the count of events where that field was transformed (e.g. "payload.model→model_id", "checksum.md5→sha256", "tags.value_coercion").

SunsetPolicy

class SunsetPolicy(str, Enum):
    NEXT_MAJOR    = "next_major"
    NEXT_MINOR    = "next_minor"
    LONG_TERM     = "long_term"
    UNSCHEDULED   = "unscheduled"

Describes how aggressively a deprecated item will be removed.

ValueMeaning
NEXT_MAJORRemoved in the next major release.
NEXT_MINORRemoved in the next minor release.
LONG_TERMKept for at least two more major releases.
UNSCHEDULEDNo removal planned; deprecation is advisory only.

DeprecationRecord

@dataclass(frozen=True)
class DeprecationRecord:
    event_type: str
    since: str
    sunset: str
    sunset_policy: SunsetPolicy = SunsetPolicy.NEXT_MAJOR
    replacement: str | None = None
    migration_notes: str | None = None
    field_renames: Dict[str, str] = field(default_factory=dict)

Structured deprecation metadata for a single event type on the migration roadmap.

Attributes:

AttributeTypeDescription
event_typestrThe deprecated event type.
sincestrVersion in which the type was marked deprecated.
sunsetstrTarget version for removal.
sunset_policySunsetPolicySunsetPolicy.NEXT_MAJOR
replacementstr | NoneNone
migration_notesstr | NoneNone
field_renamesDict[str, str]{}

Methods

summary() -> str

Return a single-line summary of the deprecation record.

Example:

llm.eval.regression → llm.eval.regression_failed (since 1.0.0, sunset 2.0.0, NEXT_MAJOR)

Module-level functions

v2_migration_roadmap() -> List[DeprecationRecord]

Return the complete list of event types deprecated in v1.0.0 and scheduled for removal in v2.0.0, sorted by event_type.

Each entry documents the recommended replacement, any relevant field renames, and the SunsetPolicy governing its removal timeline.

Returns: List[DeprecationRecord] — 9 entries covering the llm.trace.*, llm.eval.*, llm.guard.*, llm.cost.*, and llm.cache.* namespaces.

Example:

from spanforge.migrate import v2_migration_roadmap

for record in v2_migration_roadmap():
    print(record.summary())

v1_to_v2(event) -> Event | dict

def v1_to_v2(event: Event | dict) -> Event | dict

Migrate a single event from schema version 1.0 to 2.0.

Accepts either an Event instance or a plain dict (as loaded from JSONL). Returns the same type as the input. Idempotent — events already at version "2.0" are returned unchanged.

Changes applied:

  • schema_version set to "2.0".
  • Missing org_id / team_id set to None.
  • Payload key model normalised to model_id if present.
  • tags initialised to {} if missing; all values coerced to strings.
  • checksum re-hashed from md5 to sha256 if applicable.

Args:

ParameterTypeDescription
eventEvent | dictA v1.0 event (Event instance or dict from JSONL).

Returns: The migrated event (same type as input).

Raises: TypeError — if the input is neither an Event nor a dict.

Example:

from spanforge.migrate import v1_to_v2

# Migrate an Event
v2_event = v1_to_v2(v1_event)
assert v2_event.schema_version == "2.0"

# Migrate a dict from raw JSONL
v2_dict = v1_to_v2({"schema_version": "1.0", "event_type": "llm.trace.span.completed", ...})
assert v2_dict["schema_version"] == "2.0"

migrate_file(input_path, *, output=None, org_secret=None, target_version="2.0", dry_run=False) -> MigrationStats

def migrate_file(
    input_path: str | Path,
    *,
    output: str | Path | None = None,
    org_secret: str | None = None,
    target_version: str = "2.0",
    dry_run: bool = False,
) -> MigrationStats

Migrate all events in a JSONL file from v1 to v2.

Reads line-by-line, applies v1_to_v2() to each JSON object, and writes the result to output (defaults to <input>_v2.jsonl).

Args:

ParameterTypeDescription
input_pathstr | PathPath to the source JSONL file.
outputstr | Path | NoneOutput file path. Default: <stem>_v2.jsonl.
org_secretstr | NoneWhen provided, re-signs the entire migrated chain using HMAC.
target_versionstrTarget schema version (default "2.0").
dry_runboolWhen True, report stats without writing output.

Returns: MigrationStats — summary of the operation.

Example:

from spanforge.migrate import migrate_file

# Basic migration
stats = migrate_file("audit.jsonl")
print(f"Migrated {stats.migrated}/{stats.total} events → {stats.output_path}")

# Re-sign with a new key
stats = migrate_file("audit.jsonl", output="audit_v2.jsonl", org_secret="my-key")

# Preview without writing
stats = migrate_file("audit.jsonl", dry_run=True)
print(f"Would migrate {stats.migrated} events, skip {stats.skipped}")
print(f"Transformed fields: {stats.transformed_fields}")

migrate_from_langsmith(runs, *, source="langsmith-import") -> list[dict]

def migrate_from_langsmith(
    runs: list[dict[str, Any]],
    *,
    source: str = "langsmith-import",
) -> list[dict[str, Any]]

Added in: 2.0.12 (F-27)

Convert a list of LangSmith run dicts to SpanForge v2 event dicts.

Supports both the JSON array and JSONL line shapes that LangSmith produces when you export a project. Returns a ready-to-use list of SpanForge v2 event dicts suitable for writing as JSONL or passing to SFAuditClient.

Args:

ParameterTypeDefaultDescription
runslist[dict](required)List of LangSmith run dicts (from a .json or .jsonl export).
sourcestr"langsmith-import"source label stamped on every output event.

Returns: list[dict] — SpanForge v2 event dicts (one per input run).

Run-type mapping:

LangSmith run_typeSpanForge event_type
"llm"llm.trace.span.completed
"tool"llm.tool.call.completed
"retriever"llm.tool.call.completed
"chain"llm.chain.completed
(other)llm.trace.span.completed

Fields mapped:

LangSmith fieldSpanForge payload field
namepayload.span_name
run_typepayload.run_type
statuspayload.status
prompt_tokens / completion_tokens / total_tokenspayload.token_usage
start_time / end_timepayload.start_time, payload.end_time
inputs (keys only)payload.input_keys
outputs (keys only)payload.output_keys
error (truncated to 500 chars)payload.error
idtags.langsmith_run_id
trace_id / session_idtags.langsmith_trace_id
parent_run_idtags.langsmith_parent_id

Note: Raw inputs and outputs values are never stored — only the dict key names are recorded to avoid persisting potentially sensitive content.

Example:

import json
from spanforge.migrate import migrate_from_langsmith

# Load a LangSmith project export
with open("langsmith_export.json") as fh:
    runs = json.load(fh)

events = migrate_from_langsmith(runs, source="my-project")

# Write as SpanForge v2 JSONL
import json as _json
with open("spanforge_events.jsonl", "w") as out:
    for ev in events:
        out.write(_json.dumps(ev) + "\n")

print(f"Converted {len(events)} LangSmith runs → SpanForge events")