Skip to content

llm.cache — Semantic Cache Events

Auto-documented module: spanforge.namespaces.cache

The llm.cache.* namespace records the outcome of semantic cache lookups, writes, and evictions (RFC-0001 §7).

Payload classes

ClassEvent typeDescription
CacheHitPayloadllm.cache.hitA cache lookup succeeded
CacheMissPayloadllm.cache.missA cache lookup failed
CacheEvictedPayloadllm.cache.evictedA cache entry was removed
CacheWrittenPayloadllm.cache.writtenA response was written to cache

CacheHitPayload

FieldTypeRequiredDescription
key_hashstrOpaque hash of the cache lookup key
namespacestrCache namespace (e.g. "prompts", "responses")
similarity_scorefloatSemantic similarity score in [0.0, 1.0]
ttl_remaining_secondsint | NoneSeconds until the entry expires
cached_modelModelInfo | NoneModel that produced the cached response
cost_savedCostBreakdown | NoneEstimated cost avoided by the cache hit
tokens_savedTokenUsage | NoneTokens avoided by the cache hit
lookup_duration_msfloat | NoneCache lookup latency in milliseconds

CacheMissPayload

FieldTypeRequiredDescription
key_hashstrOpaque hash of the cache lookup key
namespacestrCache namespace
best_similarity_scorefloat | NoneNearest-neighbour score found (if any)
similarity_thresholdfloat | NoneMinimum score required for a hit
lookup_duration_msfloat | NoneCache lookup latency in milliseconds

CacheEvictedPayload

FieldTypeRequiredDescription
key_hashstrHash of the evicted cache key
namespacestrCache namespace
eviction_reasonstrOne of "ttl_expired", "lru_eviction", "manual_invalidation", "capacity_exceeded", "schema_upgrade"
entry_age_secondsint | NoneAge of the entry at eviction time

CacheWrittenPayload

FieldTypeRequiredDescription
key_hashstrHash of the written cache key
namespacestrCache namespace
ttl_secondsintTTL assigned to the cache entry
modelModelInfo | NoneModel that produced the cached response
response_token_countint | NoneToken count of the cached response
write_duration_msfloat | NoneCache write latency in milliseconds

Example

from spanforge import Event, EventType
from spanforge.namespaces.cache import CacheHitPayload
from spanforge.namespaces.trace import ModelInfo, GenAISystem, TokenUsage

tokens_saved = TokenUsage(input_tokens=512, output_tokens=128, total_tokens=640)

payload = CacheHitPayload(
    key_hash="sha256:abc123def456",
    namespace="responses",
    similarity_score=0.97,
    ttl_remaining_seconds=1800,
    tokens_saved=tokens_saved,
    lookup_duration_ms=2.1,
)

event = Event(
    event_type=EventType.CACHE_HIT,
    source="my-app@1.0.0",
    org_id="org_01HX",
    payload=payload.to_dict(),
)