Skip to content

spanforge.redact

PII redaction framework: sensitivity levels, policy-driven field redaction, and redaction guards.

See the Redaction User Guide for full usage examples.


Sensitivity

class Sensitivity(str, Enum)

Ordered enumeration of sensitivity levels for Redactable values.

Supports ordered comparisons (<, <=, >, >=) so policies can filter by minimum sensitivity threshold.

Members:

MemberString valueDescription
Sensitivity.LOW"low"Low sensitivity — e.g. environment names.
Sensitivity.MEDIUM"medium"Medium sensitivity — e.g. internal identifiers.
Sensitivity.HIGH"high"High sensitivity — e.g. internal system data.
Sensitivity.PII"pii"Personally Identifiable Information.
Sensitivity.PHI"phi"Protected Health Information (strictest).

Example:

from spanforge.redact import Sensitivity

assert Sensitivity.PII > Sensitivity.MEDIUM
assert Sensitivity.PHI > Sensitivity.PII

Redactable

class Redactable(value: Any, sensitivity: Sensitivity, pii_types: FrozenSet[str] = frozenset())

A wrapper that marks a payload value as sensitive.

Redactable never exposes the wrapped value in __repr__ or __str__, preventing accidental logging of sensitive data.

Args:

ParameterTypeDescription
valueAnyThe sensitive value to wrap.
sensitivitySensitivitySensitivity level of the wrapped value.
pii_typesFrozenSet[str]Set of PII type tags (e.g. {"email", "phone"}). Defaults to empty set.

Example:

from spanforge.redact import Redactable, Sensitivity

field = Redactable("alice@example.com", Sensitivity.PII, frozenset({"email"}))
str(field)           # '<Redactable:pii>'
print(field.reveal()) # alice@example.com

Properties

sensitivity -> Sensitivity

The sensitivity level of the wrapped value.

pii_types -> FrozenSet[str]

Set of PII type category tags (e.g. {"email", "ssn"}).

Methods

reveal() -> Any

Return the underlying sensitive value.

⚠️ Use with care — the returned value is the raw sensitive data.


RedactionResult

@dataclass(frozen=True)
class RedactionResult:
    event: Event
    redaction_count: int
    redacted_at: str
    redacted_by: str

Result returned by RedactionPolicy.apply().

Attributes:

AttributeTypeDescription
eventEventThe new event with sensitive fields replaced by redaction placeholders.
redaction_countintNumber of payload values that were redacted.
redacted_atstrUTC ISO-8601 timestamp when redaction was applied.
redacted_bystrIdentifier of the policy that performed the redaction.

PIINotRedactedError

class PIINotRedactedError(count: int, context: str = "")

Raised by assert_redacted() when PII is still present in an event.

Attributes:

AttributeTypeDescription
countintNumber of unredacted PII/PHI values found.

RedactionPolicy

@dataclass
class RedactionPolicy(
    min_sensitivity: Sensitivity = Sensitivity.PII,
    redacted_by: str = "policy:default",
    replacement_template: str = "[REDACTED:{sensitivity}]",
)

Policy that drives which Redactable fields are replaced in an event.

All three fields are configurable at construction time.

Fields:

FieldTypeDefaultDescription
min_sensitivitySensitivitySensitivity.PIIMinimum sensitivity level to redact. Values below this threshold are left as-is.
redacted_bystr"policy:default"Identifier embedded in RedactionResult.redacted_by.
replacement_templatestr"[REDACTED:{sensitivity}]"Template for the replacement string. {sensitivity} is substituted with the sensitivity name.

Example:

from spanforge.redact import RedactionPolicy, Sensitivity

policy = RedactionPolicy(min_sensitivity=Sensitivity.MEDIUM)
result = policy.apply(event)
print(result.redaction_count)

Methods

apply(event: Event) -> RedactionResult

Apply this policy to an event and return a new redacted event.

Traverses the event payload and replaces every Redactable value whose sensitivity >= min_sensitivity with the formatted replacement_template. The original event is not mutated.

Args:

ParameterTypeDescription
eventEventThe event to redact.

Returns: RedactionResult


Module-level functions

contains_pii(event: Event, *, scan_raw: bool = True) -> bool

Return True if any payload value is a Redactable with sensitivity >= Sensitivity.PII.

Args:

ParameterTypeDescription
eventEventThe event to inspect.
scan_rawboolWhen True (default), also run regex-based PII scanning on payload strings (via scan_payload()), not just check for Redactable wrappers. Pass False to check Redactable wrappers only.

Returns: bool


assert_redacted(event: Event, context: str = "", *, scan_raw: bool = True) -> None

Raise PIINotRedactedError if the event still contains unredacted PII or PHI.

Use this as a guardrail before exporting events.

Args:

ParameterTypeDescription
eventEventThe event to check.
contextstrOptional context string embedded in the exception message.
scan_rawboolWhen True (default), also run regex-based PII scanning. Pass False to check Redactable wrappers only.

Raises: PIINotRedactedError — if any Redactable PII/PHI values or raw PII patterns remain in the payload.


Deep PII Scanning (new in v1.0.0)

PIIScanHit

@dataclass(frozen=True)
class PIIScanHit:
    pii_type: str
    path: str
    match_count: int = 1
    sensitivity: str = "medium"

A single PII detection hit from scan_payload().

Attributes:

AttributeTypeDescription
pii_typestrType of PII detected (e.g. "email", "ssn", "credit_card").
pathstrDot-separated path to the field in the payload (e.g. "user.email").
match_countintNumber of matches of this type at this path.
sensitivitystrSensitivity level: "high" for SSN/credit_card, "medium" for email/phone, "low" for IP/NI.

Security: matched values are never included in the hit — only the type, path, count, and sensitivity.


PIIScanResult

@dataclass(frozen=True)
class PIIScanResult:
    hits: list[PIIScanHit]
    scanned: int

Result of a scan_payload() call.

Attributes:

AttributeTypeDescription
hitslist[PIIScanHit]List of PII detections.
scannedintNumber of string values scanned.

Properties:

PropertyTypeDescription
cleanboolTrue if no PII was detected.

scan_payload(payload, *, extra_patterns=None, max_depth=10) -> PIIScanResult

def scan_payload(
    payload: dict[str, Any],
    *,
    extra_patterns: dict[str, re.Pattern[str]] | None = None,
    max_depth: int = 10,
) -> PIIScanResult

Scan a payload dictionary for PII using regex detectors.

Walks the entire payload recursively (up to max_depth), testing every string value against the built-in pattern set plus any caller-supplied patterns.

Built-in detectors: email, phone, ssn (with SSA range validation via _is_valid_ssn), credit_card (with Luhn validation), ip_address, uk_national_insurance, date_of_birth (global formats — ISO, US MDY, day-first DMY, written-month DMY/MDY — with calendar validation via _is_valid_date), address.

Args:

ParameterTypeDescription
payloaddict[str, Any]The dictionary to scan.
extra_patternsdict[str, Pattern] | NoneAdditional {label: compiled_regex} detectors.
max_depthintMaximum nesting depth to scan (default 10).

Returns: PIIScanResult

Example:

from spanforge.redact import scan_payload

result = scan_payload({"email": "alice@example.com", "notes": "SSN: 123-45-6789"})
assert not result.clean
for hit in result.hits:
    print(f"{hit.pii_type} at {hit.path} (sensitivity={hit.sensitivity})")
# email at email (sensitivity=medium)
# ssn at notes (sensitivity=high)