What PII types are supported?

Supported types include Names, Contact, Financial, Government IDs, Medical, and Custom. You configure which types to detect and can add custom patterns to cover organization-specific data.

Can redaction be reversible?

Yes. Redaction methods like pseudonymization, tokenization, and encryption can be reversible under controlled conditions, enabling audit trails and data restoration when permitted; masking is not reversible.

What dependencies are required?

The skill relies on presidio-analyzer for detection, presidio-anonymizer for redaction, and spaCy for language processing; these components enable detection, redaction, and multilingual support.

pii-redaction

npx machina-cli add skill a5c-ai/babysitter/pii-redaction --openclaw

Files (1)

SKILL.md

1.3 KB

PII Redaction Skill

Capabilities

Detect personally identifiable information
Implement redaction strategies
Configure detection for various PII types
Design reversible anonymization
Implement compliance logging
Create audit trails for PII handling

Target Processes

content-moderation-safety
system-prompt-guardrails

Implementation Details

PII Types

Names: Person names, usernames
Contact: Email, phone, address
Financial: Credit cards, bank accounts
Government IDs: SSN, passport, driver's license
Medical: Health information
Custom: Organization-specific PII

Redaction Methods

Masking ([REDACTED])
Pseudonymization (fake values)
Tokenization (reversible)
Encryption

Configuration Options

PII types to detect
Detection sensitivity
Redaction method
Language support
Custom patterns

Best Practices

Comprehensive PII coverage
Regular pattern updates
Audit logging
Compliance alignment
Testing with diverse data

Dependencies

presidio-analyzer
presidio-anonymizer
spacy

Source

git clone https://github.com/a5c-ai/babysitter/blob/main/plugins/babysitter/skills/babysit/process/specializations/ai-agents-conversational/skills/pii-redaction/SKILL.md

View on GitHub

Overview

The pii-redaction skill detects personally identifiable information in conversational data and applies configurable redaction strategies to protect user privacy. It supports reversible anonymization, audit logging, and language-aware configuration to satisfy privacy and compliance needs.

How This Skill Works

It uses a PII detector to flag sensitive data across defined types, then applies a configured redaction method such as masking, pseudonymization, tokenization, or encryption. Redaction results are logged for compliance, and the system can be tailored to language support and custom patterns for organization-specific PII.

When to Use It

Moderate user-generated chat content to redact emails, phone numbers, and names before storage or display.
Configure system prompts and guardrails to prevent PII leakage to users or agents.
Store or transmit chat logs for regulatory compliance with audit trails.
Handle financial or health-related conversations by applying redaction or anonymization to sensitive fields.
Test and validate redaction coverage with diverse data patterns and languages to ensure robustness.

Quick Start

Step 1: Define PII types to detect and set the detection sensitivity ( Names, Contact, Financial, Government IDs, Medical, Custom ).
Step 2: Choose a redaction method ( masking, pseudonymization, tokenization, or encryption ) and wire it into the redaction pipeline.
Step 3: Enable compliance logging and run tests with diverse data to verify coverage, reversibility where configured, and audit trails.

Best Practices

Comprehensive PII coverage by enumerating all relevant types (Names, Contact, Financial, Government IDs, Medical, Custom).
Regular pattern updates to keep detection effective against new formats and aliases.
Audit logging for every redaction event and reversible anonymization activity.
Compliance alignment with regional data protection laws and internal policies.
Testing with diverse data, languages, and edge cases to ensure reliable performance.

Example Use Cases

Redacting emails and phone numbers from customer support transcripts before analytics or storage.
Masking names and usernames in content moderation logs to protect user identity.
Pseudonymizing patient identifiers in medical chatbot conversations for research or support.
Tokenizing or encrypting sensitive fields like account numbers in logs for secure storage.
Defining and applying custom patterns for organization-specific PII such as employee IDs.

Frequently Asked Questions

Add this skill to your agents