Agent skill
pii-redaction
Automatically applies when logging sensitive data. Ensures PII (phone numbers, emails, IDs, payment data) is redacted in all logs and outputs for compliance.
Stars
163
Forks
31
Install this agent skill to your Project
npx add-skill https://github.com/majiayu000/claude-skill-registry/tree/main/skills/data/pii-redaction
SKILL.md
PII Redaction Enforcer
When you are writing code that logs or outputs:
- Phone numbers (mobile, landline)
- Email addresses
- User IDs / Subscriber IDs
- Payment tokens / Credit card numbers
- Social Security Numbers / Tax IDs
- Physical addresses
- Names (in sensitive contexts)
- IP addresses (in some jurisdictions)
- Any other personally identifiable information
Always redact PII in logs, error messages, and debug output.
✅ Correct Pattern
python
import logging
import re
logger = logging.getLogger(__name__)
def redact_pii(value: str, show_last: int = 4) -> str:
"""
Redact PII showing only last N characters.
Args:
value: String to redact
show_last: Number of characters to show (default: 4)
Returns:
Redacted string (e.g., "***5678")
"""
if not value or len(value) <= show_last:
return "***"
return f"***{value[-show_last:]}"
def redact_email(email: str) -> str:
"""
Redact email preserving domain.
Args:
email: Email address to redact
Returns:
Redacted email (e.g., "u***r@example.com")
"""
if not email or '@' not in email:
return "***"
local, domain = email.split('@', 1)
if len(local) <= 2:
return f"***@{domain}"
return f"{local[0]}***{local[-1]}@{domain}"
# Usage in logging
logger.info(f"Processing user | subscriber_id={redact_pii(subscriber_id)}")
logger.info(f"Payment received | token={redact_pii(payment_token)} | user={redact_email(email)}")
logger.info(f"Order created | phone={redact_pii(phone_number)}")
Output:
Processing user | subscriber_id=***5678
Payment received | token=***2344 | user=u***r@example.com
Order created | phone=***1234
❌ Incorrect Pattern (PII Leak)
python
# ❌ Full PII exposed
logger.info(f"Processing subscriber_id: {subscriber_id}")
logger.info(f"User email: {email}")
logger.info(f"Phone: {phone_number}")
# ❌ PII in error messages
raise ValueError(f"Invalid subscriber ID: {subscriber_id}")
# ❌ PII in API responses (should only redact in logs)
return {"error": f"User {email} not found"} # OK in response, but log should redact
# ❌ Insufficient redaction
logger.info(f"Phone: {phone_number[:3]}****") # Shows area code - still PII!
Redaction Helper Functions
python
import re
from typing import Optional
class PIIRedactor:
"""Helper class for consistent PII redaction."""
@staticmethod
def phone(phone: str, show_last: int = 4) -> str:
"""Redact phone number."""
digits = re.sub(r'\D', '', phone)
if len(digits) <= show_last:
return "***"
return f"***{digits[-show_last:]}"
@staticmethod
def email(email: str) -> str:
"""Redact email preserving domain."""
if not email or '@' not in email:
return "***"
local, domain = email.split('@', 1)
if len(local) <= 2:
return f"***@{domain}"
return f"{local[0]}***{local[-1]}@{domain}"
@staticmethod
def card_number(card: str, show_last: int = 4) -> str:
"""Redact credit card number."""
digits = re.sub(r'\D', '', card)
if len(digits) <= show_last:
return "***"
return f"***{digits[-show_last:]}"
@staticmethod
def ssn(ssn: str) -> str:
"""Redact SSN completely."""
return "***-**-****"
@staticmethod
def address(address: str) -> str:
"""Redact street address, keep city/state."""
# Example: "123 Main St, Boston, MA" -> "*** Main St, Boston, MA"
parts = address.split(',')
if parts:
street = parts[0]
# Redact house number but keep street name
street_redacted = re.sub(r'^\d+', '***', street)
parts[0] = street_redacted
return ','.join(parts)
@staticmethod
def generic(value: str, show_last: int = 4) -> str:
"""Generic redaction for any string."""
if not value or len(value) <= show_last:
return "***"
return f"***{value[-show_last:]}"
# Usage
redactor = PIIRedactor()
logger.info(f"User phone: {redactor.phone('555-123-4567')}")
logger.info(f"User email: {redactor.email('john.doe@example.com')}")
logger.info(f"Card: {redactor.card_number('4111-1111-1111-1111')}")
When to Redact
Always redact in:
- ✅ Log statements (
logger.info,logger.debug,logger.error) - ✅ Audit logs
- ✅ Error messages (logs)
- ✅ Monitoring/observability data (OTEL, metrics)
- ✅ Debug output
- ✅ Stack traces (if they contain PII)
Don't redact in:
- ❌ Actual API calls (need full data)
- ❌ Database queries (need full data)
- ❌ Function parameters (internal use)
- ❌ Return values to authorized clients
- ❌ Encrypted storage
- ❌ Secure internal processing
Context matters:
python
# ❌ Bad: Redacting in business logic
def send_email(email: str):
send_to(redact_email(email)) # Wrong! Need real email to send!
# ✅ Good: Redacting in logs
def send_email(email: str):
logger.info(f"Sending email to {redact_email(email)}") # Log: redacted
send_to(email) # Actual operation: full email
Structured Logging with Redaction
python
import logging
import structlog
# Configure structlog with redaction
def redact_processor(logger, method_name, event_dict):
"""Processor to redact PII fields."""
pii_fields = ['email', 'phone', 'ssn', 'card_number', 'subscriber_id']
for field in pii_fields:
if field in event_dict:
if field == 'email':
event_dict[field] = PIIRedactor.email(event_dict[field])
else:
event_dict[field] = PIIRedactor.generic(event_dict[field])
return event_dict
structlog.configure(
processors=[
redact_processor,
structlog.processors.JSONRenderer()
]
)
logger = structlog.get_logger()
# Usage - automatic redaction
logger.info("user_action", email="user@example.com", phone="555-1234")
# Output: {"event": "user_action", "email": "u***r@example.com", "phone": "***1234"}
Compliance Considerations
GDPR (EU):
- Personal data must be protected
- Logs are considered data processing
- Must have legal basis for logging PII
- Right to be forgotten applies to logs
CCPA (California):
- Consumer personal information must be protected
- Includes identifiers, browsing history, biometric data
PCI-DSS (Payment Cards):
- Never log full card numbers (PAN)
- Never log CVV/CVV2
- Card holder name is sensitive too
- Mask PAN when displayed (show last 4 only)
HIPAA (Healthcare):
- PHI (Protected Health Information) must be secured
- Includes names, addresses, dates, phone numbers
- Medical record numbers, device identifiers
python
# PCI-DSS compliant logging
def process_payment(card_number: str, cvv: str, amount: Decimal):
"""
Process payment transaction.
Security Note:
PCI-DSS compliant. Never logs full card number or CVV.
"""
# ✅ Redact card number in logs
logger.info(
f"Processing payment | "
f"card={redact_pii(card_number, show_last=4)} | "
f"amount={amount}"
# ✅ NEVER log CVV
)
# Process with full data
result = payment_gateway.charge(card_number, cvv, amount)
# ✅ Redact in audit log
logger.info(
f"AUDIT: Payment processed | "
f"card={redact_pii(card_number, show_last=4)} | "
f"amount={amount} | "
f"status={result.status}"
)
return result
Testing Redaction
python
import pytest
def test_redact_phone():
"""Test phone redaction."""
assert redact_pii("5551234567") == "***4567"
assert redact_pii("555-123-4567") == "***4567"
def test_redact_email():
"""Test email redaction."""
assert redact_email("user@example.com") == "u***r@example.com"
assert redact_email("a@example.com") == "***@example.com"
def test_redact_empty():
"""Test empty string handling."""
assert redact_pii("") == "***"
assert redact_pii(None) == "***"
def test_log_redaction(caplog):
"""Test that logs contain redacted PII."""
with caplog.at_level(logging.INFO):
user_id = "user_12345678"
logger.info(f"Processing user={redact_pii(user_id)}")
assert "***5678" in caplog.text
assert "user_12345678" not in caplog.text
❌ Anti-Patterns
python
# ❌ Not redacting at all
logger.info(f"User {user_id} logged in from {ip_address}")
# ❌ Inconsistent redaction
logger.info(f"User: ***{user_id[-4:]}") # Different format each time
logger.debug(f"User: {user_id[:4]}****")
# ❌ Insufficient redaction
logger.info(f"Email: {email[0]}****") # Just first letter isn't enough
# ❌ Redacting too much (lost debugability)
logger.info("User logged in") # No identifier at all - can't debug!
# ✅ Better: Redact but keep debugability
logger.info(f"User logged in | user_id={redact_pii(user_id)}")
# ❌ Forgetting error messages
try:
process_user(email)
except Exception as e:
logger.error(f"Failed for {email}: {e}") # PII in error log!
# ✅ Better
try:
process_user(email)
except Exception as e:
logger.error(f"Failed for {redact_email(email)}: {e}")
Best Practices Checklist
- ✅ Create consistent redaction helper functions
- ✅ Redact all PII in logs automatically
- ✅ Show last 4 characters for debugging
- ✅ Preserve enough info for support/debugging
- ✅ Use structured logging with auto-redaction
- ✅ Document PII handling in Security Notes
- ✅ Test that redaction works correctly
- ✅ Train team on PII identification
- ✅ Regular audit of logs for PII leaks
- ✅ Consider legal requirements (GDPR, CCPA, PCI-DSS)
Auto-Apply
When you see logging of PII, automatically:
- Wrap with appropriate redaction function
- Show last 4 characters for debugging
- Add Security Note to docstring if handling PII
- Use consistent redaction format
- Test that PII is not in logs
Related Skills
- structured-errors - Redact PII in error responses
- docstring-format - Document PII handling with Security Note
- tool-design-pattern - Redact PII in tool logs
- pytest-patterns - Test PII redaction
Didn't find tool you were looking for?