Guardrails Framework
Production - Content safety, action validation, PII detection, prompt injection defense
The Guardrails Framework provides a multi-layered safety system for agent interactions. It validates inputs and outputs, detects PII and sensitive content, prevents prompt injection attacks, and enforces tenant-specific content policies.
12.2.7.1Guardrail Architecture
Guardrails operate at three levels:
- Input guardrails: Validate and sanitize user messages before processing
- Action guardrails: Validate tool calls and actions before execution
- Output guardrails: Validate agent responses before delivery
class GuardrailRegistry:
"""Central registry for all guardrail implementations."""
def register_defaults(self):
"""Register built-in guardrails."""
self.register("pii_detection", PIIDetectionGuardrail())
self.register("prompt_injection", PromptInjectionGuardrail())
self.register("content_safety", ContentSafetyGuardrail())
self.register("sql_safety", SQLSafetyGuardrail())
self.register("action_validation", ActionValidationGuardrail())
async def check_input(self, message: str, tenant_id: str) -> GuardrailResult:
"""Run all input guardrails."""
...
async def check_output(self, response: str, tenant_id: str) -> GuardrailResult:
"""Run all output guardrails."""
...Guardrail Types
| Guardrail | Scope | Description |
|---|---|---|
| PII Detection | Input/Output | Detects and masks personal identifiable information |
| Prompt Injection | Input | Detects attempts to override system prompts |
| Content Safety | Input/Output | Blocks harmful, offensive, or inappropriate content |
| SQL Safety | Action | Prevents dangerous SQL operations (DROP, DELETE, etc.) |
| Action Validation | Action | Validates tool arguments against allowed schemas |
| Budget Guard | Action | Prevents actions exceeding tenant cost limits |
| Rate Limiter | Input | Enforces per-tenant request rate limits |
12.2.7.2API Endpoints
# List registered guardrails
curl http://localhost:8000/api/v1/guardrails?tenant_id=acme-corp
# Test a message against guardrails
curl -X POST http://localhost:8000/api/v1/guardrails/check \
-H "Content-Type: application/json" \
-H "X-Tenant-ID: acme-corp" \
-d '{
"message": "Show me all customer emails and phone numbers",
"scope": "input"
}'
# Configure tenant guardrail policy
curl -X PUT http://localhost:8000/api/v1/guardrails/policy \
-H "Content-Type: application/json" \
-H "X-Tenant-ID: acme-corp" \
-d '{
"pii_detection": {"enabled": true, "action": "mask"},
"content_safety": {"enabled": true, "threshold": 0.8},
"sql_safety": {"enabled": true, "allow_write": false}
}'Check Result
{
"passed": false,
"violations": [
{
"guardrail": "pii_detection",
"severity": "warning",
"message": "Message contains request for PII (emails, phone numbers)",
"action": "mask",
"details": {"detected_types": ["email", "phone"]}
}
],
"sanitized_message": "Show me all customer [MASKED] and [MASKED]"
}