Skip to content

Guardrails System

When to Use

Use guardrails when you need to intercept AI requests before they reach the provider (pre-processing) or after receiving a response (post-processing). Required for user-facing AI features.

Decision

Situation Choose Why
Content moderation Pre + post guardrail Block unsafe input and output
PII filtering Pre guardrail Scrub before sending to provider
Prompt injection detection Pre guardrail Catch injection attempts before processing
AI-based moderation NonDeterministicGuardrailInterface Guardrail itself uses AI; receives AiProviderPluginManager
Streaming response Avoid NonStreamableGuardrailInterface Skipped for streaming calls automatically
Redact content mid-stream StreamableGuardrailInterface (1.4) Buffer streamed output and redact before client sees it
Site-wide enforcement Global guardrail sets (1.4) Applied to every request before caller-attached sets

Pattern

use Drupal\ai\Attribute\AiGuardrail;

#[AiGuardrail(
  id: 'safety:pii_filter',  // ID must match or be prefixed by group ("safety")
  label: new TranslatableMarkup('PII Filter'),
  description: new TranslatableMarkup('Removes PII before sending to AI'),
)]
class PiiFilter extends AiGuardrailPluginBase {

  public function processInput(InputInterface $input): GuardrailResultInterface {
    // Return PassResult, StopResult, or RewriteInputResult.
    return new RewriteInputResult('Input scrubbed', $this, []);
  }

  public function processOutput(OutputInterface $output): GuardrailResultInterface {
    return new PassResult('Output passed', $this, []);
  }
}

Guardrail Result Types

Result Class stop() Purpose
PassResult false Input/output passes without changes
StopResult true Block the request; includes a $score compared against set threshold
RewriteInputResult false Rewrite the input before sending
RewriteOutputResult false Rewrite the output before returning

Request Lifecycle

  1. PreGenerateResponseEvent fires
  2. Changed in 1.4: GlobalGuardrailsEventSubscriber (priority 100) prepends site-wide guardrail sets from ai.settings — global sets always run first and cannot be bypassed by callers
  3. GuardrailsEventSubscriber loads the merged guardrail set list from the input
  4. Each guardrail's processInput() runs (can modify input or block request)
  5. Provider processes the request
  6. PostGenerateResponseEvent fires
  7. Each guardrail's processOutput() runs

Global Guardrails (Changed in 1.4)

Configure site-wide guardrail sets that apply to every AI request regardless of caller. Configure at /admin/config/ai/settings.

# ai.settings
global_guardrails:
  - my_pii_guardrail_set
  - my_content_moderation_set

Multiple Guardrail Sets per Input (Changed in 1.4)

In 1.3.x, each input held a single guardrail set. In 1.4.x, inputs hold multiple sets.

1.3.x (deprecated in 1.4) 1.4.x replacement
setGuardrailSet($set) addGuardrailSet($set) or setGuardrailSets([$set])
getGuardrailSet() getGuardrailSets() — returns array keyed by set ID
// 1.4.x (recommended)
$input->addGuardrailSet($guardrailSet);      // Add one set; replaces if same ID
$input->setGuardrailSets([$set1, $set2]);   // Replace all sets
$sets = $input->getGuardrailSets();          // Returns array keyed by set ID

Config Entities

Entity Type Purpose
ai_guardrail Individual guardrail config entity
ai_guardrail_set Groups guardrails with pre/post lists and a stop threshold

AiGuardrailSetInterface key methods: - getPreGenerateGuardrails() — guardrails that run before AI generation - getPostGenerateGuardrails() — guardrails that run after AI generation - getStopThreshold() — float threshold for StopResult scores

Built-in Guardrail Plugins

Plugin Purpose
regexp_guardrail Block inputs/outputs matching a configurable regex (fixed in 1.3.5 — processOutput() now executes the pattern)
input_length_limit Changed in 1.4: Built-in DoS protection — blocks requests exceeding a configurable character limit
restrict_to_topic New in 1.4: Non-deterministic (LLM-based) guardrail blocking inputs/outputs outside a configured topic. 1.4.2 added a re-entrancy guard (its internal LLM call can't recurse into global guardrails) and parses the classifier response via ai.prompt_json_decode

Specialized Interfaces

Interface Purpose
NonDeterministicGuardrailInterface Guardrail that uses AI itself; receives AiProviderPluginManager via setAiPluginManager()
NonStreamableGuardrailInterface Marker — guardrail cannot process streamed responses; skipped for streaming calls
StreamableGuardrailInterface New in 1.4: Evaluate streamed output mid-stream. getStartRegex() begins buffering, getStopRegex() ends it, and processStreamedBuffer(string $buffered): GuardrailResultInterface decides — used to redact sensitive content before it reaches the client

AiGuardrailRepository

$repo = \Drupal::service('Drupal\ai\Guardrail\AiGuardrailRepository');
$guardrail = $repo->getGuardrailById('safety:pii_filter');
$set = $repo->getGuardrailSetById('my_guardrail_set');
$all = $repo->getAllGuardrailSets();

Common Mistakes

  • Wrong: ID doesn't match its group prefix → Right: ID must match or be prefixed by group (e.g., group "safety" → ID safety or safety:pii_filter)
  • Wrong: Applying streaming-incompatible guardrails without NonStreamableGuardrailInterfaceRight: Mark as NonStreamableGuardrailInterface and it will be skipped for streaming calls
  • Wrong: Not enabling guardrails on user-facing features → Right: Prompt injection can bypass agent instructions without guardrails
  • Wrong: Using setGuardrailSet() in 1.4.x → Right: Deprecated; use addGuardrailSet() or setGuardrailSets()

See Also