Skip to content

Security & Performance

When to Use

Use this guide when hardening taxonomy implementations against security vulnerabilities and optimizing for large-scale performance.

Decision

Security

Situation Choose Why
Vocabulary-level permissions Per-vocabulary permissions (edit terms in $vid) Granular control vs overly permissive administer taxonomy
Term view access Ensure users have access content permission Terms require published status AND access content
Auto-create validation Implement hook_taxonomy_term_presave() Prevent spam/XSS in auto-created terms
Custom output Use Html::escape($term->getName()) Prevent XSS attacks
CSRF protection Use Form API Automatic CSRF token inclusion

Performance

Term Count Performance Impact Mitigation
<100 Negligible Use default approaches
100-1,000 loadTree() slows; dropdown widgets lag Cache trees; use autocomplete widgets
1,000-10,000 loadTree() with load_entities causes memory issues Use load_entities = FALSE; paginate admin UI
>10,000 Admin UI fails; exposed filters timeout Disable overview form; use Search API/Solr for faceting

Pattern

Access control:

// GOOD: Granular per-vocabulary permissions
$account->hasPermission("edit terms in $vid");

// BAD: Overly permissive
$account->hasPermission('administer taxonomy');

XSS prevention:

// ALWAYS sanitize term names in custom output
$safe_name = Html::escape($term->getName());

// Twig auto-escapes
{{ term.name }} {# Safe #}
{{ term.name|raw }} {# Dangerous unless sanitized #}

Auto-create validation:

// Prevent spam/XSS in auto-created terms
function mymodule_taxonomy_term_presave(Term $term) {
  if ($term->isNew()) {
    $name = $term->getName();
    // Enforce max length
    if (strlen($name) > 50) {
      $term->setName(substr($name, 0, 50));
    }
    // Strip HTML tags
    $term->setName(strip_tags($name));
    // Normalize whitespace
    $term->setName(preg_replace('/\s+/', ' ', trim($name)));
  }
}

N+1 query problem:

// BAD: N+1 queries
foreach ($nodes as $node) {
  $terms = $node->get('field_tags')->referencedEntities();
}

// GOOD: Preload all referenced terms
$tids = [];
foreach ($nodes as $node) {
  foreach ($node->get('field_tags') as $item) {
    $tids[] = $item->target_id;
  }
}
$terms = $term_storage->loadMultiple(array_unique($tids));

loadTree() optimization:

// BAD: Out-of-memory with 10k+ terms
$tree = $term_storage->loadTree($vid, 0, NULL, TRUE);

// GOOD: Load lightweight objects, cherry-pick entities
$tree = $term_storage->loadTree($vid, 0, NULL, FALSE);
$tids_to_load = array_slice(array_column($tree, 'tid'), 0, 100);
$terms = $term_storage->loadMultiple($tids_to_load);

Caching term trees:

// Cache tree for 1 hour
$cid = "taxonomy_tree:$vid";
$cache = \Drupal::cache()->get($cid);

if ($cache) {
  $tree = $cache->data;
} else {
  $tree = $term_storage->loadTree($vid);
  \Drupal::cache()->set($cid, $tree, time() + 3600, ["taxonomy_term_list:$vid"]);
}

Large vocabulary strategies: - Disable term overview page: hook_entity_operation_alter() to remove "List terms" link - Use autocomplete everywhere: never render full term list - Consider hierarchical facets in Search API instead of Views exposed filters - Partition large vocabularies: "US States", "Canadian Provinces" instead of "All Regions"

Common Mistakes

  • Wrong: Trusting user input in auto-created terms → Right: Always validate and sanitize in presave hook
  • Wrong: Not setting cache tags on term-dependent data → Right: Use ['taxonomy_term:' . $tid] cache tag
  • Wrong: Exposing term overview form for large vocabularies → Right: Restrict access or provide filtered views
  • Wrong: Using taxonomy_index for non-nodes → Right: Use entity reference queries or build custom index table
  • Wrong: Not invalidating term tree cache → Right: Use cache tag taxonomy_term_list:$vid
  • Wrong: Granting 'administer taxonomy' to untrusted roles → Right: Use per-vocabulary permissions

See Also