Security & Performance
When to Use
Use this guide when hardening taxonomy implementations against security vulnerabilities and optimizing for large-scale performance.
Decision
Security
| Situation | Choose | Why |
|---|---|---|
| Vocabulary-level permissions | Per-vocabulary permissions (edit terms in $vid) |
Granular control vs overly permissive administer taxonomy |
| Term view access | Ensure users have access content permission |
Terms require published status AND access content |
| Auto-create validation | Implement hook_taxonomy_term_presave() |
Prevent spam/XSS in auto-created terms |
| Custom output | Use Html::escape($term->getName()) |
Prevent XSS attacks |
| CSRF protection | Use Form API | Automatic CSRF token inclusion |
Performance
| Term Count | Performance Impact | Mitigation |
|---|---|---|
| <100 | Negligible | Use default approaches |
| 100-1,000 | loadTree() slows; dropdown widgets lag | Cache trees; use autocomplete widgets |
| 1,000-10,000 | loadTree() with load_entities causes memory issues | Use load_entities = FALSE; paginate admin UI |
| >10,000 | Admin UI fails; exposed filters timeout | Disable overview form; use Search API/Solr for faceting |
Pattern
Access control:
// GOOD: Granular per-vocabulary permissions
$account->hasPermission("edit terms in $vid");
// BAD: Overly permissive
$account->hasPermission('administer taxonomy');
XSS prevention:
// ALWAYS sanitize term names in custom output
$safe_name = Html::escape($term->getName());
// Twig auto-escapes
{{ term.name }} {# Safe #}
{{ term.name|raw }} {# Dangerous unless sanitized #}
Auto-create validation:
// Prevent spam/XSS in auto-created terms
function mymodule_taxonomy_term_presave(Term $term) {
if ($term->isNew()) {
$name = $term->getName();
// Enforce max length
if (strlen($name) > 50) {
$term->setName(substr($name, 0, 50));
}
// Strip HTML tags
$term->setName(strip_tags($name));
// Normalize whitespace
$term->setName(preg_replace('/\s+/', ' ', trim($name)));
}
}
N+1 query problem:
// BAD: N+1 queries
foreach ($nodes as $node) {
$terms = $node->get('field_tags')->referencedEntities();
}
// GOOD: Preload all referenced terms
$tids = [];
foreach ($nodes as $node) {
foreach ($node->get('field_tags') as $item) {
$tids[] = $item->target_id;
}
}
$terms = $term_storage->loadMultiple(array_unique($tids));
loadTree() optimization:
// BAD: Out-of-memory with 10k+ terms
$tree = $term_storage->loadTree($vid, 0, NULL, TRUE);
// GOOD: Load lightweight objects, cherry-pick entities
$tree = $term_storage->loadTree($vid, 0, NULL, FALSE);
$tids_to_load = array_slice(array_column($tree, 'tid'), 0, 100);
$terms = $term_storage->loadMultiple($tids_to_load);
Caching term trees:
// Cache tree for 1 hour
$cid = "taxonomy_tree:$vid";
$cache = \Drupal::cache()->get($cid);
if ($cache) {
$tree = $cache->data;
} else {
$tree = $term_storage->loadTree($vid);
\Drupal::cache()->set($cid, $tree, time() + 3600, ["taxonomy_term_list:$vid"]);
}
Large vocabulary strategies:
- Disable term overview page: hook_entity_operation_alter() to remove "List terms" link
- Use autocomplete everywhere: never render full term list
- Consider hierarchical facets in Search API instead of Views exposed filters
- Partition large vocabularies: "US States", "Canadian Provinces" instead of "All Regions"
Common Mistakes
- Wrong: Trusting user input in auto-created terms → Right: Always validate and sanitize in presave hook
- Wrong: Not setting cache tags on term-dependent data → Right: Use
['taxonomy_term:' . $tid]cache tag - Wrong: Exposing term overview form for large vocabularies → Right: Restrict access or provide filtered views
- Wrong: Using taxonomy_index for non-nodes → Right: Use entity reference queries or build custom index table
- Wrong: Not invalidating term tree cache → Right: Use cache tag
taxonomy_term_list:$vid - Wrong: Granting 'administer taxonomy' to untrusted roles → Right: Use per-vocabulary permissions