AI Best Practices and Evals

Confidence label: This is a fast-moving area. The ai_best_practices project and Drupal Eval Commons are both pre-MVP as of 2026-05-21. Treat this as current-state orientation, not settled reference. Refresh cadence: quarterly, or whenever ai_best_practices releases a new version.

When to Use

Use this when you want to understand the canonical Drupal AI guidance ecosystem — where authoritative best-practice guidance will live, what the eval infrastructure is, and what tools are and are not appropriate for Drupal contribution quality assurance.

Decision: The Drupal AI Guidance Landscape

Resource	What it is	Status	Use it?
`ai_best_practices` drupal.org project	Canonical, opinionated Drupal AI guidance; `anthropics/skills`-format skill files + `evals/evals.json`; maintained by webchick, Dries, 19+ contributors	Pre-MVP (`1.0.x-dev`); MVP roadmap in progress (#3585542)	Watch it — will become the authoritative source; do not hard-depend yet
Adopted AI contribution policy	The live policy (disclosure, responsibility, enforcement)	Adopted 2026-04-23	Yes — current, enforced guidance
Drupal Eval Commons	Five-layer eval infrastructure proposal: cases/rubrics, result envelope, registry, browser, domain extensions	Proof of concept (built by Angie Byron with Claude); proposal at #3586445 — not independently confirmed	Orientation only — not a dependency
"Every Eval Ever" envelope	Standardized eval result format (EvalEval-Coalition external standard)	Stable format (external standard); Drupal's binding to it is not yet designed	The envelope format is stable; Drupal's registry is not
`promptfoo`	Eval tooling formerly used in the community	Dead — acquired by OpenAI March 2026, open-source project closed	Do not adopt

Pattern: `ai_best_practices` — What It Is and Why It Matters

The ai_best_practices project is positioned as the canonical, opinionated source of truth for how AI agents and their humans should write Drupal code. It is not documentation — it is an active, executable guidance system:

Skill files (e.g., hook-implementations.md, issue-creation.md) — guidance in anthropics/skills format, each with an evals.json
Evaluation framework (#3581832) — an evals.json spec and grader script that checks PHP lint, phpcs, diff validation, security patterns, and report structure; runs offline, no API key required
Policy framework — disclosure rules, conduct, enforcement aligned with the adopted policy

When the MVP ships, it becomes the primary reference for AI-assisted Drupal contribution quality. Until then, the adopted policy (see Drupal AI Policy) is the enforced rule.

Pattern: The Eval Axis vs. the Policy Axis

These are two separate concerns that are often conflated:

Axis	What it measures	Where it lives
Policy	Disclosure, responsibility, conduct — whether AI use was acceptable	Adopted policy doc; enforced via credit/abuse rules
Quality / Evals	Whether AI-generated code follows Drupal best practices — phpcs passes, correct APIs, security patterns	`ai_best_practices` `evals.json` + grader

Passing the policy axis (disclosing correctly) does not guarantee passing the quality axis. Both matter. The policy is enforced today; the eval infrastructure is being built.

Pattern: What to Track, What to Ignore

Track (check quarterly): - ai_best_practices releases — when the MVP ships, update Drupal AI Policy and this guide to point to it - Drupal Eval Commons #3586445 — when the registry design is finalized, evals become a usable contribution gate - Governance issue #3565917 — if it un-postpones and adopts, it may add to or modify the current policy

Ignore / do not adopt: - promptfoo — dead (March 2026 acquisition by OpenAI) - Any eval registry or evals.json schema as a hard dependency — both are [DRAFT] and subject to breaking change - Unconfirmed Eval Commons work items — #3586445 could not be independently confirmed as of 2026-05-21

Common Mistakes

Wrong: Treating ai_best_practices as stable reference → Right: It is pre-MVP. Skill file content and evals.json schema are actively changing. Reference by project URL, not pinned to internal structure.
Wrong: Using promptfoo → Right: It is no longer maintained. Do not adopt it for any Drupal eval workflow.
Wrong: Conflating the policy gate with the quality gate → Right: Disclosing AI use correctly does not mean the AI-generated code passes Drupal's quality bar. Both checks are needed.
Wrong: Treating a passing eval as a policy substitute → Right: The eval grader checks code quality; it does not verify that you disclosed AI use appropriately or understood the issue before submitting.