AI Test Generation — Overview
When to Use
Use AI test generation when the site exists, behavior is defined, and you need coverage you don't have time to write by hand. Do not use it when intent is unclear — the AI will encode whatever it finds, which may be the bug.
Decision
| If you're... | Do this |
|---|---|
| Backfilling test coverage on an existing site | Use AI generation per flow, review each plan, commit |
| Doing TDD on a net-new feature | Write the plan yourself first; let AI generate the code from your plan |
| Reproducing a bug for a regression test | Describe the bug to the Planner; review the plan; generate |
| Maintaining a stable suite where UI changes | Let the Healer fix locators; review every change |
| Writing a test for behavior you can't articulate yet | Don't generate — figure out the intent first |
Pattern
Intent (user story / code / prompt)
↓
Planner agent
↓
Markdown test plan ← HUMAN REVIEWS HERE
↓
Generator agent
↓
Playwright code ← HUMAN REVIEWS HERE
↓
Runs in CI
↓ (when locators drift)
Healer agent
↓
Patched Playwright code ← HUMAN REVIEWS HERE
Three review gates. Skipping any of them defeats the workflow.
Common Mistakes
- Wrong: Generating code directly from a prompt without the plan stage → Right: plan first, every time — it's the only review gate a non-developer can use
- Wrong: Treating AI tests as final on commit → Right: review every output as if a junior engineer wrote it
- Wrong: Using AI generation when you can't articulate what "correct" looks like → Right: define intent first; then generate