Skip to content

Baseline Update Workflow

When to Use

Use this guide after an intentional UI change that's expected to produce diffs. Bulk updates and baseline-only commits are the primary ways VR programs fail.

Decision

Model When
Inline — UI change + baseline updates in the same PR Small, targeted changes (≤5 baseline files)
Distinct VR-baseline commits — UI change in commit A; chore(vr): update baselines for button restyle in commit B Larger refactors; easier revert

Pattern

Scoped local update:

# Always scope with --grep to the affected tests
npx playwright test --update-snapshots --grep "button"

Review checklist using the HTML report:

npx playwright show-report

For each failed assertion, check: 1. Scrub the slider — is the diff in the region the PR claims to change? 2. Are there incidental diffs in other regions? If yes — investigate; this is the regression-finding moment 3. Are diffs cross-viewport consistent? A change affecting only desktop but not mobile suggests a media-query bug

The atomic baseline-update rule:

Never update a baseline without a code change in the same commit explaining why.

A baseline-only commit (update baselines) with no corresponding source change is, definitionally, accepting a regression you don't understand.

Common Mistakes

  • Wrong: bulk --update-snapshots as the default response to red CI → Right: accepts regressions
  • Wrong: updating baselines in CI on a separate "fix" commit with no explanation → Right: same problem, scaled
  • Wrong: reviewing PNG diffs only in GitHub's UI without scrubbing the Playwright report → Right: misses sub-pixel shifts

See Also