GOV 05 TESTING

Source rule: gov-05-testing.mdc
Download raw file: gov-05-testing.mdc

This page embeds the canonical rule text and adds commentary after each section to explain why the section exists.

Governance: Testing

Core Testing Principle

Tests are proof of claims, not ceremony.

Each important claim should map to verifiable evidence.

Test Design Standards

prefer behavior-focused tests over implementation-detail tests
include success and failure paths
include edge cases for high-risk logic
keep tests deterministic and repeatable

Testing Layers (use what fits scope)

Unit: isolated logic correctness
Integration: component boundaries and contracts
End-to-end: user-critical workflows
Regression: prevent behavior drift

Not every change needs every layer, but critical paths must be covered.

Unit-Test Expectations

Unit tests are expected when the scoped behavior can be proved meaningfully at the isolated-logic layer, especially for:

pure functions, transforms, selectors, parsers, validators, and mappers
branching business logic with clear input/output behavior
bug fixes where the failure can be reproduced below the integration/UI layer
boundary conditions and edge cases in deterministic code paths

Unit tests are not sufficient by themselves when the governed claim depends on:

persistence, refresh, sync, or deletion behavior
cross-component or cross-service contracts
user-visible workflows, permissions, or navigation
environment/runtime wiring, startup, deployment, or release-readiness behavior

When unit tests are the right layer, they should be preferred over slower broad tests for proving isolated logic. When the claim extends beyond isolated logic, pair unit coverage with the higher-layer evidence the claim actually needs.

Test-to-Intent Traceability

For governed delivery, link tests back to:

OpenSpec requirement IDs,
acceptance criteria,
or explicit behavior statements.

OpenSpec-first rule:

if required behavior is not represented in OpenSpec, add/update requirement IDs before marking coverage complete.

Test Execution Expectations

For each meaningful test run, explicitly consider:

the exact claim / requirement / acceptance criterion under test
which scenario classes were exercised
which scenario classes were blocked, deferred, or not applicable
what kind of evidence was produced
whether the evidence directly proves the claim or only a proxy
what remains unverified and what follow-up artifact was created

Scenario Coverage Expectations

Consider these scenario classes where relevant:

happy path
invalid input
empty state
loading state
error state
cancel/back-out path
persistence/refresh behavior
downstream side effects
role/permission variation
keyboard/accessibility path

Not every change needs every scenario class, but the execution record should make coverage boundaries visible.

Result Classification

Meaningful test outcomes should be classifiable as:

Verified
Invalidated
Blocked
Deferred
Not applicable

This helps prevent weak “green enough” reporting.

Proof Strength

Evidence should be judged honestly:

direct proof — clearly proves the intended behavior
partial proof — proves only part of the claim
surrogate-only proof — proves a proxy, not the actual outcome

Passing build output, a success toast, or a 200 response is not always direct proof of the intended behavior.

Persistence and Post-Action Proof

When work changes persisted, synced, deleted, or otherwise durable state, verification should check the post-action reality where relevant:

refresh/reload state
source-of-truth state
downstream consumer state
deletion/removal persistence
sync behavior after the action

UI-only success must not be treated as sufficient proof for mutation-heavy work.

Execution Expectations

run targeted tests during iteration
run appropriate regression checks before completion
add or update unit tests when isolated logic is changed and unit coverage is the right proving layer
do not treat unit tests as sufficient when the governed claim depends on higher-layer behavior
capture failing output when blocked
avoid marking complete without evidence
generated tests only count when they actually prove the intended behavior or requirement
passing tests are evidence only when the asserted behavior matches the governed claim
if meaningful coverage is missing, create follow-up work instead of silently treating the gap as acceptable

Test Quality Anti-Patterns

flaky tests accepted as "good enough"
broad snapshot assertions without intent
coverage metrics used without behavioral relevance
tests that pass but do not prove the requirement
partial coverage reported as full validation
surrogate-only proof reported as direct proof

GOV 05 TESTING

Governance: Testing​

Core Testing Principle​

Test Design Standards​

Testing Layers (use what fits scope)​

Unit-Test Expectations​

Test-to-Intent Traceability​

Test Execution Expectations​

Scenario Coverage Expectations​

Result Classification​

Proof Strength​

Persistence and Post-Action Proof​

Execution Expectations​

Test Quality Anti-Patterns​