From Vibe Coding to Governed Delivery

May 7, 2026 · 10 min read

Governance Foundation

AI coding agents are getting good enough that the old question, "Can they write code?", is becoming less interesting.

The harder question is whether they can participate in a real delivery system without turning the repo into a mess.

Once agents can read issues, modify files, run tests, create branches, and merge work, the risk changes. The problem is no longer capability. The problem is control.

More agents do not automatically create more delivery. Without an operating model, they create duplicated work, unclear ownership, long-lived branches, hidden feature flags, broken integration, and a growing gap between what the system appears to be doing and what is actually safe to ship.

That is the problem VibeGov is designed to address.

The mistake is treating agents like clever freelancers

A repo does not need a crowd of clever freelancers.

It needs a governed delivery system.

In many AI-assisted workflows, each agent is given a task, a prompt, and access to the repo. That can work for a small change. It does not scale into reliable delivery.

The moment multiple agents are involved, the system needs answers to basic governance questions:

Who decides what the issue means?
Who decides whether the issue is ready to build?
Who owns the architecture boundary?
Who owns delivery into the integration branch?
Who owns the user experience and design-system contract?
Who verifies the outcome independently?
Who watches for stale work, broken state, and follow-through?
Who is allowed to block unsafe change?

If those answers are not explicit, agents will fill the gaps with assumptions.

And assumptions are where delivery drift begins.

Prompts are not governance

Agent instructions matter, but prompts alone are not enough.

A prompt can say:

Do not expand scope.

But the delivery system still needs a place where scope is defined, reviewed, and enforced.

A prompt can say:

Keep the repo clean.

But the workflow still needs branch rules, validation gates, issue evidence, and a clear definition of done.

A prompt can say:

Follow the architecture.

But the project still needs someone or something accountable for defining that architecture, maintaining ADRs, and deciding when a change crosses a boundary.

VibeGov starts from a simple assumption:

Agents should be autonomous inside clear boundaries, not free outside accountability.

The issue is the work contract

In AI-assisted delivery, the issue becomes more important, not less.

A weak issue gives the agent room to guess. A strong issue gives the agent a contract to execute.

That contract should define:

the intended outcome
why it matters
scope and non-goals
OpenSpec binding or SPEC_GAP
acceptance criteria
verification expectations
risk level
any required research, exploration, design, security, or architecture input

This is why a one-line issue should not move straight into development.

Fast capture is fine. Fast execution from unclear intent is not.

The work can start as:

Fix login weirdness.

But it should not reach implementation until the issue explains what is weird, what correct behaviour looks like, how it binds to the spec, and how the result will be verified.

Intake can be loose. Execution should not be.

The board is the operating system

The project board is not just a reporting tool. It is the operational state machine.

A simple board is enough:

No status
Backlog
Ready
In Progress - In Dev
In Review - In Test
Done
Blocked
Parking Lot

The important part is not the labels. It is what they mean.

Ready means the issue is buildable and releasable.

In Progress - In Dev means the Developer agent is actively delivering it.

In Review - In Test means the change is being validated through automation, verifier activity, or release confidence checks.

Done means the work has landed cleanly and the integration branch is healthy.

Blocked means progress needs an explicit unblocker, not silent waiting.

Parking Lot means the idea is acknowledged but intentionally outside the current path.

This gives agents a shared operating surface. They do not need to invent side queues, hidden TODOs, or chat-based promises.

The board is where state lives.

Ready means releasable

One of the most important rules in an agent delivery system is this:

Ready means releasable.

An issue should not enter Ready unless the work can safely land on the integration branch and move toward release.

That does not mean every issue must deliver a large user-facing feature. It means the increment should be coherent, integrated, and safe.

Bad ready work looks like:

build half a feature and hide it
create a parallel implementation path
start a migration with no cutover plan
add a feature toggle with no owner or removal condition
implement speculative code for a future product decision

Good ready work looks like:

deliver a complete behaviour change
add a tested internal capability with a clear future use
implement a paid feature as an explicit entitlement
add an operational toggle with defined enabled and disabled behaviour
create a migration step that leaves the system stable

Agents move quickly. That makes issue slicing more important.

If the work is not safe to land, it is not ready for Dev.

Done means green integration state

Code written is not done.

Tests passing locally is not done.

A branch that looks good is not done.

Done means the work has made it to the integration branch and that integration state is still green.

This matters because agent delivery can create a false sense of progress. The agent can produce code, explain the change, and sound confident. But until the work is integrated, validated, and traceable to the issue, it has not improved the product.

The Developer agent should own the path from ready issue to green integration state:

start from a clean integration branch
implement the issue
update tests, docs, and config where required
validate locally
refresh from the current integration branch
integrate the change according to repo policy
watch automation
fix immediately if the pipeline fails
close the issue only when evidence is complete

This is not bureaucracy. It is delivery closure.

No wild forks

Branches are useful as temporary implementation workspaces.

They are not product states.

Long-lived branches, hidden futures, and parallel product lines create exactly the kind of ambiguity AI delivery should avoid.

The rule should be blunt:

All development must converge.

If a feature is worth building, it should be shaped into a releasable increment. If it is not ready to be released, it should remain in Backlog, Parking Lot, research, design, or architecture analysis.

Do not let the repo become a museum of abandoned futures.

Feature toggles are configuration, not hiding places

Feature toggles are not bad.

Undisciplined toggles are bad.

A feature toggle should be an explicit product, operational, or release control. It should not be a way to merge unfinished code and decide later what it means.

Good toggle use includes:

paid feature entitlement
tenant or customer-specific enablement
environment-specific behaviour
staged rollout
operational kill switch
time-bound experiment

For every toggle, define:

name
purpose
owner
configuration location
default state
enabled behaviour
disabled behaviour
tests for both states
removal condition if temporary

The key rule is simple:

No feature should require code edits to enable after development.

If a feature is optional, paid, staged, or tenant-specific, build it that way from the start.

Toggles are configuration and product controls, not hiding places for incomplete work.

Separate roles are useful when they create real control

The goal is not to create an agent circus.

Separate roles are useful when they create clearer accountability.

A practical operating model can include:

planner for intake, prioritisation, backlog hygiene, and developer handoff
architect for system design, ADRs, boundaries, migrations, developer-experience architecture, and technical direction
designer for UI/UX intent, Design Language System stewardship, user flows, component states, and accessibility-by-design
developer for issue execution, coding, testing, git hygiene, and integration
researcher for external evidence gathering, source evaluation, and cited synthesis
explorer for repo, UI, and API exploration, evidence capture, finding triage, and spec gaps
verifier for independent QA, regression checks, acceptance evidence, and release confidence
security for threat modelling, secrets, auth, privacy, dependency, licensing, and exposure review
documenter for READMEs, install guides, changelogs, user docs, and public comms
maintainer for repo hygiene, branch closure, changelogs, versioning, and release readiness
operator for recurring sweeps, task/state orchestration, reminders, and follow-through

Not every issue should pass through every role.

That would kill delivery speed.

Instead, route work by need.

Researcher and Explorer feed evidence. Designer shapes experience intent. Security blocks unsafe change. Architect protects direction. Planner protects readiness. Developer ships. Verifier proves. Documenter keeps the written surface aligned. Maintainer keeps release and repo hygiene clean. Operator keeps the system moving.

The model is not many agents doing whatever they want.

It is governed autonomy.

Specialists should feed the spec, not bypass it

A clean pattern is:

Raw idea
 ↓
Planner triage
 ↓
Research / exploration / design / security input as needed
 ↓
Architect or Planner creates the build-ready issue
 ↓
Developer delivers
 ↓
Automation and Verifier validate
 ↓
Integration remains green

Specialist work is independent of code. A Researcher can answer a question. An Explorer can inspect the repo. A Designer can define the user flow. Security can identify controls.

But those outputs should flow back into the issue or OpenSpec before development starts.

Research and design should not bypass the accountable delivery contract.

Automation proves mechanics; governance preserves meaning

Automation is essential, but it cannot do the whole job.

Automation can prove:

tests pass
build succeeds
lint and type checks pass
secrets are not detected
dependency checks are clean
pipeline triggered
artifact was produced

But automation cannot fully decide:

whether the issue meant the right thing
whether the architecture direction is sound
whether the user experience is coherent
whether the trade-off is acceptable
whether the feature should exist
whether scope was silently expanded
whether the disabled state of a paid feature makes product sense

That is why governance still matters.

Automation is the proof layer. It does not replace accountability.

The real unlock is governed autonomy

The next phase of AI software delivery will not be won by giving agents unlimited freedom.

It will be won by teams that can give agents enough autonomy to move fast and enough governance to keep the system coherent.

That means:

issues are treated as execution contracts
OpenSpec captures requirement truth
the project board carries operational state
the integration branch remains the integration truth
the release branch remains release truth
agents act within role authority
automation validates the mechanics
security and verification provide independent confidence
operators keep the loop moving

Vibe coding showed how quickly software can be produced when humans and AI work fluidly together.

The next step is making that flow reliable enough for serious delivery.

That is the shift from vibe coding to governed delivery.

The mistake is treating agents like clever freelancers​

Prompts are not governance​

The issue is the work contract​

The board is the operating system​

Ready means releasable​

Done means green integration state​

No wild forks​

Feature toggles are configuration, not hiding places​

Separate roles are useful when they create real control​

Specialists should feed the spec, not bypass it​

Automation proves mechanics; governance preserves meaning​

The real unlock is governed autonomy​

The mistake is treating agents like clever freelancers

Prompts are not governance

The issue is the work contract

The board is the operating system

Ready means releasable

Done means green integration state

No wild forks

Feature toggles are configuration, not hiding places

Separate roles are useful when they create real control

Specialists should feed the spec, not bypass it

Automation proves mechanics; governance preserves meaning

The real unlock is governed autonomy