Can an AI agent backlog evolve itself?
splitIssue on the parent card, it creates real child cards on the board, and later cycles dispatch agents against those new cards. Backlog growth is real, productive on a well-scoped parent, runaway on an ambiguous one.The phenomenon, in one paragraph
Drop one ticket on the board: "Add invoice export." Dispatch Feature-Dev autopilot with the product author persona in the roster. Cycle 0 runs the product author, which examines the ticket and decides it is really four things: schema for the export job, a worker that runs the export, a download endpoint, and an audit log entry. The persona calls splitIssue four times. Now the board has one parent plus four children — the agent grew the backlog while you were not watching.
Subsequent cycles pick those children up. If a child is itself under-scoped, the next product author run on it may split it further. This is the self-evolving property — the agent system discovers work and files it, instead of treating your initial ticket as the complete spec. It can be the best thing the orchestrator does. It can also explode.
How splitting actually happens
Inside a Feature-Dev session each persona run is a normal Claude Code or Codex dispatch with access to the agent bridge tools. One of those tools is splitIssue. When a persona invokes it, the dispatcher writes new rows into the local SQLite database (or, in GitHub mode, opens new sub-issues via Octokit) and links them as children of the parent card. The board updates live; you see the new cards land under the parent.
Each child gets a status of queued by default. The autopilot orchestrator does not auto-dispatch children — they wait in queue for a slot to claim them on a future cycle. Once a slot is free and the round-robin counter lands on a child, that child runs with the next persona in the rotation.
Why this can be productive
Real software work decomposes at write time, not at plan time. If you wrote one ticket on a Monday and reviewed it on Wednesday, you would already have split it three ways yourself. The product author persona is doing the work you would do — it just does it as the implementation begins, which is when ambiguity surfaces.
A well-scoped parent ticket plus a product author persona produces a clean decomposition: 3 to 7 children, each independently implementable. The engineer persona then picks them up one by one, the reviewer reads the diffs, the tester runs the suite. Backlog grew, work happened, backlog shrank again as cards reached done.
Why this can be runaway
An under-scoped parent ticket is fuel. "Improve onboarding" gets split into 8 children: simplify form, fewer fields, better copy, social login, email verification, role selection, welcome screen, sample data. The product author runs again on "simplify form" because it still does not have an acceptance criterion. That spawns another 5 children. Now the board has 1 parent and 13 descendants and the loop has not implemented anything yet.
The economic shape is the part to watch. Each split is cheap — $0.30 to $0.80 of product author tokens. But splits create future runs, and future runs cost engineer time, reviewer time, tester time. A 13-card decomposition at $5 average per card is $65 of downstream spend you authorized by clicking autopilot on the wrong parent.
How KanBots bounds the growth
Three guards. None of them is a depth-counter on the tree directly; the bounds are economic and decisional.
- Session budget cap — set on the Autopilot — Feature Dev modal as a USD number. When the accumulated cost across every child run in the session crosses the cap, the orchestrator throws
SessionBudgetExceededErrorand every slot returns. Splits stop because no more cycles run. See cost and budget control for the mechanics. - Parallelism cap — slots are clamped to
MAX_PARALLELISM = 4. Even if the backlog explodes to 50 cards, only four can be running at any moment, so the burn rate has a ceiling. - The promote step — agents cannot push branches themselves; promote-to-PR is a human action. New child cards are queued, but their commits never reach the remote without you. Runaway splitting can pollute the board but cannot pollute your main history. See the feature-branch workflow.
Personas themselves are also a soft bound. If the reviewer persona is in the roster, every implementation run is followed by a reviewer run that can reject the change. A reviewer that says "this child should have been part of the parent, don't ship it separately" causes the engineer to fold the work back into the parent on the next cycle. The board shrinks.
A worked example, both shapes
Productive split. Card #310, "Add CSV export for invoices, downloadable from the customer dashboard, audit log entry on every export." Run Feature-Dev with product, engineer, reviewer, tester. Parallelism 2. Budget $20.
- Cycle 0 (product): splits #310 into three children: schema for the export job (#311), endpoint at
GET /invoices/export(#312), audit log row writer (#313). Cost $0.40. - Cycles 1–4: engineer and tester alternate across the three children. Each child finishes in 2 cycles. Total spend at end of cycle 4: $7.20.
- Cycle 5 (reviewer): reads the parent's accumulated state and approves. Cycle 6 (product): re-runs the split tool, sees no new gaps, returns. Loop ends. Final spend $9.60.
Runaway split. Card #311, "Improve onboarding." Same configuration.
- Cycle 0 (product): splits into 8 children. Cost $0.60.
- Cycle 1 (engineer): picks up child #312, "simplify form," starts implementing without an acceptance criterion, makes guesses.
- Cycle 2 (reviewer): cannot approve because there is no acceptance criterion to check the diff against. Emits a decision event asking for clarification.
- You are at lunch. The slot pauses. The other slot is on cycle 3, running the product author on child #313, which splitsthat into 4 more children.
- You come back to a board with 12 cards and $4 spent. Stop the session. Edit the parent ticket to have a real acceptance criterion. Discard the spurious children. Restart with budget $10.
Practical guidance
Three habits that keep self-evolving useful instead of runaway.
Always include an acceptance criterion in the parent ticket. One sentence: "Done when an exported CSV download arrives in the dashboard and an audit log row is written." Personas read this; reviewer rejects implementations that do not satisfy it; product author splits cleanly around it.
Run with a tight cap on first dispatch. $5 to $10 the first time you autopilot a card you have never run. Watch the autopilot panel for the first 5 minutes. If the product author has split into 3 children that look right, raise the cap and let it run. If it has split into 12, stop, fix the parent, restart.
Promote winners before re-dispatching. When a child finishes cleanly, promote it (as commit or draft PR) before letting more autopilot cycles run. This pulls completed work out of the autopilot session's accounting and lets you re-dispatch on a smaller surface.
Three failure modes
Recursive splitting on the same child. Product author runs on a child, splits it; runs on the new grandchild, splits that. Symptom: a card with a sub-sub-sub tree. Fix: open the persona prompt for product author and add "Split only when the parent ticket lacks an acceptance criterion. If the parent is concrete, return without calling splitIssue."
Duplicate children. Two product author runs in parallel both split the same parent and create overlapping children — "write export endpoint" and "implement /invoices/export" as two cards. Fix: lower parallelism to 1 on the first cycle through the personas, raise it after the split phase has stabilized.
Orphan children after stop. You stop the session mid-loop. Children are on the board with no runs in flight; they sit in queued indefinitely. Fix: either dispatch on them manually, run a fresh autopilot session targeting them, or discard them from the card menu. KanBots does not auto-clean.
When self-evolving is wrong
It is wrong on production hotfixes. A burning bug needs one fix, not a decomposition. Run a single dispatch with the engineer persona and a tight cap. Self-evolving is also wrong when the parent ticket is itself a research question ("evaluate three approaches and recommend one") — the product author will treat the question as a feature spec and split it into implementation children. Run a one-off dispatch with the engineer persona explicitly told to produce a comparison doc instead.
For the round-robin that drives splits and dispatches in the same session, see multi-persona orchestration; for the outer autopilot loop see autopilot mode.
Try it on your own folder
Drop a folder, get a board, dispatch parallel agents. The desktop runs locally on macOS, Linux, and Windows.
Related questions
- What is autopilot mode for Claude Code?Autopilot picks personas, parallelism, and budget. It loops until the work converges or the cost cap hits. The mental model and when to use it.
- How does multi-persona AI agent orchestration work?Product author → engineer → reviewer → tester. How round-robin persona cycles produce better output than single-persona loops, and how to configure them.
- How do you put a budget cap on AI coding agents?Per-run cost tracking, per-card rollups, per-autopilot-session caps. Stop runaway spend before it stops you.
- How do AI agents fit a feature-branch workflow?One agent → one branch → one PR, isolated by worktree, with pre-push hooks preventing agent-side pushes. The exact branch naming and promote flow.