How do you spawn multiple coding agents from one machine?

Spawning multiple coding agents on one laptop needs two kinds of isolation: process isolation (real OS children, separate stdio, separate exit codes) and source isolation (separate git worktrees, separate branches). Skip either and the agents race each other; nail both and your machine is the only limit.

The short answer

Any agent CLI — Claude Code, Codex CLI, Aider, Cline-as-CLI — can be spawned with child_process.spawn (or your shell equivalent). That gives you process isolation: each one has its own PID, its own stdin/stdout, its own cost meter. To get source isolation you create a worktree per process. The recipe is the same regardless of which CLI you use, which is exactly why a board-driven supervisor can treat them uniformly.

How it actually works under the hood

From the OS’s perspective, an “AI coding agent” is a long-lived child process that reads a prompt, makes HTTPS calls to a model provider, and edits files in its current working directory. Spawning N of them is no different from spawning N npm testprocesses — you just need separate directories so they do not fight over the filesystem.

The two CLIs that matter today both stream NDJSON:

  • Claude Code: claude -p "..." --output-format stream-json --verbose --permission-mode bypassPermissions. Stdin delivery for the prompt, NDJSON on stdout, events typed system / assistant / tool_use / tool_result / result.
  • Codex CLI: codex exec --json "..." (with sandbox flags as needed). Different event shapes but the same fundamental wire format: one JSON object per line.

Any supervisor that can spawn a process, pipe its stdout through a line-splitter, and dispatch per-line based on a small adapter, can drive both. That is what the AgentCliAdapter interface in @kanbots/dispatcher is: a command, a buildArgs function, and a parseLine function. Plug in any agent CLI that streams JSON and you get all the supervisor benefits for free.

The two isolation axes

Process isolation is the easy one. spawn gives you a real OS child with its own PID, its own environment, its own cwd. Send signals to it (SIGTERM on user stop, SIGKILL on hung process), let it exit cleanly, capture its stdout/stderr.

Source isolationis the one people skip and regret. Two agents in one working directory will read each other’s half-written code, both run npm install against the same node_modules, both add imports the other one hasn’t defined yet. The fix is a git worktree per agent — covered in detail at the Claude Code git worktree workflow — but the short form is:

# one worktree per agent, each on its own branch
git worktree add ../wt/a -b agents/a main
git worktree add ../wt/b -b agents/b main
git worktree add ../wt/c -b agents/c main

# spawn agents into those directories
( cd ../wt/a && claude -p "task A" --output-format stream-json ) &
( cd ../wt/b && codex  exec --json "task B" ) &
( cd ../wt/c && claude -p "task C" --output-format stream-json ) &
wait

How KanBots does this specifically

KanBots is the supervisor that owns both isolation axes per card. The board is the spawn surface; each card is a potential agent run; each run is a real OS child of the Electron main process running in a real worktree under .kanbots/worktrees/.

The model-agnostic side is the AgentCliAdaptercontract. The Claude Code adapter and the Codex adapter implement the same three methods; the dispatcher itself does not know which one it is running. Practical consequence: you can have card #42 dispatched with Claude and card #43 dispatched with Codex on the same board, and they show up in the UI identically — same status dots, same threaded events, same decision prompts. Pick the model per card from the Providers modal.

Cost is multiplicative across processes, so KanBots tracks per-card cost from the usage field in each result event, aggregates it per card, and rolls it up per autopilot session. Two stop conditions matter:

  • Per-run cap— a single Claude/Codex run can be capped by tokens or dollars; the supervisor kills the process when the cap hits.
  • Per-autopilot-session cap— with parallelism = 4 and a loop running for an hour, dollars accrue fast. The session has a hard budget; when the rolling total crosses it, the supervisor stops dispatching new runs and lets the in-flight ones drain.

Both caps surface in the UI as a stop reason on the autopilot session row. No surprise bills, no silent drift. More detail in how to parallelize AI coding agents safely.

A concrete walkthrough

Say you want to run two Claude agents and one Codex agent against three different cards on a board.

  1. Open the desktop, pick the repo folder. In Providers, sign in to both Claude Code (claude /login) and Codex (OPENAI_API_KEY or ChatGPT Codex login).
  2. On three cards, hit the per-card model picker: pick Claude Sonnet for the first two, Codex for the third.
  3. Dispatch all three. Three worktrees appear under .kanbots/worktrees/; three OS child processes start; three live threads render in the UI.
  4. On a laptop, your fans spin up. top shows three node/codex processes pegging cores. (Most of the wall-clock is HTTPS to the model, not CPU, but the stream parsers do real work.)
  5. When a card finishes, the worktree stays on disk and the card moves to In review. Choose Open draft PR, Promote commit, or Discard.

Practical limits on a single machine

Memory is rarely the bottleneck — each agent process is on the order of 200–500 MB depending on the CLI. Disk is real: each worktree is a full checkout, and large monorepos balloon fast. Network is the silent killer: four agents each holding open HTTPS streams to a model provider on a flaky cafe wifi will produce more rate limits and more retries than four agents on home fiber. Some calibration from real use:

  • M-series Mac (16 GB+): 4 parallel agent runs is comfortable. 6 is fine if the repo is small. 8+ starts to thrash on disk if your monorepo is heavy.
  • Linux desktop with 32 GB+: 8 parallel runs is fine. The bottleneck becomes your CLI’s rate limits, not the machine.
  • Cloud VM (4 cores, 8 GB): 2 is the sweet spot. Avoid running Electron there; use the OSS dispatcher headless or the cloud product instead.

The MAX_PARALLELISM constant baked into KanBots autopilot is 4. That is not an arbitrary number — it is the point where decision-prompt frequency stops scaling linearly with slots. Past 4, you spend more time answering decisions than the agents save you.

Common failure modes

Rate-limit thundering herd

N agents all start, all hit the model, all retry on the same 429. Most providers backoff-jitter on the client side, but if you wrote the supervisor yourself, you have to. KanBots’s dispatcher detects rate-limit text in stderr and exposes it as a rate_limit event on the card; the run pauses with a clear status instead of looking like a generic stall.

Shared dev server

Multiple agents on the same repo, multiple npm run dev servers all wanting port 3000. Use per-worktree PORT environment variables, or use KanBots’s Branch preview which picks a free port per worktree automatically.

Agents reading uncommitted state from another worktree

This should not happen because the worktrees are separate directories — but it does happen when an agent runs git diff main.. with a path that escapes its working tree, or when the agent cats a file from ../sibling-worktree/“to compare.” The fix is containment: KanBots’s dispatcher ships a containment guard that rejects tool calls touching paths outside the worktree. Without that guard, you rely on the model being well-behaved.

When this is the wrong tool

Spawning N agents is the right pattern for “I have N independent tasks.” It is the wrong pattern for:

  • A single complex feature. One agent on the parent issue, with multi-persona round-robin inside that one card, beats four agents racing to write the same feature differently.
  • Anything that demands a single source of truth in the working tree (e.g. a long-running interactive REPL workflow). The whole point of worktrees is they diverge.
  • Production environments. The whole shape here is for a developer machine; if you want this with team-level reviews and shared state, use the cloud product, not your laptop.

For the developer-machine case — one human, many cards, several agents — pair this with parallel Claude Code for the Claude-only version and how to parallelize AI coding agents safely for the failure-mode-by-failure-mode treatment.

Try it on your own folder

Drop a folder, get a board, dispatch parallel agents. The desktop runs locally on macOS, Linux, and Windows.