Unpossible: A minimal multi-agent orchestrator
The idea is simple: spawn a new agent for each task i.e. no previous context, meaning each task starts fresh with as much context as possible.
Wanting to see if I could take it one step further, and perhaps run multiple Ralphs in parallel, coordinating on a shared task list. The result: Unpossible, a minimal multi-agent orchestrator using filesystem locks and git worktrees.
Honest disclaimer: For practical purposes just use Codex or Claude Code with small enough tasks and you should be fine. Codex particularly is very good at managing context and running for very long sessions. Personally I have never found the need for ralphs. Regardless it was a great learning experience and if you want to know what I learned building my own minimal orchestrator, read on.
Ralph: The minimal loop
At its simplest, a Ralph is just a while loop:
while [ $iteration -lt $MAX_ITERATIONS ]; do
task_id=$(find_pending_task)
if [ -z "$task_id" ]; then break; fi
claude --print --output-format stream-json \
-p "$prompt" > output.jsonl
# check if task completed or needs to skip
iteration=$((iteration + 1))
done
That’s it. Find a task, run Claude on it, repeat.
Unpossible: Running multiple Ralphs
Now let’s take Ralph and run multiple of them in parallel.
The setup:
prd.json- a JSON array of tasks withid,title,description,done,dependsOn,validationfieldsprogress.txt- append-only log where Ralphs record what they did (helps them coordinate)prompt.template.md- instructions injected into each Ralph instance- One git worktree per Ralph, so they can work without stepping on each other
The biggest factor in success? Validation steps. If you don’t tell the model exactly how to verify its work, it can mark incomplete tasks as done.
Task claiming with filesystem locks
Each Ralph has its own copy of prd.json in its worktree. But we need to prevent two Ralphs from grabbing the same task.
The solution: atomic directory creation.
claim_task() {
local task_id=$1
if mkdir ".unpossible/locks/$task_id" 2>/dev/null; then
return 0 # got the lock
else
return 1 # someone else has it
fi
}
mkdir is atomic on Unix. If it succeeds, you own that task. If it fails, someone else already claimed it. No race conditions, no external dependencies.
When a Ralph finishes:
- Commit changes
- Rebase onto latest main
- Fast-forward merge back to main
- Release the lock
If there are merge conflicts, the Ralph resolves them. This actually works better than you’d expect - the model can look at git show <conflicting-commit> and understand what the other Ralph was doing.
The dependency problem
The hardest part: what happens when Ralph-1 picks TASK-005 but realizes it needs TASK-003 done first? And TASK-003 is either being worked on by Ralph-2 or nobody’s started it yet.
Two approaches:
Strict mode: If a task’s dependsOn aren’t all complete, skip it. Add the missing dependency to prd.json, emit <promise>SKIP</promise>, release the lock, pick another task. If nothing is ready, sleep and retry.
Overlap mode: Let Ralphs implement minimal prerequisites to unblock themselves. Ralph-1 can implement just enough of TASK-003 to finish TASK-005. When Ralph-2 comes through with the “proper” TASK-003 implementation later, they merge. The last one to merge synthesizes both approaches.
Personally I had better results with the overlap mode. It’s faster (no blocking) and you get this emergent synthesis where different implementations get merged together, that is each agent touches a task in a slightly different manner and can potentially add something that was missed by the previous agent.
What the prompt looks like
The prompt template gets placeholders replaced at runtime:
# Task Assignment
You are {{TASK_ID}} ralph, working in {{RALPH_DIR}}.
## Your Task
{{TASK_JSON}}
## Validation
Before marking done, verify:
{{VALIDATION_STEPS}}
## After Committing
git rebase {{BASE_BRANCH}}
(cd {{MAIN_DIR}} && git merge {{RALPH_BRANCH}} --ff-only)
The validation section is where you spell out exactly what “done” means. npm run build && npm test or whatever. Without this, you will very likely get tasks marked complete that aren’t.
The flow
┌─────────────────────────────────────────────────────────┐
│ unpossible.sh │
│ Creates worktrees, spawns Ralphs, monitors progress │
└─────────────────────────────────────────────────────────┘
│ │
▼ ▼
┌─────────────┐ ┌─────────────┐
│ Ralph-1 │ │ Ralph-2 │
│ worktree │ │ worktree │
└─────────────┘ └─────────────┘
│ │
▼ ▼
┌─────────────┐ ┌─────────────┐
│ claim task │ │ claim task │
│ via mkdir │ │ via mkdir │
└─────────────┘ └─────────────┘
│ │
▼ ▼
┌─────────────┐ ┌─────────────┐
│ run claude │ │ run claude │
│ implement │ │ implement │
│ validate │ │ validate │
└─────────────┘ └─────────────┘
│ │
▼ ▼
┌─────────────┐ ┌─────────────┐
│ commit │ │ commit │
│ rebase │◄───conflicts───│ rebase │
│ ff-merge │ │ ff-merge │
└─────────────┘ └─────────────┘
│ │
└──────────────┬───────────────┘
▼
┌─────────────┐
│ main │
│ branch │
│ (linear) │
└─────────────┘
When conflicts happen during rebase, the Ralph examines both implementations and merges. It keeps both functionalities working. I’ve seen it adopt error handling from one side and more complete logic from the other.
What I learned
Task quality is everything. Vague tasks produce vague results. Each task should be atomic (one thing), verifiable (clear validation), and scoped (don’t bleed into other tasks).
Validation is non-negotiable. If you can’t tell the model exactly how to check its work, it will lie to you. Not maliciously - it just doesn’t know what “done” means.
Overlap mode produces better results. When Ralphs can implement minimal prerequisites and merge conflicts intelligently, you get synthesis instead of blocking. Different approaches converge into something better than either alone. And as the tasks are already quite small anyways it doesn’t hamper ralph’s progress too much.
Filesystem locks are underrated. No Redis, no database, no coordination service. Just mkdir. It’s atomic, it’s simple and it works.
Code
The orchestrator is at github.com/muneebshahid/unpossible. It’s deliberately minimal - two shell scripts and some prompt templates. The README has setup instructions and test fixtures.
Fair warning: this is experimental. The model can ignore instructions. Tasks can conflict in weird ways. But when it works, watching three agents coordinate through filesystem locks and git merges is genuinely satisfying.