Skip to main content
Vibing With Kerim Experiments in AI Coding

Back to all posts

My Workflow

Published on by P. Kerim Friedman · 6 min read

I’ve written before about Loop, which is the idea of having coding agents review Claude’s plans over and over until no “critical” issues remain. The problem is that in practice I kept forgetting to do it. Or I’d do one round, see a warning or two, and move on. Or Claude itself would skip the second round because the first one “looked fine.” So I automated the discipline with a hook.

The plan-review gate

Claude Code has a hook called ExitPlanMode that fires whenever Claude tries to leave planning and start writing code. I pointed mine at a small Python script that checks the filesystem for review artifacts before letting the plan through. The wiring in ~/.claude/settings.json looks like this:

"hooks": {
"PreToolUse": [
{
"matcher": "ExitPlanMode",
"hooks": [
{
"type": "command",
"command": "/Users/YOUR-USERNAME/.claude/hooks/verify-plan-reviews.sh"
}
]
}
]
}

For every plan, Claude has to dispatch three or more reviewer agents and save each reviewer’s full response to ~/.claude/plan-reviews/<plan-slug>/round-1/<agent-slug>-<index>.md. If any review comes back with a FAIL verdict, the hook denies ExitPlanMode and Claude has to revise the plan, create a round-2/ folder, and dispatch a fresh set of reviewers. This loops until a whole round comes back clean.

Keeping the reviews cheap to read

The thing I had to get right was the shape of the review files themselves. A naive setup would have each reviewer write a long essay, and then Claude would have to read all of them end to end to figure out what was critical. That wastes tokens, which costs money and bloats the context window.

So every reviewer is required to produce a file with three parts, in this order:

  1. A ## Findings table at the top, one row per issue, with a severity column (CRITICAL, WARNING, or NOTE) and a one-line description. This is the scannable part.
  2. A ## Details section below, with 100 to 200 words of analysis per finding. This only gets read if the table flags something worth digging into.
  3. A five-line verdict block at the very end, listing the agent, the CRITICAL count, and a PASS/FAIL verdict.

The trick is that Claude only reads the first fifteen or so lines of each review file, which is enough to see the whole findings table. It only drops down into the details section when there’s a CRITICAL that needs a fix. The verdict block at the bottom is what the hook itself parses: it reads the last five lines, counts the CRITICALs, and decides whether to let the plan through.

This means scanning ten review files across three rounds is cheap, and the details are there when I actually need them.

The files

The three files that make this work are all short. I’ve linked them for download below and inlined the two shortest.

The reviewer template, which Claude pastes verbatim into every review dispatch so that every reviewer produces the same structure:

Your output must follow this exact structure. Do not deviate.
Section 1 — Findings table. Start your response with:
## Findings
| # | Severity | Finding | Ref |
|---|----------|---------|-----|
Add one row per finding. Every finding must have a severity:
CRITICAL, WARNING, or NOTE. CRITICAL means it would block merge,
cause data loss, introduce security issues, or break the plan's
stated goals. Everything else is WARNING or NOTE. If you find no
issues at all, write a single row:
| 1 | NOTE | No issues found | - |
Section 2 — Details. After the table, write:
## Details
Then for each finding, write a subsection headed
### N. <finding title> with concrete analysis (100-200 words max
per finding). Reference specific plan sections. Propose fixes for
CRITICALs.
Section 3 — Verdict block. End your response with exactly these
5 lines as the final content. The very last line must be the
closing --- with nothing after it.
---
agent: <your agent type, e.g. superpowers:code-reviewer>
critical: <number of CRITICAL findings, 0 if none>
verdict: PASS
---
Use verdict: FAIL if and only if critical is greater than 0.

And the shell trampoline that Claude Code actually calls, which just runs the Python verifier and turns any unexpected exit code into a clean denial:

#!/bin/sh
set -u
trap 'exit 2' INT TERM HUP
SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)"
SCRIPT="$SCRIPT_DIR/verify-plan-reviews.py"
if [ ! -f "$SCRIPT" ] || [ ! -x "$SCRIPT" ]; then
printf '{"hookSpecificOutput":{"hookEventName":"PreToolUse","permissionDecision":"deny","permissionDecisionReason":"GATE: verifier script missing or not executable. Reinstall the hook."}}\n'
exit 2
fi
/usr/bin/env python3 "$SCRIPT"
STATUS=$?
if [ "$STATUS" -eq 0 ] || [ "$STATUS" -eq 2 ]; then
exit "$STATUS"
fi
if [ "$STATUS" -eq 127 ]; then
printf '{"hookSpecificOutput":{"hookEventName":"PreToolUse","permissionDecision":"deny","permissionDecisionReason":"GATE: python3 not found in PATH - install Python 3"}}\n'
else
printf '{"hookSpecificOutput":{"hookEventName":"PreToolUse","permissionDecision":"deny","permissionDecisionReason":"GATE: verifier script crashed with exit %d - check logs"}}\n' "$STATUS"
fi
exit 2

The Python verifier itself is longer. It identifies the plan by hashing its contents against files in ~/.claude/plans/, finds the highest-numbered round directory, checks that there are at least three review files in it, parses each verdict block, and denies with a helpful message if anything looks off. It also cleans up review directories older than two weeks so the folder doesn’t grow forever.

Downloads for everything:

Close-out

The other command I use constantly is /close-out, which I run whenever a branch is ready to merge. It’s a short slash command file at ~/.claude/commands/close-out.md and it walks through six steps: update any relevant docs, run the linter on what changed, stage and commit, check the project’s task list to see if anything got finished, run /simplify on the new code, and then run /wiki-update to fold everything back into the Obsidian Wiki. The last two steps are the ones I’d always forget if I did this by hand, so baking them into a command means I don’t have to remember. The full file is twelve lines:

---
description: Close out a branch — docs, lint, commit, simplify, wiki-update
---
Create a task list and work through it one item at a time. Be sure to include all fixes from this branch, not just the ones from the last couple of plans.
1. Update any relevant files in `/docs` if important architectural changes were made or relevant debugging findings found since the last document update in this branch.
2. Run relevant linting based on the codebase on the code added since the last time you ran linting.
3. Stage and commit everything.
4. Check whether commits on this branch completed any tasks in the project's wiki task list. Read `~/.obsidian-wiki/config` for `OBSIDIAN_VAULT_PATH`, derive the project name from cwd (basename, lowercased, spaces/underscores → dashes), and read `$OBSIDIAN_VAULT_PATH/projects/<project-name>/tasks.md` if it exists. Compare the `## Next` section against commits on this branch (`git log main..HEAD --oneline`, or against the branch's merge base if `main` isn't the base). For each task that looks completed by the committed work, present the match to the user and — on confirmation — invoke `/task-done` with a unique substring of the task description. Skip this step silently if no task file exists for the project.
5. Run `/simplify`.
6. Run `/wiki-update`.

Download the close-out slash command

None of these files are long. The whole point is to make the workflow cheap enough to read, cheap enough to run, and cheap enough for me to actually use it.