I Read the Code
·2 min·Comic
The planner, coder, judge split is a good harness: an expensive model plans, a cheap one writes, the expensive one judges, with the budget spent where judgment lives. But an LLM judge is a filter, not an alibi; it catches the cheap model's misses, not whether the plan solved your problem or whether the edge cases the models quietly agreed on match the ones your users will find. The one step that does not delegate is the last: somebody opens the diff and reads it.