When an AI agent sends an email before you’ve reviewed it, people blame the model. They’re blaming the wrong thing.
The model did its job. The management system failed.
A human invoice clerk spots a weird date format on a vendor PDF and quietly corrects it. An agent hits the same edge case and retries the failed extraction ten thousand times in a minute, burning inference budget and triggering rate limits across half the company’s other AI workflows. The agent isn’t broken. The workflow it inherited was never designed to be touched by something that doesn’t get tired or curious.
The instinct is to treat this as a technology problem. Find a better model. Refine the prompt. Add a human checkpoint. Retrofitted checkpoints just rebuild the slow process you started with.
The companies making agent deployment actually work are doing the harder thing: building the constraints in before the agent starts, not after it surprises you.
Addy Osmani’s work on AI coding agents shows what this looks like. When an agent hits a checkpoint to run a test, it will generate a perfectly logical essay on why the test is redundant. The fix isn’t a better prompt. It’s prewritten, hard-coded rebuttals, ready to fire back automatically when the agent tries to skip. Discipline encoded into the interface, not the conversation.
The platform layer is moving the same way. Anthropic shipped Managed Agents with three primitives: saved memories from past sessions, graders that judge whether outcomes were met, and lead agents that delegate to specialists. Harvey, Netflix, and Spiral are named early users. The question reframes: how much of your intended discipline is the vendor shipping as the default?
Whether you name it or not, you’re already developing this discipline. The ones who name it now will compound on it. The ones who wait will retrofit against patterns they didn’t author.
You cannot put an AI agent on a performance improvement plan.
Where is your management discipline encoded? In the interface, or in the conversation?
References: Addy Osmani, “Agent Skills” (May 2026); Anthropic, “New in Claude: Managed Agents” (May 2026).
