1 comments

  • steer_dev a day ago

    Doing post-mortems on my agent's failures over the holidays made me realize the problem isn't the model. It is the lack of a deterministic inference-time verification layer.

    I spent the break reading the recent Stanford/Harvard paper on agentic adaptation [1]. Their research provides mathematical proof for what I experienced in Q4: supervising only final outputs is a dead end. Agents learn to "ignore tools and improve likelihood," meaning they learn to lie more convincingly to pass evaluations while the underlying logic rots.

    I call this the Agent Lobotomy.

    The agent I have in production today is significantly dumber than the one I demoed in December. I was forced to strip autonomy, remove context, and add human checkpoints because I could not trust the probabilistic output. We are stuck in an Autonomy Retreat, creating an Authority Bottleneck [2] where agents are relegated to assistive tasks because the tail risk of autonomous action is too high.

    I built Steer (open source) to stop the bleed. In v0.4.0, I moved the architecture to an Agent Service Mesh pattern. Instead of decorating every function, you patch the framework (e.g. PydanticAI) at the entry point. It auto-discovers tools and enforces a reliability policy globally via deterministic Reality Locks.

    The real unlock is the data. By capturing the delta between a Blocked Response and a Taught Fix, Steer acts as a synthetic data factory for DPO. It moves reliability from a runtime tax to a training asset, allowing you to eventually refactor your prompt monolith into fine-tuned model weights.

    I've put together three cookbooks showing how this stops the lobotomy in SQL and RAG workflows: 1/ Framework Patching: https://github.com/imtt-dev/steer/blob/main/steer/cookbook/p... 2/ SQL Security Lock: https://github.com/imtt-dev/steer/blob/main/steer/cookbook/s... 3/ RAG Grounding Guard: https://github.com/imtt-dev/steer/blob/main/steer/cookbook/r...

    References: [1] https://arxiv.org/abs/2512.16301 [2] https://cloudedjudgement.substack.com/p/clouded-judgement-12...