Can We Close the Loop in 2026?
If you spend enough time with agents, you feel a difference. The good ones feel like working with a colleague. They plan, check, and try to catch their own mistakes and only then tell you it's done. They try to close their own loop. The bad ones end with you typing "review your work and find the errors" or "did you test it?".
From what I can tell, two things separate the good ones from the bad:
-
Self-awareness: Does the agent have an understanding of what it is (an LLM), what tools it has, and how to write its own instructions?
-
Closing the loop. Can the agent verify its own work before responding to the user?
What "Self-Awareness" Is (and Isn't)
This is not about consciousness or sentience. In the context of AI agents, self-awareness refers to operational self-knowledge—a model's functional understanding of its own:
- Knows its constraints. An LLM that understands context window sizes, how system instructions shape behavior, and how to write prompts will write better instructions for itself and for follow-up tasks or sub-agents.
- Understands its own mechanics. An agent aware of its tools (file I/O, shell, browser) picks the right one instead of hallucinating an API that doesn't exist.
- Knows when it's likely wrong. Calibrated uncertainty—recognizing "this regex is complex, I should test it" or "I am not sure about this" rather than confidently outputting something broken.
Researchers call this situational awareness (Berglund et al., 2023) or meta-cognition. Anthropic's introspection research (Oct 2025) showed that Claude can tell whether outputs were artificially injected or generated by it. Not in a human sense, but as a mechanism that produces better meta-task behavior.
What it is not: Understanding why it generates specific tokens, having subjective experience, or "truly knowing itself". Models are trained to behave as if they understand their role. That behavioral consistency produces better meta-task outputs. Useful, not magic.
What "Closing the Loop" Means
Closing the loop means the agent doesn't return to you until it has verified its own work against external signals. Not based on its own judgment of correctness, but based on external signals, e.g. from the compiler, the test runner, the filesystem, and other LLM calls.
Examples of closing the loop today:
- Code changes → agent runs
npm test, reads the error trace, fixes the edge case, re-runs until green—before telling you it's done. - File writes → agent reads the file back after writing to confirm the diff matches intent.
- Requirements → agent compares its final output against the original task list to check nothing was missed.
Most loop-closing in production today is scaffolding, verification steps built around the agent, not initiated by it. Spotify's background coding agent is a good example. After 1,500+ agent-generated PRs, they built independent verifiers (Maven builds, test runners, formatters) exposed to the agent as a single MCP tool. The agent doesn't know what the verifier does internally; it just knows that it needs to call "verify" and then gets pass/fail feedback. On top of that, an LLM judge compares the final diff against the original prompt to catch scope creep (agents "getting creative" and refactoring code nobody asked them to touch).
The research frontier is going towards spontaneous verification. A recent paper (intrinsic self-critique) from DeepMind showed that a single LLM can check its own plan step-by-step against the task rules, no external signal. It just asks itself "does this step actually follow the rules?" after each action, and if not, rewrites the plan. That alone jumped planning success from 50% to 89%. It hints at a future where verification isn't bolted on, it's just what the model does.
What Comes Next?
Today, closing the loop works when the scaffolding is right. Tools like OpenClaw are the first pushing this further, persisting learnings across sessions through memory and a "heartbeat" mechanism that proactively checks and acts on tasks without being prompted. But it's still fragile. The agent does it when the setup tells it to, not because it knows it should.
Same for delegation. Gemini CLI, Claude Code, and Codex CLI all let a lead agent spin up sub-agents in parallel. The mechanism works, but the communication is unreliable, the lead agent doesn't always know what context to pass, what instructions to write, or what the sub-agent is missing. That's where self-awareness will make a difference.
We will see an increase in consistency. Self-awareness that's trained in. Loop-closing that's default behavior. And eventually, continuous learning. The inner loop catching mistakes within a task, the outer loop carrying lessons across sessions. When agents do all of this reliably—understand themselves, verify their work, remember what they learned—then they start being collaborators we can trust.
Thanks for reading! If you have any questions or feedback, please let me know on Twitter or LinkedIn.