Loop Engineering · AI Agents · 2026
The term is hype. The change underneath is real. The difference is that here it's running. The genealogy, the one thing that changed, and seven agents looping in production.
The honest verdict
Some tell you loop engineering is the future of programming. Others, that it's pure repackaged hype. Both are half right: the term is marketing, but underneath there's a change that is real. The difference is that neither side shows it to you working. Here it does.
They slapped "engineering" onto something we've been doing for years. Let's go one by one.
1970s → cron jobs (run this every X time)
2022 → ReAct (reason + act, in a cycle)
2025 → Ralph Wiggum (while loop + one prompt)
2026 → "Loop Engineering" ← the new name 1970s · Cron jobs
"Run this task every X time." The first loop, older than most of us. Scheduled automation, decades before the term.
2022 · ReAct (Yao et al.)
Reason and act in a cycle: the model thinks, acts, observes the result and thinks again. That loop is, literally, the heart of what they now call loop engineering. Almost four years earlier.
July 2025 · Geoffrey Huntley
"Ralph Wiggum as a software engineer." An infinite while loop that grabs a text file, hands it to the agent and starts over. As dumb as Ralph — it doesn't think, it just repeats. And it still worked.
June 2026 · "Loop Engineering"
The new name. The same while loop from a year ago, repackaged as "the new paradigm." What actually changed isn't the loop: it's the model running inside it.
Ralph's original loop (Huntley, 2025)
while :; do cat PROMPT.md | npx --yes @sourcegraph/amp ; done That's all: an infinite while that grabs a text file, hands it to the agent and starts over. Forever. That was almost a year ago. And that's exactly what they're selling you this week as "the new paradigm."
The while loop has existed for years. So why did it explode right now? Because until recently, the loop was useless.
THE SAME LOOP, TWO RESULTS
2023 model: 2026 model:
you hand it the error you hand it the error
→ can't use it → understands and fixes it
→ drifts off → progresses toward the goal
→ burns tokens, never ends → keeps going until
verification says "done" ✅ This generation of models, for the first time, grabs a "this is wrong" signal and, instead of derailing, corrects toward the goal, step after step, hundreds of times. Careful: they still don't know on their own when they're done — what changed is that they now know how to use a verification to move forward.
The day the model knows how to move forward with a signal, the bottleneck stops being it... and becomes you. You stop babysitting, copying errors. You go from writing the prompt to designing the system that writes the prompt for you.
One single thing turns a dumb while loop into an engineering loop: verification.
Dumb loop (Ralph)
while true:
do the prompt thing
repeat It doesn't know if it finished. It doesn't know how to stop. It spins forever, burning tokens. Without verification you didn't build a loop: you built a very confident token-burning furnace.
Engineering loop
while NOT meets_goal:
look at the state
decide and act
VERIFY the result
met it? → stop It has a verifiable exit condition. It knows when to stop. That verification is the exact line between hype and engineering.
The exit condition can take two forms. Both work; the trap comes after.
It either passes or it doesn't, no middle ground. The tests pass, CI is green, the build compiles. The loop runs until that's met, and then it stops. No discussion.
Another agent reviews the work and says "this still looks off, keep going." The one that builds isn't the one that judges. It's inferential verification.
The trap · the sharpest critique (and it's right)
If your exit condition is "the tests pass," the loop hands you something that passes the tests. But green doesn't mean good: not maintainable, not secure, not fast, not understandable in six months. The loop finds one way that works today, and "works today" isn't "good software." The real engineering work isn't writing the while — it's designing a verification that captures real quality. Nobody has that solved yet. It's the frontier.
Addy Osmani (Google) broke it down clearly. A working loop needs five pieces — plus a sixth that holds it up over time.
The loop wakes itself up: an event or a schedule triggers it, without you launching it by hand.
Several agents in parallel, each in its own copy, without stepping on each other's work.
The project rules written once, so the agent doesn't have to guess on every pass.
Real tools wired to the world: GitHub, Slack, a database. The loop acts, it doesn't just talk.
The one that builds isn't the same as the one that judges. One agent does the work, another reviews it.
The model forgets between sessions; the loop doesn't. It leaves something written that survives — today's work feeds tomorrow's, without you remembering it.
They're easy to confuse. The loop isn't "part" of the harness: they're two distinct layers, on two distinct axes.
HARNESS → what the agent has on ONE pass
(tools, permissions, secrets, context)
LOOP → how that pass REPEATS over time
(trigger, verification, stop condition)
The loop is the outer while.
The harness is everything the agent uses inside each pass. The tools, the keys, the safety rules. Everything the agent has at hand on each pass.
The act of working in that workshop over and over, until done, knowing when to stop. The loop runs the harness over time — it sits one floor above.
Where does the harness come from? I took it apart in another video. Watch Harness Engineering ›
A concept you don't see running is exactly the same as a concept that doesn't exist. Three levels, from lower to higher stakes.
A single line: run the agent over and over until the test passes. It fails, fixes, fails, fixes, and stops at the exact moment it's ready. I don't appear anywhere in the loop. The cameo: the same loop without the test goes blind — it says "all done" forever. The test is the line between hype and engineering.
Hermes runs as a continuous process, always on. Its trigger isn't during the conversation: it's when you close it. On exit, a subagent reviews the whole session, finds a pattern worth keeping and creates a new skill. By itself. That skill persists and loads itself tomorrow — that's the fifth piece: memory.
Seven agents on Google Cloud Run, each with its own URL, talking over the A2A protocol. Each agent is a loop; the orchestrator is a loop of loops: it calls one, waits for its result, passes it to the next. Context travels from loop to loop, only with messages. A real loop, at scale.
After building all this and showing it running, the honest answer.
The term is hype
It's a trendy name for a while loop with verification, an idea that's years old. If someone tells you this was invented this week, they're selling you something.
The change underneath is real
For the first time the model knows how to use a verification signal to move forward instead of derailing. That changes your role: from writing prompts to designing the loops that write them. That's already happening.
The loop shines
Long, repetitive and verifiable tasks: run the tests, fix the lint, close the ticket. If you can write the "this is done" condition, you have a loop.
The loop is danger
Fuzzy exploratory work where not even you know what the end looks like. The loop optimizes toward your vague description with blind confidence. If you can't write the "done," you don't have a loop: you have a wish.
The reminder that lands
The loop speeds up your work. It doesn't replace your judgment. And it costs: a loop burns several times more tokens than a prompt — and in multi-agent, much more. With no exit condition, no budget, no iteration limit, you pay for nothing. The day you release the goal without knowing exactly what you're verifying, you didn't build an autonomous system: you built an expensive way to generate code nobody reviewed. The judgment is yours. That's the part no hype will give you.
The primary sources cited in the video.
Geoffrey Huntley · July 2025
The original while loop: a text file with instructions, handed to the agent, in an infinite loop. The starting point of all this.
Yao et al. (Princeton + Google) · 2022
The research that defined the reason + act cycle. The academic heart of the loop, almost four years before it had a trendy name.
Addy Osmani (Google) · June 2026
Breaks a working loop into five pieces: automations, worktrees, skills, connectors and sub-agents. Plus memory, because the model forgets but the loop doesn't.
Peter Steinberger · 2026
Creator of OpenClaw. The triggering tweet: you no longer prompt your agents — you design the loops that prompt them.
Boris Cherny · 2026
Creator of Claude Code at Anthropic. His job is no longer writing prompts — it's writing loops that run on their own.
Lance Martin · 2026
Not everything deserves a loop. On one end, small verifiable tasks where the loop shines. On the other, fuzzy exploratory work where the loop burns tokens with blind confidence.
Every piece of the loop, already shown running in another video. These are the ones in the clips.
The workshop where the loop works: everything the agent has on each pass. The loop sits one floor above.
The reviewer in action: an agent that judges another's work and sends it back to fix. Inferential verification.
The 7 agents that delegate tasks in production, each running its own loop.
Skills as project rules: the third piece of a loop, so the agent doesn't guess every time.
Memory that survives between sessions: today's work feeds tomorrow's, without you remembering it.
How each agent is built before it runs inside a loop coordinated by A2A.
The essentials about loop engineering.
It's designing the system that iterates for you: an agent that reasons, acts, verifies the result and repeats until it meets a verifiable goal, without you copying errors and pasting them by hand. The term is new (June 2026), but the idea comes from the ReAct research (2022) and the Ralph Wiggum while loop (2025).
Almost, and that's the trap. The Ralph loop is an infinite while with no verification: it does the work and never knows it finished, so it burns tokens forever. Loop engineering adds a verifiable exit condition — the loop knows when to stop. That verification is the only difference between a token-burning furnace and a system that runs itself.
The loop didn't change: the model running inside it did. Before, if you put a model in an infinite while, it derailed — you'd hand it the error to its face and it couldn't use it. This generation of models, for the first time, grabs a "this is wrong" signal and corrects toward the goal instead of getting lost. The loop existed, but the model didn't know how to walk inside it.
They're two distinct layers, on two distinct axes. The harness is what the agent has on a single pass: tools, permissions, secrets, context. The loop is how that pass repeats over time: trigger, verification, stop condition. The loop is the outer while; the harness is everything the agent uses inside each pass. The loop runs the harness over time — it sits one floor above.
Lance Martin frames it as a spectrum. On one end, long, repetitive and verifiable tasks — run the tests, fix the lint, close the ticket — that's where the loop shines. On the other, exploratory work where not even you know what the final result looks like: there the loop optimizes toward your vague description with total confidence. If you can't write the "this is done" condition, you don't have a loop yet: you have a wish, and a prompt is still cheaper.
Yes, quite a bit more. A loop can burn four or five times more tokens than a normal prompt, and in multi-agent setups much more. If you leave it with no exit condition, no budget and no iteration limit, there are two endings: an ugly API bill, or running out of quota mid-week. The loop speeds up your work, it doesn't replace your judgment.
No one invented it this week, even if that's how it's sold. The original while loop is from Geoffrey Huntley ("Ralph Wiggum as a software engineer", July 2025). Addy Osmani (Google) broke it into five blocks. Peter Steinberger and Boris Cherny popularized it from OpenAI and Anthropic. And the academic root is ReAct (Yao et al., 2022).
YouTube channel
@NicolasNeiraGarcia
ADK · A2A · Claude Code · Automation · Infrastructure