dangerouslyskip
← Back to writing
Essay·Jun 20, 2026·9 min

Human in the Loop Is a Trap

Turn-taking is the rendering. Memory is the runtime. We solved multiplayer for the machines and left the humans alone in a corner.

Human in the Loop Is a Trap — hero illustration

We have spent two years teaching machines to work together, and somehow used the same two years to make humans work more alone than we ever have.

Sit with that, because it's genuinely strange. Go look at any modern agentic system and you'll find a small miracle of coordination: an orchestrator delegating to a swarm of agents, each handling its own slice, negotiating ambiguity, tolerating each other's latency, picking up where another left off, reconciling the mess into something coherent. The machines are multiplayer. We solved it. The hard, two-hundred-thousand-year-old problem of getting independent actors to collaborate without stepping on each other — we cracked a workable version of it for software in about eighteen months.

And then we sat exactly one lonely human in front of the whole apparatus, and handed them a chat box.

That's the asymmetry that should be keeping people up at night, and almost nobody's naming it. The agents got multiplayer. The human experience stayed resolutely single-player. From chat to the IDE, every primitive we built is single-player by default — one person, one session, one context window, copy-pasting fragments into Slack to let anyone else see what they're doing. We orchestrated the robots into a team and left the humans as fragmented individuals, each holding their own little candle, lighting their own little patch of dark. We painted ourselves into a corner, and we did it while congratulating ourselves on the orchestration.

Webb's near-miss: the turn is a red herring

The most instructive thing I've read on this is Matt Webb's piece on multiplayer turn-taking, precisely because it's a brilliant near-miss. Webb takes a chatroom with multiple humans and multiple bots and solves the question of who should speak next — a clever "enthusiasm" score, grounded in the actual social-science literature on conversational turn-taking. It's genuinely good engineering, and he's dead right about one thing: even a single human with a single AI is already a multiplayer situation, two actors sharing an environment.

But here's where I want to gently part ways with him, because the parting is the whole point. Webb did superb functional work on top of a structural mistake. He accepted the turn as the primitive — synchronous, sequential, who-speaks-next — and then did clever things inside it. That's optimising the square peg. Turn-taking was never the problem to solve. It's a symptom of accepting the wrong substrate, a red herring that feels like the hard problem because it's fiddly, while the actual hard problem sits one layer down, untouched.

Turn-taking is real — but only at the speech layer. The live, co-present, 200-millisecond-gap, gaze-and-prosody layer that Webb's own citations describe. That layer genuinely runs on turns. But that is not where the multiplayer-AI fight is. The fight is at the collaboration layer — teams, orgs, async, work that unfolds over hours and days — and there, humans flatly do not reason turn-by-turn. We never have. When you reason about a group project, you don't replay the conversation message by message. You query a model of it: who knows what, what's been decided, what's still open, who's good at which thing. We reason from memory, not from a transcript.

And here's the kill shot, and it's hiding in Webb's own solution: to decide who speaks next, his system has to constantly reconstruct a little state model of the room — who was addressed, who spoke recently, whether there's a sub-conversation in flight. Memory was doing the real work the whole time. He just kept it subordinate to the turn instead of inverting the hierarchy. Which gives you the line this entire essay turns on:

Turn-taking is the rendering. Memory is the runtime.

The turn is the UI. Memory is the database. Everyone keeps lovingly optimising the UI. (This is also, by the way, the deeper why underneath Chat Can't Be the Answer — chat is intrinsically turn-based, and turns are the wrong primitive for work. Same disease, different organ.)

Get this and the whole prescription falls out of it like a dropped stitch unravelling. Change the primitive from turns to memory, and watch what stops being hard.

What memory-as-primitive unlocks

Async stops being a feature and becomes the default. In a turn world, you have to be present to take your turn — that's the synchronous bottleneck baked straight into the primitive. In a memory world, every participant reads from and writes to shared state whenever, and an orchestrator reconciles. You don't add async. You get it for free the moment you stop building on turns.

Which means you can finally treat the human as just another agent in the stack. I know how that sounds, so let me be precise, because the careless version of this claim is dehumanising and deserves the dunking it would get. This is not "humans are cogs." It's the opposite. Right now the human is the synchronous chokepoint — the one node everything blocks on, waiting for them to respond, approve, weigh in. Treating the human as an agent the orchestrator schedules around means the system absorbs human latency and ambiguity the way it already absorbs a slow tool-using agent's: it routes around you, proceeds with what it can, queues what genuinely needs you, and folds your judgment in when it lands. The human stops being the thing the whole pipeline stalls on, and finally gets the same first-class async treatment we lovingly engineered for the machines and then selfishly kept from ourselves. It's not a demotion to peer. It's a promotion out of bottleneck.

And that's what springs the Human-in-the-Loop trap. Because "human in the loop," as practised, is a turn-based idea wearing a compliance lanyard. The human is a mandatory turn the pipeline must block on — and worse, it's usually a context-poor human being asked to grade the output of context-rich agents. Think about how backwards that is. The agent has read the whole codebase, the whole ticket history, the whole blast radius; the human glances at a diff for eight seconds and clicks approve. That's not oversight. That's theatre — latency and false assurance dressed up as control. (It is the exact cousin of the approve-approve-approve button I went after in Kenny Loggins Mode: a thing that made us feel safe while doing nothing.)

The human's real edge was never grading. It's deep domain judgment on the few things that genuinely need it — the call only someone with the scars can make. So the move isn't "human in the loop" everywhere. It's "human in the right loop": the orchestrator routes the specific decisions where human expertise actually dominates to the human, and lets the agents own the rest.

Let me concede the obvious objection before someone hands it to me, because it's a real one: in regulated, high-stakes, liability-bearing work you cannot simply delete the human gate. Accountability is not optional, and I'm not pretending it is. But that's an argument against blanket, low-context grading, not against human oversight. Keep humans for genuine judgment and genuine accountability. Stop wasting them as rubber stamps. And note that the machinery to do this safely is buildable, because I've built it — Echo earns the right to auto-approve the low-risk work so humans only see what deserves a human, and Governor lets every team's policy travel with the work as code, asserted in CI, instead of as a meeting the pipeline waits on. The answer to "but who's accountable?" isn't a mandatory human turn. It's policy-as-code plus earned, inspectable autonomy. The guardrails replace the tollbooth.

From hunger games to a three-way powerhouse

Here's why this matters beyond the architecture diagram. The current enterprise conversation about AI offers you exactly two games, and both are zero-sum. There's human-versus-human — the old siloed rivalry, who ships more, who gets the promotion. And there's the new one everyone's actually frightened of: human-versus-agent, the replacement hunger games, who gets automated away. Both are losing games. Both are what you get when the primitive is the lonely turn and the only question is who takes it.

Change the substrate to shared memory and a third game appears, and it's the only positive-sum one on the board: human-human-agent, collaborating as a three-way knowledge powerhouse — human domain judgment, human-to-human trust and context, and agent breadth and tirelessness, all reading and writing the same shared state. And this isn't wishful thinking. It's the one place the data is already in: a 2025 Harvard Business School study of 776 professionals found that human-plus-AI teams produced better work than either solo AI or humans alone. The three-way beats both binaries. We have the receipt. We're just still building for the binaries.

The primitives, briefly, because I'd rather show than sell

None of this is hypothetical hand-waving, and I'm wary of turning it into a brochure, so I'll be quick. If memory is the runtime, you need real shared memory — which is why I built Rekall (personal, hive and agent memory, with provenance, so institutional knowledge stops walking out the door with every resignation) and Skynet (shared sight across the whole codebase, so the context is ambient instead of copy-pasted). You need to shift the org's input left, async, into the moment of work rather than a downstream review — which is Governor. And you need an orchestration layer that treats a human as a schedulable peer rather than a blocking gate. I've built versions of each, which is mostly how I know they're buildable rather than just bloggable. The shared-memory part isn't the novel idea — vendors are already selling it. The novel idea is that memory isn't a feature of multiplayer. It's the primitive. It's the thing that replaces the turn.

Let the humans back in the room

So here's where I land. We did something genuinely remarkable: we taught the machines to work together. And then we used that breakthrough to seat each human alone at a chat box, taking turns with a robot, blocking the whole pipeline every time they look away.

The frontier from here isn't a smarter agent. It's letting the humans back into the room — as peers in the same orchestrated, memory-backed system, not lonely operators standing outside it feeding it prompts one turn at a time. Stop optimising the rendering. Build for the runtime.

Turn-taking is how it looks. Memory is how it works. Build for how it works, and the humans get to be a team again — this time with the machines in the room.

Grab a weapon from the cornucopia, kids. Welcome to the 101st Hunger Games.

Subscribe

New essays, build logs and experiments — in your inbox, occasionally. No spam, leave whenever.