dangerouslyskip
← Back to writing
Essay·Jun 17, 2026·11 min

Marking My Own Homework

The 2026 AI predictions nobody asked for — graded at the halfway line, no curve.

Marking My Own Homework — hero illustration

Back on January 1 I posted a pile of predictions nobody asked for. The thing about predictions is that everyone makes them and almost nobody goes back to check, because going back to check is how you find out you were an idiot. So this is me, halfway through the year, going back to check. Lights on. No grading on a curve. I'm a Walt-Disney-level techno-optimist with a doomsday clock in my back pocket, so read the grades in both registers — giddy and faintly terrified, usually at once.

A prediction with no scorecard is just a horoscope for engineers. Here's the scorecard.

The rubric, because honesty needs one:

  • CALLED IT — happened, clearly, I'll take the points.
  • TRACKING — heading exactly where I said, not fully landed yet.
  • HALF RIGHT — right instinct, reality got messier.
  • TOO EARLY — the back half of the year hasn't happened. Ask me in December.
  • UNGRADED — I didn't do the homework to mark this fairly, so I won't bluff it.

The big one: put the wizard behind the curtain

I said: AI becomes a hidden worker — it does things in the background, without invocation, because that's where we like our wizards.

CALLED IT, and it's the one I'm smuggest about, because the whole market quietly agreed with me. The winning move in 2026 turned out to be invisible AI — the smart brands putting it in their decisions, not their content. Nobody wants to watch the magician work the levers; it ruins the trick. Hide the machinery and the thing feels more magical, not less. (I liked this idea enough that it got its own essay. It's load-bearing.)

I said: there'd be an AI backlash and a hand-made revival — and enterprise wouldn't care and would steamroll ahead anyway. The luddites lose; the horse has bolted.

CALLED IT. "AI slop" became the insult of the year. There's a measurable hand-made premium now — agencies charging a multiple for "human-crafted," a major broadcaster slapping guaranteed human on its output, Wikipedia editors banning LLM-written articles, "made by humans" badges everywhere, most people telling Gartner they actively distrust AI search summaries. Meanwhile enterprise gen-AI spend more than tripled in a year and didn't blink. The luddites are loud and losing, exactly as called. The kicker: the smartest response brands found to the backlash was to hide the AI — so the anti-AI movement and the wizard-behind-the-curtain prediction turned out to be the same prediction wearing two hats.

I said: "good enough" eats the world, and the hunt for perfect dies — the 1%-better principle.

TRACKING. Price wars, commodity models, and Apple straight-up buying a model instead of building one. "Cheaper and now" is beating "perfect and eventually" everywhere you look.


The lab wars

I said: Apple stays mediocre at AI, doubles down on hardware, and it's completely fine, because models are a commodity and they'll win the use not the model.

CALLED IT — almost uncomfortably so. WWDC landed in June with the big Siri overhaul running on Google's Gemini, reportedly for around a billion dollars a year. Read that twice: Apple outsourced the brain and kept the integration. That is the commodity-model thesis with a ribbon on it. They wrapped it in a system orchestrator, leaned hard on privacy and personal context, dangled foldable hardware, and Federighi took a public swing at rivals chasing AI for its own sake. Hardware double-down, commodity model underneath, invisible enablement on top. Everyone scored it a miss. It wasn't.

I said: Google wins through vertical integration, won't sweep enterprise because the wrong people make the calls, and you should buy the stock — the rare all-three, don't-pick-two outcome.

TRACKING, leaning CALLED IT. Google opened a price war at I/O, committed an eye-watering capex number, rebranded Vertex into a Gemini enterprise agent platform, threw a partner fund at distribution, and is now the brain inside Siri. Better, faster and cheaper — the all-three outcome that's brutal to run against. And the nuance held beautifully: they aren't sweeping enterprise. Anthropic leads enterprise LLM spend at roughly 40% while OpenAI slid to about 27%, which is precisely why Google is throwing money at the gap. Vertical integration winning the war; enterprise still a knife fight. Buy the stock still looks right.

I said: Microsoft and Amazon keep faltering because generic is a trap — if it does everything mediocre it does nothing well. Amazon should sell picks and shovels; Microsoft should hang better drapes and get the wizard a bigger smoke machine.

TRACKING — with an honest hedge, because I leaned on the read here rather than stress-testing Redmond and Seattle line by line. But the tension I named — generic-everything versus the specificity people actually want — is still the live critique, and standalone Copilot share is a rounding error next to its embedded footprint. The picks-and-shovels lane (the Vercels, Supabases, RunPods) is exactly where the quieter money is being made.

I said: OpenAI slides to ~50% share, with Claude and Google taking most of the rest.

DEPENDS WHICH SCOREBOARD — and on the one that pays the bills, CALLED IT. Enterprise/API: OpenAI's already past my number, down near 27%, Anthropic around 40%, Claude and Google genuinely making up most of the rest. Consumer chatbot: ChatGPT's still about 60%, down from a near-monopoly 87% a year ago — hurtling toward 50, not there yet. I was grading the enterprise board. There, it's in.

I said: the models plateau on capability and compete on cost; gains come from the harness, not the model; world models are the next real leap.

TRACKING. The headlines went from "smarter" to "cheaper and faster" right on cue, and world models are very much the thing being pointed at as the next capability jump rather than a bigger LLM.


The build pipeline (home turf)

I said: AI ships code faster than humans can review it — unchecked agentic PRs go mainstream, and we start reviewing the reviews.

TRACKING, leaning CALLED IT. AI-assisted devs are committing at three-to-four times the old rate, and agentic tools now account for a meaningful slice of all public commits. The firehose is real; humans can't read it by hand. The "reviewing the reviews" layer is still forming — which is the entire reason I built Echo, a 7-agent PR reviewer that has to earn what it's allowed to auto-approve. Watch this space; it's where the back half of the year lives.

I said: "100% AI-written codebase" announcements go from confession to flex.

TRACKING. On schedule, and getting smug about it.

I said: AI-powered QA lags.

TRACKING. Coding is roughly 16% of an engineer's day. Pour rocket fuel on that 16% and the other 84% — reviews, releases, QA — becomes the jam. The commit rate rocketed; the review-and-test rate didn't. QA is the bottleneck the whole industry's only now noticing. It isn't that AI is broken. The plan is.

I said: agent memory becomes the hot topic — context memory and execution memory.

CALLED IT, and from an angle I didn't fully price in: memory poisoning is now a named attack class. Persistent agent memory turned out to be the hottest capability and a fresh security hole at the same time. Remembering you liked pink and forgetting you changed your mind at 3am is now, officially, a bug genre.

I said: agents get trivially easy to build, people won't know they're running multi-agent systems, and the marketplaces and fragmentation accrue as enterprise debt.

TRACKING. Skill and MCP marketplaces are multiplying, "shadow AI" is the new shadow IT, and the sprawl is quietly piling up exactly as flagged. The bill comes later, as bills do.

I said: the subagent and MCP marketplace gets monetised.

TRACKING (early). Where there's sprawl there's a storefront — it's nascent but coming. The marketplaces are even getting abused before they're properly monetised, which is its own kind of milestone.

I said: the back half of the year, focus shifts off engineering tools toward the product side and post-engineering work — releases, infra, cyber.

TOO EARLY. It's June. Ask me in December. Early smoke, though: cyber's already a bonfire, so part of this may be arriving ahead of schedule.


Cyber: the red-hot nightmare mess

I said: cybersecurity is a red-hot nightmare mess.

CALLED IT, emphatically — possibly an understatement. "Vibe-hacking" is a term of art now. LLM-built malware went from proof-of-concept to in-the-wild. Malicious packages in public repos exploded into the hundreds of thousands. Time-to-exploit collapsed from roughly two years to under two months. Agentic AI is being called the attack surface of the year, unvetted MCP servers and all. And the OpenClaw mess — hijackable local agents, a skills hub shipping infostealers — is exhibit A for the thing I keep banging on about: exposed machinery isn't just ugly UX, it's an incident waiting to happen. The wizard behind the curtain isn't only more magical. He's safer.


The physical world

I said: robots get home-ish ready.

CALLED IT, and the "ish" earned its keep. One outfit's taking home pre-orders by the thousand, another's showing its humanoid folding laundry and loading dishwashers, you can put a humanoid in your cart for under twenty grand, and Tesla's targeting consumer Optimus by year-end. Not in the living room yet — but unmistakably home-shaped and pre-orderable. Home-ish.

I said: robotics advances are rapid.

CALLED IT. The tell of the year: a leading humanoid outfit ripping out something like 109,000 lines of hand-tuned balance code and replacing it with a small neural net. That's not demo polish. That's embodied AI eating the stack.

I said: fusion advances are rapid.

TRACKING, with a caveat. Plasma-confinement records tumbling, multiple credible demo reactors on track, billions in private money piling in. Genuinely rapid advances — but nobody's put net electricity on a grid this year, so file under "accelerating," not "solved." I said rapid, not done. Fair points, honest asterisk.

I said: quantum advances are rapid.

UNGRADED. Not bluffing a grade I didn't earn — I haven't done the quantum legwork this half. Flagging it honestly. Mark me down for the cowardice if you like.

I said: a huge wave of AI-driven medicine, off the back of sequencing and an unprecedented grip on protein folds and binding sites.

CALLED IT, and the binding-sites framing was the bullseye. The standout: a drug-design engine predicting a hidden, cryptic binding pocket from sequence alone. Add paired protein language models and de-novo binder design and the medicine flywheel is properly spinning. The flashy bit was never going to be a chatbot. It was always going to be systems finding druggable pockets we didn't know were there.


The consumer edge

I said: world models become something Joe User has heard of.

TRACKING. They went consumer this half — Google shipped one as a product, a couple of labs dropped theirs, and a certain ex-Meta luminary left to start a world-model lab at a frankly silly valuation. "Joe User knows the phrase" is getting there: the enthusiasts are all over it, the masses are nearly.

I said: AI-native products and wearables reach consumers in Q1.

HALF RIGHT. Smart glasses are a real, growing consumer category — one player owns most of it and has shipped millions. But the dedicated AI-native wave (the pendants, the standalone gadgets) is still troubled; the early flame-outs spooked everyone and the serious push is back-half and 2027. Right about glasses, early on the rest.


The money and the mess

I said: enterprise mimics the wrong things — outsized layoffs without the outsized AI adoption to justify them.

TRACKING. The "we cut headcount because… AI" announcement, conspicuously light on actual deployment receipts, is now a genre. This is Peter Principle meets Conway's Law — the wrong people deciding, across boundaries they can't talk across. The winners will be the rare few who read an x% efficiency gain as "eat x% more of the market" instead of "eat x% of our staff."

I said: thin-slice boring problems get AI solutions and sprint for acquisition — a ton of acquisitions, far more than new billionaires.

TRACKING. The acqui-hire conveyor belt is running hot. Lots of small exits, very few rocket ships.


The standings

Mostly Called It and Tracking, one honest Half Right (wearables), one Too Early (the H2 shift), and one Ungraded because I didn't do the quantum homework and won't pretend otherwise.

But the score isn't the interesting part. The interesting part is the through-line. Almost every prediction that landed hardest rhymes with the same idea: the winning move in 2026 is to hide the wizard. The backlash rewards invisible AI. Apple shipped an orchestrator, not a chatbot. The cyber nightmare is, in large part, a story about too much machinery left exposed. I thought I was making twenty-odd separate calls. I was mostly making one, over and over.

Set your watches. Come back in December for the reckoning on the Too-Earlys — and if you reckon I marked myself generously, the receipts are right below. Mark it yourself.

Stay curious. Stay humble. Keep experimenting.


Receipts