live·2026

Resumatron9000

Resumatron9000 — a JD-tailored résumé and a recruiter-facing interview agent, engineered so neither can invent a fact, figure, or credential.

I named it Resumatron9000, which sounds exactly like the kind of over-eager gadget that would cheerfully tell a recruiter I single-handedly invented the transistor. It's built to be the precise opposite. The whole engineering effort went into one unglamorous goal: a résumé tool that is constitutionally incapable of lying about me.

Because here's the thing everyone building one of these gets wrong. The hard part was never the tailoring. Reshaping a master résumé to a job description is the easy 20% — any model does a passable job in one prompt. The hard part is trust. An LLM that rewrites your résumé and then sits there answering a recruiter's questions about it will, left to its own devices, embellish. It'll round 30% up to "nearly half." It'll turn "contributed to" into "led." It'll confidently quote a metric that sounds like it came off the page and absolutely did not. Almost everything I'm about to describe exists to make that impossible — not discouraged, not caveated, structurally impossible.

That's the thesis of the whole build, and it's one I'll keep coming back to: trust is an architecture, not a disclaimer. A polite "please don't make things up" in a system prompt is not a control. It's a wish.

What it actually is

Resumatron9000 turns one master résumé into a JD-tailored, one-to-one résumé at a private link (/r/<slug>), with a built-in "interview me" agent a recruiter can interrogate. The recruiter sees a clean, designed page; print-fidelity PDF, ATS-friendly DOCX and machine-readable JSON exports; and a floating Interview me widget that answers questions grounded only in my real record. It reads like a considered application — "why I'd be a strong fit for X" — not a mail-merge, and it never waves a flag saying "AI-generated for you."

What they don't see: my referees' contact details (those never leave the server), the draft versions, or any of the private strategy that shaped the document. Which brings us to the first real decision.

Diagnose, then execute

The core architectural call was to split the work into two passes, because diagnosis is a different skill from execution. A single "rewrite my résumé for this JD" prompt produces confident mush — it tries to analyse and write at the same time and does neither well. So:

Stage one is the Role & Fit Analyst (fit.ts). It reads the JD against the master and produces a FitBrief — a private strategy document the recruiter never sees. It works out what the role actually is beyond its title (a "Head of Applied AI" that's really a pre-sales GTM role, say), classifies a role shape, extracts the hard requirements and must-have keywords, maps my genuine strengths to the role — each citing the master record that evidences it — and, critically, names the honest gaps plainly, each paired with a bridge. It raises human-decision flags ("reconcile 300 vs 500 headcount", overlapping dates) for me to resolve, rather than silently baking a guess into the document. Stage one never invents. It diagnoses.

Stage two is the Reshaper (tailor.ts). It takes the brief and reshapes the master into a tailored résumé: chosen title, profile, ordered roles, selected bullets, optional rephrasings — text only, never new facts — and skill groups. The output is a pure selection-and-rephrase contract over master record IDs. That constraint is the quiet hero of the whole system, because it's what makes the next part possible.

The anti-fabrication spine

Three layers, and every one of them was earned through a bug.

Layer one — the validator (validate.ts): every unit must trace to a master ID. A tailored bullet whose ID doesn't exist in the master is rejected. A rephrase that introduces a new number or a new named entity that wasn't in the source bullet is rejected. Drop a must-have keyword or omit core experience and it's flagged. Block-level flags stop a publish outright, and stage two sees its own block flags and self-corrects on retry. The structural guarantee: a tailored unit that can't trace back to the master cannot ship.

Two fiddly bits that are pure projects-blog gold because they're the kind of thing you only learn by hitting them. Keyword coverage had to be made synonym- and word-order-tolerant, so "AI governance" in a JD matches "governance of AI systems" in a bullet — get that wrong and you get a flood of false "you dropped a must-have!" flags. And number extraction had to be made idiom-aware: a real bug had it reading 0→1 as the number 1 and blocking a perfectly legitimate "0→1 platform build" bullet for "introducing a new number." Now it strips 0→1, 1:1, 24/7 and friends before the check runs. Small, stupid, and exactly the sort of thing that decides whether the system is usable.

Layer two — the numeric trust gate, which cost me the most. The interview agent (ask.ts) runs RAG over my record and answers recruiter questions. Left alone, it invented plausible figures — "90% adoption", "140% productivity" — that sounded like they came off the résumé but didn't. The root cause was genuinely subtle and worth the whole paragraph: the grounding context I was handing the agent included blog excerpts, company research, and interview transcripts — and those transcripts contained the AI interviewer's own invented numbers, which the answer model then dutifully cited back as established fact. I'd poisoned my own well.

The fix is the bit I'd point a sceptical engineer at. A function extracts every figure in a draft answer and checks it against a deliberately narrow canonical set — the resolved résumé plus the master, and nothing else. Not blogs. Not company briefs. Not transcripts. If a number isn't in that canonical record, the answer is regenerated; if it still fails, it falls back to a safe response. The consequence is a feature disguised as a constraint: any real metric has to be promoted into the master to be quotable at all. Every number goes through one reviewed door. There is no side entrance.

Layer three — the honest-advocate posture. The agent is an advocate, not a hype machine, and not a confessor. It leads the case. It never volunteers a gap. If a recruiter raises one, it acknowledges it in a single light line and then reframes to adjacent evidence and the teams I build to cover it — it never fabricates the missing credential, and it never invites the critique. Which is, not at all by coincidence, exactly how a good human reference behaves. Honest advocacy, not spin.

Show your work: the altitude bug

This is my favourite, because it's a clean example of diagnosis-quality driving output-quality.

Several of my leadership-voice "outcome" bullets abstract the very systems that my named-product bullets describe in detail:

| The abstract "outcome" bullet | The named systems it's summarising ||---|---|| "Architected production agentic systems…" | Skynet, J.E.F.F, Echo || "Stood up the AI security & governance fabric…" | Iron Golem, Governor || "Built an internal auth-broker & API gateway…" | AI Studio || "Built the enablement layer…" | Dojo |

A given résumé has to show each cluster at one altitude — abstract or detailed — or it describes the same work twice and reads like padding. Originally the engine picked altitude from the coarse role family, and an "Applied AI Engineer" JD got classified as family ai, which fell on the abstract side. The result was perverse: a hands-on IC role got served leadership-voice summaries, and the technically meaty named systems — the actual evidence an IC screener wants — got deduped away. Exactly backwards.

The fix separated two axes that had been quietly conflated:

An explicit altitude: 'detail' | 'abstract' field, instead of inferring it from family.
A new applied_engineer role shape (the hands-on IC: Applied AI Engineer, Forward-Deployed, ML, Staff) that maps to detail — because for those roles the named systems are the evidence.
A classification tie-breaker: a title containing "Engineer" plus a hard hands-on requirement (Python, ships production code, agent dev, reviews code) classifies as applied_engineer even when the role is customer-facing. applied_ai is reserved for the non-IC roles where the person doesn't write the code.

Re-run the relevant cut after that change and it flips to applied_engineer / altitude: detail, and all eight named systems render in their full technical glory instead of being summarised into mush. The lesson generalises well beyond résumés: what the job is (role shape) is a different axis from what a fact is about (content family), and conflating them silently corrupts the output. Most "the AI gave me a bad result" problems are a diagnosis problem upstream, not a generation problem.

Preparing the advocate

The owner side of Resumatron9000: tailoring a résumé to a JD, the Prepare-your-agent grounding screen, and grounding sources — GitHub repos, blog posts, and books & ideas.

An honest advocate is only as good as what it's grounded on, and a one-page CV is a thin, lossy summary of a career. So there's a whole prepare-your-agent layer that feeds and steers the interview agent — and, just as importantly, fences it.

It grounds in my private GitHub. The agent reads my private repos directly, because there is vastly more breadth in the code than ever fits on a document — the half-built experiments, the tools behind the tools, the projects that never earned a bullet point. It can speak to those because it has actually read them, not because I summarised them once and hoped.

It takes a grounding inbox beyond the master. I can feed it documents, additional repos, LinkedIn and blog posts, the books and ideas that shaped how I think, and curated Q&A — reference material that informs how it talks about me without ever entering the master record. That split is the entire point: the master is the canonical fact source; the grounding is context and colour.

It interviews me. There's a backend agent whose only job is to interview me — to pull out the anecdotes, the stories, the "why did you build it that way" that lives in my head and never in a résumé. Tacit knowledge in, structured grounding out. The best human reference is one who's actually heard your war stories; this manufactures that on purpose.

I can steer it — "if asked X, base your answer on Y" — so a predictable question lands on the strongest, truest material instead of whatever the model free-associates toward. And I can fence it with a list of topics to avoid or gracefully bounce from. An advocate that knows what not to talk about is a better advocate.

Here's the part that ties straight back to the well-poisoning bug, and it's the whole reason this layer can be generous: grounding gives the agent context and colour; it never gives it licence to invent facts. All that extra material lets the agent sound like it genuinely knows me — tell the story behind a system, place a project in context, recall a war story. But the numeric gate still only lets it quote figures from the master. That separation is deliberate, and it's exactly the lesson the bug taught me: the grounding gets to be rich and wide precisely because the fact-citation door stays narrow. Feed it everything. Let it quote almost nothing.

The bits that make it production, not a demo

A few things round it out. The owner can change the résumé either by talking to the agent or through an editor — same underlying data — but crucially the wording edits are stored as overrides while numbers, new systems and structure stay in code, so the trust gate keeps working even as phrasing changes. Published résumés are frozen — a recruiter's link never shifts under them; when the engine improves, cuts are regenerated deliberately through a script that prints diagnostics to stderr and aborts on any block-level flag before a single row is written. The site exposes MCP tools, a /raw and /json surface, and an llms.txt, so when another AI reads the site it gets clean grounded facts instead of scraping HTML and guessing. And there are 71 tests sitting alongside the engine, because the validator, the number extraction, the keyword coverage and the altitude dedup are exactly the things you do not want to discover are broken in front of a recruiter.

The stack, for the curious: it's a feature inside the same Next.js + Payload CMS + Postgres app that runs this blog — no separate service — with generation on Claude Sonnet 4.6 and the cheap sub-tasks on Haiku, every call on a hard timeout so an admin save never hangs on a slow model.

What I'd pull out of all this

Six things, and only the first is really about résumés:

Trust is an architecture, not a disclaimer. "Don't fabricate" is enforced by a validator and a numeric gate over canonical facts, not by a polite paragraph in a prompt.
Split diagnosis from execution. Two narrow passes beat one broad one, and the diagnosis is where output quality is actually won.
Separate your axes. Conflating "what the job is" with "what a fact is about" is what caused the worst bug in the system.
Make every number go through one door. Forcing real metrics into a single canonical source to be quotable is a constraint that pays for itself immediately.
Ground wide, cite narrow. Feed the agent everything — repos, docs, stories — for context and colour, but let it quote facts only from the canonical record. Rich grounding is safe because the fact door is narrow.
Honest advocacy beats hype — and, not coincidentally, it's what a good human reference does. Lead the case, never invent the credential, never volunteer the gap, and know which topics to bounce from.

So that's Resumatron9000: a résumé bot named like it would absolutely lie to you, engineered so it can't. The worked example I tuned the altitude fix against happens to be a particular Applied AI Engineer role. Make of that what you will.

I'm sorry Dave. I genuinely can't fabricate that.