dangerouslyskip
← Back to projects
live·2026

Notewell

My daughter couldn't take notes. So — as one does — I built her Notewell: an app that teaches the science of note-taking and quietly runs on it.

Notewell — a note-taking learning app for kids. The home screen reads “Take notes your future self will thank you for,” above a lesson path with progress, XP and the day-1 streak.

My daughter is, by her own cheerful admission, terrible at taking notes. So I did what I always do. I made an app for that.

It started, as these things should, by actually watching how she studies. And the problem wasn't studying — it was what she was studying from. Bad notes. Chaos, honestly. Fragments of whatever she'd been covering in class, no structure, obvious gaps where whole ideas should have been. The poor lil miss was trying to revise from a soup of etchings, and no amount of effort was going to turn that soup into a test result. She never had a chance, and it wasn't for lack of trying. She was working hard off a broken input.

And here's the thing that actually stopped me: nobody had ever taught her how to take notes. We hand kids a notebook and a pen and simply assume the skill arrives on its own. I went looking and realised it had never been taught to me either — somewhere along the way I'd invented my own private system and shorthand so idiosyncratic that no human alive could follow my notes but me. It works, but I couldn't tell you why it works, or teach it. Note-taking is one of the highest-leverage skills in a student's life and it's almost never explicitly taught. That's mad.

It turns out there's real science here — encoding versus storage, retrieval practice, spaced repetition, the Cornell method, dual coding — a whole evidence base on how to get information to actually stick, and approximately none of it reaches a 12-year-old. So I set the agents onto teaching my kids.

(For the record, this isn't my first rodeo. I've built NAPLAN prep, targeted maths and science and English tutors, and a much more ambitious whole-curriculum engine hanging off a knowledge graph that'll take basically any curriculum you throw at it. This one is the simple, sharp little sibling of all that. Sometimes you don't need the cathedral. You need the one room your kid is cold in.)

The idea that makes it more than a textbook

Here's the design decision I'm proudest of, and it came straight out of the brief: the app doesn't just teach the techniques. It runs on them.

It would have been easy to ship ten tidy lessons explaining spaced repetition and retrieval practice and call it educational. But a lesson that describes spacing while ignoring it is a lie of omission. So Notewell models what it teaches. Before it gives her a new lesson, it quizzes her on the old ones — that's retrieval practice and spacing, happening to her, not just explained at her. And it schedules a delayed test a couple of "days" later to prove the notes actually worked — closing the loop the science says you have to close. The medium is the message. You don't learn that spacing works because a screen told you; you learn it because the screen made you feel it land.

The lessons themselves are short and interactive, never a wall of text — including a forgetting-curve simulator she can poke at, an SVG that redraws as she toggles review sessions on and off so she can watch her future self forget, and then watch the curve flatten when she adds spaced reviews. Show, don't tell, aimed at a 12-year-old's attention span.

From a pretty mockup to a real thing

I started with two inputs: a 400-line PRD that owned the pedagogy, and a UI mockup that was a gorgeous, warm-editorial React prototype running off Babel-in-the-browser — beautiful to look at, and hollow underneath. Six of ten lessons, no backend, no tests, and some leftover "Note Ninja" branding from a previous life.

The look was where we'd landed, so I treated its design tokens as canonical and ported them verbatim into a proper Vite + React + TypeScript app, brought the six real lessons and the shared machinery across, and built the missing four from the PRD. Then came the parts a sceptical engineer actually cares about.

Heuristic-first AI grading. The app grades open-ended answers — her cue questions, her summaries, her recall attempts. The temptation is to reach straight for an LLM and make it a hard dependency. I didn't. I shipped a pure-function heuristic grader first — edit distance, keyword overlap, copied-run detection — behind a single clean gradeX() seam, fully working with zero API keys. Then, when an OpenRouter key showed up, real LLM grading (Gemini 2.5 Flash) lit up behind the exact same seam without touching the client: a tiny server proxy calls the model, the browser never sees the key, and any failure silently falls back to the heuristic. The hard constraint that drove the whole shape — no LLM key ever in the browser bundle — turned out to also be the thing that made the app robust. The AI is an upgrade, never a crutch. Pull the key and it still teaches.

The audio lectures, and the narrator who kept changing. The best idea in the build was the client's (the client being twelve): make the mini-lectures audio you take notes against, rather than text that reveals on a timer — because the entire point is practising note-taking against a live stream of speech. The four-minute "Black Death" boss lesson is exactly that. I generate per-line clips through a TTS service so the captions stay tightly synced, with an accessible captions toggle and a text-reveal fallback if no audio exists. First pass had a beautiful bug: I hadn't pinned a voice, so the TTS picked a different speaker per request and the narrator drifted mid-lecture, like the Black Death was being explained by a rotating cast of strangers. One pinned voice ID and a regenerate, and the plague had a single storyteller again.

The fake four-day streak. Two hours after launch, a screenshot caught the home screen proudly bragging about my daughter's "4 day streak." The app had existed for two hours. It was a hardcoded placeholder inherited from the mockup, and it's the perfect little parable: every mocked value is a latent bug. When you port a demo into a product, every number that was faked to look good in a screenshot is a lie waiting to ship. It kicked off a sweep that replaced the fake streak with a real one and cleaned up its siblings — XP toasts that lied on replays, "1 days" pluralisation, keyboard-inaccessible nav links, a warm-up that needed gating so it only quizzes lessons she's actually done.

Accounts, without a big-bang cutover. The biggest architectural step was going from a static single-player app to real multi-user accounts with Google sign-in and progress that follows her across devices. The interesting part isn't the hand-rolled OIDC or the per-user SQLite blob on a persistent volume — it's the deploy strategy. I built it to degrade gracefully: with no Google credentials configured, there's simply no sign-in button and the app behaves exactly like the old anonymous version. Which meant I could ship the entire new auth server to production inert, prove nothing broke, and then light up sign-in later just by adding a couple of env vars. No risky big-bang switch-over, no held breath. Graceful degradation isn't just a runtime nicety; it's a deployment strategy. (The client-side trick that made it painless: an adapter behind useStore(user) that keeps the synchronous localStorage path for anonymous users — so the whole test suite didn't have to be rewritten — and swaps to debounced server-sync when signed in, migrating local progress up on first login.)

There was the usual texture of actually shipping a thing, too: driving the deploy platform's API directly when its CLI abstraction leaked, finding an undocumented storage-type enum by grepping the platform's own source on the box, and reconciling a branch that kept diverging from a squash-merged main with the git merge -s ours trick. The unglamorous 84%, as ever.

Where it landed

It's live, and it's hers: ten interactive lessons, four original audio mini-lectures, a spaced-retrieval warm-up, a printable Cornell template, and that delayed quiz that closes the loop two "days" later. Real AI grading with an offline fallback. Google accounts with cross-device sync, a graceful anonymous mode, and a free-first-lesson on-ramp so there's the lowest possible friction to start. Around seventy-odd tests, all green — including one that deliberately wipes local storage and reloads, just to prove her progress is coming from the server and not the browser kidding us both.

So, did it work?

The honest answer is: she's taking better notes, and she can tell you why her notes are better, which is the part that'll outlast any single test. The soup is becoming structure.

But the bit I keep turning over is bigger than one kid's revision. Note-taking is a foundational, learnable skill that we systematically fail to teach, and it turns out AI is weirdly, brilliantly suited to teaching exactly this kind of unglamorous meta-skill — patiently, interactively, at a twelve-year-old's pace, modelling the technique while it teaches it, never getting bored on the fortieth repetition. That's not the flashy use of this technology. I think it might be the best one.

One day, surely, we'll just plug a USB into the back of the ole head, Matrix style, and she'll blink and say "I know note-taking." For now, this is the closest we've got — agents doing the upload in real time, through good old hard work.

For her, though. Not for me.

I get to sit on the couch and watch TV while my kid learns a new skill. Which, when you really think about what this technology is for, might be the entire point.