Session 1 · Mental models

Models, Chatbots, Agents.

This is not another explainer about “next-token prediction.” It is one idea — recursion — that turns a word-guesser into something that can read, write, search, and act in the world. Get this and the rest of AI stops being mysterious.

~12 min read + play Interactive: The Machine
Why bother

Understanding AI is practical, not academic.

You don't need to know how an engine works to drive — but you do if you want to tune it, notice when it's failing, and not get sold a lemon. The same is true here. A working mental model lets you:

A note in the spirit of the workshop: I am not an expert. No one is. These systems are young and strange. What follows is the most honest model I know — not the last word.

The interactive

Build the machine, one step at a time.

Start at Bare model and press Step. Each layer adds exactly one capability — and it is always the same loop underneath. Watch the context grow: that growing tape is the model's entire world.

The one idea

It is recursion, all the way up.

Every layer you just stepped through is the same move: context in → model runs → output → the harness does something with that output to build new context → repeat.

“The app does something” is doing a lot of work in that sentence. It might just append a word (a chatbot). It might run a web search, read a file, write a file, execute code, or click a button. It might spawn a hundred more models and let them write to a shared file. The model never changes — the harness gets cleverer.

“AI” names two things: the model (the brain) and the app that harnesses it (the orchestrator). Most of 2025's leap was the harness.

What to carry out of Session 1

The model has no memory

It sees only the context window. Everything it “knows about you” is text sitting in that window right now. Lose the window, lose the thread.

Thinking is just more tokens

“Reasoning” models emit private tokens before they answer. Useful, sometimes dramatic — but the same loop, run longer.

Tools are context-building

The model can't act. It emits a pattern that means “act,” and ordinary code does the acting and returns the result as context.

Agents are recursion

An agent's most powerful tool is spawning another agent. Swarms are many loops sharing a workspace until a condition is met.

How we got the model at all Pre-training (predict the next token over an enormous corpus) builds raw competence. Post-training (reinforcement from human or model judgment) makes it helpful and safe. Fine-tuning & distillation specialize or shrink it. The vocabulary — foundation models, fine-tunes, coding agents, wrapper apps — all sits on top of the one loop you just operated.
Then we built things

From model to something useful.

In the workshop this is where we stopped talking and opened laptops — Claude Code, Cowork, Codex — and watched the harness “supercharge” ordinary chatbot work. The point of the demos is exactly the point of the interactive above: the leap is not a smarter brain, it is a smarter loop around the brain.