Model Pipeline
Text is chopped into tokens, processed through the model, then decoded back into text. The model is predicting the next useful pieces, not looking up a literal answer book.
AI feels mysterious until you separate the parts: tokens, context, model prediction, tool use, memory, and verification. This page lets you poke each part directly.
Walk through the lab in order: system map, training, diagrams, tokens, context, prompts, tools, agent loop, trace, systems, concepts, stack building, and failure modes.
The illustration shows the same system as a physical workbench. Pick a station to see what that part contributes to an agent run.
This page mostly explains what happens when you use an AI system. Model training and RLHF are a deeper topic, so they now have their own outline page.
The future page will cover pretraining, fine-tuning, RLHF, safety tuning, evaluation, data quality, and the difference between base models and chat assistants.
These are the useful shapes underneath most AI systems: the text pipeline, the agent loop, and the difference between temporary context and durable memory.
Text is chopped into tokens, processed through the model, then decoded back into text. The model is predicting the next useful pieces, not looking up a literal answer book.
An agent keeps cycling until it has enough evidence or hits a limit. Good agents know when to stop and report what they verified.
Context is the working desk. Memory is the labeled drawer. If a fact needs to survive future sessions, it belongs in memory or a source file.
Models do not read text exactly like humans do. They see chunks called tokens. This is an approximate local demo, but it shows why wording and length matter.
A chat can be long while the model's active view is limited. Older material may stay visible, get summarized, or fall out unless it is saved somewhere durable.
Better prompts do not make the model magical. They reduce ambiguity, supply constraints, and tell the model what a useful answer looks like.
A plain model predicts an answer from its training and current context. A tool-using agent can inspect files, search, run commands, generate images, or ask for approval.
A model can produce a plausible answer from context, but it may invent details when the answer depends on current files, dates, or external state.
A tool-using agent can inspect the actual source of truth, then answer with less guesswork and clearer evidence.
An agent is a model wrapped in a loop. It reads the request, checks context, chooses tools when needed, observes the result, and decides whether to keep going or answer.
A useful AI run leaves an outside trail: what was requested, what it checked, what it observed, and how it verified the result. You do not need hidden chain-of-thought to judge whether the work is trustworthy.
AI is not one thing. The useful question is what kind of system you are using and what constraints come with it.
Private and hardware-bound. Good for experiments, automation, and offline control. Quality depends on the model and machine.
Usually stronger and faster to update. Great for hard reasoning, coding, and multimodal work, with external service tradeoffs.
A model with tools, memory, permissions, and a loop. Powerful because it can act, risky if guardrails are sloppy.
Image, voice, music, and video models translate prompts into pixels or sound. They need visual direction, not just facts.
AI terms get thrown around loosely. Pick a concept to see the useful definition, a practical example, and the common misunderstanding to avoid.
Most real AI systems are a stack of choices. Add or remove capabilities below and watch the profile change.
Most AI mistakes are not mysterious. They usually come from missing context, stale information, weak instructions, bad tool choice, or permission boundaries.
Question: “What changed in the latest deployed website commit?”