Reference / Glossary Agent
A reference, not a chapter. Search any term used anywhere in the five tiers. Each entry is tagged with the tier it came from so you can jump back for the long version.
01 · How to use this glossary
Type any term, or part of one, into the search bar. The list filters live as you type.
Use the A-Z jump bar to land on a letter. Useful for browsing when you don't know the exact word.
This glossary is a snapshot. The real "glossary agent" is any LLM you already have open. Paste an unfamiliar term plus "explain like I just finished the Chief A.I., Oh! basics guide" and you'll get a custom-tuned definition in seconds.
FOUNDATIONS PROMPTING LLMS INTERMEDIATE ADJACENT
An AI system that decides which step to take next on its own, instead of waiting for you to drive every move. Books a flight, fills a form, runs a workflow. As a beginner, you'll meet agents through computer-use products (Operator, Claude Code) or workflow tools like n8n and Zapier.
Multiple agents coordinating toward a broader goal, usually with an orchestrator above them choosing who does what. Less a single tool, more a way of arranging tools.
The widest umbrella for computer systems doing things we'd call intelligent. Includes everything from chess engines to spam filters to ChatGPT. Most "AI" in the wild is not generative and predates the current LLM wave.
Google's AI-generated answer that appears at the top of regular search results. Backed by Gemini and grounded in Google's index.
The AI lab behind Claude. Founded by ex-OpenAI researchers. Smaller than OpenAI and Google but widely respected for model quality, writing, and coding.
Application Programming Interface, the developer-facing way to talk to an LLM directly, without a chat UI. You won't use it as a beginner, but products you use are built on it.
Claude's side panel that renders whatever it's generating, a document, code, a chart, a small app, as a live, editable object. Lets you iterate without copy-pasting.
Developer-facing transcription APIs. You won't use them directly, but products you use for meeting transcription are often built on one of them.
The ChatGPT and Gemini equivalents of Claude's Artifacts. A side panel that holds a live, editable document or code block.
A prompting technique where you ask the model to think step by step before answering. Modern reasoning models (o-series, Claude Opus, Gemini Ultra) do this automatically. Older models needed to be asked.
OpenAI's flagship consumer product. The most-used LLM in the world. Built on GPT family models. Comes in Free, Plus, and Pro tiers.
The feature that grounds ChatGPT's answers in live web results with citations. Reduces hallucinations on factual questions.
Anthropic's LLM family. Comes in Haiku (fast), Sonnet (default), and Opus (most capable) variants, in numbered versions. Quietly the favorite of writers, lawyers, and developers.
A separate Anthropic product that runs Claude in your terminal and lets it act on real files. The most capable agentic coder for non-developers, despite the technical-sounding name.
The capability of an AI to control a computer the way a human would, moving a cursor, clicking, typing. OpenAI's Operator and Anthropic's computer-use API are the leading examples.
The maximum number of tokens an LLM can consider at once, your prompt + the conversation history + its reply. Exceed it and older content drops off the front. Modern windows range from 128K (about 250 pages) to 1M (about a bookshelf).
Microsoft's AI assistant. Powered by GPT models under the hood. Lives inside Word, Excel, Outlook, Teams, and Windows. If you live in Microsoft 365, the path of least resistance.
An AI-native code editor (forked from VS Code). The industry favorite among professional developers. Lets you pick which model powers it.
A reusable specialized assistant in ChatGPT. You define a name, instructions, knowledge files, and optionally external actions. The closest thing to "building your own AI tool" without code. Equivalent to a Gem in Gemini or a Project + system prompt in Claude.
OpenAI's image generation model, built into ChatGPT. Now joined by GPT-4o's native image generation, which handles conversational editing better.
A flavor of machine learning that uses neural networks with many layers. The "deep" just means "many layers." All modern generative AI is built on deep learning.
An agent-style feature (versions in Gemini, ChatGPT, and others) that browses the web for 10-30 minutes and produces a structured report. Best for "go research this topic and come back."
The market-leading synthetic voice product. Voice cloning from 30 seconds of sample. Hundreds of preset voices in dozens of languages. What podcasters, audiobook producers, and video creators use.
A numerical representation of a piece of text (or image, or audio) that captures its meaning. Used behind the scenes for semantic search, finding documents by what they're about, not by which keywords they contain. You won't write embeddings yourself; you'll meet them inside tools that "search your files semantically."
Including two or three examples of the desired output in your prompt. The single highest-leverage move on a hard task. Beats describing what you want with adjectives.
Additional training applied to a pre-trained model on your specific dataset, to make it better at your use case. Rare for beginners; most consumer-facing customization happens through prompts, Custom GPTs, and Projects instead.
Adobe's image generation model. Trained on licensed Adobe Stock, commercially safe for ads and client work. Lives inside Photoshop, Illustrator, and Express.
Google's fast, cheap Gemini tier. The default on the free plan. Genuinely good for most tasks; beats most "small" competitors.
An open-weights image model from Black Forest Labs. Strong quality, runs anywhere. You'll meet it inside third-party image apps like Ideogram, Krea, and Replicate.
The biggest, most capable models from labs racing each other (OpenAI, Anthropic, Google DeepMind). The opposite of open-source models you can run yourself.
A technique for vague tasks: the first prompt asks the model to ask you clarifying questions; the second prompt executes the task using your answers. Use it whenever you'd otherwise type "help me with..."
Gemini's reusable specialized assistant. Equivalent to a Custom GPT or a Claude Project + system prompt. You give it instructions and (optionally) Drive files as knowledge.
Google DeepMind's LLM family. Comes in Flash (fast), Pro (default), and Ultra (most capable). Best in class for huge context, multimodal understanding, and Workspace integration.
Gemini's voice + camera mode, strongest on Android. Point your phone at something and have a real-time conversation about it.
AI that produces new content, text, image, audio, video, code, instead of just classifying or predicting. Everything in this guide is generative AI.
The original mainstream coding AI assistant. Lives inside VS Code, JetBrains, and other editors. Microsoft + OpenAI partnership.
The family name of OpenAI's models. GPT-4o, GPT-4.1, GPT-5, o-series. "GPT" stands for Generative Pre-trained Transformer; the acronym is now just a brand.
OpenAI's marketplace of Custom GPTs. Browse before you build, thousands already exist for common tasks.
A lightweight transcription tool that runs in the background of any meeting. Captures audio, summarizes, blends in your typed notes. Increasingly the operator favorite for daily calls.
xAI's LLM, built into X (formerly Twitter). Less filtered than ChatGPT or Claude; real-time access to X data. Niche use; not a daily-driver pick for most.
Anchoring an AI's answer to real, retrieved documents (web pages, your files) rather than letting it free-associate from training. The standard defense against hallucination. ChatGPT Search, Gemini Grounding, Perplexity, and NotebookLM all do this.
Anthropic's fast, cheap Claude tier. Lightweight, near-instant. Great for quick chat, simple drafts, mobile.
When an LLM produces something that sounds right but isn't true. Fake citations, fake quotes, fake URLs, fake legal cases. Built into how the technology works, not a bug. Mitigated, not eliminated, by grounding and verification.
An image generator best in class at rendering readable text inside images. The right pick for posters, logos, ads with copy.
Google's image generation model family (including "Nano Banana" variants). Built into Gemini and Workspace. Excellent at photorealism and consistent characters across multiple images.
What happens every time you press Enter. Your prompt goes in, a response comes out. The model's weights don't change. The opposite of training.
Everything you and the system send into the model, your prompt, attached files, hidden system prompt, conversation history. Usually cheaper than output tokens.
Refining a response by giving the model more instructions in the same chat, instead of starting over. The four moves: constrain, redirect, compare, quote.
Meta's open-weights LLM family. The dominant open-source model. Free to use, modify, and run on your own hardware. Powers many cheaper third-party AI products.
A generative AI specialized in language. Reads text, writes text. GPT, Claude, Gemini, Llama. The thing under the chat box. When people say "AI" in 2026, they almost always mean an LLM.
No-code app builders. Describe an app in a sentence and get a working web app back. Best for prototypes, landing pages, internal tools. The real entry point for non-coders.
A subset of AI where systems learn patterns from data instead of following explicit rules. Recommendation engines, fraud detection, weather models. Predates the LLM era by decades.
A standardized way for LLMs to plug into external tools, calendars, CRMs, Drive, Slack. Build one MCP connection and any compatible AI can use it. You'll see "MCP servers" and "MCP connectors" referenced in Claude and ChatGPT settings.
An LLM's ability to remember things across conversations. ChatGPT's is global by default. Claude's is project-scoped and opt-in. Gemini's is global and opt-in. Useful but watch for leakage between contexts.
The aesthete's image generator. Best stylized and artistic output. Lives in a web app (originally launched on Discord).
A European open-weights model lab. Their Mistral and Mixtral models are widely used in third-party AI products and self-hosted setups.
The brain. The trained system that produces output. On its own it can think and write but can't act. An LLM is a kind of model.
A model that accepts more than just text as input, images, audio, sometimes video. Most modern frontier models are multimodal by default. The unlock for beginners: pasting screenshots.
An Imagen-family image generation model inside Gemini. Strong at photorealism and consistent characters.
The mathematical structure that learns patterns from data in machine learning. Modern "deep" networks have many layers and billions of parameters.
A Google product (separate from Gemini, but Gemini-powered) that grounds AI answers in a finite set of sources you provide. Almost no hallucination. Best for due diligence, study, and research. Also generates surprisingly good podcast-style audio overviews.
OpenAI's reasoning model family (o1, o3, o4). Slower than GPT-4o but better at hard analysis, math, and multi-step problems. Use them when GPT-4o feels shallow.
The AI lab behind ChatGPT and GPT models. The largest of the three frontier labs by user count. Ships features the fastest.
An LLM whose weights are released publicly. You can download and run it yourself, or have someone host it for you. Llama, Mistral, Qwen, DeepSeek, Gemma. Beginners rarely use these directly, but products you use are often built on them.
OpenAI's agent product. An AI that drives a virtual browser for you, "book me a table at X for Friday." Available in Pro plans. Promising, still flaky for hard tasks.
Anthropic's flagship Claude tier. Slowest, deepest, most capable. Reserved for heavy analysis, complex coding, long documents.
A real-time meeting transcription product. Integrates with Zoom, Meet, and Teams. The mainstream business choice for transcripts and summaries.
Everything the model writes back to you. Usually 3-5x more expensive than input tokens because generating is harder than reading.
One of the numbers inside a model that gets adjusted during training. Modern frontier models have hundreds of billions to trillions of them. "70B model" means 70 billion parameters. Bigger ≠ always smarter; training technique matters too.
A search-grounded AI that routes through multiple models and cites its sources for every answer. The "research" search engine. Free tier is excellent.
A bench of fast-moving smaller video generation players. Worth scanning quarterly, the leader changes.
A common name for the ~$20/month paid tier across multiple products. Claude Pro, Gemini AI Pro. Confusingly, ChatGPT's $20 tier is called Plus and Pro is its $200 tier.
A folder for related chats and uploaded files, in both ChatGPT and Claude. Carries custom instructions that apply to every chat inside it. The single highest-leverage intermediate feature.
Everything you send into an LLM, your message plus any attached files. A working prompt has five ingredients: role, goal, context, format, examples.
A technique where the system first retrieves relevant documents and then generates an answer using them. The technical name for what NotebookLM, Perplexity, and "chat with your files" tools do under the hood.
An independent video AI company. The most professional editing surface, masks, motion, style transfer. Where many serious creators do real video AI work.
A reusable instruction package for Claude. A folder containing a markdown file Claude loads when its description matches your task. More powerful than Custom GPTs but with a steeper learning curve.
Anthropic's default Claude tier. Excellent writing, careful reasoning. Where most of the day-to-day work happens for paid users.
OpenAI's video generation model. Bundled into ChatGPT for paid users. Strong at short, cinematic clips.
Text-to-music tools. Type a song prompt, get a full track with vocals. Mostly novelty for non-music users; serious for hobbyists.
Hidden instructions the platform (or you, in a Project/Custom GPT/Gem) sends at the start of a conversation. Sets the model's role, behavior, and constraints. Users see only the chat, not the system prompt, but you can write your own once you reach intermediate features.
A chunk of text the model sees as a unit. Roughly 4 characters of English, or about 3/4 of a word. Models never see letters; they see tokens. The unit by which you're billed (on APIs) and constrained (in context windows).
How a model gets built. A huge text corpus is fed in over weeks or months on GPU clusters, adjusting the model's parameters until it can predict tokens well. Happens once (per version). The opposite of inference.
The category of AI that turns written text into spoken audio. ElevenLabs is the leader. OpenAI TTS and Google Chirp are the major alternatives.
Google's flagship Gemini tier. Top reasoning, heaviest lifts. Reserved for AI Ultra subscribers.
What you type into the chat box. As a beginner, every message you send is a user prompt. At Tier 4 you'll also start writing system prompts inside Custom GPTs, Projects, and Gems.
Google's video generation model family. Inside Gemini and dedicated Google products. Tight integration with Workspace and YouTube.
Inside an LLM, a way to talk out loud and get spoken responses back. ChatGPT's Advanced Voice and Gemini Live are the strongest. The most underrated intermediate feature for thinking with AI.
OpenAI's open-source transcription model. Near-state-of-the-art accuracy. Powers many of the transcription products on the market.
A growing field of agentic coding tools. Worth a scan if Cursor and Claude Code don't fit your workflow.
Google's suite of productivity apps, Gmail, Docs, Sheets, Slides, Drive, Meet, Calendar, with Gemini built in. Gemini's biggest structural advantage over ChatGPT and Claude.
Workflow automation tools, now AI-aware. You wire steps together visually; LLMs handle the "thinking" steps. Where most real business "agents" actually live today.
Asking the model to do a task without showing it any examples. The default. Effective for easy tasks; weaker than few-shot for anything stylistic or structured.