Glossary
AI agent & automation glossary.
Plain definitions of the concepts behind Glitch Grow's AI agents — written so you can quote them in client conversations.
-
agent observability
What is AI agent observability?
Agent observability is the practice of instrumenting an AI agent so you can see what it did, why it did it, and how it performed — typically via structured logs, traces, and metrics tied to each agent decision.
-
agent runtime
What is an agent runtime?
An agent runtime is the execution environment that runs an AI agent — handling state persistence, tool dispatch, retries, observability, and the LLM API client — so the agent's logic doesn't have to.
-
agentic RAG
What is agentic RAG?
Agentic RAG is a pattern where an AI agent decides at runtime which retrieval queries to issue, iterates on results, and synthesizes across multiple retrieval rounds — instead of doing one fixed retrieval step.
-
agentic workflow
What is an agentic workflow?
An agentic workflow is an automation where an LLM-driven agent decides the next step at runtime, rather than following a hard-coded sequence — typically using a plan, act, observe, reflect loop.
-
AI agent
What is an AI agent?
An AI agent is a software system that uses a large language model to plan, take actions through tools, observe results, and iterate toward a goal — without a human stepping through each action.
-
AI digital marketing
What is AI digital marketing?
AI digital marketing is the use of AI agents to run end-to-end marketing operations — paid ads, outbound sales, social posting, voice outreach, SEO, and creative production — instead of stitching together separate SaaS tools for each.
-
chain-of-thought (CoT)
What is chain-of-thought prompting?
Chain-of-thought (CoT) is a prompting technique that asks an LLM to show its reasoning step-by-step before producing a final answer, which improves accuracy on complex reasoning tasks.
-
context window
What is an LLM context window?
A context window is the maximum number of tokens (input + output combined, in most APIs) that an LLM can process in a single call — typically ranging from a few thousand to several million tokens for modern models.
-
embedding
What is an embedding?
An embedding is a numerical vector representation of text (or other data) where semantically similar inputs produce vectors that are close together in vector space — used for similarity search, classification, and RAG retrieval.
-
fine-tuning
What is LLM fine-tuning?
Fine-tuning is the process of additional training applied to a pre-trained LLM on a smaller, task-specific dataset — adjusting the model's weights to improve performance on a specific use case.
-
function calling
What is function calling?
Function calling (also called tool use) is a feature in modern LLMs that lets the model output a structured request to invoke a function — the host application runs the function, returns the result, and the model continues reasoning with that result.
-
headless automation
What is headless automation?
Headless automation is automation that runs without a graphical interface — typically as background services, scheduled jobs, or API-driven workflows — rather than scripted UI clicks in a visible browser.
-
human-in-the-loop (HITL)
What is human-in-the-loop (HITL) in AI agents?
Human-in-the-loop (HITL) is a pattern where an AI agent pauses at predefined checkpoints to wait for a human's approval, edit, or rejection before continuing — typically for actions that are public, irreversible, or risky.
-
inference cost
What is LLM inference cost?
Inference cost is the per-call cost of running an LLM — measured in dollars per million input tokens and dollars per million output tokens, varying widely between models and providers.
-
LangChain
What is LangChain?
LangChain is an open-source framework for building applications with LLMs, providing primitives for prompts, chains, memory, agents, and tool use — across Python and TypeScript.
-
LangGraph
What is LangGraph?
LangGraph is a Python and TypeScript framework for building stateful, multi-agent LLM applications as graphs of nodes with conditional edges and persistent state.
-
LLM orchestration
What is LLM orchestration?
LLM orchestration is the coordination layer that sequences multiple LLM calls, tool invocations, retries, and conditional branches into a coherent agent or workflow — typically backed by a state machine or graph framework.
-
MCP server
What is an MCP server?
An MCP server is a small program that exposes tools, resources, and prompts to an LLM client (like Claude Desktop or an agent) over the Model Context Protocol — a standard for letting AIs interact with external systems.
-
multi-tenancy
What is multi-tenancy in AI agents?
Multi-tenancy is the architectural pattern where one running instance of an agent serves many independent customers (tenants), with each tenant's data, credentials, and configuration isolated from the others.
-
orchestration
What is workflow orchestration?
Orchestration is the coordination of multiple services, tools, or steps into a single coherent workflow — handling sequencing, retries, conditional branches, and state across the workflow's lifetime.
-
prompt injection
What is prompt injection?
Prompt injection is a class of attack where malicious instructions hidden in untrusted content (web pages, emails, documents) cause an AI agent to take actions or reveal information against the user's intent.
-
retrieval-augmented generation (RAG)
What is retrieval-augmented generation (RAG)?
Retrieval-augmented generation is a pattern where an LLM retrieves relevant documents from a knowledge base at inference time and uses them as context to ground its response in source material.
-
self-hosting
What is self-hosting?
Self-hosting is the practice of running software on infrastructure you control — your own servers, cloud accounts, or VMs — rather than using a managed SaaS provider.
-
structured output
What is LLM structured output?
Structured output is the LLM capability of returning responses that conform to a predefined schema (typically JSON) — guaranteed by the model's API, not just requested in the prompt.
-
system prompt
What is a system prompt?
A system prompt is the instruction message that establishes an LLM's role, tone, constraints, and tools — separate from the user's message and typically given higher priority by the model.
-
token
What is a token in LLM APIs?
A token is the unit of text an LLM processes — typically a word fragment of 1–4 characters; English averages roughly 0.75 words per token, so 1,000 tokens ≈ 750 words.
-
tool use
What is tool use in LLMs?
Tool use (also called function calling) is the LLM capability of emitting structured requests to call external functions — the host application runs the function and returns the result for the model to continue reasoning.
-
vector database
What is a vector database?
A vector database is a data store optimized for similarity search over high-dimensional vectors — used as the retrieval layer for RAG, recommendation, and semantic search applications.
-
voice agent
What is a voice agent?
A voice agent is an AI agent that interacts over a phone or voice channel — taking spoken input through speech-to-text, reasoning with an LLM, and responding through text-to-speech, typically with sub-second latency.
-
webhook
What is a webhook?
A webhook is an HTTP callback that one service makes to another when an event occurs — used to push notifications between systems instead of polling for changes.
-
white-label AI
What is white-label AI?
White-label AI is an AI product or service that one company licenses or rebrands as its own, presenting it to end clients under its own brand rather than the original builder's.