Definition
What is an embedding?
Last updated
Definition
An embedding is a numerical vector representation of text (or other data) where semantically similar inputs produce vectors that are close together in vector space — used for similarity search, classification, and RAG retrieval.
Embeddings turn text into fixed-size float arrays (typically 384–3,072 dimensions). The key property: cosine similarity between embedding vectors approximates semantic similarity between the original texts. Production uses include vector search for RAG, clustering documents, finding duplicates, and recommendation systems. Models include OpenAI text-embedding-3, Cohere Embed v3, and many open-source alternatives (BGE, E5, GTE).
Choosing an embedding model
Dimensionality, language coverage, and domain (general-purpose vs code vs scientific) all matter. For most RAG use cases, OpenAI text-embedding-3-small or BGE-large work well. Specialized domains (legal, medical, code) benefit from domain-tuned models.
Storing embeddings
Vector databases (pgvector, Pinecone, Qdrant, Weaviate) index embeddings for fast nearest-neighbor search. For small corpora (under 100K vectors), a flat in-memory index is often enough.