Embedding
Also known as: Vector Embedding, Text Embedding, Dense Vector
Definition
A dense numerical vector representation of content (text, images, audio) where similar items have similar vectors. Embeddings compress semantic meaning into a fixed-size array of floats (typically 384-3072 dimensions) that can be compared mathematically. They're the foundation of vector search—without embeddings, there's nothing to search over.
What this is NOT
- Not the embedding model (the model produces embeddings; an embedding is a single vector)
- Not a sparse vector (embeddings are dense; TF-IDF vectors are sparse)
- Not the text itself (embeddings are numerical representations of text)
Alternative Interpretations
Different communities use this term differently:
llm-practitioners
The output of an embedding model (OpenAI text-embedding-3, Cohere embed, sentence-transformers) that converts text into a vector for storage in a vector database and similarity comparison.
Sources: OpenAI Embeddings API documentation, Sentence-Transformers documentation, MTEB benchmark
deep-learning
A learned representation where items are mapped to points in a continuous vector space such that geometric relationships reflect semantic relationships. Word2Vec and GloVe pioneered this for words; modern models extend to sentences and documents.
Sources: Word2Vec paper (Mikolov et al., 2013), BERT and transformer-based embedding models
Examples
- OpenAI's text-embedding-3-small producing a 1536-dimensional vector
- Embedding a product description for a recommendation system
- Converting code snippets to vectors for code search
- Multimodal embeddings that represent both text and images
Counterexamples
Things that might seem like Embedding but are not:
- TF-IDF vectors (sparse, not learned)
- One-hot encodings (not semantic)
- Raw text strings
Relations
- requires vector-search (Vector search operates on embeddings)
- requires retrieval-augmented-generation (RAG typically uses embeddings for retrieval)
- overlapsWith chunking (Chunks are what get embedded)
Implementations
Tools and frameworks that implement this concept: