retrieval
12 concepts in this domain
-
Chunking
ProcessThe process of dividing documents into smaller pieces (chunks) for embedding and retrieval. Chunking is necessary because: (1) embedding models have input limits, (2) smaller chunks enable more precis...
Also: Text Chunking, Document Chunking, Splitting
-
Context Window
PropertyThe maximum amount of text (measured in tokens) that an LLM can process in a single inference call, including both input (prompt + context) and output (response). The context window is a hard constrai...
Also: Context Length, Max Tokens, Context Size
-
Embedding
ArtifactA dense numerical vector representation of content (text, images, audio) where similar items have similar vectors. Embeddings compress semantic meaning into a fixed-size array of floats (typically 384...
Also: Vector Embedding, Text Embedding, Dense Vector
-
Episodic Memory
SystemMemory of specific events, experiences, and interactions—the "what happened" record. In AI agents, episodic memory stores past interactions, actions taken, and their outcomes, enabling the agent to re...
Also: Event Memory, Autobiographical Memory
-
Hybrid Search
ProcessA retrieval approach that combines multiple search strategies—typically keyword-based (sparse/BM25) and embedding-based (dense/vector)—to get the benefits of both. Keyword search excels at exact match...
Also: Hybrid Retrieval, Sparse-Dense Retrieval
-
Knowledge Base
SystemA structured collection of information that an AI system can query for knowledge retrieval. In LLM contexts, knowledge bases typically store documents, chunks, or facts that can be retrieved and injec...
Also: KB, Document Store, Corpus
-
Long-Term Memory
SystemMemory that persists beyond a single conversation or session, enabling an agent or system to retain information across interactions over extended periods. Long-term memory requires external storage (d...
Also: Persistent Memory, External Memory
-
Reranking
ProcessA second-stage ranking process that takes initial retrieval results and reorders them using a more sophisticated (and expensive) relevance model. Rerankers typically use cross-encoder models that join...
Also: Reranker, Cross-Encoder Reranking, Second-Stage Ranking
-
Retrieval-Augmented Generation
ProcessA pattern that enhances LLM generation by first retrieving relevant documents from an external knowledge source, then including those documents in the prompt as context. RAG addresses LLM limitations:...
Also: RAG, Retrieval Augmentation
-
Retriever
SystemA component that takes a query and returns relevant documents or passages from a corpus. The retriever is the "search" part of RAG—it determines what context the LLM sees. Retrievers can use various s...
Also: Retrieval Component, Search Component
-
Semantic Memory
SystemMemory of facts, concepts, and general knowledge independent of specific episodes or events. In AI agents, semantic memory stores learned facts, user preferences, and domain knowledge that can be retr...
Also: Factual Memory, Knowledge Memory
-
Vector Search
ProcessFinding items similar to a query by comparing their vector representations (embeddings) in a high-dimensional space. Unlike keyword search which matches exact terms, vector search captures semantic si...
Also: Semantic Search, Similarity Search, Nearest Neighbor Search, ANN Search