retrieval

12 concepts in this domain

Chunking
Process

The process of dividing documents into smaller pieces (chunks) for embedding and retrieval. Chunking is necessary because: (1) embedding models have input limits, (2) smaller chunks enable more precis...

Also: Text Chunking, Document Chunking, Splitting
Context Window
Property

The maximum amount of text (measured in tokens) that an LLM can process in a single inference call, including both input (prompt + context) and output (response). The context window is a hard constrai...

Also: Context Length, Max Tokens, Context Size
Embedding
Artifact

A dense numerical vector representation of content (text, images, audio) where similar items have similar vectors. Embeddings compress semantic meaning into a fixed-size array of floats (typically 384...

Also: Vector Embedding, Text Embedding, Dense Vector
Episodic Memory
System

Memory of specific events, experiences, and interactions—the "what happened" record. In AI agents, episodic memory stores past interactions, actions taken, and their outcomes, enabling the agent to re...

Also: Event Memory, Autobiographical Memory
Hybrid Search
Process

A retrieval approach that combines multiple search strategies—typically keyword-based (sparse/BM25) and embedding-based (dense/vector)—to get the benefits of both. Keyword search excels at exact match...

Also: Hybrid Retrieval, Sparse-Dense Retrieval
Knowledge Base
System

A structured collection of information that an AI system can query for knowledge retrieval. In LLM contexts, knowledge bases typically store documents, chunks, or facts that can be retrieved and injec...

Also: KB, Document Store, Corpus
Long-Term Memory
System

Memory that persists beyond a single conversation or session, enabling an agent or system to retain information across interactions over extended periods. Long-term memory requires external storage (d...

Also: Persistent Memory, External Memory
Reranking
Process

A second-stage ranking process that takes initial retrieval results and reorders them using a more sophisticated (and expensive) relevance model. Rerankers typically use cross-encoder models that join...

Also: Reranker, Cross-Encoder Reranking, Second-Stage Ranking
Retrieval-Augmented Generation
Process

A pattern that enhances LLM generation by first retrieving relevant documents from an external knowledge source, then including those documents in the prompt as context. RAG addresses LLM limitations:...

Also: RAG, Retrieval Augmentation
Retriever
System

A component that takes a query and returns relevant documents or passages from a corpus. The retriever is the "search" part of RAG—it determines what context the LLM sees. Retrievers can use various s...

Also: Retrieval Component, Search Component
Semantic Memory
System

Memory of facts, concepts, and general knowledge independent of specific episodes or events. In AI agents, semantic memory stores learned facts, user preferences, and domain knowledge that can be retr...

Also: Factual Memory, Knowledge Memory
Vector Search
Process

Finding items similar to a query by comparing their vector representations (embeddings) in a high-dimensional space. Unlike keyword search which matches exact terms, vector search captures semantic si...

Also: Semantic Search, Similarity Search, Nearest Neighbor Search, ANN Search