System

Long-lived operational components - databases, orchestrators

17 concepts of this type

Agent Memory
agents

Mechanisms that allow an agent to retain and access information beyond the current context window. Memory enables agents to learn from past interactions, maintain state across sessions, and reference ...
Agentic System
agents

A software system that exhibits agent-like behavior—perceiving, reasoning, and acting—but may not fit the strict definition of a single coherent agent. Agentic systems often combine multiple compone...
API Gateway
deployment

A server that acts as an intermediary between clients and backend services, handling cross-cutting concerns like authentication, rate limiting, logging, and routing. For LLM applications, API gateways...
Episodic Memory
retrieval

Memory of specific events, experiences, and interactions—the "what happened" record. In AI agents, episodic memory stores past interactions, actions taken, and their outcomes, enabling the agent to re...
Foundation Model
models

A large model trained on broad data at scale, designed to be adapted to a wide range of downstream tasks. Foundation models are "pre-trained" on general data (text, images, code) and then specialized ...
Guardrails
evaluation

Systems that monitor, filter, or constrain LLM inputs and outputs to prevent harmful, unsafe, or policy-violating content. Guardrails act as safety layers around LLM applications, catching problems th...
Knowledge Base
retrieval

A structured collection of information that an AI system can query for knowledge retrieval. In LLM contexts, knowledge bases typically store documents, chunks, or facts that can be retrieved and injec...
Large Language Model
models

A neural network trained on massive text corpora to predict and generate natural language. "Large" refers to both the model size (billions of parameters) and training data (trillions of tokens). LLMs ...
Long-Term Memory
retrieval

Memory that persists beyond a single conversation or session, enabling an agent or system to retain information across interactions over extended periods. Long-term memory requires external storage (d...
Model Router
deployment

A system that dynamically selects which model to use for a given request based on criteria like cost, latency, capability, or query complexity. Routers enable cost optimization (use cheaper models for...
Model Serving
deployment

The infrastructure and systems that make trained models available for inference requests. Model serving handles loading models into memory, processing requests, managing resources, and returning predi...
Multi-Agent System
agents

A system composed of multiple agents that interact, collaborate, or compete to achieve individual or collective goals. Agents in a multi-agent system may be homogeneous (same capabilities) or heteroge...
Multimodal Model
models

A model that can process and/or generate multiple types of data (modalities) such as text, images, audio, and video. Multimodal models understand relationships across modalities—they can describe imag...
Retriever
retrieval

A component that takes a query and returns relevant documents or passages from a corpus. The retriever is the "search" part of RAG—it determines what context the LLM sees. Retrievers can use various s...
Semantic Memory
retrieval

Memory of facts, concepts, and general knowledge independent of specific episodes or events. In AI agents, semantic memory stores learned facts, user preferences, and domain knowledge that can be retr...
Tokenizer
models

A component that converts raw text into tokens (numerical IDs) that an LLM can process, and converts tokens back to text. Tokenizers define the vocabulary of a model and how text is segmented. The tok...
Transformer
models

A neural network architecture based on self-attention mechanisms, introduced in "Attention Is All You Need" (2017). Transformers process sequences by allowing each position to attend to all other posi...