evaluation
7 concepts in this domain
-
Alignment
PropertyThe degree to which an AI system's behavior matches intended goals, values, and constraints. For LLMs, alignment means the model is helpful, harmless, and honest—it does what users want, avoids harmfu...
Also: AI Alignment, Value Alignment, Model Alignment
-
Benchmark
ArtifactA standardized dataset and evaluation protocol for measuring LLM performance on specific capabilities or tasks. Benchmarks enable comparison across models and tracking of progress over time. They typi...
Also: Evaluation Benchmark, Test Suite, Eval
-
Faithfulness
PropertyThe degree to which generated text accurately reflects and is supported by the source material or context provided. A faithful response contains only claims that can be verified from the given sources...
Also: Factual Consistency, Source Fidelity
-
Grounding
ProcessConstraining LLM outputs to be based on and traceable to specific source material, rather than generated from the model's parametric knowledge alone. Grounding connects generated text to verifiable so...
Also: Grounded Generation, Source Attribution
-
Guardrails
SystemSystems that monitor, filter, or constrain LLM inputs and outputs to prevent harmful, unsafe, or policy-violating content. Guardrails act as safety layers around LLM applications, catching problems th...
Also: Safety Guardrails, Content Filters, Safety Layers
-
Hallucination
PropertyWhen an LLM generates content that is factually incorrect, nonsensical, or unfaithful to provided source material, presented confidently as if true. Hallucinations are a fundamental limitation of LLMs...
Also: Confabulation, Fabrication, Making Things Up
-
Red Teaming
ProcessSystematically testing AI systems by attempting to make them fail, produce harmful outputs, or behave in unintended ways. Red teams act as adversaries, probing for vulnerabilities through prompt injec...
Also: Adversarial Testing, Safety Testing, Attack Testing