Faithfulness

Property evaluation published

Also known as: Factual Consistency, Source Fidelity

Definition

The degree to which generated text accurately reflects and is supported by the source material or context provided. A faithful response contains only claims that can be verified from the given sources, without adding unsupported information. Faithfulness is a key metric for evaluating RAG systems.

What this is NOT

Not correctness (faithful to source, source might be wrong)
Not completeness (faithful answers can omit information)
Not relevance (answer can be faithful but not answer the question)

Alternative Interpretations

Different communities use this term differently:

llm-practitioners

A quality measure for RAG outputs: does the generated answer accurately represent what the retrieved documents say, without adding claims not in the sources? Evaluated by faithfulness metrics like RAGAS.

Sources: RAGAS faithfulness metric, RAG evaluation frameworks, Summarization evaluation literature

Examples

RAGAS faithfulness score for RAG evaluation
NLI-based verification of generated claims
Checking each sentence against source documents
LLM judge evaluating 'is this claim supported by the context?'

Counterexamples

Things that might seem like Faithfulness but are not:

Model adding information not in context (unfaithful)
Model contradicting the provided sources
Model inventing citations

Relations

inTensionWith hallucination (Hallucination is unfaithfulness)
overlapsWith grounding (Grounding enables faithfulness)
overlapsWith benchmark (Faithfulness is often benchmarked)

Implementations

Tools and frameworks that implement this concept:

Langfuse secondary
RAGAS primary