Foundation Model

System models published

Also known as: Base Model, Pre-trained Model

Definition

A large model trained on broad data at scale, designed to be adapted to a wide range of downstream tasks. Foundation models are "pre-trained" on general data (text, images, code) and then specialized through fine-tuning, prompting, or other adaptation methods. The term emphasizes that these models serve as foundations for building specific applications.

What this is NOT

Not a fine-tuned model (foundation models are the base, pre-adaptation)
Not task-specific (foundation models are general-purpose)
Not small models (foundation models are large by design)

Alternative Interpretations

Different communities use this term differently:

ml-research

A model paradigm introduced by Stanford HAI where a single large model trained on diverse data serves as the base for many applications, contrasting with the prior paradigm of training task-specific models.

Sources: On the Opportunities and Risks of Foundation Models (Bommasani et al., 2021), Stanford HAI Foundation Models report

llm-practitioners

A pre-trained model (like GPT-4, Claude, Llama base) before fine-tuning or RLHF, sometimes also used to refer to any major model that can be adapted for various uses.

Sources: Model provider documentation, Industry usage

Examples

Llama 3.1 base model (before instruction tuning)
CLIP (vision-language foundation model)
GPT-4 base (before RLHF)
Whisper (audio foundation model)

Counterexamples

Things that might seem like Foundation Model but are not:

A BERT model fine-tuned for sentiment classification (task-specific)
A small custom model trained for one application
Instruction-tuned or RLHF'd versions (those are adapted, not foundation)

Relations

generalizes large-language-model (LLMs are a type of foundation model)
overlapsWith fine-tuning (Foundation models are often fine-tuned for specific uses)
overlapsWith multimodal-model (Multimodal models are foundation models for multiple modalities)

Implementations

Tools and frameworks that implement this concept:

Hugging Face primary
Llama 3 primary
Meta Llama primary
Mistral AI primary