Instruction Tuning

Process models published

Also known as: Instruction Fine-tuning, Instruct Models

Definition

Fine-tuning a language model on datasets of instructions and responses to improve its ability to follow natural language instructions. Instruction- tuned models understand and execute commands like "Summarize this" or "Write a poem about X" much better than base models, which just predict text.

What this is NOT

Not RLHF (instruction tuning is supervised; RLHF adds RL)
Not prompting (instruction tuning changes weights)
Not task-specific fine-tuning (instruction tuning is general)

Alternative Interpretations

Different communities use this term differently:

llm-practitioners

Training a base model on instruction-response pairs to create an "instruct" version that follows user requests. Examples: GPT-3.5-turbo, Llama-2-chat, Claude (all instruction-tuned).

Sources: FLAN paper (Wei et al., 2021), InstructGPT paper, Model provider documentation

Examples

FLAN-T5: T5 instruction-tuned on diverse tasks
Llama-2-chat: Llama instruction-tuned for dialogue
Alpaca: Llama instruction-tuned on GPT-generated instructions
Training on ShareGPT conversation data

Counterexamples

Things that might seem like Instruction Tuning but are not:

Base GPT-4 before instruction tuning (just text completion)
RLHF alone without instruction tuning
Task-specific fine-tuning (not general instruction following)

Relations

specializes fine-tuning (Instruction tuning is fine-tuning on instruction data)
overlapsWith rlhf (Often followed by RLHF for further alignment)
requires foundation-model (Instruction tuning starts from a base model)

Implementations

Tools and frameworks that implement this concept:

Axolotl primary
Meta Llama secondary