Instruction Tuning
Also known as: Instruction Fine-tuning, Instruct Models
Definition
Fine-tuning a language model on datasets of instructions and responses to improve its ability to follow natural language instructions. Instruction- tuned models understand and execute commands like "Summarize this" or "Write a poem about X" much better than base models, which just predict text.
What this is NOT
- Not RLHF (instruction tuning is supervised; RLHF adds RL)
- Not prompting (instruction tuning changes weights)
- Not task-specific fine-tuning (instruction tuning is general)
Alternative Interpretations
Different communities use this term differently:
llm-practitioners
Training a base model on instruction-response pairs to create an "instruct" version that follows user requests. Examples: GPT-3.5-turbo, Llama-2-chat, Claude (all instruction-tuned).
Sources: FLAN paper (Wei et al., 2021), InstructGPT paper, Model provider documentation
Examples
- FLAN-T5: T5 instruction-tuned on diverse tasks
- Llama-2-chat: Llama instruction-tuned for dialogue
- Alpaca: Llama instruction-tuned on GPT-generated instructions
- Training on ShareGPT conversation data
Counterexamples
Things that might seem like Instruction Tuning but are not:
- Base GPT-4 before instruction tuning (just text completion)
- RLHF alone without instruction tuning
- Task-specific fine-tuning (not general instruction following)
Relations
- specializes fine-tuning (Instruction tuning is fine-tuning on instruction data)
- overlapsWith rlhf (Often followed by RLHF for further alignment)
- requires foundation-model (Instruction tuning starts from a base model)
Implementations
Tools and frameworks that implement this concept:
- Axolotl primary
- Meta Llama secondary