Unsloth

library active open-source

A library for fast and memory-efficient LLM fine-tuning. Unsloth optimizes the training loop with custom CUDA kernels, enabling 2-5x faster fine-tuning with 70% less memory usage. It integrates with Hugging Face Transformers and supports popular models like Llama, Mistral, and Gemma.

Implements

Concepts this tool claims to implement:

  • Optimized LoRA and QLoRA training. Custom triton kernels for attention and MLP layers. Gradient checkpointing with reduced memory overhead. 2-5x speedup over standard Hugging Face training.

  • Quantization secondary

    4-bit quantization for training (QLoRA). GGUF export for inference. Memory-efficient training on consumer GPUs.

Integration Surfaces

  • Python library
  • Hugging Face Transformers integration
  • Google Colab notebooks
  • Kaggle integration

Details

Vendor
Unsloth AI
License
Apache-2.0
Runs On
local, cloud
Used By
human, system

Notes

Unsloth makes fine-tuning accessible on limited hardware. Popular for Google Colab users who want to fine-tune LLMs on free tier GPUs. The speed improvements are significant and well-documented. Works as a drop-in optimization for existing Hugging Face training code.