Unsloth

library active open-source

A library for fast and memory-efficient LLM fine-tuning. Unsloth optimizes the training loop with custom CUDA kernels, enabling 2-5x faster fine-tuning with 70% less memory usage. It integrates with Hugging Face Transformers and supports popular models like Llama, Mistral, and Gemma.

Implements

Concepts this tool claims to implement:

Fine-Tuning primary

Optimized LoRA and QLoRA training. Custom triton kernels for attention and MLP layers. Gradient checkpointing with reduced memory overhead. 2-5x speedup over standard Hugging Face training.
Quantization secondary

4-bit quantization for training (QLoRA). GGUF export for inference. Memory-efficient training on consumer GPUs.

Integration Surfaces

Details

Vendor: Unsloth AI
License: Apache-2.0
Runs On: local, cloud
Used By: human, system

Notes

Unsloth makes fine-tuning accessible on limited hardware. Popular for Google Colab users who want to fine-tune LLMs on free tier GPUs. The speed improvements are significant and well-documented. Works as a drop-in optimization for existing Hugging Face training code.

Implements

Integration Surfaces

Details

Links

Notes

Related Tools