Snorkel AI
A data-centric AI platform focused on programmatic labeling and data development. Snorkel pioneered weak supervision - using labeling functions to programmatically generate training labels instead of manual annotation. Their enterprise platform extends this with LLM-assisted labeling and data quality tools.
Implements
Concepts this tool claims to implement:
- Synthetic Data primary
Programmatic data labeling with labeling functions. Weak supervision to combine noisy label sources. LLM-assisted label generation for scaling annotation.
- Training Data primary
Data slicing and analysis. Error analysis on model predictions. Data versioning and lineage tracking.
- Annotation secondary
SME (subject matter expert) labeling interface. Combine programmatic and human labels. Label model for denoising and combining sources.
Integration Surfaces
Details
- Vendor
- Snorkel AI Inc.
- License
- Apache-2.0 (OSS) / Proprietary (Flow)
- Runs On
- local, cloud
- Used By
- human, system
Links
Notes
Snorkel is unique in its focus on programmatic labeling. The open-source library implements weak supervision concepts. Snorkel Flow is the enterprise platform with more features. Good for teams that can write labeling functions rather than relying purely on manual annotation.