Open Source Data Labeling Tools
Argilla helps domain experts and data teams to build better NLP datasets in less time.
License: Apache License 2.0
Baal is an active learning library that supports both industrial applications and research usecases.
License: Apache License 2.0
Web-based text annotation tool for Named-Entity-Recogntion task.
License: Other
Python library for data-centric AI. Can automatically: find mislabeled data, detect outliers, estimate consensus + annotator-quality for multi-annotator datasets, suggest which data is best to (re)label next.
License: GNU Affero General Public License v3.0
Web-based image segmentation tool for object detection, localization and keypoints
License: MIT License
OpenCV’s web-based annotation tool for both VIDEOS and images for computer algorithms.
License: MIT License
Open source text annotation tools for humans, providing functionality for sentiment analysis, named entity recognition, and machine translation.
License: MIT License
Image labelling tool with support for collaboration, supporting bounding box, polygon, line, point labelling, label export, etc.
License: MIT License
Image annotation tool for bounding boxes with auto-suggestion and extensibility for plugins.
License: MIT License
Multi-domain data labeling and annotation tool with standardized output format.
License: Apache License 2.0
Open source graphical image annotation tool writen in Python using QT for graphical interface focusing primarily on bounding boxes.
License: MIT License
Free to use online tool for labelling photos. Prepared labels can be downloaded in one of multiple supported formats.
License: GNU General Public License v3.0
modAL is an active learning framework designed with modularity, flexibility and extensibility in mind.
License: MIT License
Open source tool for labelling images with support for labels, edges, as well as image resizing and zooming in.
License: Apache License 2.0
Image annotation tool with ability to “colour” on the images to select labels for segmentation. Process is semi-automated with the watershed marked algorithm of OpenCV
License: GNU Lesser General Public License v3.0
The data scientist’s open-source choice to scale, assess and maintain natural language data.
License: Apache License 2.0
Open-source tool for tracking, exploring, and labeling data for AI projects.
License: Apache License 2.0
Hitachi’s Open source tool for labelling camera and LIDAR data.
License: MIT License
Snorkel is a system for quickly generating training data with weak supervision.
License: Apache License 2.0
superintendent provides an ipywidget-based interactive labelling tool for your data.
License: No License
Last Updated: Dec 26, 2023