Doccano
Web AppFreeOpen-source text annotation for NLP tasks.
Capabilities13 decomposed
multi-task text annotation with project-scoped label schemas
Medium confidenceEnables creation of annotation projects supporting text classification, sequence labeling (NER), and sequence-to-sequence tasks through a unified project management interface. Each project defines its own label taxonomy and annotation type, with the backend Django REST API enforcing schema validation and persisting annotations to SQLite or PostgreSQL. The Vue.js frontend renders task-specific annotation interfaces dynamically based on project configuration, allowing teams to switch between annotation paradigms within the same deployment.
Uses a project-scoped label schema pattern where each project's annotation type and labels are defined once at creation, enforced server-side via Django serializers, and rendered dynamically in Vue.js components — avoiding the complexity of runtime task switching while maintaining simplicity for single-task projects
Simpler than Label Studio's complex conditional logic system but more focused on NLP tasks; lighter than Prodigy's ML-in-the-loop approach, making it better for teams prioritizing collaborative annotation over active learning
collaborative team annotation with role-based access control
Medium confidenceImplements multi-user annotation workflows through Django's authentication system with role-based access control (RBAC) at the project level. Users are assigned roles (admin, annotator, viewer) with granular permissions enforced in the REST API layer before data access. The backend tracks annotation ownership, supports concurrent editing without locking, and maintains audit trails of who annotated what. The Vue.js frontend respects role permissions in the UI, hiding actions unavailable to the current user's role.
Uses Django's permission framework with project-level role assignment, where roles are enforced at the serializer level in REST endpoints — each API call checks user.has_perm() before returning data, ensuring no leakage of unauthorized annotations
More lightweight than enterprise platforms like Labelbox (no custom role hierarchies) but more structured than Prodigy's single-user focus; better for teams needing basic RBAC without complex permission matrices
docker containerization with environment-based configuration
Medium confidenceProvides Docker Compose configuration for single-command deployment of Doccano with all dependencies (Django backend, Vue.js frontend, PostgreSQL, Redis). Environment variables control database connection, secret keys, allowed hosts, and feature flags. The Dockerfile uses multi-stage builds to minimize image size. Supports both development (with hot-reload) and production (with gunicorn) configurations. Pre-built images are published to Docker Hub, eliminating build time.
Uses Docker Compose with environment variable substitution for configuration, multi-stage Dockerfile for minimal image size, and pre-built images on Docker Hub — deployment is one command (docker-compose up) with no build step required
More convenient than manual installation but less flexible than Kubernetes manifests; better for teams wanting quick deployment without container orchestration expertise
project cloning and template reuse for rapid project setup
Medium confidenceAllows administrators to clone existing projects (including label schema, annotation guidelines, and UI configuration) to create new projects without manual reconfiguration. Cloning copies project metadata but not annotations, enabling rapid setup of similar projects. Supports exporting project configuration as a template file and importing it into other Doccano instances. Templates are JSON files containing label definitions, UI settings, and guidelines.
Implements project cloning via Django model copying with selective field inclusion (labels, UI config, guidelines) but exclusion of annotations, and template export/import via JSON serialization — enables rapid project setup and cross-instance configuration sharing
More convenient than manual reconfiguration but less sophisticated than Label Studio's workspace templates; better for teams with repetitive project structures
multi-language support with unicode text handling and rtl language rendering
Medium confidenceSupports annotation in multiple languages including right-to-left (RTL) languages (Arabic, Hebrew, Persian) with proper Unicode text handling and bidirectional text rendering. The frontend uses CSS flexbox with direction properties to render RTL text correctly, while the backend stores all text as UTF-8 without language-specific processing. Language selection is per-project, affecting UI language and text rendering direction.
Implements bidirectional text rendering with CSS direction properties for RTL languages, enabling native annotation in Arabic, Hebrew, and Persian without manual text reversal. All text is stored as UTF-8, avoiding language-specific encoding issues.
Provides native multilingual support with RTL rendering, whereas Label Studio requires custom CSS modifications for RTL languages and Prodigy has limited non-English support
asynchronous data import with format auto-detection and validation
Medium confidenceProcesses bulk data imports through a Celery task queue that handles CSV, JSON, JSONL, and other formats without blocking the web interface. The backend detects file format, validates against project schema (ensuring required text fields exist), and creates Example records in batches. Large imports are chunked to avoid memory exhaustion, with progress tracking via Celery task IDs. Failed rows are logged separately, allowing users to retry or inspect errors without re-importing successful records.
Uses Celery task queue with format auto-detection via file extension and content sniffing, combined with Django's bulk_create() for batch inserts — imports are tracked by task ID, allowing users to check progress and retrieve error logs without blocking the UI
More scalable than synchronous imports in Prodigy but less sophisticated than Label Studio's streaming parser; better for teams with large datasets and limited patience for blocking uploads
structured data export with format conversion and filtering
Medium confidenceExports annotated datasets in multiple formats (JSON, JSONL, CSV, CoNLL for sequence labeling) through a Django REST endpoint that queries the database, applies user-specified filters (by label, annotator, status), and serializes annotations with metadata. Export jobs can be async for large datasets, returning a download URL. The serialization layer handles format-specific transformations: CoNLL format converts span annotations to BIO tags, CSV flattens nested structures, JSONL preserves full annotation objects.
Uses Django serializers with format-specific subclasses (CoNLLSerializer, CSVSerializer, JSONLSerializer) that transform the same underlying annotation data into task-specific formats — each serializer handles format rules (BIO tagging, flattening, etc.) without duplicating query logic
More flexible than Prodigy's fixed export formats but less customizable than Label Studio's template-based exports; better for standard NLP formats (CoNLL, BIO) but requires custom code for proprietary formats
auto-labeling with external service integration and custom rest templates
Medium confidenceIntegrates with external ML services (OpenAI, Hugging Face, custom REST APIs) to pre-label examples before human annotation. Users configure auto-labeling via a template system that specifies request format, response parsing, and label mapping. The backend sends text to the external service, parses the response, and creates annotations programmatically. Supports both batch pre-labeling (all examples at once) and on-demand labeling (per-example). Failed requests are retried with exponential backoff; results are cached to avoid duplicate API calls.
Uses a template-based configuration system where users define request/response formats in the UI without code, with Jinja2 templating for dynamic field substitution and regex/JSONPath for response parsing — auto-labeling jobs are queued via Celery and results are cached by content hash to avoid duplicate API calls
More flexible than Prodigy's hardcoded model integrations (supports any REST API) but less robust than Label Studio's plugin system (no type safety or validation); better for teams with custom models but requires careful template configuration
example assignment and sampling strategies for annotation distribution
Medium confidenceDistributes examples to annotators using configurable sampling strategies (sequential, random, stratified by label) to ensure balanced workload and coverage. The backend tracks assignment state (unassigned, assigned, completed) per annotator and prevents double-assignment. Supports batch assignment (assign N examples to annotator) and dynamic assignment (assign next unassigned example on-demand). The Vue.js frontend shows annotators their assigned examples in a queue, with progress tracking.
Uses Django ORM with assignment state tracking (unassigned/assigned/completed) and pluggable sampling strategies (SequentialSampler, RandomSampler, StratifiedSampler) that can be swapped at runtime — prevents double-assignment via database constraints and atomic transactions
More lightweight than Label Studio's complex task routing but more flexible than Prodigy's single-user model; better for teams needing simple, fair distribution without ML-based prioritization
restful api for programmatic project and annotation management
Medium confidenceExposes full Doccano functionality through a Django REST Framework API with endpoints for creating projects, uploading data, retrieving annotations, and managing users. All operations that can be done in the UI are available via HTTP endpoints with JSON request/response bodies. The API uses token-based authentication (JWT or session tokens) and enforces the same RBAC as the UI. Supports pagination for large result sets and filtering by query parameters. API documentation is auto-generated via drf-spectacular (OpenAPI/Swagger).
Built on Django REST Framework with auto-generated OpenAPI documentation via drf-spectacular, providing a complete REST surface that mirrors the UI's capabilities — all endpoints enforce the same RBAC and serialization logic as the web interface
More complete than Prodigy's limited API but less feature-rich than Label Studio's extensive REST/GraphQL endpoints; better for teams needing basic programmatic access without complex query requirements
multi-language annotation interface with rtl and character-set support
Medium confidenceProvides a Vue.js frontend that supports annotation in 20+ languages with proper right-to-left (RTL) text rendering for Arabic, Hebrew, and Persian. The UI dynamically switches text direction and font rendering based on detected language. Character-set support includes CJK (Chinese, Japanese, Korean), Devanagari, and other non-Latin scripts. Language is set per-project and enforced in the annotation interface, with translations for UI elements (labels, buttons, help text) provided via i18n framework.
Uses Vue.js i18n plugin with dynamic text direction switching (dir attribute) and CSS flexbox/grid for RTL layouts — language is set at project creation and enforced throughout the UI, with character rendering delegated to the browser's Unicode support
More comprehensive RTL support than Prodigy (which is English-only) but less sophisticated than Label Studio's language-specific UI customization; better for teams needing basic multilingual support without complex localization
annotation interface customization via project-level ui configuration
Medium confidenceAllows project administrators to customize the annotation interface without code by configuring display options (show/hide fields, reorder labels, set keyboard shortcuts) through the Django admin or API. Configuration is stored per-project and applied dynamically in the Vue.js frontend. Supports task-specific customizations: text classification can show labels as buttons or dropdown, sequence labeling can show span highlighting or tag list, etc. Changes apply immediately to all annotators without redeployment.
Stores UI configuration in the Django model as JSON, which is fetched by the Vue.js frontend and applied dynamically to control component rendering and behavior — no code changes required, changes are immediate across all annotators
More flexible than Prodigy's fixed UI but less powerful than Label Studio's template system; better for teams wanting simple customization without learning a templating language
annotation quality monitoring with inter-annotator agreement metrics
Medium confidenceProvides built-in metrics for measuring annotation quality through inter-annotator agreement (IAA) calculations. Supports Cohen's Kappa for binary classification, Fleiss' Kappa for multi-class, and Krippendorff's Alpha for sequence labeling. Metrics are computed on overlapping annotations (examples assigned to multiple annotators) and displayed in the admin dashboard. The backend computes metrics on-demand via a Celery task, caching results for performance. Supports filtering by label, date range, and annotator pair.
Implements multiple IAA metrics (Cohen's Kappa, Fleiss' Kappa, Krippendorff's Alpha) via scikit-learn, computed asynchronously via Celery and cached in the database — metrics are filterable by label, date, and annotator pair, enabling drill-down analysis of disagreement
More comprehensive than Prodigy (which has no IAA support) but less sophisticated than specialized quality tools like Labelbox's quality metrics; better for teams needing standard IAA metrics without custom analysis
Capabilities are decomposed by AI analysis. Each maps to specific user intents and improves with match feedback.
Related Artifactssharing capabilities
Artifacts that share capabilities with Doccano, ranked by overlap. Discovered automatically through the match graph.
Label Studio
Open-source multi-modal data labeling platform.
label-studio
Label Studio annotation tool
Supervisely
Enterprise computer vision platform for teams.
Labelbox
AI-powered data labeling platform for CV and NLP.
SuperAnnotate
Enhance AI with advanced annotation, model tuning, and...
Best For
- ✓ML teams building NLP datasets with mixed task types
- ✓researchers prototyping multiple annotation paradigms
- ✓organizations standardizing on a single annotation platform
- ✓distributed teams annotating large datasets
- ✓organizations with strict data governance requirements
- ✓projects requiring audit trails for compliance
- ✓DevOps engineers deploying Doccano to production
- ✓teams using Kubernetes or container orchestration
Known Limitations
- ⚠No hierarchical label support — labels are flat per project, limiting complex taxonomies
- ⚠Annotation type is immutable after project creation — requires project recreation to switch tasks
- ⚠No built-in inter-annotator agreement metrics — requires external analysis of exported annotations
- ⚠No optimistic locking — concurrent edits to the same annotation can overwrite without warning
- ⚠RBAC is project-scoped only; no organization-level or resource-level permissions
- ⚠No built-in conflict resolution for simultaneous annotations of the same example
Requirements
Input / Output
UnfragileRank
UnfragileRank is computed from adoption signals, documentation quality, ecosystem connectivity, match graph feedback, and freshness. No artifact can pay for a higher rank.
About
Open-source text annotation tool for machine learning practitioners. Supports sequence labeling, text classification, and sequence-to-sequence tasks with a collaborative web interface, multi-language support, and dataset export in common formats.
Categories
Alternatives to Doccano
Are you the builder of Doccano?
Claim this artifact to get a verified badge, access match analytics, see which intents users search for, and manage your listing.
Get the weekly brief
New tools, rising stars, and what's actually worth your time. No spam.
Data Sources
Looking for something else?
Search →