Capability

Consensus Based Annotation Workflows With Quality Scoring

20 artifacts provide this capability.

Want a personalized recommendation?

Top Matches

OpenAssistant Conversations (OASST)Dataset44/100

via “human quality rating aggregation with inter-annotator agreement metrics”

161K human-written messages in 35 languages with quality ratings.

Unique: Provides raw per-annotator ratings alongside aggregates, enabling downstream systems to compute custom agreement metrics and weight examples by confidence rather than using fixed aggregation. Most datasets only expose final scores.

vs others: Richer annotation metadata than single-rater datasets (e.g., Alpaca) or datasets with binary labels, allowing nuanced quality-based filtering and confidence-weighted training.

Consensus Based Annotation Workflows With Quality Scoring

Top Matches

Also Known As

Company