Material Detail

Domain-Independent Quality Measures for Crowd Truth Disagreement

Domain-Independent Quality Measures for Crowd Truth Disagreement

This video was recorded at 12th International Semantic Web Conference (ISWC), Sydney 2013. Using crowdsourcing platforms such as CrowdFlower and Amazon Mechanical Turk for gathering human annotation data has become now a mainstream process. Such crowd involvement can reduce the time needed for solving an annotation task and with the large number of annotators can be a valuable source of annotation diversity. In order to harness this diversity across domains it is critical to establish a common ground for quality assessment of the results. In this paper we report our experiences for optimizing and adapting crowdsourcing microtasks across domains considering three aspects: (1) the micro-task template, (2) the quality measurements for the workers judgments and (3) the overall annotation workow. We performed experiments in two domains, i.e. events extraction (MRP project) and medical relations extraction (Crowd-Watson project). The results conrm our main hypothesis that some aspects of the evaluation metrics can be dened in a domainindependent way for micro-tasks that assess the parameters to harness the diversity of annotations and the useful disagreement between workers. This paper focuses specically on the parameters relevant for the 'event extraction' ground-truth data collection and demonstrates their reusability from the medical domain.


  • User Rating
  • Comments
  • Learning Exercises
  • Bookmark Collections
  • Course ePortfolios
  • Accessibility Info

More about this material


Log in to participate in the discussions or sign up if you are not already a MERLOT member.