Machine learning from crowds using candidate set-based labelling
Date
2022Metadata
Show full item recordAbstract
Crowdsourcing is a popular cheap alternative in machine learning for gathering information from a set of annotators. Learning from crowd-labelled data involves dealing with its inherent uncertainty and inconsistencies. In the classical framework, each annotator provides a single label per example, which fails to capture the complete knowledge of annotators. We propose candidate labelling, that is, to allow annotators to provide a set of candidate labels for each example and thus express their doubts. We propose an appropriate model for the annotators, and present two novel learning methods that deal with the two basic steps (label aggregation and model learning) sequentially or jointly. Our empirical study shows the advantage of candidate labelling and the proposed methods with respect to the classical framework.