On the evaluation and selection of classifier learning algorithms with crowdsourced data
Date
2019-02-16Metadata
Show full item recordAbstract
In many current problems, the actual class of the instances, the ground truth, is unavail-
able. Instead, with the intention of learning a model, the labels can be crowdsourced by
harvesting them from different annotators. In this work, among those problems we fo-
cus on those that are binary classification problems. Specifically, our main objective is
to explore the evaluation and selection of models through the quantitative assessment of
the goodness of evaluation methods capable of dealing with this kind of context. That
is a key task for the selection of evaluation methods capable of performing a sensible
model selection. Regarding the evaluation and selection of models in such contexts,
we identify three general approaches, each one based on a different interpretation of
the nature of the underlying ground truth: deterministic, subjectivist or probabilistic.
For the analysis of these three approaches, we propose how to estimate the Area Under
the Curve (AUC) of the Receiver Operating Characteristic (ROC) curve within each
interpretation, thus deriving three evaluation methods. These methods are compared in
extensive experimentation whose empirical results show that the probabilistic method
generally overcomes the other two, as a result of which we conclude that it is advisable
to use that method when performing the evaluation in such contexts. In further studies,
it would be interesting to extend our research to multiclass classification problems.