Corpus Annotation through Crowdsourcing:Towards Best Practice Guidelines

R. Sabou, K. Bontcheva, L. Derczynski, A. Scharl:
"Corpus Annotation through Crowdsourcing:Towards Best Practice Guidelines";
Vortrag: Ninth International Conference on Language Resources and Evaluation (LREC'14), Reykjavik, Iceland; 26.05.2014 - 31.05.2014; in:"Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)", European Language Resources Association (ELRA), (2014), ISBN: 978-2-9517408-8-4.

[ Publication Database ]

Abstract:


Crowdsourcing is an emerging collaborative approach for acquiring annotated corpora and a wide range of other linguistic resources.

Although the use of this approach is intensifying in all its key genres (paid-for crowdsourcing, games with a purpose, volunteering-based

approaches), the community still lacks a set of best-practice guidelines similar to the annotation best practices for traditional, expert-
based corpus acquisition. In this paper we focus on the use of crowdsourcing methods for corpus acquisition and propose a set of best

practice guidelines based in our own experiences in this area and an overview of related literature. We also introduce GATE Crowd, a

plugin of the GATE platform that relies on these guidelines and offers tool support for using crowdsourcing in a more principled and

efficient manner.