Project "DALI" data sheet

The following table provides information about the project.


Organization address
address: 327 MILE END ROAD
city: LONDON
postcode: E1 4NS

contact info
title: n.a.
name: n.a.
surname: n.a.
function: n.a.
email: n.a.
telephone: n.a.
fax: n.a.

 Coordinator Country United Kingdom [UK]
 Total cost 2˙499˙471 €
 EC max contribution 2˙499˙471 € (100%)
 Programme 1. H2020-EU.1.1. (EXCELLENT SCIENCE - European Research Council (ERC))
 Code Call ERC-2015-AdG
 Funding Scheme ERC-ADG
 Starting year 2016
 Duration (year-month-day) from 2016-09-01   to  2021-08-31


Take a look of project's partnership.

# participants  country  role  EC contrib. [€] 
1    QUEEN MARY UNIVERSITY OF LONDON UK (LONDON) coordinator 2˙197˙167.00
2    UNIVERSITY OF ESSEX UK (COLCHESTER) participant 302˙303.00


 Project objective

Natural language expressions are supposed to be unambiguous in context. Yet more and more examples of use of expressions that are ambiguous in context, yet felicitous and rhetorically unmarked, are emerging. In my own work, I demonstrated that ambiguity in anaphoric reference is ubiquitous, through the study of disagreements in annotation, that I pioneered in CL. Since then, additional cases of ambiguous anaphoric reference have been found; and similar findings have been made for other aspects of language interpretation, including wordsense disambiguation, and even part-of-speech tagging. Using the Phrase Detectives Game-With-A-Purpose to collect massive amounts of judgments online, we found that up to 30% of anaphoric expressions in our data are ambiguous. These findings raise a serious challenge for computational linguistics (CL), as assumptions about the existence of a single interpretation in context are built in the dominant methodology, that depends on a reliably annotated gold standard. The goal of the proposed project is to tackle this fundamental issue of disagreements in interpretation by using computational methods for collecting and analysing such disagreements, some of which already exist but have never before been applied in linguistics on a large scale, some we will develop from scratch. Specifically, I propose to develop more advanced games-with-a-purpose to collect massive amounts of data about anaphora from people playing a game. I propose to use Bayesian models of annotation, widely used in epidemiology but not in linguistics, to analyse such data and identify genuine ambiguities; doing this for anaphora will require novel methods. Third, I propose to use these data to revisit current theories about anaphoric expressions that do not seem to cause infelicitousness when ambiguous. Finally, I propose to develop the first supervised approach to anaphora resolution that does not require a gold standard as a blueprint for other areas.


year authors and title journal last update
List of publications.
2018 Massimo Poesio, Yulia Grishina, Varada Kolhatkar,Nafise Sadat Moosavi, Ina Roesiger, Adam Roussel, Fabian Simonjetz, Alexandra Uma,Olga Uryupina, Juntao Yu, Heike Zinsmeister
Anaphora Resolution with the ARRAU corpus
published pages: , ISSN: , DOI:
Proceedings of the NAACL Workshop on Computational Models of Reference, Anaphora and Coreference 2019-10-09
2018 Silviu Paun, Jon Chamberlain, Udo Kruschwitz, Juntao Yu, Massimo Poesio
A probabilistic annotation model for crowdsourcing coreference
published pages: 1926-1937, ISSN: , DOI:
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing 2019-10-09
2018 Jon Chamberlain, Udo Kruschwitz, Massimo Poesio
Optimising crowdsourcing efficiency: Amplifying human computation with validation
published pages: 41-49, ISSN: 2196-7032, DOI: 10.1515/itit-2017-0020
it - Information Technology 60/1 2019-10-09
2019 Olga Uryupina, Ron Artstein, Antonella Bristot, Federica Cavicchio, Francesca Delogu, Kepa Rodriguez, and Massimo Poesio
Annotating a broad range of anaphoric phenomena, in a variety of genres: the ARRAU corpus
published pages: , ISSN: 1351-3249, DOI:
Natural Language Engineering 2019-10-09
2018 Silviu Paun, Bob Carpenter, Jon Chamberlain, Dirk Hovy, Udo Kruschwitz, Massimo Poesio
Comparing Bayesian Models of Annotation
published pages: 571-585, ISSN: 2307-387X, DOI:
Transactions of the Association for Computational Linguistics 2019-10-09
2017 Chris Madge, Jon Chamberlain, Udo Kruschwitz, Massimo Poesio
Experiment-Driven Development of a GWAP for Marking Segments in Text
published pages: , ISSN: , DOI:
Proceedings of the CHIPLAY Conference 2019-06-13
2017 Chris Madge, Richard Bartle, Jon Chamberlain, Udo Kruschwitz, Massimo Poesio
Testing game mechanics in games with a purpose for NLP applications
published pages: , ISSN: , DOI:
Proceedings of the GAMES4NLP Symposium 2019-06-13
2017 Jon Chamberlain, Richard Bartle, Udo Kruschwitz, Chris Madge, Massimo Poesio
Metrics of games-with-a-purpose for NLP applications
published pages: , ISSN: , DOI:
Proceedings of GAMES4NLP 2017 2019-06-13

