Opendata, web and dolomites


Workflows for the Large-Scale Collection and Transference of Knowledge across Languages: Using Natural Language Processing to Produce High-Quality Contents with Language Learners

Total Cost €


EC-Contrib. €






Project "WIKOLLECT" data sheet

The following table provides information about the project.


Organization address
address: VIALE DRUSO 1
postcode: 39100

contact info
title: n.a.
name: n.a.
surname: n.a.
function: n.a.
email: n.a.
telephone: n.a.
fax: n.a.

 Coordinator Country Italy [IT]
 Total cost 183˙473 €
 EC max contribution 183˙473 € (100%)
 Programme 1. H2020-EU.1.3.2. (Nurturing excellence by means of cross-border and cross-sector mobility)
 Code Call H2020-MSCA-IF-2018
 Funding Scheme MSCA-IF-EF-ST
 Starting year 2020
 Duration (year-month-day) from 2020-09-01   to  2022-08-31


Take a look of project's partnership.

# participants  country  role  EC contrib. [€] 
1    ACCADEMIA EUROPEA DI BOLZANO IT (BOLZANO) coordinator 183˙473.00


 Project objective

WiKollect aims at creating a workflow for the large-scale transference of high-quality contents across languages. The workflow is divided in four cyclic steps. In step (i) an automatic model will identify contents available in a document in language A which are missing in a document, on the same topic, in language B. In step (ii) candidates to fill the gaps in the document in language B will be automatically generated. In step (iii) such candidates will be subject to manual evaluation by language learners. In step (iv) the contents identified as high-quality will be promoted to fill the gaps in the document in language B. WiKollect will take advantage of the barely-exploited synergy among natural language processing, language learning, and crowdsourcing. To address the different research challenges posed by the workflow design and implementation, it will create an innovative and re-usable hybrid intelligence architecture combining (a) artificial intelligence —such as machine learning and natural language processing— to identify contents worth transferring across languages and generate potential translations and (b) human intelligence —by means of implicit crowdsourcing— relying on a crowd of language learners to flag good contents. WiKollect will create different by-products in addition to the research products that will be generated by addressing each step in the four-step workflow. Language learning exercises on specific topics and complexity levels will be generated. The fair re-use of contents across languages will be promoted with the mass production of high-quality contents. During the MSC period, WiKollect will target the generation of Wiktionary contents in Italian and German. Still, the workflow is flexible and extendable and can be applied to other documents (e.g., Wikipedia articles, news) and languages in the near future.

Are you the coordinator (or a participant) of this project? Plaese send me more information about the "WIKOLLECT" project.

For instance: the website url (it has not provided by EU-opendata yet), the logo, a more detailed description of the project (in plain text as a rtf file or a word file), some pictures (as picture files, not embedded into any word file), twitter account, linkedin page, etc.

Send me an  email ( and I put them in your project's page as son as possible.

Thanks. And then put a link of this page into your project's website.

The information about "WIKOLLECT" are provided by the European Opendata Portal: CORDIS opendata.

More projects from the same programme (H2020-EU.1.3.2.)


Digital Poetry in Today’s Russia: Canonisation and Translation

Read More  

MingleIFT (2020)

Multi-color and single-molecule fluorescence imaging of intraflagellar transport in the phasmid chemosensory cilia of C. Elegans

Read More  

STIMOS (2019)

Stimulation of Multiple Organoids Simultaneously

Read More  
lastchecktime (2022-10-03 2:24:10) correctly updated