Opendata, web and dolomites

DASMT SIGNED

Domain Adaptation for Statistical Machine Translation

Total Cost €

0

EC-Contrib. €

0

Partnership

0

Views

0

Project "DASMT" data sheet

The following table provides information about the project.

Coordinator
LUDWIG-MAXIMILIANS-UNIVERSITAET MUENCHEN 

Organization address
address: GESCHWISTER SCHOLL PLATZ 1
city: MUENCHEN
postcode: 80539
website: www.uni-muenchen.de

contact info
title: n.a.
name: n.a.
surname: n.a.
function: n.a.
email: n.a.
telephone: n.a.
fax: n.a.

 Coordinator Country Germany [DE]
 Project website http://www.cis.uni-muenchen.de/
 Total cost 1˙228˙625 €
 EC max contribution 1˙228˙625 € (100%)
 Programme 1. H2020-EU.1.1. (EXCELLENT SCIENCE - European Research Council (ERC))
 Code Call ERC-2014-STG
 Funding Scheme ERC-STG
 Starting year 2015
 Duration (year-month-day) from 2015-12-01   to  2020-11-30

 Partnership

Take a look of project's partnership.

# participants  country  role  EC contrib. [€] 
1    LUDWIG-MAXIMILIANS-UNIVERSITAET MUENCHEN DE (MUENCHEN) coordinator 1˙228˙625.00

Map

 Project objective

Rapid translation between European languages is a cornerstone of good governance in the EU, and of great academic and commercial interest. Statistical approaches to machine translation constitute the state-of-the-art. The basic knowledge source is a parallel corpus, texts and their translations. For domains where large parallel corpora are available, such as the proceedings of the European Parliament, a high level of translation quality is reached. However, in countless other domains where large parallel corpora are not available, such as medical literature or legal decisions, translation quality is unacceptably poor.

Domain adaptation as a problem of statistical machine translation (SMT) is a relatively new research area, and there are no standard solutions. The literature contains inconsistent results and heuristics are widely used. We will solve the problem of domain adaptation for SMT on a larger scale than has been previously attempted, and base our results on standardized corpora and open source translation systems.

We will solve two basic problems. The first problem is determining how to benefit from large out-of-domain parallel corpora in domain-specific translation systems. This is an unsolved problem. The second problem is mining and appropriately weighting knowledge available from in-domain texts which are not parallel. While there is initial promising work on mining, weighting is not well studied, an omission which we will correct. We will scale mining by first using Wikipedia, and then mining from the entire web.

Our work will lead to a break-through in translation quality for the vast number of domains with less parallel text available, and have a direct impact on SMEs providing translation services. The academic impact of our work will be large because solutions to the challenge of domain adaptation apply to all natural language processing systems and in numerous other areas of artificial intelligence research based on machine learning approaches.

 Publications

year authors and title journal last update
List of publications.
2018 Viktor Hangya, Fabienne Braune, Alexander Fraser, Hinrich Schütze
Two Methods for Domain Adaptation of Bilingual Tasks: Delightfully Simple and Broadly Applicable
published pages: , ISSN: , DOI:
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) 2019-05-29
2017 Matthias Huck, Simon Riess, Alexander Fraser
Target-side Word Segmentation Strategies for Neural Machine Translation
published pages: 56-67, ISSN: , DOI: 10.18653/v1/W17-4706
Proceedings of the Second Conference on Machine Translation 2019-05-29
2017 Hassan Sajjad, Helmut Schmid, Alexander Fraser, Hinrich Schütze
Statistical Models for Unsupervised, Semi-Supervised, and Supervised Transliteration Mining
published pages: 349-375, ISSN: 0891-2017, DOI: 10.1162/COLI_a_00286
Computational Linguistics 43/2 2019-05-29
2016 Anita Ramm, Alexander Fraser
Modeling verbal inflection for English to German SMT
published pages: 21-31, ISSN: , DOI: 10.18653/v1/W16-2203
Proceedings of the First Conference on Machine Translation: Volume 1, Research Papers 2019-05-29
2017 Aleš Tamchyna, Marion Weller-Di Marco, Alexander Fraser
Modeling Target-Side Inflection in Neural Machine Translation
published pages: 32-42, ISSN: , DOI: 10.18653/v1/W17-4704
Proceedings of the Second Conference on Machine Translation 2019-05-29
2017 Valentin Deyringer, Alexander Fraser, Helmut Schmid, Tsuyoshi Okita
Parallelization of Neural Network Training for NLP with Hogwild!
published pages: 29-38, ISSN: 1804-0462, DOI: 10.1515/pralin-2017-0036
The Prague Bulletin of Mathematical Linguistics 109/1 2019-05-29
2018 Costanza Conforti, Matthias Huck, Alexander Fraser
Neural Morphological Tagging of Lemma Sequences for Machine Translation
published pages: , ISSN: , DOI:
Proceedings of the 13th Conference of the Association for Machine Translation in the Americas (Volume 1: Research Papers) 2019-05-29
2017 Matthias Huck, Aleš Tamchyna, Ondřej Bojar, Alexander Fraser
Producing Unseen Morphological Variants in Statistical Machine Translation
published pages: 369-375, ISSN: , DOI: 10.18653/v1/E17-2059
Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers 2019-05-29
2017 Marion Weller-Di Marco, Alexander Fraser, Sabine Schulte im Walde
Addressing Problems across Linguistic Levels in SMT: Combining Approaches to Model Morphology, Syntax and Lexical Choice
published pages: 625-630, ISSN: , DOI: 10.18653/v1/E17-2099
Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers 2019-05-29
2017 Leonie Weissweiler, Alexander Fraser
Developing a Stemmer for German Based on a Comparative Analysis of Publicly Available Stemmers
published pages: 81-94, ISSN: , DOI: 10.1007/978-3-319-73706-5_8
Language Technologies for the Challenges of the Digital Age. GSCL 2017. Lecture Notes in Computer Science 10713 2019-05-29
2016 Marion Weller-Di Marco, Alexander Fraser, Sabine Schulte im Walde
Modeling Complement Types in Phrase-Based SMT
published pages: 43-53, ISSN: , DOI: 10.18653/v1/W16-2205
Proceedings of the First Conference on Machine Translation: Volume 1, Research Papers 2019-05-29
2017 Matthias Huck, Fabienne Braune, Alexander Fraser
LMU Munich\'s Neural Machine Translation Systems for News Articles and Health Information Texts
published pages: 315-322, ISSN: , DOI: 10.18653/v1/W17-4730
Proceedings of the Second Conference on Machine Translation 2019-05-29
2017 Anita Ramm, Sharid Loáiciga, Annemarie Friedrich, Alexander Fraser
Annotating tense, mood and voice for English, French and German
published pages: 1-6, ISSN: , DOI: 10.18653/v1/P17-4001
Proceedings of ACL 2017, System Demonstrations 2019-05-29
2018 Fabienne Braune, Viktor Hangya, Tobias Eder, Alexander Fraser
Evaluating bilingual word embeddings on the long tail
published pages: 188-193, ISSN: , DOI: 10.18653/v1/N18-2030
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers) 2019-05-29
2018 Philipp Dufter, Mengjie Zhao, Martin Schmitt, Alexander Fraser, Hinrich Schütze
Embedding Learning Through Multilingual Concept Induction
published pages: , ISSN: , DOI:
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) 2019-05-29
2016 Fabienne Braune, Alexander Fraser, Hal Daumė III, Aleš Tamchyna
A Framework for Discriminative Rule Selection in Hierarchical Moses
published pages: 92-101, ISSN: , DOI: 10.18653/v1/W16-2210
Proceedings of the First Conference on Machine Translation: Volume 1, Research Papers 2019-05-27
2016 Aleš Tamchyna, Alexander Fraser, Ondřej Bojar, Marcin Junczys-Dowmunt
Target-Side Context for Discriminative Models in Statistical Machine Translation
published pages: 1704-1714, ISSN: , DOI: 10.18653/v1/P16-1161
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) 2019-05-27

Are you the coordinator (or a participant) of this project? Plaese send me more information about the "DASMT" project.

For instance: the website url (it has not provided by EU-opendata yet), the logo, a more detailed description of the project (in plain text as a rtf file or a word file), some pictures (as picture files, not embedded into any word file), twitter account, linkedin page, etc.

Send me an  email (fabio@fabiodisconzi.com) and I put them in your project's page as son as possible.

Thanks. And then put a link of this page into your project's website.

The information about "DASMT" are provided by the European Opendata Portal: CORDIS opendata.

More projects from the same programme (H2020-EU.1.1.)

CohoSing (2019)

Cohomology and Singularities

Read More  

PROTECHT (2020)

Providing RObust high TECHnology Tags based on linear carbon nanostructures

Read More  

OSIRIS (2020)

Automatic measurement of speech understanding using EEG

Read More