Opendata, web and dolomites


Robust End-To-End SPEAKER recognition based on deep learning and attention models

Total Cost €


EC-Contrib. €






Project "ETE SPEAKER" data sheet

The following table provides information about the project.


Organization address
address: ANTONINSKA 548/1
postcode: 601 90

contact info
title: n.a.
name: n.a.
surname: n.a.
function: n.a.
email: n.a.
telephone: n.a.
fax: n.a.

 Coordinator Country Czech Republic [CZ]
 Total cost 120˙817 €
 EC max contribution 120˙817 € (100%)
 Programme 1. H2020-EU.1.3.2. (Nurturing excellence by means of cross-border and cross-sector mobility)
 Code Call H2020-MSCA-IF-2018
 Funding Scheme MSCA-IF-EF-ST
 Starting year 2019
 Duration (year-month-day) from 2019-06-01   to  2021-01-31


Take a look of project's partnership.

# participants  country  role  EC contrib. [€] 
1    VYSOKE UCENI TECHNICKE V BRNE CZ (BRNO STRED) coordinator 120˙817.00


 Project objective

This project focuses on automatic speaker recognition (SID), the task of determining the identity of the speaker in a speech recording. Disentangling the speaker specific information from the rest of nuisance variability requires complex models. Deep neural networks (DNNs) have recently showed their potential for this, as the popular x-vector learnt by a DNN. Here, we aim for end-to-end SID where the system is optimized as a whole for the target task. Despite several attempts in this line of research, many aspects still remain unexplored or not explored thoroughly. We also propose to explore recurrent approaches, suitable for dealing with temporal signals, as well as different pooling methods to obtain a fixed-length representation from a variable length input sequence of speech features. Next, we want to explore different flavors of attention mechanisms, which make the DNN to focus on relevant parts of the input, providing a way to quantify how much evidence has been collected about the speaker identity and the uncertainty of the obtained representation, which is a critical issue when making (Bayesian) decisions in SID. Finally, some other approaches such as using the raw signal (instead of features) or other advances that might arise will be also explored for SID and related tasks. To achieve our goals, we will start from theory, implement the proposed approaches and test on public SID benchmarks such as NIST SREs. The outcomes are intended to benefit both scientific community and speech processing industry. The applicant Dr. Alicia Lozano-Diez is an excellent female researcher, who has done her Ph.D. at Audias (Universidad Autonoma de Madrid, Spain), a respected research lab. The host group Speech@FIT from Brno University of Technology (Czechia) has a top-class track on speech processing research. Thus, we expect the combination of both the researcher and the host to boost the researcher career and benefit the host group (and its industrial European partners).

Are you the coordinator (or a participant) of this project? Plaese send me more information about the "ETE SPEAKER" project.

For instance: the website url (it has not provided by EU-opendata yet), the logo, a more detailed description of the project (in plain text as a rtf file or a word file), some pictures (as picture files, not embedded into any word file), twitter account, linkedin page, etc.

Send me an  email ( and I put them in your project's page as son as possible.

Thanks. And then put a link of this page into your project's website.

The information about "ETE SPEAKER" are provided by the European Opendata Portal: CORDIS opendata.

More projects from the same programme (H2020-EU.1.3.2.)

MIGPSC (2018)

Shaping the European Migration Policy: the role of the security industry

Read More  

MingleIFT (2020)

Multi-color and single-molecule fluorescence imaging of intraflagellar transport in the phasmid chemosensory cilia of C. Elegans

Read More  

ToMComputations (2019)

How other minds are represented in the human brain: Neural computations underlying Theory of Mind

Read More