Opendata, web and dolomites

Report

Teaser, summary, work performed and final results

Periodic Reporting for period 1 - ENRICH (Enriched communication across the lifespan)

Teaser

Communication using speech is natural, ubiquitous, and efficient in daily life. So, reduced capacity in speaking or hearing can create a significant barrier to social inclusion throughout our life, in education, at work and at home. Trends in health, technology and social...

Summary

Communication using speech is natural, ubiquitous, and efficient in daily life. So, reduced capacity in speaking or hearing can create a significant barrier to social inclusion throughout our life, in education, at work and at home. Trends in health, technology and social mobility are creating challenges for listeners. We are living longer and age-related hearing loss will affect around half of all adults by the age of 80 in Europe. At the same time, listeners are faced with artificial speech from devices in the home, at work, and while travelling, often in noisy environments. Because of increasing mobility, we are exposed to more speech spoken non-natively, or spoken in a first language other than our own. All of these factors are known to reduce intelligibility, or make speech more demanding to process. Technological solutions such as hearing aids (\'prostheses\'), voice modification and speech synthesis can help, yet their use often imposes even greater effort on the part of the listener. We need a clearer understanding of how listeners process different styles of speech, what aspects of speech make one form of speech more intelligible or easy to process than another, how different types of listener are affected by distinct speech styles and how to design effective algorithms and strategies (‘speech enrichments’) to make life easier for listeners.

ENRICH (Enriched Communication across the Lifespan) is an EU-funded Marie Skłodowska Curie European Training Network developing novel algorithms for speech enrichment. We aim to demonstrate the benefit of enrichment on cognitive effort for listener groups and individuals, and to pilot the application of speech enrichment in assistive technology. ENRICH is training the next generation of scientists to devise, evaluate and apply new approaches to the generation and processing of speech that will reduce listening effort while maintaining or increasing intelligibility. ENRICH is tackling many distinct forms of speech: healthy and disordered speech, pre-recorded speech, and computer-generated speech. We are measuring the improvements from enriched speech on message understanding and cognitive effort in varying listening conditions. We do this using a wide variety of methods such as physiological measures (pupil size, or brain activity measured by EEG), dual-task designs, subjective judgements, identification scores, and task completion rates. Advanced virtual reality software is used to create well-controlled but highly-realistic simulated environments. Speech modifications are tested with normal-hearing and hearing-impaired listeners, non-native listeners, and listeners with enhanced listening skills.

ENRICH is home to 14 early-stage researchers with backgrounds in psychology, linguistics, engineering and computer science, hosted by leading European research groups in 8 organisations from the academic and industrial sector in 5 countries, along with 7 partners from research institutes, the hearing and voice technology industries, and clinics. The network provides unparalleled access to state-of-the-art expertise, specialist equipment, speech corpora, algorithms, protocols and listener/speaker population samples speaking a range of European languages. Through an ambitious training programme, ENRICH researchers are gaining skills across the spectrum of disciplines required to make novel contributions in speech communication. They are also getting hands-on training in complementary areas including entrepreneurship, technical writing, scientific conduct and public dissemination. See also the ENRICH video (search \'youtube enrich etn\').

Work performed

Researchers have been in post for 12-20 months and have spent that time devising their experimental programmes, identifying spoken language resources, collecting new corpora, and designing algorithms. Initial results have been disseminated in 24 conference posters, 13 talks and 9 published papers. Substantial effort has been made in compiling a repository of shared data resources, and 7 completely new corpora have been collected. 8 training events have been held so far, most lasting 3-4 days, covering topic-specific scientific themes, data science (programming & statistics), and complementary skills. Researchers have made 14 cross-site visits, with many secondments planned in the next 12 months. Some research highlights to date: (i) Synthetic speech requires more listening effort than plain speech, and different types of speech generation approaches differ in the cognitive load they impose on the listener. This finding, based on pupillometry and dual-task paradigms, will help in the design of new \'low effort\' synthesis approaches; (ii) a pupillometry-based study has demonstrated that speech can be algorithmically-modified to result in a lower listening effort than natural speech; (iii) self-reported listening effort is highly-correlated with intelligibility for oesophageal speech, but while voice familiarity does not improve intelligibility, listening effort is lower for people familiar with the talker\'s voice; (iv) an implicit trainable technique based on deep neural networks can match the success of an explicit speech modification algorithm in improving intelligibility without increasing volume; (v) the degree of entrainment to a target talker in the presence of a competing talker has been measured using EEG and provided insights into how a listener\'s attentional focus is affected under challenging listening conditions characterised by the presence of noise or distorted forms of speech.

Final results

The last decade has seen significant progress in understanding which factors contribute to intelligibility in everyday listening conditions, the development of metrics to predict intelligibility, and the use of these metrics to design intelligibility-enhancing algorithms. The next challenge, which ENRICH is tackling, is to do the same for listening effort. By the end of the project we will have developed listening effort predictors and used them to generate forms of speech that are easier to process for listeners, while maintaining intelligibility. We will have made progress on understanding which factors lead to higher effort by studying natural speech collected from a sizeable sample of talkers in different languages. We will develop a clearer understanding of how listening effort impacts upon different groups of listeners. We will have trained a community of researchers who also have a good understanding of the needs of industry and of requirements for commercial applications. By disseminating our findings in academic conferences and journals, at industry and public understanding events, we will have made the concept of speech enrichment central to the next generation of hearing and communication technology, improving the lives of many citizens in the future.

Website & more info

More info: http://www.enrich-etn.eu/.