Interview with Jérôme Farinas, automatic speech processing

We publish the seventh and last video of the series of interviews presenting the research work of our different departments. Jérôme FARINAS, Associate Professor at UT3 in the SI departmentSAMoVA team, explains his research work on automatic speech processing.

What is automatic speech processing?

Automatic speech processing is a field of research that studies, from an audio recording of speech, models and systems in order to be able to recognize automatically by computer a lot of information. This information concerns the recording conditions of the audio extract, the language used, the text spoken, the characteristics of the speaker, the intonations and the emotions present, among others. It is a multidisciplinary research field that manipulates acoustic signals and requires skills in signal processing, mathematics, computer science (especially Artificial Intelligence), linguistics, and even neuroscience, when it is necessary to understand the functioning of the human brain to create computer models.

What is the challenge of research in this area?

Enormous progress has been made since the 2010s. The mastery of deep neural network learning has profoundly changed the landscape of research in pattern recognition and particularly in the field of speech. Artificial intelligence and one of its main declinations called Machine Learning has experienced a revolution by benefiting from the increase in computing power and the availability of large data collections. This revolution has changed the models traditionally used and has made it possible to propose solutions based on deep networks, convolutional networks, encoder-decoders and attentional models, which are extremely demanding in terms of learning. In spite of these performances, there is still a pitfall when the speech quality is reduced or when listening is disturbed. This applies to the simulation of presbycusis (age-related hearing impairment) by speech recognition in order to be able to propose improvements in hearing aid settings. Currently, Jérôme Farinas’ research focuses on parameters, or even a measurement, that would characterize speech production disorders, particularly in the case of people suffering from pathologies such as cancers of the ENT sphere or Parkinson’s type diseases, and this in an objective manner, i.e. calculated automatically. The field of application is clearly the hospital environment, for which speech processing will facilitate the collection of information at the level of care and clinical research, with the aim of providing medical solutions for the years to come.