Nos partenaires

CNRS

Rechercher





Accueil du site > Français > Evénements > Séminaires

Séminaires

 

L’IRIT étant localisé sur plusieurs sites, ses séminaires sont organisés et ont lieu soit à l’Université Toulouse 3 Paul Sabatier (UT3), l’Université Toulouse 1 Capitole (UT1), l’INP-ENSEEIHT ou l’Université Toulouse 2 Jean Jaurès (UT2J).

 

Speech Dereverberation using EM Algorithm and Kalman Filtering

Sharon GANNOT - Bar-Ilan University (Israel)

Lundi 1 Octobre 2018, 14h00 - 15h00
INP-ENSEEIHT, Salle C103
Version PDF :

Abstract

Reverberation is a typical acoustic phenomenon that is attributed to multitude of reflections from the walls, ceiling, and objects in enclosures. Reverberation is known to deteriorate the accuracy of automatic speech recognition systems as well as the quality of the speech signals. In severe cases, it might also hamper the intelligibility of the speech signals. Dereverberation algorithms aim at the reduction of the reverberation and, and as a result, at emphasizing the anechoic speech signal. A non-Bayesian, Maximum-Likelihood (ML) approach for single speaker dereverberation using multiple microphones, is taken in this work. We first define a statistical model for the speech signal and for the associated acoustic impulse responses, and speech and noise power spectral densities. The estimate-maximize (EM) approach is then employed to infer the ML estimate of the deterministic parameters. It is shown that the clean speech is estimated in the E-step using a Kalman smoother, and the acoustic parameters are updated in the M-step. For online applications and dynamic scenarios, i.e. when the speaker and/or the microphones are moving, we derive a recursive (REM) algorithm which uses the Kalman filter rather than the Kalman smoother and an online update the parameters, by only using the current observed data. Two extensions of this approach were developed as well. The first extension is a segmental algorithm, taking an intermediate batch-recursion approach, where iterations are performed over short segments and smoothness between the estimated parameters is preserved. The latency of the segmental algorithm can hence be controlled by setting the segment length, while the accuracy of the estimates can be iteratively improved. The second extension is a binaural algorithm mainly applicable to hearing aids. The binaural algorithm trades off between the reduction of reverberation and the preservation of the spatial perception of the user. An extensive simulation study, as well as real recordings of moving speakers in our acoustic laboratory, demonstrate the performance of the presented algorithms.

Short Bio: Sharon Gannot received the B.Sc. degree (summa cum laude) from the Technion-Israel Institute of Technology, Haifa, Israel, in 1986, and the M.Sc. (cum laude) and Ph.D. degrees from Tel-Aviv University, Tel Aviv, Israel, in 1995 and 2000, respectively, all in electrical engineering. In 2001, he held a Postdoctoral position with the Department of Electrical Engineering, KU Leuven, Leuven, Belgium. From 2002 to 2003, he held a Research and Teaching position with the Faculty of Electrical Engineering, Technion—Israel Institute of Technology. He is currently, a Full Professor with the Faculty of Engineering, Bar-Ilan University, Ramat Gan, Israel, where he is heading the Speech and Signal Processing Laboratory and the Signal Processing Track. Since April 2018, he is also a part-time Professor at the Technical Faculty of IT and Design, Aalborg University, Denmark.

 

Retour