Romain Contrain/ June 30, 2021/ Current

Foreign Language Learning assisted by Artificial Intelligence

Apprentissage des Langues Assisté par Intelligence Artificielle

ALAIA is a joint laboratory between IRIT and Archean Technologies. Funded by the ANR, the French Research Agency, for 3 years, this LabCom started in March 2019. ALAIA is based on the synergy between the SAMoVA team and Archean Labs, R&D department of Archean Technologies.

Main issues and objectives

The core of ALAIA is to put computer technologies and artificial intelligence methods at the service of foreign language learning. The originality of our project is to focus on oral expression, through the evaluation of the quality of utterance pronunciation by foreign language learners, while considering the impact of their mother tongue on the target language. Our main objectives are to:

  • rely on the partners’ expertise in order to develop and deploy innovative software in the field of foreign language learning;
  • adopt a highly interdisciplinary approach based on the fields of didactics and linguistics, computer research and techniques for interaction with learners;
  • integrate the methods from these three domains in the development of software building blocks.
Synergy between ALAIA’s partners, learners and experts

We focus on the Japanese-French language pair; the former as mother tongue (L1) and the latter as target language (L2). The methodology implemented will be applied next to other language pairs. This work relies on the expertise of teachers in foreign language didactics, who are already working with ALAIA’s partners.

Current work

Within the framework of the ALAIA innovative program, our first step is to work on phonetic-phonological skills by focusing on the pronunciation of phonemes at word level. Automatic detection, localization and characterization of segmental production errors is our main focus. This relies on:

(1) Speech stimuli in French.

Based on LexPro, this dataset was developed in collaboration with the University of Waseda (S. Detey) and the University of Pau (F. Hapel and C. Domin). About 2500 statements were recorded (in standard French), orthographically and phonetically transcribed.

(2) Recordings of Japanese learners’ production (in French).

Resulting from former collaborations with the University of Waseda, a first dataset of about 15000 recordings has been valorized in the scope of ALAIA.

(3) Phonetic annotations made by experts

Over 7100 utterances based on 200 stimuli (distinct words or short sentences) produced by 67 learners, were manually transcribed at the phone level. This first annotation step was carried out thanks to specifically developed tools used by experts. About 55000 phones were labeled in terms of correctness or error type. Another expert has recently added information about temporal phone segmentation and alignment.

Thanks to this sizable dataset, we worked on acoustic modeling enabling transcription at phone level. The different steps of our process are summed up in the figure below.

System and resources required to achieve pronunciation error detection, localization and characterization.

(4) Acoustic modeling adaptation to japanese learners of French

More specifically, we used the phonetically annotated dataset mentioned above to adapt pre-existing acoustic-phonetic models ([Li, 2020], [Gelin, 2021]) to the domain of Japanese learners speaking French, which successfully improved the quality of its transcriptions. This model was then integrated to a tool for detecting and identifying pronunciation errors, which still in development.

(5) Error detection and characterization at lexical and syntactic levels

An industrial PhD (CIFRE) was started in January 2021 to work on linguistic levels in order to focus on lexical and syntactic errors occurring in learners’ oral production. The aim of this research work is to propose a comprehensibility measure covering both prononciation and linguistic level, applied to more spontaneous utterances which differs from word or sentence repetitions. This is a continuation of research conducted in Estelle Randria’s PhD Thesis [Randria, 2022].

LabCom Members

After Antoine Viette and Gautier Arcin who worked with us as research engineers in 2021, Romain Contrain is now in charge of our research developments on pronunciation error detection and characterization. Complementary annotations and temporal alignement were done by Sang-Ho Kim. Verdiana De Fino as industrial PhD is working on errors at higher linguistic levels. They work in close collaboration with the steering committee members (see below).

LabCom governance

The LabCom governance is organized through two committees :

Steering Committee

Strategic Committee

  • Xavier Aumont: President of Archean Technologies
  • Jean-Pierre Jessel: Vice President of Research – University Toulouse III
  • Jérôme Lelasseux: Local SATT representative – TTT Toulouse Tech transfer
  • Jean-Marc Pierson: Head of IRIT, representing also the head of the INS2I institute of the CNRS
  • Charlotte Sicre: IT and liberties referent at IRIT – RGPD correspondent
  • Jean-Marc Fourcade: Engineering and Computer Science Officer – Region Occitanie
  • Olivier Baude: Pr. of Language Sciences, University of Nanterre – Director of the TGIR Huma-Num
  • Sylvain Detey: Pr. of Language Sciences, Waseda University – Japan – Expert in language didactics

Latest meeting: 21st November 2022

Publications related to ALAIA


Funding and Schedule

  • ANR Joint Laboratory Program ANR-18-LCV3-001 (FR) – 300k€
  • Start time: 1st March 2019 – End of project : 31st December 2023


  • [Li, 2020] Li, X., Dalmia, S., Li, J., Lee, M., Littell, P., Yao, J., … & Metze, F. (2020, May). Universal phone recognition with a multilingual allophone system. In ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 8249-8253). IEEE.
  • [Gelin, 2021] Gelin, L., Daniel, M., Pinquier, J., & Pellegrini, T. (2021). End-to-end acoustic modelling for phone recognition of young readers. Speech Communication, 134, 71-84.
  • [Randria, 2022] Estelle Randria. Compréhensibilité de contenus audiovisuels : quelles approches pour une mesure objective ? Université Paul Sabatier (Toulouse 3), 2022. Français.
Share this Post