LabCom ALAIA

Iferrane/ juin 30, 2021/ Current

Foreign Language Learning assisted by Artificial Intelligence

Apprentissage des Langues Assisté par Intelligence Artificielle

ALAIA is a joint laboratory between IRIT and Archean Technologies. Funded by the ANR, the French Research Agency, for 3 years, this LabCom started in March 2019. ALAIA is based on the synergy between the SAMoVA team and Archean Labs, R&D department of Archean Technologies.

Main issues and objectives

The core of ALAIA is to put computer technologies and artificial intelligence methods at the service of foreign language learning. The originality of our project is to focus on oral expression, through the evaluation of the quality of utterance pronunciation by foreign language learners, while considering the impact of their mother tongue on the target language. Our main objectives are to:

  • rely on the partners’ expertise in order to develop and deploy innovative software in the field of foreign language learning;
  • adopt a highly interdisciplinary approach based on the fields of didactics and linguistics, computer research and techniques for interaction with learners;
  • integrate the methods from these three domains in the development of software building blocks.
Synergy between ALAIA’s partners, learners and experts

We focus on the Japanese-French language pair; the former as mother tongue (L1) and the latter as target language (L2). The methodology implemented will be applied next to other language pairs. This work relies on the expertise of teachers in foreign language didactics, who are already working with ALAIA’s partners.

Current work

Within the framework of the ALAIA innovative program, our first step is to work on phonetic-phonological skills by focusing on the pronunciation of phonemes at word level. Automatic detection, localization and characterization of segmental production errors is our main focus. This relies on:

(1) Speech stimuli in French.

Based on LexPro, this dataset has been developed in collaboration with the University of Waseda (S. Detey) and the University of Pau (F. Hapel and C. Domin). About 2500 statements were recorded (in standard French), orthographically and phonetically transcribed.

(2) Recordings of Japanese learners’ production (in French).

Resulting from former collaborations with the University of Waseda, a first dataset of about 15000 recordings has been valorized in the scope of ALAIA.

(3) Phonetic annotations made by experts

Until now, more than 7100 utterances based on 200 stimuli (distinct words or short sentences) produced by 67 learners, have been manually transcribed at the phoneme level. This first annotation step was carried out thanks to specifically developed tools used by experts. About 55000 phonemes were labels in terms of correctness or error types. Another expert recently added information about temporal phoneme alignment.

Thanks to this important dataset, we are also currently working on acoustic modeling enabling transcription at phonem levels. The different steps of our process is summed up in the figure below.

System and resources required to achieve pronunciation error detection, localization and characterization.

(4) Acoustic modeling adaptation to japanese learners of French

to be completed

(5) Error detection and characterization at lexical and syntactic levels

An industrial PhD (CIFRE) has started in January 2021 to work on linguistic levels in order to focus on lexical and syntactic errors occurring in the production of learners. The aim of this research work is to propose a comprehensibility measure covering both prononciation and linguistic level, applied to more spontaneous utterances which differs from word or sentence repetitions.

LabCom Members

After Antoine Viette, Gautier Arcin who worked with us as research engineers in 2021, Romain Contrain is now in charge of our research developments on pronuncation error detection and characterization. Complementary annotations and temporal alignement were done by Sang-Ho Kim. Verdiana De Fino as industrial PhD is working on errors at higher linguistic levels. They work in close collaboration with the steering committee members (see below).

LabCom governance

The LabCom governance is organized through two committees :

Steering Committee

Strategic Committee

  • Xavier Aumont: President of Archean Technologies
  • Jean-Pierre Jessel: Vice President of Research – University Toulouse III
  • Jérôme Lelasseux: Local SATT representative – TTT Toulouse Tech transfer
  • Jean-Marc Pierson: Head of IRIT, representing also the head of the INS2I institute of the CNRS
  • Charlotte Sicre: IT and liberties referent at IRIT – RGPD correspondent
  • Jean-Marc Fourcade: Engineering and Computer Science Officer – Region Occitanie
  • Olivier Baude: Pr. of Language Sciences, University of Nanterre – Director of the TGIR Huma-Num
  • Sylvain Detey: Pr. of Language Sciences, Waseda University – Japan – Expert in language didactics

Latest meeting: 21st November 2022

Publications related to ALAIA

Collaborations

  • Sylvain Detey  Waseda University – Japan – Expert in L2 didactics

Funding and Schedule

  • ANR Joint Laboratory Program – a 3-year project plus 12-month extension (due to COVID conditions) and a complementary 10-months extension to the end of 2023.
  • Start time: 1st March 2019 – End of project : 31st December 2023

Share this Post