Impact of gestures on the pronunciation of foreign language learners
Main issues and objectives
The issue of associating gesture and speech in a second language learning context (L2) emerged from a workshop organized by the project partners involved at different levels in L2 learning. Starting from a first reflection on the technical possibilities of combining gesture production and pronunciation learning, several international academic experts in the field of (1) gesture analysis, (2) language acquisition, (3) link between didactic and digital processing were invited at IRIT in October 2017, to discuss this matter. The company Archean Technologies, involved in language learning software, was also asked to participate to give its industrial point of view on the following questions: “What do we know about the role of gestures while acquiring the pronunciation of a foreign language? How can we study and measure this link? “. That is how the idea of the INGPRO project emerged. Our proposal was submitted and accepted in 2018 as a Regional Council Project (Occitanie).
INGPRO aims at proposing a methodology that can answer these questions, validate the underlying assumptions and integrate the software functionalities into an experimental platform in order to carry out a set of tests for a wide audience of learners. During this two-year project we aimed to demonstrate the feasibility of our innovative proposal.
Our team was mainly involved in the study of speech and gesture coordination. We relied on a preliminary study carried out by the Octogone laboratory (project leader) regarding the gestures performed by L2 teachers in the context of pronunciation learning, especially French. The objective of this study was to produce a typology and a characterization of the gestures used for pronunciation correction. This was followed by a study on the coordination of gesture and speech in situations of phonetic correction . Based on this work, we proposed a methodology for animating a 3D avatar producing a set of corrective gestures associated with their corrective stimulus . Both works are illustrated in a synthetic way below.
Study of Speech and Gesture Coordination
From a set of video recordings of phonetic correction sessions carried out by teachers of FFL (French as a Foreign Language), a manual annotation step of the audio track was done in order to highlight the various phases describing the anatomy of gestures as defined in  and : rest position, preparation, stroke, hold, retraction, … as well as the parts of the human body involved in the gesture realization (hands, arms, bust, head, glance, …). These annotations were aligned on the teacher’s utterances and more particularly on annotations manually or automatically done at the phoneme level. This is illustrated in Figure 1, where the red framework shows the annotation regarding the body part information while the yellow one shows the temporal sequence of the different gesture phases and the green one the annotations available at phoneme level.
Towards a gesture typology
The temporal decomposition of each type of gestures (as shown in Figure 2) was studied in order to highlight the teachers’ intra and inter personal variability/regularity while performing corrective gestures.
Modeling gesture types
We designed an experimental protocol to submit pronunciation exercises, in the target foreign language, to different groups of learners. The idea was to carried out a contractive study regarding the incidence of gestures used to strengthen pronunciation correction. We chose to use a 3D avatar as a virtual teacher for two main reasons: (1) first for a better control of the experimental conditions, in particular the type of corrective gestures submitted to the learners (gesture parameters, body parts involved, combination with spoken corrective stimuli, etc. ), (2) second, to reduce teachers’ inter-personal variability and limit the presence of other communication signs produced unconsciously which could influence the learner.
From the literature and the analysis of teacher video corpora, six different co-verbal gestures were selected. They could be use in an iconic way for suprasegmental correction (illustrating a concrete concept like rising intonation) or in a metaphoric way (illustrating an abstract concept like tension or relaxation). Each type of gesture is illustrated below :
|Upward||Upward + Downward||Monotone||Relaxing||Tension||Opening|
Animation of a 3D avatar
The 3D avatar (Figure 3 – left) has to perform a gesture that must be synchronized with the pronunciation of the corrective stimulus (with or without emphasis, as shown in the video below) it is associated with. A tool has been developed to generate animations integrating lip synchronisation (based on phoneme-to-viseme map  see Figure 3 – middle) with the associated corrective gesture (Figure 3 – right).
Since mid-november 2020, Amélie Le Chevanton has joined us as research engineer to further develop the Web platform dedicated to Foreign Language learning. Her first mission was to integrate the avatar in the web service platform used for the experimental sessions organized in spring 2021 (after been delayed because of the COVID period). This tool was designed to collect learners’ speech production and study the impact of gestures performed during phonetic correction.
Marie Philippart de Foy et Léonardo Contreras Roa joined us, as part-time post-doctoral students, in the last phase of the project They analyzed in detail the data collected during the experimental sessions with Japanese-speaking students that took place in May 2021. We warmly thank Corentin Barcat (PhD Tokyo University of Foreign Studies) for his help in collecting the data using the fore-mentioned application.
In the scope of INGPRO, a complementary work was done for transfering gesture generation and alignment with audio to take into account of beat gestures to help visualization of the tonic accent in specialized English vocabulary learning.
 David McNeill GESTURE :A PSYCHOLINGUISTIC APPROACH University of Chicago. For Psycholinguistics Section,The Encyclopedia of Language and Linguistics.
 Jana Bressem & Silva H. Ladewig Rethinking gesture phases : Articulatory features of gestural movement?, Semiotica 2011 (184) :53-91 (2011).
 Kellian Ballentine, Speech analysis and gesture coordination for learning french as a foreign language, Master 2 internship report, August 2019.
 Franck Donny, Project INGPRO: animation of a 3D avatar, Mater 2 internship report, September 2020.
 BEAR H. L., HARVEY R. : Phoneme-to-viseme mappings : the good, the bad, and the ugly. Speech Communication. Vol. 95 (Dec 2017), 40–67.
 Alazard-Guiu, C., Ferrané, I., Fontan. L. , Le Coz M., Pellegrini T. , Pinquier J. & N.Yassine-Diab (2021). Studying the impact of congruent and incongruent corrective gestures on the acquisition of L2 pronunciation. Trend In Pedagogical Transmission Of Prosody (TIP TOP), Konstanz, 12-13 octobre.
 Charlotte Alazard-Guiu , Nadia Yassine-Diab , Isabelle Ferrané , Lionel Fontan, Julien Pinquier, Thomas Pellegrini , Amélie Le Chevanton, Studying the impact of congruent and incongruent corrective gestures on the acquisition of L2 pronunciation. Onela2021 : ONELA Outils et Nouvelles Explorations de la Linguistique Appliquée 2021/ Instrumentation and New Explorations in Applied linguistics, Toulouse, 19-21 octobre 2021.
- UT2J – Octogone Lordat: Charlotte Alazard-Guiu (project coordinator)
- ARCHEAN Technologies – ARCHEAN LABS: Lionel Fontan
- UPS – LAIRDIL: Nadia Yassine-Diab
- UPS – IRIT- SAMoVA: Isabelle Ferrané – Thomas Pellegrini
- Two-year project supported by OCCITANIE Regional Council (+ 9 month extension)
- Start time: 1st January 2019
- End time: 30th September 2021 (including a nine-month extension granted by the Regional Council)