Thomas Pellegrini @ University of Toulouse

Publications

Full list

International Journals

L. Cances, E. Labbé, T. Pellegrini, 2022. Comparison of semi-supervised deep learning algorithms for audio classification. EURASIP Journal on Audio, Speech, and Music Processing, 2022:23, https://doi.org/10.1186/s13636-022-00255-6. (HTML link)

L. Gelin, M. Daniel, J. Pinquier, T. Pellegrini, 2021. End-to-end acoustic modelling for phone recognition of young readers. Speech Communication, 134, pp. 71-84. (HTML link)

C.-E. Himeur, T. Lejemble, T. Pellegrini, M. Paulin, L. Barthe, N. Mellado, 2021. PCEDNet: A Lightweight Neural Network for Fast and Interactive Edge Detection in 3D Point Clouds. ACM Trans. on Graphics (TOG), 41:1, pp. 1-21. (HTML link)

A. Paroni, N. Henrich Bernardoni, C. Savariaux, H. Lœvenbruck, P. Calabrese, T. Pellegrini, S. Mouysset, and S. Gerber, 2021. Vocal drum sounds in human beatboxing: An acoustic and articulatory exploration using electromagnetic articulography. The Journal of the Acoustical Society of America, 149(1), pp.191-206. (HTML link)

T. Pellegrini, L. Fontan, J. Mauclair, J. Farinas, C. Alazard-Guiu, M. Robert, P. Gatignol. Automatic Assessment of Speech Capability Loss in Disordered Speech, In ACM Trans. on Accessible Computing, ACM, Special Issue on Speech and Language Processing for AT (Part 1), Vol. 6:3, May 2015 (HTML link)

T. Pellegrini, R. Correia, I. Trancoso, J. Baptista, N. Mamede, M. Eskenazi. ASR-based exercises for listening comprehension practice in European Portuguese, In Computer Speech & Language, ISSN 0885-2308, 10.1016/j.csl.2013.02.004, Vol. 27:5, pp. 1127–1142, August 2013 (HTML link)

T. Pellegrini, L. Lamel. Automatic word decompounding for ASR in a morphologically rich language: application to Amharic , In IEEE Trans. on Audio, Speech and Language Processing, Volume 17:5, pp. 863-873, July 2009 (HTML link)

National Journals

S. Detey, L. Fontan, T. Pellegrini. Traitement de la prononciation en langue étrangère : approches didactiques, méthodes automatiques et enjeux pour l'apprentissage. In Traitement Automatique des Langues, Association pour le Traitement Automatique des Langues (ATALA), Vol. 57, N. 3, (en ligne), 2016

International Conferences

2023

Thomas Pellegrini, Ismail Khalfaoui-Hassani, Etienne Labbé, Timothée Masquelier. Adapting a ConvNeXt model to audio classification on AudioSet. Accepted to INTERSPEECH, PDF

Etienne Labbé, Julien Pinquier, Thomas Pellegrini. Multitask learning in Audio Captioning: a sentence embedding regression loss acts as a regularizer. Accepted to EUSIPCO, PDF

Ismail Khalfaoui-Hassani, Thomas Pellegrini, Timothée Masquelier. Dilated convolution with learnable spacings. Proc. ICLR, Kigali, May 2023, PDF

2022

T. Pellegrini. Language-based audio retrieval with textual embeddings of tag names. In Proc. Workshop DCASE, Nancy, Nov. 2022, PDF

E. Labbé, T. Pellegrini, J. Pinquier. Is my automatic audio captioning system so bad? SPIDEr-max: a metric to consider several caption candidates. In Proc. Workshop DCASE, Nancy, Nov. 2022, PDF

2021

L. Gravellier, J. Hunter, P. Muller, T. Pellegrini, I. Ferrané. Weakly supervised discourse segmentation for multiparty oral conversations. In Proc. EMNLP, Punta Cana (Online), Nov. 2021

L. Gelin, T. Pellegrini, J. Pinquier, M. Daniel. Simulating reading mistakes for child speech Transformer-based phone recognition. In Proc. Interspeech, Brno (Online), Sept. 2021, PDF

T. Pellegrini. Deep-learning-based central African primate species classification with MixUp and SpecAugment. In Proc. Interspeech, Brno (Online), Sept. 2021, PDF, award best results to the challenge URL

E. Labbé, T. Pellegrini, IRIT-UPS DCASE 2021 Audio Captioning System. Technical Report, July 2021, Challenge results, PDF

T. Pellegrini, T. Masquelier. Fast threshold optimization for multi-label audio tagging using Surrogate gradient learning. In Proc. ICASSP, Toronto (Online), June 2021, Paper, code, video

L.Cances, T. Pellegrini. Comparison of Deep Co-Training and Mean-Teacher approaches for semi-supervised audio tagging. In Proc. ICASSP, Toronto (Online), June 2021

L. Pibre, S. Mechrouh, T. Pellegrini, J. Pinquier, I. Ferrané. Automatic macro segmentation into interaction sequence: a silence-based approach for meeting structuring. In Proc. CBMI, Lille (Online), Jun 2021

T. Pellegrini, R. Zimmer, T. Masquelier. Low-activity supervised convolutional spiking neural networks applied to speech commands recognition. In Proc. IEEE Spoken Language Technology Workshop, Shenzhen (Online), Jan. 2021

2020

T. Pellegrini, IRIT-UPS DCASE 2020 audio captioning system. Technical report, July 2020, Challenge results, PDF, code

2019

L. Cances, P. Guyot, T. Pellegrini. Evaluation of Post-Processing Algorithms for Polyphonic Sound Event Detection. In Proc. IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 318-322, October 2019. URL, Bibtex

L. Cances, T. Pellegrini, P. Guyot. Multi-task learning and post-processing for sound event detection. Technical Report DCASE 2019, October 2019. Code+results+PDF

A. Heba, T. Pellegrini, J.-P. Lorré, R. André-Obrecht. Char+CV-CTC: Combining Graphemes and Consonant/Vowel Units for CTC-Based ASR Using Multitask Learning. In Proc. INTERSPEECH, Graz, pp. 1611-1615, Sept. 2019. URL, Bibtex

T. Pellegrini, J. Farinas, E. Delpech, F. Lancelot. The Airbus Air Traffic Control speech recognition 2018 challenge: towards ATC automatic transcription and call sign detection. In Proc. INTERSPEECH, Graz, pp. 2993-2997, Sept. 2019. PDF

C. Gendrot, E. Ferragne, T. Pellegrini. Deep learning and voice comparison: phonetically-motivated vs. automatically-learned features. In Proc. International Congress of Phonetic Sciences (ICPhS 2019), Melbourne, International Phonetic Association, pp. 1-5, Aug. 2019.

E. Ferragne, C. Gendrot, T. Pellegrini. Towards phonetic interpretability in deep learning applied to voice comparison. In Proc. International Congress of Phonetic Sciences (ICPhS 2019), Melbourne, International Phonetic Association, pp. 1-5, Aug. 2019.

T. Rolland, A. Basarab, T. Pellegrini. Label-consistent sparse auto-encoders. In Proc. Workshop on Signal Processing with Adaptative Sparse Structured Representations, Toulouse, July 2019. PDF

J. Mamou, T. Pellegrini, D. Kouamé, A. Basarab. A convolutional neural network for 250-MHz quantitative acoustic-microscopy resolution enhancement. In Proc. IEEE International Engineering in Medicine and Biology Conference (EMBC 2019), Berlin, July 2019

T. Pellegrini, L. Cances. Cosine-similarity penalty to discriminate sound classes in weakly-supervised sound event detection. In Proc. International Joint Conference on Neural Networks (IJCNN 2019), Budapest, 14/07/2019-19/07/2019, INNS : International Neural Network Society, pp. 1-8, July 2019. PDF

2018

L. Cances, T. Pellegrini, P. Guyot. Sound event detection from weak annotations: weighted-GRU versus multi-instance-learning. In Proc. IEEE AASP Challenge on Detection and Classification of Acoustic Scenes and Events (DCASE 2018), Surrey, UK, 19/11/2018-20/11/2018, Tampere University of Technology, pp. 64-68, Nov 2018. URL

S. Cosentino, E. Randria, J.-Y. Lin, T. Pellegrini, S. Sessa, A. Takanishi. Group Emotion Recognition Strategies for Entertainment Robots. In Proc. IEEE/RSJ International Conference on Intelligent RObots and Systems (IROS 2018), Madrid, Oct 2018. PDF

2017

C. Manenti, T. Pellegrini, J. Pinquier, Unsupervised Speech Unit Discovery Using K-means and Neural Networks, in Proc. International Conference on Statistical Language and Speech Processing, pp. 169-180, Le Mans, Oct. 2017

A. Heba, T. Pellegrini, T. Jorquera, R. André-Obrecht, J.-P. Lorré, Lexical Emphasis Detection in Spoken French using F-BANKs and neural networks, in Proc. International Conference on Statistical Language and Speech Processing, pp. 241-249, Le Mans, Oct. 2017

T. Pellegrini, Densely Connected CNNs for Bird Audio Detection, in Proc. European Signal and Image Processing Conference (EUSIPCO 2017), pp. 1734-1738, EURASIP, Kos, Sept. 2017

E. Randria, S. Cosentino, J.-Y. Lin, T. Pellegrini, S. Sessa, A. Takanishi, Audience mood estimation for the Waseda Anthropomorphic Saxophonist 5 (WAS-5) using cloud cognitive services, In Proc. of the 35th annual conference of the robotics society of Japan (RSJ) - Special Issue On Robotics and AI, pp. 1-4, Tokyo, Sept. 2017

C. Senac, T. Pellegrini, F. Mouret, J. Pinquier, Music Feature Maps with Convolutional Neural Networks for Music Genre Classification , in Proc. International Workshop on Content-Based Multimedia Indexing (CBMI 2017), Florence, pp. 1-5, June 2017

2016

T. Pellegrini, S. Mouysset, Inferring Phonemic Classes from CNN Activation Maps Using Clustering Techniques, in Proc. INTERSPEECH, San Francisco, Sept. 2016 (PDF, slides)

C. Manenti, T. Pellegrini, J. Pinquier, CNN-Based Phone Segmentation Experiments in a Less- Represented Language, in Proc. INTERSPEECH, San Francisco, Sept. 2016 (PDF, slides)

V. Laborde, T. Pellegrini, L. Fontan, J. Mauclair, H. Sahraoui, J. Farinas, Pronunciation Assessment of Japanese Learners of French with GOP Scores and Phonetic Information, in Proc. INTERSPEECH, San Francisco, Sept. 2016 (PDF, poster)

P. Guyot, A. Eldridge, Y. Chen Eyre-Walker, A. Johnston, T. Pellegrini, M. Peck, Sinusoidal modelling for ecoacoustics, in Proc. INTERSPEECH, San Francisco, Sept. 2016 (PDF)

M. Thlithi, J. Pinquier, T. Pellegrini, R. André-Obrecht. Filterbank coefficients selection for segmentation in singer turns , In Proc. CBMI, pp. 1-6, Bucharest, June 2016

2015

T. Pellegrini, V. Barriere. Time-continuous estimation of emotion in music with recurrent neural networks, in Proc. Working Notes Proceedings of the MediaEval 2015 Workshop, Wurzen, Sept. 2015 (PDF)

T. Pellegrini. Comparing SVM, softmax, and shallow neural networks for eating condition classification, in Proc. INTERSPEECH, Dresden, Sept. 2015 (PDF)

M. Thlithi, C. Barras, J. Pinquier, T. Pellegrini, Singer diarization: application to ethnomusicological recordings, in Proc. International workshop on Folk Music Anaysis (FMA 2015), Paris, June 2015

V. Bragard, T. Pellegrini, J. Pinquier, Pyc2Sound: a Python tool to convert images into sound, in Proc. Audio Mostly, Thessaloniki, ACM, Nov. 2015

L. Fontan, T. Pellegrini, J. Olcoz, A. Abad, , Predicting disordered speech comprehensibility from Goodness of Pronunciation scores, in Proc. Workshop on Speech and Language Processing for Assistive Technologies (SLPAT 2015), Dresden, Sept 2015.

National Conferences

L. Gelin, T. Pellegrini, J. Pinquier, M. Daniel. Améliorations d'un système Transformer de reconnaissance de phonèmes appliqué à la parole d'enfants apprenants lecteurs In Actes JEP, Juin 2022, Noirmoutier. PDF

J. Farinas, T. Pellegrini, J. Pinquier. Comparaison de systèmes automatiques de reconnaissance grand vocabulaire appliqué à de la parole pathologique In Proc. JPC, May 2019, Mons

T. Pellegrini, L. Fontan, H. Sahraoui. Réseau de neurones convolutif pour l'évaluation automatique de la prononciation, In Proc. des Journées d'Etudes sur la Parole (JEP), p. 624-632, Paris, juillet 2016 (PDF in French)

C. Manenti, T. Pellegrini and J. Pinquier. Influence de la quantité de données sur une tâche de segmentation de phones fondée sur les réseaux de neurones, In Proc. des Journées d'Etudes sur la Parole (JEP), p. 392-400, Paris, juillet 2016 (PDF in French)

Misc

T. Pellegrini. La parole « non-standard » : un défi pour les outils de traitement automatique de la parole, conférence invitée à la journée d'étude Regards Croisés sur la Voix, Strasbourg, 10 juin 2016 (slides PDF in French)

T. Pellegrini. Punctuation recovery with recurrent neural networks, presentation slides at SAMoVERT seminar, La Fraissinede, June 2015 (PDF in Franglish)

Thomas Pellegrini

Assistant Professor in Computer Science

Menu: