Functional impact of speech disorders
The reduction of mortality and the extension of life expectancy following cancer make it a priority to manage the after-effects of the pathology and its treatments.
Oral or oropharyngeal cancer affect the structures involved in speech production. Thus, the quality of life of these patients may be affected by changes in their communication skills.
In clinical practice, the evaluation of speech disorders is mainly based on perceptual assessments, subject to significant variability (inter- and intra-judges). The development of automatic speech processing tools can optimize this approach.
The objective of this research is to develop an automatic tool for the analysis of pathological speech in patients treated for oral and oropharyngeal cancer.
This would allow a better adaptation of therapeutic strategies, taking into account the patients’ needs and the functional impact on their communication skills and quality of life.
This research is summurized in four sections:
- Systematic review
- Spontaneous speech analysis
- Functional data and health-related quality of life
2.1. Systematic review
Context: A systematic review was led to describe the effects of head and neck cancer on speech intelligibility using acoustic analysis. This review focuses on speech intelligibility in adults with oral or oropharyngeal cancer assessed by acoustic measures.
Materials and methods: Two databases (PubMed and Embase) were surveyed to retrieve article references. The methodology used, in accordance with the PRISMA recommendations, led to a final selection of 22 articles.
Results: The studies focus mainly on patients treated by surgery. The languages studied are varied with a predominance of English and Dutch. Studies on small volume tumours (T1 and T2 according to TNM classification) are the majority (54%). The comparison criterion is perceptual assessment in 27% of cases. 64% of studies measure characteristics of isolated phonemes.
Table 1: Constitution of the speech samples
|Speech sample||Number of studies (%)|
|Isolated phonemes |
Extracted from a read text
Extracted from isolated words
Combination of sustained vowels and phonemes in words
|14 (64%) |
|Syllables and diadochokinesis||2 (9%)|
|Read text||4 (17%)|
|Not reported||1 (5%)|
Nasality is mainly analyzed in patients treated for oropharyngeal cancer. Vowels are mainly studied by analyzing the formants and vowel space area, consonants by means of specific parameters according to their phonetic characteristics. Machine learning methods (with the use of cepstral coefficients), which are still uncommon, make it possible to classify the speech of patients with tumours of type T3 or T4 (according to the TNM classification) as “intelligible” or “not intelligible”.
Conclusion: The acoustic measurements used depend on the location: nasality is mainly assessed in oropharyngeal cancers, the characteristics of vowels and consonants in oral cavity cancers. Speech disorders induced by small volume tumours are measured by phonetic characteristics, while those induced by larger volume tumours are measured automatically. No trends were found depending on the type of processing and language of the speech sample. The development of more complex automatic models, combining binary classification (intelligible or not), a description of low-level acoustic alterations, as well as more global elements of the spontaneous speech signal, could allow the functional and psychosocial impact of speech disorders to be measured.
2.2. C2SI index
The aim of the C2SI project (Carcinologic Speech Severity Index, Institut National du Cancer 2014-2018) was to build an automatic index to measure the impact of speech disorders on communication abilities in patients treated for oral and/or oropharyngeal cancer.
Objective: to assess the validity of the different measurement scores of speech disorders, resulting from an automatic signal analysis, in patients treated for upper aerodigestive tract cancer, to build a global automatic score.
Material and methods: 87 patients treated for oral cavity or oropharynx cancer, and 42 controls performed various speech production tasks, targeting vocal production, prosody, comprehensibility, acoustico-phonetic decoding, and intelligibility. The audio recordings of these productions were the object of a human perceptive evaluation, but also of an automatic processing. Self-questionnaires of quality of life and perception of speech disability were proposed to the participants to study the links between speech disorder and perceived impact. Metadata about individual, clinical and treatment information were also collected. Construct validity, criterion validity and reliability were analysed. An automatic index was finally built by modeling.
Results: Among all the parameters that can be extracted from an automatic processing of the speech signal, 6 were selected because they are consistent with the data of the literature, they respect the construct validity by discriminating extreme groups, and are correlated with the perceptual score, acting as a gold standard, and with the speech disability scores (criterion validity). A factor analysis confirms their structure in two domains: 2 parameters are part of the “voice” domain (interquartile difference of the fundamental frequency, and amplitude instability), and 4 are part of the “speech” domain (likelihood scores in acoustic-phonetic reading and decoding, row accumulation and anomalous acoustic-phonetic decoding rates). They present a good internal consistency (Cronbach’s alphas greater than or equal to 0.90 in the “speech” domain). This led to the construction of an automatic score by modeling these parameters. It has good metric qualities.
Y(full_C2SI) = 11.26482 + (-0.0049184 * F0-IQR) + (-0.0946604 * Pitch instability AAA) + (-0.147016 * Likelihood score DAP) + (1.391981 * Likelihood score LEC) + (-2.09 e-06 * Cumulative rank DAP) + (-0.0111486 * Anomaly rate DAP)
Discussion: Automatic speech processing allows to define valid, reliable and reproducible parameters. A simplification by reduction of tasks may be considered in routine clinical use.
That’s why a reduced score was built. It should include as few tests as possible, without losing its good psychometric properties.
Y(reduced_C2SI) = 11.48726 + (1.52926 * Likelihood score LEC) + (-1.94 e-06 * Cumulative rank DAP)
However, the “social role functioning” dimension of the SF-36 (SF36-SF, generic self-assessed quality of life questionnaire) was weakly correlated with perceptual (r =0.13) and automatic (r =0.31) scores. The link between the SF36-SF and the C2SI is shown in Figure 2 (x-axis: C2SI score, y-axis: SF36-SF score).
Regarding the speech-related quality of life, the functional dimensions of the SHI and of the PHI (two valid tools allowing to assess perceived speech impairment) are moderately correlated with the perceptual score (r = -0.39 for both questionnaires), but also with the automatic one (r = -.31 in both cases).
The correlations between the speech severity scores and the functional dimensions of the questionnaires are also moderate (r = -.50 between the perceptual score and the PHI “psychosocial” dimension; r = -0.52 between the automatic C2SI score and the “psychosocial” dimension).
This study shows that the correlation between the quality of life and the speech disorder severity is only moderate, whether the impairment is perceptually or automatically assessed. This requires studying the intermediate step between the speech disorder severity and the speech-related quality of life, which is the functional impact of the disorder on the patient’s everyday life activities.
2.3. Spontaneous speech analysis
To this end, the development of comprehensive models combining different acoustic measures and speech processing tools to avoid perceptual biases would allow for a better consideration of this functional impact. Further analyses have to be carried out to find whether these measures on conversational speech are relevant, which is a more natural context of speech production.
The study of the speech of about twenty-five patients treated for oral cavity and oropharyngeal cancer will make it possible to meet these objectives:
- Vocal parameters: production of a sustained /a/
- Parameters at the segmental level: production of sentences containing all the consonants of French in the same vowel context, and a task of pseudo-words repetition
- Semi-spontaneous speech on reading a text and sentences
- Spontaneous speech: recording of the interview for the ECVB questionnaire (measurement of the communicative impact of the speech disorder).
Our hypothesis is that the impact of speech disorders on communication and quality of life will be found in a spontaneous speech task, i.e. the one that is unconstrained and as close as possible from the daily life of the patient.
Several analyses are planned on spontaneous speech:
- At signal level:
- parameters of macrovariability (fundamental frequency, intensity), cepstral parameters such as CPP (Cepstral Peak Proeminence), slope or tilt (Praat)
- determination of voiced, non-voiced and pause segments (speechTools)
- speech rate and rhythm (Praat and WebRTC)
- On the phonetic level: global phonetic inventory and analyses of confidence scores associated with recognized phonemes (TDNNf-HMM based on Kaldi)
- At the lexical level: lexical inventory and analyses of confidence scores (TDNNf-HMM), semantic and syntactic analysis of content.
2.4. Functional data and health-related quality of life
The functional outcome will be measured by the following elements:
- Communication needs and measurement of perceived functional impact: evaluation of social circles (ECCS), QFS (Zanello, 2006), DIP (Letanneux, 2013), ECVB (Darrigrand, 2000)
- Quality of life: EORTC QLQ-C30, QLQ-H&N35
The intermediate factors between speech disorders and communication disorders on the one hand, and quality of life on the other hand must also be taken into account.
- Cognitive and anxiety/depression status: MoCA, HAD (Zigmond & Snaith, 1983), EDP (Dolbeault, 2008)
- Associated oncological deficits: CHI (Balaguer, 2017)
- Perceived speech disability: PHI (Fichaux-Bourin, 2009; Balaguer, 2019)
Among these questionnaires, few are validated in a head and neck cancer population. Several studies have been led and are still ongoing to validate these questionnaires in our population of interest.
- RUGBI : Agence Nationale de la Recherche ANR-18-CE45-0008
- DAPADAF-E : Ministère de la Santé, DGOS – PHRIP-19-0004
- C2SI : Institut National du Cancer 2014-135
Balaguer, M., Pommée, T., Farinas, J., Pinquier, J., Woisard, V., Speyer, R. Effects of oral and oropharyngeal cancer on speech intelligibility using acoustic analysis: Systematic review. Head and Neck. (2019). https://doi.org/10.1002/hed.25949
Balaguer M, Champenois M, Farinas J, Pinquier J, Woisard V. The (head and neck) carcinologic handicap index: validation of a modular type questionnaire and its ability to prioritise patients’ needs. European Archives of Oto-Rhino-Laryngology. 2020. https://doi.org/10.1007/s00405-020-06201-6
Woisard V, Astésano C, Balaguer M, Farinas J, Fredouille C, Gaillard P, Ghio A, Giusti L, Laaridh I, Lalain M, Lepage B, Mauclair J, Nocuadie O, Pinquier J, Pouchoulin G, Puech M, Robert D, Roger V. C2SI corpus: a database of speech disorder productions to assess intelligibility and quality of life in head and neck cancers. Language Resources and Evaluation. 2020. https://doi.org/10.1007/s10579-020-09496-3
Balaguer M, Farinas J, Fichaux-Bourin P, Puech M, Pinquier J, Woisard V. Validation of the French Versions of the Speech Handicap Index and the Phonation Handicap Index in Patients Treated for Cancer of the Oral Cavity or Oropharynx. Folia Phoniatr Logop. 2019:1-14. https://doi.org/10.1159/000503448
Balaguer, M., Boisguérin, A., Galtier, A., Gaillard, N., Puech, M., Woisard, V. Assessment of impairment of intelligibility and of speech signal after oral cavity and oropharynx cancer. European Annals of Otorhinolaryngology, Head and Neck Diseases (2019). https://doi.org/10.1016/j.anorl.2019.05.012
Balaguer, M., Boisguerin, A., Galtier, A. et al. Factors influencing intelligibility and severity of chronic speech disorders of patients treated for oral or oropharyngeal cancer. Eur Arch Otorhinolaryngol (2019). https://doi.org/10.1007/s00405-019-05397-6
Balaguer M., Percodani J, Woisard V. The Carcinologic Handicap Index (CHI): A disability self-assessment questionnaire for head and neck cancer patients. Eur Ann Otorhinolaryngol Head Neck Dis. 2017 Aug 18. pii: S1879-7296(17)30107-2. doi: https://doi.org/10.1016/j.anorl.2017.06.010
Borggreven, P. A., Verdonck-De Leeuw, I. M., Muller, M. J., Heiligers, M. L. C. H., De Bree, R., Aaronson, N. K., & Leemans, C. R. (2007). Quality of life and functional status in patients with cancer of the oral cavity and oropharynx: Pretreatment values of a prospective study. European Archives of Oto-Rhino-Laryngology, 264(6), 651–657. https://doi.org/10.1007/s00405-007-0249-5
Corine Astésano, Mathieu Balaguer, Jerome Farinas, Corinne Fredouille, Alain Ghio, et al.. Carcinologic Speech Severity Index Project: A Database of Speech Disorder Productions to Assess Quality of Life Related to Speech After Cancer. Language Resources and Evaluation Conference (LREC), May 2018, Miyazaki, Japan. ⟨hal-01770168⟩
Middag, C. (2012). Automatic analysis of pathological speech.Doctoral dissertation, Ghent University, Department of Electronics and information systems, Ghent, Belgium.
Perneger, T. V., Leplège, A., Etter, J. F., & Rougemont, A. (1995). Validation of a French-language version of the MOS 36-Item Short Form Health Survey (SF-36) in young healthy adults. Journal of Clinical Epidemiology, 48(8), 1051–1060. https://doi.org/10.1016/0895-4356(94)00227-H
Rinkel, R. N., Leeuw, I. M. V., Van-Reij, E. J., Aaronson, N. K., & Leemans, R. (2008). Speech Handicap Index in patients with oral and pharyngeal cancer: better understanding of patients’ complaints. Head and Neck, 30, 868–874. https://doi.org/10.1002/HED
Fichaux-Bourin, P., Woisard, V., Grand, S., Puech, M., & Bodin, S. (2009). Validation d’un questionnaire d’auto-évaluation de la parole (Parole Handicap Index). Rev Laryngol Otol Rhinol, 130, 45–51.