Monophony / Polyphony Distinction
In many fields of music analysis (for example: source separation, instruments recognition,…), it could be usefull to know how many instruments are present, or how many notes are played at the same time. We propose here a method for this last problem. Here, a “monophonic” sound is defined as one note played at a time (either played by an instrument or sung by a singer), while a “polyphonic” sound is defined as several notes played simultaneously.
A brief overview of our method is described on figure 1. The originality of our approach leads in the extracted features, as well as in their modelling, which is done using Weibull bivariate distributions.
Figure 1: Monophony / Polyphony distinction process
The parameters extracted from the signal come from the YIN algorithm, a well known pitch estimator [de Cheveigné et al.]. This estimators gives a value which can be interpreted as the inverse of a confidence indicator: the lower the value is, the more reliable the estimated pitch is.
Considering that when there is one note, the estimated pitch is reliable, and that when there is several notes, the estimated pitch is not, we take as parameters the short term mean and the short term variance of this “confidence indicator”.
The bivariate distribution of these two parameters is modelled using Weibull bivariate distributions (see figure 2), which have prove (see publications below) to fit best the experimental distribution.
Figure 2: Estimated Weibull bivariate distributions – Left: Monophonic music – Right: Polyphonic music
A new method for the estimation of the parameters of a Weibull bivariate distribution has been proposed, based on the moment method (for more details, see publications below).
This monophony/polyphony distinction is used as a preprocessing in the singing voice detection process.
Hélène Lachambre, Régine André-Obrecht, Julien Pinquier. Distinguishing Monophonies from Polyphonies using Weibull Bivariate Distributions. In : IEEE Transactions on Audio, Speech and Language Processing (to appear).
Hélène Lachambre, Régine André-Obrecht, Julien Pinquier. Estimation des paramètres d’une loi de Weibull bivariée par la méthode des moments – Application à la séparation Monophonie / Polyphonie. In : XVIèmes Rencontres de la Société Francophone de Classification (SFC 2009), 2009 (in french).
[de Cheveigné et al.] A. de Cheveigné and H. Kawahara Yin, a Fundamental Frequency Estimator for Speech and Music”. In Journal of the Acoustical Society of America, vol.73, no.3, pp.671-678, 2002.