Multiple sources detection
Overview
Detecting when multiple harmonic sources are present is essential for structuring various type of audio content. We propose a method for detecting area with simultaneous harmonic sources using graph analysis of the tracking of the main frequencies.
As our approach seems to work on choir detection, we propose to generalise our approach to identify overlapping harmonic sources using the sinusoidal segments. Compared with the previous approach, no time-frequency restriction is performed and the spectrum is analysed for every frame and every frequencies up to 3000Hz.
Peak selection and tracking
In order to avoid the impact of noise on the tracking, the approach does not select all the peaks any more but use a piecewise linear threshold to select only the most predominant peaks. This threshold also take into accounts the fact that amplitudes decrease with frequencies as shown in the figure below.

The tracking method is restored to its initial configuration, allowing only one successor per peak.
Graph analysis
To detect the different sources our approach uses a grouping of the different sinusoidal segments corresponding to a single source. In order to regroup them, we will use the harmonicity between their frequencies to create a graph of the links between sinusoidal segments.

Each connect component of the graph is therefore used as a signature of a source.

Decision
Once every source is localised on the Time-Frequency space. We detect source-overlapping areas simply by counting the number of source signatures present at each time.
Contributors
- Maxime Le Coz (lecoz@irit.fr)
- Régine André-Obrecht
- Julien Pinquier
Main publications
Maxime Le Coz, Julien Pinquier, Régine André-Obrecht. Superposed Speech Localisation using Frequency Tracking (regular paper). In : INTERSPEECH, Lyon, 25/08/2013-29/08/2013, (Eds.), International Speech Communication Association (ISCA), p. 714-717, août / august 2013. BibTeX
Maxime Le Coz, Julien Pinquier, Régine André-Obrecht, Julie Mauclair. Audio Indexing Including Frequency Tracking of Simultaneous Multiple Sources in Speech and Music (regular paper). In : International Workshop on Content-Based Multimedia Indexing (CBMI 2013), Veszprem, 17/06/2013-19/06/2013, IEEE, p. 23-25, juin / june 2013. BibTeX