Multiple sources detection

Jpetiot/ janvier 7, 2013/ Analysis


Detecting when multiple harmonic sources are present is essential for structuring various type of audio content. We propose a method for detecting area with simultaneous harmonic sources using graph analysis of the tracking of the main frequencies.

As our approach seems to work on choir detection, we propose to generalise our approach to identify overlapping harmonic sources using the sinusoidal segments. Compared with the previous approach, no time-frequency restriction is performed and the spectrum is analysed for every frame and every frequencies up to 3000Hz.

Peak selection and tracking

In order to avoid the impact of noise on the tracking, the approach does not select all the peaks any more but use a piecewise linear threshold to select only the most predominant peaks. This threshold also take into accounts the fact that amplitudes decrease with frequencies as shown in the figure below.

The tracking method is restored to its initial configuration, allowing only one successor per peak.   

Graph analysis

To detect the different sources our approach uses a grouping of the different sinusoidal segments corresponding to a single source. In order to regroup them, we will use the harmonicity between their frequencies to create a graph of the links between sinusoidal segments.

Each connect component of the graph is therefore used as a signature of a source.


Once every source is localised on the Time-Frequency space. We detect source-overlapping areas simply by counting the number of source signatures present at each time.


  • Maxime Le Coz (
  • Régine André-Obrecht
  • Julien Pinquier

Main publications

Maxime Le CozJulien PinquierRégine André-ObrechtSuperposed Speech Localisation using Frequency Tracking (regular paper).  In : INTERSPEECHLyon25/08/2013-29/08/2013, (Eds.), International Speech Communication Association (ISCA), p. 714-717, août / august 2013. BibTeX

Maxime Le CozJulien PinquierRégine André-ObrechtJulie MauclairAudio Indexing Including Frequency Tracking of Simultaneous Multiple Sources in Speech and Music (regular paper).  In : International Workshop on Content-Based Multimedia Indexing (CBMI 2013)Veszprem17/06/2013-19/06/2013IEEE, p. 23-25, juin / june 2013. BibTeX

Share this Post