Generic GLR/BIC Audio-Video Segmentation

Jpetiot/ janvier 7, 2009/ Analysis

Context 

We make the hypothesis that basic video or audio features present homogeneous values depending on a special context: homogeneity can be exploited by a GLR-BIC segmentation algorithm. The homogeneity criterion is evaluated by the ability to describe this feature values with a Gaussian law.This method consists in applying the GLR algorithm until convergence to the best repartition of Gaussian distributions, and then applying the BIC to choose points which are points of change. This improved method is very precise and the penalty coefficient which appears in the BIC expression is constant unlike other methods which only use the BIC algorithm. 

Overview

The proposed method for signal segmentation follows four main steps as explained following and in figure 1. 

  • Figure 1.a shows the theoretical segmentation points;
  • Splitting step (figure 1.b) in which a GLR point detection is achieved on each fixed size window; 
  • Most probable point detection step (figure 1.c)  using GLR on overlapping variable size windows (their boundaries are one point over two obtained at the previous step);
  • Re-adjustment step (figure 1.d) in which GLR is applied several times until stabilization;
  • Definitive change detection step (figure 1.e) where the BIC is applied with a fixed penalty coefficient.

The purpose of all those steps is to make as well as possible a stable segmentation that gives homogeneous zones in terms of features distributions.

Figure 1: GLR-BIC segmentation. 

Applications

Projects

  • EPAC Project (ANR 2006 Masse de données – Connaissances Ambiantes): Mass Audio Documents Exploration for Extraction and Processing of conversational speech,
  • MUSCLE Project
  • A01 Project:  Automatic Indexing of speakers in audio-visual sequences using multimodal approach. 

Contributors

Publications

  • Elie El Khoury, Christine Senac, Régine André-Obrecht. Speaker Diarization: Towards a more Robust and Portable System. In: IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2007), Honolulu, Hawaii, USA, 15/04/07-20/04/07, IEEE, p. 489-492, 2007.
  • Elie El Khoury, Gaël Jaffré, Julien Pinquier, Christine Senac. Association of Audio and Video Segmentations for Automatic Person Indexing. In: International Workshop on Content-Based Multimedia Indexing (poster session) (CBMI 2007), Bordeaux, France, 25/06/07-27/06/07, IEEE, p. 287-294, 2007.
  • Elie El Khoury, Sylvain Meigner, Christine Senac: Projet EPAC: segmentation et regroupement en locuteurs. In : JEP, 2008.

Demo

Click here!

Share this Post