0701, 2009

Monophony / Polyphony Distinction

Jpetiot/ janvier 7, 2009/ Analysis

Context In many fields of music analysis (for example: source separation, instruments recognition,…), it could be usefull to know how many instruments are present, or how many notes are played at the same time. We propose here a method for this last problem. Here, a “monophonic” sound is defined as one note played at a time (either played by an

0701, 2009

Generic GLR/BIC Audio-Video Segmentation

Jpetiot/ janvier 7, 2009/ Analysis

Context We make the hypothesis that basic video or audio features present homogeneous values depending on a special context: homogeneity can be exploited by a GLR-BIC segmentation algorithm. The homogeneity criterion is evaluated by the ability to describe this feature values with a Gaussian law.This method consists in applying the GLR algorithm until convergence to the best repartition of Gaussian

0701, 2009

Acoustic-to-articulatory Inversion

Jpetiot/ janvier 7, 2009/ Modeling

Context The aim of acoustic-to-articulatory inversion is to recover the vocal tract shape, knowing the acoustics pronounced. This recovery is done by estimating the position of flesh points located on lips, tongue, jaw, and sometimes velum. In our case, 6 captors are positionned on: upper lip, lower lip, jaw, front tongue, middle tongue and back tongue. System Overview First, an

0801, 2008

QUAERO

Jpetiot/ janvier 8, 2008/ Previous

Objectives Quaero is a collaborative research and development program, centered at developing multimedia and multilingual indexing and management tools for professional and general public applications such as the automatic analysis, classification, extraction and exploitation of information. The research aims to facilitate the extraction of information in unlimited quantities of multimedia and multilingual documents, including written texts, speech and music audio

0701, 2008

Program Boundaries Detection

Jpetiot/ janvier 7, 2008/ Applications

Overview Very few researches have been done for program boundaries detection on TV broadcast for now. All existing approaches are based on a spatiotemporal modelling of the content and decision rules. Currently, it is the only way to reach the semantic quality required by search engines. But only recording collections following the same structure can benefit from such methods. Furthermore,

0701, 2008

Adaptative User-Defined Similarity Measure

Jpetiot/ janvier 7, 2008/ Audiovisual Content Structuring

Context With the aim of audiovisual database consulting, without being limited to a predefined applicative context, the prospect of a user-dependent interactive visual organization should be enviable: with S a same small subset of documents, a user should have the possibility to explore several geographical representations of it; the rest of the corpus (or a part of it) has to reorganize itself regarding

0801, 2007

MISTRAL

Jpetiot/ janvier 8, 2007/ Previous

Open Source Platform for Biometric Authentification Problem description The biometric user authentication yields to verify the identity of a person based on some physical or behavioural characteristics, such as fingerprints, DNA, iris, face or voice. This topic has been the object of a significant increase in interest during the last decade, in the commercial field (secured access to sensible information,

0801, 2007

EPAC

Jpetiot/ janvier 8, 2007/ Previous

Exploring large sets of audio documents for extraction and processing of concersational speech Problem description Very large collections of speech data are now available and have to be indexed to allow later retrieval of recorded information. The cost of manual transcription of audio recordings is high, especially when specific indexing is wanted such as the main topic, keywords or the

0701, 2007

Audio Primary Component Detection

Jpetiot/ janvier 7, 2007/ Applications

Context The primary component detection is a first step in audiovisual indexation. The objective here is to describe the type of data (speech, music, singing voice), in order to give information such as the parts of the document to transcribe. In our studies, we consider 4 primary components : speech, music, singing voice, and jingles. We construct a detector for

0701, 2007

Audio Video Characters Labelling

Jpetiot/ janvier 7, 2007/ Applications

Context Many works were carried out on the audiovisual content characterization, and particularly on person detection. The majority of these studies are mono media and allow the detection of a person either by his visual appearance in a frame (like a face) or by his voice. More recent works start video content analysis by integrating both acoustics and visual features,