The laboratory of Ethnology and Comparative Sociology (LESC) including the research center of ethnomusicology (CREM) and the center of teaching and research in American Indian ethnology (EREA) as well as the laboratory of anthropology of National Museum of Natural History (MNHN) are dealing with the need to index the audio archives they manage, while keeping track of the contents, which is a long, fastidious and expensive task.
During the CNRS interdisciplinary summer school (Science and Voice 2010), a common interest has risen between acousticians, ethnomusicologists, and computer scientists:there nowadays exist advanced audio analysis tools, developed by indexing specialists (acousticians and computer scientists) that can provide easier content access and indexing.
The context of this project is to index and improve the access to the LESC audio archives:the CREM data and the EREA data on the Maya « singing/speaking » distinction, as well as the MNHN data (tradional African music). Since 2007, as no open-source application exists on the market on how to access to the audio data recorded by researchers, the CREM-LESC, the LAM and the sound archives of the MNHN began the conception of an innovate and collaborative tool that answers the trade needs (linked to the documents temporal span), while being adapted to the researchers requirements. With financial support from the CNRS Très Grand Equipement (TGE), ADONIS and the Mnistry of culture, the Telemeta platform, developed by Parisson, is online since May 2011.
On this platform, basic signal analysis tools are already available, It is however mandatory to have a set of advanced and innovative tools for automatic or semi-automatique indexing of this audio data, that includes sometimes long recordings, with quite heterogeneous content and quality.
The aim of the DIADEMS project is to supply some of these tools, to integrate them into Telemeta, while bearing with the user needs, This implies a complementarity of the scientific objectives of each partner:
For the technology providers, IRIT, LIMSI, LaBRI and LAM, the aim is twofold:
To provide existing technologies, such as speech and music detections, speakers segmentation. These tools aim at extracting homogeneous segments of interest for the users. These systems have been regularly tested during numerous (inter)national evaluation compaigns, with increasingly difficult contexts, However none of these compaigns contains such diversity as there exist in the audion archives studied in this project. This heterogeneity is linked to the recording conditions, to the kind of the documents, as well as their geographical origin. The challenge for all these « state-of-the-art » systems is therefore to adapt them to the users needs.
To propose new tools for exploring the contents of the homogeneous segments. The research on the singing/speaking voice oppositin, the singing voice, the singing turns and the musical similarity are not achieved yet.. A real research study on defining relevant features and how to take them into account has still to be carried out. To be able to interact with musicologists and ethnomusicologists is a major advantage in this context.
For ethnomusicologists and musicologists, the aims are different, depending on the usage:
For documentalists, the aim is to learn to use the tools and to add their practical knowledge in order to adapt them to their indexing needs. An important exchange must take place between the tool provider, the integrator and the user. The focus must be put on the visualisation of the processing results, which should provide a useful help for indexing.
For ethnomusicologists and musicologists, the aim is beyond the indexing capabilities of the tools. There should therefore be exchanges with the technology providers to define which the most relevant information retrieval tools are.