Similarity Matrix

Jpetiot/ janvier 7, 2006/ Analysis

Context

Similarity between two video documents is a concept which shall take into account their content as well as their structure, in terms of time order. We propose an algorithm performing this kind of comparison and from which we derive various applications such as a generic measure to estimate the “style similarity”, or a segmentation tool to split a long recording in programs or sequences. We consider that similarity in style relies on the occurrence of common elements between the compared documents, from a production point of view. Those more or less perceptually human common elements – we call production invariants – can be characterized by a combination of audiovisual characteristics. They can be highlighted by the fact that the dominant color (corresponding to a given set, a given lightning) evolves in the same way along two different documents from a low-level point of view, or the fact that a same commercial is repeated at different moment in a TV program at a higher level point of view. Measuring the degree of style similarity between two compared documents relies then, on the ability to quantify the occurrence of these common elements.

For three consecutive sequences of two video excerpts to be compared, we may reinforce the similarity notion when this similarity can be observed on several features at the same time. Therefore one similarity matrix is computed for each considered features before being summed. In this example, the first excerpt corresponds to the horizontal dimension, while the second one appears on the vertical one. S (resp. T, C and P) stands for a similarity in terms of Shape (Resp. Texture, Color and Position).

Overview

Some video features can be seen as time series. To measure a similarity between two given subsequences, the most popular methods used are LCSS (Longest cCmmon SubSequence) and DTW (Dynamic Time Warping) and their derivatives. The algorithm we developed for features comparison is aiming at reducing the computational cost while approximating the results of the LCSS algorithm. There are 2 steps in this algorithm.

Step 1: Recursive quadratic intersection

The goal here is to avoid useless similarity estimations. For two sequences S1,S2 of real values to be compared, we extract the min and the max. If there is no intersection between the so defined intervals [min1, max1] and [min2, max2], then the overall similarity between these two sequences is considered to be 0. Else, the sequences are split on two subsequences : (S11, S12) and (S21, S22) of equal length, and the same test is performed in a recursive manner on each four couples which can be compared: (S11,S21) ; (S11,S22) ; (S12,S21) and (S12,S22). This process is iterated until the sequences are shorter than a given predefined size (tmax). The output of this process is a set of subsequences couple of size close to tmax, which may present some similarity. The parameter tmax must be adjusted as a tradeof between performance (for low values of tmax) and precision (for higher values).

Step 2: Cover rate computation

To compare two sequences S1 and S2 shorter than a tmax duration, we use another algorithm called the “cover rate” which estimate the number of subsequences of length tmin which can be matched in the same order. This cover rate is a rough recursive estimation.

The parameter tmin defines the temporal granularity on which the similarity is estimated. Operators e and d are the morphological erosion and dilation of sequences I and J with a structuring element of size tmin. The cover rate is inserted in the matrix in each cell along the diagonal of the box defined by the sequences I and J.

Applications

We have observed that our algorithm approximate results of a classical algorithm fo sub-sequance matching call the LCSS (Longest Common Sub-Sequence) with an error rate of at most 6 %. In the same time, the computational cost is divided by a factor of 10 to 15.

Hereafter are a presented a set of matrices obtained on various kind of content

Comparison of two commercial breaks

This matrix was obtained while comparing two commercial breaks. A post processing aiming to smoothing the similarity coefficient has been applied on this matrix. We can observe diagonal segments with high coefficients (red pixels) corresponding to the same videos inserted at different places in the two compared files. We can also denote the presence of blocs which are resulting from the comparison of different videos, but for a same products. Those videos are produced following the same production rules.

Autosimilarity of a TV stream

This similarity matrix compares a recording of TV stream with itself. We can observe that programs can be easily localized because they correspond to blocs of high coefficients around the first diagonal.

Similarity matrices can be used for various purpose such as:

Defining a distance between video styles,
Identifying production invariants,
Splitting videos in sequences or TV streams in programs.

Projects

CHAPITRE (ANR project with NPTV, NDS and Expway)

Contributors

Siba Haïdar,
Bilal Chebaro,
Philippe Joly (contact).

Main publications

Siba Haidar, Philippe Joly, Bilal Chebaro. Mining for Video Production Invariants to Measure Style Similarity. In : International Journal of Intelligent Systems, Wiley, Vol. 21 N. 7, p. 747-763, july 2006.
Siba Haidar, Philippe Joly, Bilal Chebaro. Audiovisual production invariant searching. In: 1ère Conférence en Recherche d’Information et Applications (CORIA’04), Toulouse, p. 333-346, march 2004.
Siba Haidar. Comparaison des documents audiovisuels par matrice de similarité. Thèse de doctorat, Université Paul Sabatier, september 2005. (In french)