LIMSI

[version française] En Français



CapRe          

Visual perception tools for natural interaction
CapRe : A gaze capture and tracking system




Object

In the near future, machines will be enhanced with perceptual abilities which allow them to interpret a range of human behavior, exploiting information from vision and verbal information of users. We propose a non intrusive vision system aiming to build a perceptual tools in order to make natural interaction possible. Such a real-time camera-based system is designed for gaze tracking from image sequence of human in front of a machine (Fig. 1).

Description

We propose a wireless, inexpensive and real-time system (CapRe) which detects and tracks the components of the face from video sequences. CapRe combines several processing phases (Fig. 2). These phases are divided into two main processes. On one hand, the "detection and tracking" process is concerned with the first three phases which mainly use image processing techniques and are more inter-dependent. These three phases alternate between two states, initialization and adaptation. On the other hand, the "Measurement" process applies more geometric computations and transformations.

In a first instance, CapRe localizes and tracks face, nose and eyes. The system continiously updates the recognition parameters to take into account photometric and geometric variations of patterns found in the face components. Then, CapRe evaluates the orientation of the head and the facedependent eyes orientation in order to determine the gaze direction vector (Fig. 3).

We gathered a corpus of 40 video films (15000 images) representing 32 different persons facing the screen (Fig. 4). The duration of each film is approximately 1 minute. Using this corpus, an evaluation (Table 1) of the system was carried out for the first three phases (initialization state only and both initialization and adaptation states). We consider that the face's bounding box localization fails if it doesn't include the nose and both eyes. For the nose's or eyes' localization, an error occurs if the distance between it and the real localization is greater than 5 millimeters. In Table 1, we represent the results of pupils' localization for the full corpus and for persons which doesn't wearing glasses. Most of the errors that occurred for persons wearing glasses are due to light reflections.

Results and prospects

Except the 6th phase, all other phases were implemented for the two states (initialization and adaptation). The phases 1, 2, 3 and 4 are integrated in the CapRe system. After the evaluation of these phases, we will integrate the other phases. In order to improve the robustness of the system, we are planning to automatically determine a maximum number of detection and recognition parameters, using stochastic methods. This needs the elaboration of a large corpura.

We are planning to integrate the CapRe system into applications of human-computer interfaces, like the multimodal projects (Meditor, Mix3D) and ARGo LSF gesture recognition system. We also envision to use CapRe as a gaze measurement tool in cognitive experiments in order to analyze the various strategies used in tasks involving software training or Web information searching.

References




Last modification on 2006/12/01 by C.Collet Back