Human motion analysis

Jpetiot/ janvier 7, 2006/ Analysis

Context

Human motion analysis is a problem which has been addressed in different ways according to various expected goals. Methods using low or high level features such as optical flow have been proposed in the past. Those ones are most of the time dedicated to one specific task such as the recognition of a specific motion and then difficult to use in another context. The need to define a model of the human body appeared in order to provide a sharper and more flexible description.
Our objectives are the creation of a system as generic and adaptable as possible. No knowledge about the performed motion or the subject posture is required. Only one view is employed. The assumption made is that the camera is static.

Overview

Hierarchical body model

The principle is to have a multi-level description of the human body, defined in a hierarchical way. The first level is quite rough and down to the last in the hierarchy, the model becomes sharper. A comparison between this concept and a multi-resolution approach can be made. The idea is to refine the localization of the desired characteristic from one level to the next.

In our approach, the resolution is a constant, the variable one being the proposed level of representation. The underlying concept is to provide first a quite general description of the subject pose. According to application’s goals, this precision degree may or may not be sufficient. In the second case, the match can be refined according to the first one by using the next level of the model. This one provides a sharper representation with more details. The same process is repeated until expected level of precision has been reached, in respect of processing time constraints or the impossibility to go further in the description. For practical implementation, we have used a description composed of three levels (see fig 1).

Figure 1: The three hierarchical levels of decription for a human posture.

Model – Subject matching

The first knowledge about subject posture is obtained trough its bounding box. In our model definition, the bounding box corresponds to the level 0. As in many systems, a simple background subtsraction allows to retain only subject pixels. The matching of the proposed model is made at the region scale. The similarity measure is derived from the chamfer distance. The image is cut into search areas, one for each model element. Those ones are matched with pixels located within their own area. During the first iteration, as no a priori information is known about the subject posture, the output of the level 0 is cut according to the most probable location of different subject limbs in the image. Each model segment is matched within its own area independently from the others. Temporal relation between frames, when the matching is performed on a video, is used through the search area redefinition. A first matching on an element is made at time t within its search area which is an angular sector. From this result, a new search area is defined at time t+1. A quite satisfying previous match implies a reduction of the search space. On the contrary, poor ones lead to increase its size.

Features correspondence for sharper levels of the model is made according to the ones obtained in the upper levels. The limb is decomposed into new sub-rectangles. At each element corresponds a search area defined according to the one from which it is ensued.

Contributors

Thomas Fourès,
Philippe Joly (contact)

Main publications

Thomas Foures, Philippe Joly. Scalability in human shape analysis. In: IEEE International Conference on Multimedia & Expo (ICME) (ICME 2006), Toronto – Ontario – Canada, 09/07/2006-12/07/2006, IEEE, p. 2109-2112, 2006.

Thomas Foures, Philippe Joly. Defining Search Areas to Localize Limbs in Body Motion Analysis. In: First Int. Workshop on Adaptative Multimedia Retrieval (AMR 2003) – LNCS 3094 (2004), Hamburg, Germany, 15/09/2003-16/09/2003, Springer-Verlag Berlin Heidelberg 2004, ISBN 3-540-22163-8, p. 147-163, september 2003.

Thomas Foures, Philippe Joly. A multi-level model for 2D human motion analysis and description. In: Electronic Imaging Science and Technology 2003 – Internet Imaging IV, Santa Clara, California, USA, 21/01/2003-22/01/2003, SPIE and IS&T, USA, p. 61-71, january 2003.

Thomas Foures. Description analytique de la posture du corps humain pour l’indexation video. Thèse de doctorat, Université Paul Sabatier, june 2007 (In french).