VS 2008

The Eighth International Workshop on Visual Surveillance

Home
Invited Talk
Registration
Programme
Poster Preparation
Call for Papers
Important Dates
People
My documents
ECCV2008

 

Invited Talk

"Scene Understanding and Activity Recognition"

Dr François Brémond

PULSAR, INRIA, Sophia Antipolis, Nice, France

 
Abstract

Biography

Scene understanding is the process of perceiving, analyzing and elaborating an interpretation of the 3D dynamic scene observed through a network of sensors. This process consists mainly in matching  information from the sensors observing the scene with models. Thus, to understand a scene is to both adding and extracting semantics from the sensor data data describing a scene. This scene can contain a number of physical objects of various types (e.g. people,

vehicle) interacting with each others or with their environment (e.g. equipment) more or less structured. The scene can last few instants (e.g. the fall of a person) or few months (e.g. the depression of an elderly person), can be limited to a laboratory slide observed through a microscope or go beyond the size of a city. Sensors include mostly cameras (e.g. omni directional, infrared), but also may include microphones and other sensors (e.g. optical cells, contact sensors, physiological sensors, radars, smoke detectors).

Scene understanding is influenced by cognitive vision and it requires at least the melding of three areas: computer vision, cognition and software engineering. Scene understanding can achieve four levels of generic computer vision functionality of detection, localization, recognition and understanding. But scene understanding systems go beyond the detection of visual features such as corners, edges and moving regions to extract information related to the physical world which is meaningful for human operators. Its requirement is also to achieve more robust, resilient, adaptable computer vision functionalities by endowing them with a cognitive faculty: the ability to learn, adapt, weigh alternative solutions, and develop new strategies for analysis and interpretation. The key characteristic of a scene understanding system is its capacity to exhibit robust performance even in circumstances that were not foreseen when it was designed. Furthermore, a scene understanding system should be able to anticipate events and adapt its operation accordingly. Ideally, a scene understanding system should be able to adapt to novel variations of the current environment to generalize to new context and application domains and interpret the intent of underlying behaviors to predict future configurations of the environment, and to communicate an understanding of the scene to other systems, including humans.

To make this approach concrete, my talk will address the challenges associate with the following main themes: perception for scene understanding - the perceptual world; maintenance of the 3D coherency throughout time - the physical world; event recognition - the semantic world; and evaluation, control and learning - autonomous systems.

Biography

François Brémond is a researcher in the PULSAR team at INRIA Sophia Antipolis. He obtained his Master degree in 1992 at ENS Lyon. He has conducted research works in video understanding since 1993 both at Sophia-Antipolis and at USC (University of Southern California), LA. In 1997 he obtained his PhD degree at INRIA in video understanding and François Brémond pursued his research work as a post doctorate at USC on the interpretation of videos taken from UAV (Unmanned Airborne Vehicle) in DARPA project VSAM (Visual Surveillance and Activity Monitoring). In 2007 he obtained his HDR degree (Habilitation à Diriger des Recherches) at Nice University on Scene Understanding.

Dr Brémond designs and develops generic systems for dynamic scene interpretation. The targeted class of applications is the automatic interpretation of indoor and outdoor partially structured scenes observed in particular with monocular colour cameras. These systems detect and track mobile objects, which can be either humans or vehicles, and recognize their behaviours. He is particularly interested in filling the gap between sensor information (pixel level) and behaviour recognition (semantic level).

François Brémond is author or co-author of more than 60 scientific papers published in international journals or conferences in video understanding. He is reviewer for several international journals (IJHCS, IEEE Transations on Neural Networks, IEEE Systems, Man and Cybernetics, PAAJ, Eurasip JASP) and conferences (CVPR, ICVS,…). He has also participated to six European projects (PASSWORDS, ADVISOR, AVITRACK, SERKET, CARETAKER, CoFriend), one DARPA project, seven industrial research contracts and several international cooperations (USA, Taiwan, UK, Belgium) in video understanding. For instance, he has managed to recognize a large variety of scenarios in different applications: 1) fighting, abandoned luggage, graffiti, fraud, crowd behavior in metro stations, on roads and onboard trains, 2) aircraft arrival, aircraft refueling, luggage loading/unloading on airport aprons, 3) bank attack in bank agencies, office behavior for ambient intelligence, 4) access control in buildings and 5) wasp monitoring for biological application.

   
The DIR© Arts!

 Held in conjunction with