Automatic Human Face Segmentation from Video Sequences


Tania Douglas,
Researcher,
Human Science
I am working at NHK STRL as a post-doctoral research fellow sponsored by the Japanese Science and Technology Agency (STA).
I came in January 1999, just after completing my Ph.D. in Scotland. The research project I am involved in addresses the problem of automatic facial recognition from video sequences; the part of the project that I am concentrating on concerns the segmentation of faces from their background after they have been identified. My research specialty before joining STRL was medical imaging, and transferring my image processing skills from medical applications to facial recognition has been interesting and educational.
Our Laboratories have been conducting research on a face tracking and recognition system for the automatic indexing of video content. Research has reached the point where a prototype system can recognize face images taken from arbitrary and continuously changing angles. Although at present the system employs a limited database, it has reached a level of recognition accuracy that makes practical use feasible.

Tania Douglas, an STA fellow at the STRL, has pursued research in precisely segmenting face regions based on the face recognition results. The recognition system produces estimates of the position of each face region in each frame, its size and angle, and the identity (ID) of the person. The objective of Dr Douglas's research has been to use this information to guide a precise region boundary estimation that is flexible enough to handle fine image variability beyond that estimated by the recognition system.
This segmentation technology will realize the following automatic video processing functions, leading to new program production possibilities.
Without using "chroma keying," the system will be able to extract only the video image of a face from a video sequence, and automatically replace the background with reduced distortion at the boundary of the face region.
The system will automatically be able to apply a mosaic effect to the face region, following the person's movement.

In the mid- to long-term, this technology is expected to be one of the most important basic technologies for the following two systems:
An object coding system, which will realize optimum coding for the image region corresponding to each object, and the image background.
A multimedia coding system, which will display object-related information according to a user's interests by attaching object metadata to each individual object region.

A facial segmentation example is shown in the figure. This face recognition technology is expected to lead to the development of methods for the extraction of human body regions from video sequences through the introduction of new schemes such as body models.