Towards automatic indexing by image recognition technology

Our Laboratories have been conducting research into the automatic indexing of video content. In this article, I will introduce a recognition system that detects, recognizes and tracks people's faces in video material.

The ability to recognize face images will deliver the following benefits and lead to improvements in program production efficiency.

It will enable a rapid search of video material using key information such as a performer's name, the number of people in a shot, or a person's movement. This will increase the efficiency of the video editing process.
Sequences will be located using meaningful search keys describing the composition of the shot, such as "bust shot of A," "conversation involving B and C," or "D moving from left to right."
This technology will enable video retrieval from the massive video archives which broadcasters maintain, using key information such as the name of an individual appearing in the video.
Combined with a robot camera, it will make possible the automatic tracking and shooting of a specific subject.

Presently we are developing a prototype system in which facial images are first entered into the system database, and then when faces appear in input video sequences, their locations, sizes and angles are tracked and, if registered, their identities recognized.
Although research on face recognition technology is being pursued worldwide, there are difficulties yet to be overcome. These arise from the image variability associated with changes in facial expression, position, size and angle, lighting conditions, image background and motion, and so on. Our Laboratories will pursue research into overcoming these challenges, aiming for a system with flexible real-time recognition capability.

Image recognition
With a view to applications in security, video retrieval and human interfaces, research on the automatic recognition of identity, movement, expression and so on is currently popular worldwide. Various trials are also under way on the compilation of highlights by detection of specific events in video, such as goals in soccer, home runs in baseball and smashes in tennis.

