Saving Face With Person Spotter

The developers of an advanced face-recognition technology say their system can pick human faces out of a crowd on a video image, then analyze their gestures and expressions.

An advanced face-recognition technology, currently being used to test theories of how the human brain recognizes images, may one day find a home in ATM machines, hotel room-access systems - even PCs, in the form of new interfaces that might execute commands by tracking a user's gaze and facial expressions.

Person Spotter was developed jointly over the past few years by the University of Southern California and the University of Bochum in Germany, with funding from the Army Research Laboratory.

"What we have here is much more than face recognition," said Hartmut Neven, a research assistant professor at USC and one of three associates who worked with project director Dr. Christoph von der Malsburg to develop the software.

Neven said Person Spotter is a comprehensive architecture that can visually interpret how many people are in the system's field of vision, their locations, hand gestures, and, to some degree, facial expressions. If an individual is in the system's database, it can identify him or her - even if there is some change in appearance, including hair style, facial hair, or glasses.

Neven said that the system could also potentially perform demographic analysis - discerning the race, gender, and age of people in a group in its field of view.

Face detection and face recognition are different biometric problems, with face recognition - the ability to match a particular face from a database - being a long-researched field of computer vision. Face detection, on the other hand - where human faces are picked out of a given image - has only had significant work done since the early '90s, said Henry A. Rowley, a researcher at Carnegie Mellon's MURI project, which explores these areas.

Von der Malsburg, a neurophysiologist and brain theorist, said that once a face is detected, the image must undergo a mathematical transformation called a wavelet transform. This process filters the image into many new frequency-specific images, which are then used to analyze a given face. Von der Malsburg said that he modeled the technique after his theories on how the visual system works in invertebrates.

The first working model of the wavelet transform, developed by one of von der Malsburg's students, took three quarters of an hour to execute, but subsequent code optimization brought that down to 10 minutes or so. Now the process consumes only seconds on a Silicon Graphics workstation, von der Malsburg said.

The speed will be reduced further still, to a fraction of a second, with the help of a specialized chip being developed in Germany by Siemens, the chip manufacturer, in collaboration with their team.

"The chip is an array of digital signal processors, and what's special about it is that it is designed around the needs of this application," said von der Malsburg. The hardware will help in getting the video data into the system, and also directly execute the software's specific, time-consuming algorithms.

Von der Malsburg founded Eyematic Interfaces, a Los Angeles-based company, to market the system. Neven, a vice president of the company, said that the team is porting the system to PC-based machines from the high-end SGI system.

Neven said they already have interested customers who hope to use the system for ATM security, hotel, business, and residential access control, as well as point-of-sale security.

But the applications for this technology can go beyond security, von der Malsburg said.

"It is not clear where the market will open up first," he said. "I am quite sure that in eight or 10 years' time, there will be a multibillion dollar market yearly for video analysis systems - but it's not clear where the whole thing will start."

Rowley suggests his face detection and recognition software might find application as a user interface for computers, where a user's vision and focus can be used to track movement and information presented on a monitor or other output device.

Using this kind of technology for new human-computer interaction is not lost on von der Malsburg, who said that by analyzing facial expressions, computers could begin to develop a closer, more personal relationship with people.

"Once this is readily available, in the sense that the software is flexible enough so you can control such things and have those systems learn to do new things," he said, "it's going to transform our lives in many ways."