UCSD Jacobs School of Engineering
University of California San Diego
Pulse
 

Machine Vision's Brave New World

"The fundamental challenges in
computer vision are not solved. But there is now recognition that, thanks to Moore's Law, vision algorithms that require extremely intense processing have become much more accessible." —Mohan Trivedi

Can computers 'see'? They can do that and more, if they are hooked up to camera sensors and use processing to recognize patterns from one video frame to the next. In a world of proliferating cameras, they promise greater public safety—but potentially less privacy. Enter a unique computer vision technology developed in the lab of electrical and computer engineering (ECE) professor Mohan Trivedi. Here, a video feed showing pedestrians in a walkway morphs into a simulated environment, with people now represented as colored rectangular boxes.The pedestrians' identities are protected, but their movements can be tracked and analyzed.

"We designed it as a privacy filter, limiting the intrusion at the point where an image is captured and not allowing the detail to be transmitted upstream," says Trivedi. "Privacy should not be an afterthought. Law enforcement could take off the filter if they detect suspicious activity, but this contextualization technology should give society at large more confidence that privacy can be maintained even in a world where thousands of cameras go online every day."

Computer vision has come a long way in the eight years since the director of UCSD's Computer Vision and Robotics Research lab began using the technology in a 'smart room' for new-age videoconferencing. Caltrans later asked Trivedi to apply his expertise to smart roads—camera systems on highways to help prevent or mitigate accidents. His deployment of distributed interactive video arrays (DIVAs) caught the eye of automakers, and in quick succession DaimlerChrysler,Volkswagen-Audi and Nissan began funding Trivedi's research into the use of computer- vision systems on-board vehicles, notably to reduce driver distraction.The lab has maintained research sponsorships from a variety of federal and state agencies, including NSF, DoD and the UC Discovery Grant program.

Major automakers are sponsoring UCSD research into the use of computer vision systems on-board vehicles to reduce driver
 distraction.
Major automakers are sponsoring UCSD research into the use of computer vision systems on-board vehicles to reduce driver distraction.

Now Trivedi has come full circle, and is marrying his lab's research on smart rooms and vehicles to the vision systems his team developed for outdoor spaces.The result is SHIVA, short for Systems for Human Interactivity Visualization and Analysis. The new lab and its infrastructure include dozens of cameras, microphone arrays and motion sensors to harness computer vision for analyzing body movements and interaction with the surrounding environment, even under widely varying lighting conditions. The DIVAs incorporate a range of color, omnidirectional (360-degree), infrared, thermal and stereo (3D) cameras.

"Our ultimate objective is for the spaces themselves to maintain an awareness of the surroundings, while extracting human body movements and tracking bodies indoors, outdoors, and in mobile environments," says Trivedi."We are not specifically trying to recognize individual faces, but rather movements that may be meaningful, so a person could point at a screen and say 'enlarge', and the room would enlarge a section of what's displayed based on where the person is pointing or looking." The SHIVA lab allows Trivedi and his fellow researchers to develop and test new algorithms to dissect how human bodies move within a scene. Postdoctoral researcher Sangho Park is analyzing the articulated movements of individual body parts, alone and in concert with other body parts, while fellow postdoc Tarak Gandhi investigates person and vehicle tracking, camera calibration, 3D geometry and motion algorithms for event recognition. Graduate student Shinko Cheng uses thermal cameras and the derived volumetric representation of a hand to capture the hand's gestures of the palm and 15 other segments and its 24 degrees of freedom.To know where someone is looking, researchers are working on algorithms to analyze head poses as well as the location of eyes, lips and other facial landmarks. Ph.D. candidate Joel McCall has developed a technique for analyzing facial affects— for instance, whether the individual is drowsy, or angry. Postdoctoral researcher Kohsia Huang has developed a fully integrated system for tracking, voxelization, gesture recognition, face capture and face recognition using omni-directional cameras.Trivedi's team also collaborates closely with other UCSD faculty including psychologist Harold Pashler, cognitive scientist Jim Hollan, electrical engineer Bhaskar Rao and structural engineer Ahmed Elgamal.

So far Trivedi's major projects have been in automotive and homeland security applications. Most recently,Trivedi launched a $500,000, 18-month project in August to prototype low-cost systems for continually tracking persons and analyzing activity in outdoor spaces, notably around critical infrastructure such as airports, border crossings and bridges.The systems would be capable of alerting a command center of any potentially threatening event, e.g., when a person enters an airport terminal and leaves a package unattended, or when someone enters a security perimeter without permission.The project is funded by the federal inter-agency Technical Support Working Group responsible for counter-terrorism technology research.

Earlier this year,Volkswagen and the UC Discovery Grant program awarded Trivedi $462,000 over two years to develop dynamic visual displays and vision modules for analyzing a driver's state and intent.

"We appreciate that the fundamental challenges in computer vision are not solved," admits Trivedi."But there is now recognition that, thanks to Moore 's Law, vision algorithms that require extremely intense processing have become much more accessible."

Ground Zero

Mohan Trivedi is not the only Jacobs School researcher making waves in computer vision. Over the summer, UCSD appeared to be Ground Zero for academic research in the field. Jacobs School faculty and graduate students helped organize the IEEE International Conference on Computer Vision and Pattern Recognition (CVPR) in San Diego . CSE professor David Kriegman co-chaired the conference, and ECE's Trivedi delivered one of the keynotes. UCSD was also well represented among the peer-reviewed papers accepted at CVPR, with four co-authored by electrical engineer Nuno Vasconcelos and his graduate students, and three by computer scientist Serge Belongie's team. Vasconcelos and Belongie were also singled out this year by the National Science Foundation for Faculty Early Career Development (CAREER) awards.

Belongie works in the area of 3D scene analysis and reconstruction. He is trying to make it easier for computers to recognize and track moving, non-rigid objects such as animals and humans in motion. "If you want to track a car using vision techniques, it is fairly easy as long as you know the shape of the car from the beginning," he explains. "But tracking a person who is running down a street is orders-of-magnitude more complicated."

Like a medical researcher, Belongie has begun testing his algorithms on the lowly mouse. With funding from Calit2, his Smart Vivarium project aims to use networks of cameras for non-stop observation and analysis of laboratory mice, which have limited range of motion and are not as articulated as humans.

For his part, ECE's Nuno Vasconcelos is working on techniques that may accelerate the day when a computer can easily recognize millions of objects. He calls it 'weakly supervised recognition' — i.e., systems that can more easily detect and recognize objects in large image and video repositories. "We are hoping to lay the foundation for a longterm vision of recognition systems that would contain banks of recognition modules fully trainable by naive users, with minimal requirements in terms of manual data pre-processing and computational complexity," says Vasconcelos.

Related Links
Video: System for Human Interaction Visualization and Analysis (SHIVA) [Realplayer required] Length: 5:06
Video: LISA 2005 [Realplayer required] Length: 16:34
Website: Computer Vision and Robotics Research Laboratory
Website: Laboratory for Intelligent and Safe Automobiles
Website: Belongie and Kriegman Computer Vision Laboratory
Website: Vasconcelos Statistical Visual Computing Lab
 
Printer Friendly Version Email this Story
     
  << Back to Table of Contents