MIT Artificial Intelligence Laboratory

The primary goal of the Artificial Intelligence Laboratory is to understand how computers can be made to exhibit intelligence. Current research in the area of vision at the Laboratory includes work on such diverse issues as object recognition, navigation, scene understanding, and active vision. We also have projects which address the topics of visual learning and biological vision. Some particularly exciting applications of our vision research include our image-guided surgery project and our intelligent room endeavor. The faculty members involved in vision related research include Professor Eric Grimson, Professor Berthold Horn, Professor Tomas Lozano-Perez, Professor Tomaso Poggio, and new faculty member Paul Viola.
Demonstration Abstracts

The MIT Cheap Vision Machine (CVM)
Chris Barnhart and Ian Horswill
We will demonstrate a very low cost vision system that packages a 60MFLOP digital signal processor with high-speed memory mapped a memory-mapped RGB/NTSC frame grabber and display. The system uses less than 10 Watts and is small enough for embedded applications. It forms the basis of several of the real-time vision demos. Click here for more information.

Fast Object Recognition in Noisy Images Using Simulated Annealing
Margrit Betke
We will present an automatic object recognition system based on fast simulated annealing. The object recognition problem is addressed as the problem of best describing a match between a hypothesized object and an image. The normalized correlation coefficient is used as a measure of the match. Templates are generated on-line during the search by transforming model images. Simulated annealing reduces the search time by orders of magnitude with respect to an exhaustive search. The algorithm is applied to the problem of how landmarks, e.g., traffic signs, can be recognized by a navigating robot. The performance of the system is illustrated with real-world images of complicated scenes with traffic signs.

Synthesizing Virtual Views of Faces
David Beymer
Given one view of a face, we demonstrate a technique for synthesizing new views of the face as seen from different viewpoints or expressions. To synthesize these "virtual" views, a 2D deformation is measured from 2D views of a prototype face and then mapped onto the given target face. Come see virtual views of your face that are synthesized using a completely automatic technique in near real time! An example.

Medical Image Registration
Gil Ettinger
A key problem in effective analysis of 3D medical imagery is the registration of the scans to different coordinate frames: across modalities, across time, or across patients. We are developing automated 3D medical image registration algorithms which employ a combination of energy-minimization surface alignment techniques to achieve accurate and robust alignment of 3D data sets. We have applied these techniques to the problems of: (1) image-guided surgery, in which we register MR imagery to the patient's coordinate frame for generating "enhanced reality visualizations" of the patient's internal anatomy (An example image-guided surgery), and (2) 3D change detection, in which structural anatomical changes are tracked over time (An example change detection).

Grounding language in visual routines
Ian Horswill
We present a simple natural language understanding program that uses a real-time implementation of Ullman's visual routine processor theory to find the referents of simple noun phrases without a fully-articulated world model.

Image Analysis and Synthesis
Mike Jones and Steve Lines
We present an image analysis and synthesis system which analyses line drawings of cartoon faces and then synthesizes a real image which has roughly the same facial expression and pose. Both the analysis and synthesis modules use 2D prototype images to build a model.

An Active Attentive Visual System for Object Recognition
Aparna Lakshmi Ratan
We present an active and attentive vision system which finds target objects in a scene by integrating color and stereo cues to fixate candidate regions in which to recognize the target objects using alignment-style recognition methods. Click here for more information

Model Guided Correspondence for Recognition
Pamela Lipson
We present a model-guided approach to correspondence that efficiently and robustly establishes a pointwise correspondence between a model and image picture. We have tested our approach within the framework of a linear combination recognition scheme.

Visually-Guided Navigation in Rough Outdoor Terrain
Liana Lorigo
The task is autonomous obstacle avoidance in unmapped rocky terrain, and the platform is a small mobile robot. Preliminary results will be demonstrated.

Enhanced Reality Visualization
J.P. Mellor
Enhanced reality visualization is the process of enhancing an image by adding to it information which is not present in the original image. A wide variety of information can be added to an image ranging from hidden lines or surfaces to textual or iconic data about a particular part of the image. We will demonstrate enhancements which require geometrically accurate positioning. Click here for a sample

Real-time Face Verification
Raquel Romano
We present a real-time face verification system which grabs a live image and automatically authenticates a given user by determining whether a frontal view of the subject is present in the image.

Vision and Touch Guided Manipulation
Salisbury and Slotine (PIs)
A high performance robot and vision system which is capable of autonomously grasping stationary and moving objects.

Haptics
K. Salisbury (PI)
Systems which permit touching and physical interaction with virtual objects.

Human Face Detection in Cluttered Scenes
Kah-Kay Sung
We present a distribution-based modeling cum example-based learning technique for finding human faces in cluttered scenes. During the demo, we will grab live images of subjects in a background of their choice and have the face detection algorithm locate faces in the images.

Real-time vision-based robot mapping
Robert Thau
We present a robot which maps its immediate environs based on camera data, in real-time. The robot runs in an unrestricted office environment; camera and odometry are the only sensors.

Gesture Recognition for Presentation Support
Mark Torrance
A demonstration using a motion-based visual tracking system developed by Sajit Rao, in combination with a continuous speech recognition system developed by the Spoken Language Systems group in the Laboratory for Computer Science, to support the use of audio visual tools during a presentation.

Reubens: A Modular Visual Tracking System
Mike Wessler
Reubens is an active vision system that runs at video rate on a single C-31 DSP chip. It integrates independent saccade and smooth pursuit modules to make the complete system more robust than either of the modules alone.

Alignment by Maximization of Mutual Information
Paul Viola and William Wells
Over the last 30 years the problems of image registration and recognition have proven more difficult than even the most pessimistic might have predicted. Progress has been hampered by the sheer complexity of the relationship between an image and an object, which involves the object's shape, surface properties, position, and illumination. We will present an alignment technique based on mutual information that can work both for real objects/images and for medical registration problems.

Poster Titles

Error Propagation in Full 3d-from-2d Object Recognition
Tao Alter

An Analysis of Shashua and Ullman's Saliency Network
Tao Alter

Extracting Salient Contours Using Shortest Paths
Tao Alter

Segmentation of Brain Tissue from Magnetic Resonance Images
Tina Kapur

Object Recognition via Image Invariances
Pawan Sinha

Accurate Internal Camera Calibration using Rotation, with Analysis of Sources of Error.
Gideon Stein

Back to ICCV '95 Home Page