The Recognition of Material Properties for Vision and Video Communication

MIT2000-02

Progress Report: July 1, 2001–December 31, 2001

Edward Adelson

 

 

Project Overview

How can we tell that an object is shiny or translucent or metallic by looking at it? Humans do this effortlessly, but machines cannot. Materials are important to humans and will be important to robots and other machine vision systems. For example, if a domestic cleaning robot finds something white on the kitchen floor, it needs to know whether it is a pile of sugar (use a vacuum cleaner), a smear of cream cheese (use a sponge), or a crumpled paper towel (grasp with fingers).

An object’s appearance depends on its shape, on the optical properties of its surface (the reflectance), the surrounding distribution of light, and the viewing position. All of these causes are combined in a single image. In the case of a chrome-plated object, the image consists of a distorted picture of the world; thus the object looks different every time it is placed in a new setting. Somehow humans are good at determining the "chromeness" that is common to all the chrome images.

 

Progress Through December 2001

Computational Algorithms for Reflectance Estimation

Previously, we built a system for classifying images of spheres based on the sphere’s reflectance. We then extended that system to support images of non-spherical objects with known geometry.

In the last six months, we have been investigating the performance of the system when the estimate of geometry is incorrect. We have also begun developing algorithms to handle incorrect estimates of geometry.

In addition, we have begun developing algorithms in the case that no geometry estimate is available at all.

 

Psychophysics of Reflectance Estimation

Previously, we investigated how well humans could classify reflectances under different illuminations. In the last six months, we began investigating which specific properties of illumination affect this task. Specifically, we are interested in the characteristics of illumination that bias humans.

Recovering Intrinsic Images from Single Images

Estimating material properties is difficult because an image is the combination of the material properties and other visual characteristics, such as the shape of the objects and their illumination. When attempting material recognition, it would be useful to be able to isolate specific characteristics of the objects in the image, such as the illumination of each object or each object’s reflectance. An image containing just one characteristic is called an intrinsic image.

We have begun developing a system to separate an image of a scene into intrinsic images, each containing one visual characteristic. Given an input image, the system returns an image containing the reflectance of every object in the scene and also returns an image containing the combination of shape and illumination at every point.

The current implementation works by distinguishing image changes caused by changing reflectance from image changes caused by a change in an object’s shape. Once the changes are separated according to their cause, the output images can be created. Currently, color information is used to separate these image changes. Below are some preliminary results.

 

 

 

 

 

 

 

 

 

 

 

 

Auditory Material Properties

In addition to the work on visual cues, we are embarking on a study of auditory cues. When people encounter objects of unknown material, they often knock the object to hear how it sounds. This problem can be posed as a classification task, in parallel with the classification we have done with images. We are now synthesizing the sound corresponding to some standard shapes, such as bars and discs, which are excited by standard inputs, such as hammer strikes.

 

Research Plan for the Next Six Months

First, we intend to continue to refine the reflectance classification techniques, especially when the geometry of the objects is unknown. We also hope to develop an analytic framework for reflectance estimation. This will place our algorithms on a firmer theoretical foundation.

Second, we intend to develop techniques for creating intrinsic images from gray-scale images. This will involve learning to differentiate image patterns that correspond to changes in an object’s shape from patterns that correspond to a change in reflectance.

Third, we will continue building tools for synthesizing sounds corresponding to standard shapes. We will also attempt to train a system that can recognize the sound of metal (as opposed to wood or plastic) for objects of various shapes with various inputs.