The Recognition of Material Properties for Vision and Video Communication
Progress Report: July 1, 2002‹December 31, 2002
Edward H. Adelson
How can we tell that an object is shiny or translucent or metallic by looking at it? Humans do this effortlessly, but machines cannot. Materials are important to humans and will be important to robots and other machine vision systems. For example, if a domestic cleaning robot finds something white on the kitchen floor, it needs to know whether it is a pile of sugar (use a vacuum cleaner), a smear of cream cheese (use a sponge), or a crumpled paper towel (grasp with fingers).
An object¹s appearance depends on its shape, on the optical properties of its surface (the reflectance), the surrounding distribution of light, and the viewing position. All of these causes are combined in a single image. In the case of a chrome-plated object, the image consists of a distorted picture of the world; thus the object looks different every time it is placed in a new setting. Somehow humans are good at determining the "chromeness" that is common to all the chrome images.
Progress Through December 2002
Psychophysics of Reflectance Estimation
In previous work we measured the accuracy with which humans match surface reflectance properties across variations in illumination. We found that subjects could match surface gloss (specular reflection) reliably and accurately across variations in illumination, as long as the statistics of the illumination are characteristic of the real world. Critically, however, performance declined considerably under illuminations such as single point light sources and random noise patterns, which do not have realistic statistics.
Our theory is that real-world illumination has certain well-conserved statistical properties, which lead to reliable diagnostic features in images of glossy materials. The key research question is to identify the statistical properties of real-world illumination that allow subjects to estimate surface reflectance properties.
In the last 6 months we have successfully identified a number of properties of real-world illumination that are important for surface reflectance estimation. Illumination in the real world has many regularities, ranging from simple properties (such as a pixel histogram that is skewed to low intensities); to higher order regularities (such as a ground plane and the presence of recognizable objects such as buildings and trees). Which regularities are important for surface reflectance estimation?
We have found that certain statistics of intermediate complexity lead to a good impression of glossiness. Histograms of pixel intensities in real-world illumination are heavily skewed to low intensities. This appears to be an important characteristic for surface reflectance estimation. Intuitively the skew results from the fact that most of the surfaces in the world reflect, rather than emit light. Although these surfaces structure the patterns of reflection by providing visible objects in the reflection, they are responsible for only a small proportion of the incoming light. By contrast, direct light sources, such as candles and the sun, are generally rather compact and sparsely distributed, although they provide the majority of the incident light. This finding is consistent with previous research (S. Nishida and M. Shinya. Use of image-based information in judgments of surface-reflectance properties. J. Opt. Soc. Am. A, 15:29512965, 1998), which has found that image pixel histograms play an important role in surface reflectance matching.
We found that a real-world illumination that is robbed of its characteristic skew, makes glossy surfaces look dull, matte and unrealistic. Conversely, random noise patterns can lead to vivid impressions of gloss when endowed with skew. This suggests that the pixel histogram skew, or sparseness of illumination, is an important characteristic of real-world illumination for surface reflectance estimation. We have also found that endowing random noise patterns with realistic wavelet statistics (i.e. those that are characteristic of real-world illumination), causes the illumination to yield glossy-looking surfaces. Examples of this are shown in Figure 1. Wavelet statistics capture local information about scale and orientation, which is important for representing the clumps, blobs and edges that are found in real-world illumination.
This research has been accepted (with minor revisions) for publication in the Journal of Vision.
Recovering Intrinsic Images from a Single Image
Previously, we had developed a system for decomposing a single image into two images, one representing the shading of the scene and a second image containing the reflectance of each point in the scene. This systems works by calculating horizontal and vertical derivatives of the image, then classifying each derivative as being caused by either shading or by a reflectance change. This classification is made using local gray-scale and color image information.
In many cases, the local evidence that we use is not sufficient to correctly classify derivatives. Figure 1 shows an example of this situation. When compared to the example shading and reflectance change, the center of the mouth is equally well classified with either label. However, the corners of the mouth can be classified as being caused by a reflectance change with little ambiguity. Since the derivatives in the corner of the mouth and the center all lay on the same image contour, they should have the same classification. In the last six months, we have refined our techniques for propagating information from the corners of the mouth, where the correct classification is clear, into the center, where the local evidence is ambiguous.
We propagate information by treating the classification of each derivative as a random variable in a Markov Random Field, or MRF. In this model, the label of each derivative only directly depends on the labels of the neighboring derivatives. We set up the MRF so that derivatives along an image contour are strongly influenced to have the same label as their neighbors. The most probable label for each derivative is found using the Generalized Belief Propagation algorithm. As part of this work, we created and verified an efficient C++ implementation of the Generalized Belief Propagation algorithm.
This work was presented at the Neural Information Processing Systems Conference.
Research Plan for the Next Six Months
Psychophysics of Reflectance Estimation
In direct collaboration with Shin'ya Nishida and colleagues we will replicate and extend our experiments with surface shapes other than spheres. We also hope to perform experiments with surfaces that have empirically measured BRDFs, rather than parametric approximations.
Recovering Intrinsic Images
We will investigate decompositions involving different types of visual characteristics, such as transparency or material properties. We are also interested in extending the technologies developed in this project into an interactive computer graphics tools for synthetically relighting images.