Example-Based Image Synthesis

MIT2001-07

 

Progress Report: July 1, 2002‹December 31, 2002

 

William T. Freeman

 

 

 

Project Overview

 

Our initial research topic was example-based image synthesis: synthesizing image detail using a database of example images. This synthesis can be of texture, as in the Image Quilting technique, or of high-resolution image detail, as in Super-resolution.

 

After face-to-face meetings, we learned of a common research interest in shape estimation from images. Mr. Sato and Dr. Onozawa of NTT have developed laboratory equipment for analyzing object images under varying optical conditions, useful for shape estimation. Prof. Freeman and Dr. Torralba of MIT have developed a technique to improve shape estimates, using the image information and the rendering parameters learned from the initial shape estimate. These two research components are a great fit. We have supplied the NTT laboratory with the techniques and code we have developed, and we look forward to applying our shape estimation method to their image and range data, as well as helping to tailor the algorithms to their problems.

 

 

Progress Through December 2002

 

Under NTT support, we have developed shape recipes.  These are functions that transform from image intensities to shape subbands.  In low-level vision, the representation of scene properties such as shape, albedo, etc., are very high dimensional as they have to describe complicated structures. The approach we have developed lets the image itself bear as much of the representational burden as possible. In many situations, scene and image are closely related and it is possible to find a functional relationship between them. The scene information can be represented in reference to the image where the functional specifies how to translate the image into the associated scene.

 

We have shown how to use this representation to improve initial estimates of shape.  The initial shape estimate lets us learn the functional relationship between image and shape.  Then we apply that relationship to improve the shape estimates using image details not taken advantage of by the initial shape estimate.

 

This work was presented and published at the NIPS conference, in a ³poster spotlight² presentation, in Vancouver, BC, Canada, in December, 2002. We recently completed a second manuscript and submitted it for publication in the major computer vision conference, CVPR 2003. 

We have sent the shape recipes code to Mr. Sato, and he has sent us images and initial shape estimates from the NTT visual hull system.  These data have shown us directions for further algorithm development, and we have recently extend the algorithm to account for occlusion and reflectance boundaries, such as are prevalent in the data from NTT.

 

Regarding example-based image synthesis, we have sent to Mr. Sato and Dr. Onozawa both the texture synthesis code and the super-resolution code from our ongoing work on texture and high-resolution image synthesis.  Mr. Sato has applied the super-resolution code to the view-interpolation task for the NTT image-based rendering system.  The results show some improvement in resolution without observable artifacts.

 

Mr. Bryan Russell from MIT has been exploring sample-based methods for super-resolution for video sequences.  We have found it difficult to create a manageably small database of training data which at the same time contains enough samples for simple translation.

 

 

Research Plan for the Next Six Months

 

Regarding shape recipes:  We will continue our work to extend shape recipes to images where occlusion and reflectance changes are dominant.  This improvement, applied to the image-based rendering system of Sato-san, should allow improved image-based rendering view interpolation, or allow data collection from fewer views with the same image quality.  Prof. Freeman will visit NTT (and Onozawa-san and Sato-san) with the MIT delegation in April, 2003,

 

 

 

Regarding example-based image synthesis:  For video sequences, training-based super-resolution methods have proved difficult.  Our current approach is to exploit the Laplacian-shaped priors for the histograms of image subbands in both still and video image super-resolution.  Thus, we are studying the resolution enhancement obtainable by that simple, but powerful, prior probability on images.  The implementation issues are:  representation and inference.  We plan to use a shape-recipes representation (one of a number of linear regression candidate formulas at each pixel), and Bayesian belief propagation as the inference engine.  We expect this will allow inference of high resolution subbands with a much smaller training set than is currently required.