Example-Based Image Synthesis
MIT2001-07
Progress Report: July 1,
2002‹December 31, 2002
William T. Freeman
Project Overview
Our
initial research topic was example-based image synthesis: synthesizing image
detail using a database of example images. This synthesis can be of texture, as
in the Image Quilting technique, or of high-resolution image detail, as in
Super-resolution.
After
face-to-face meetings, we learned of a common research interest in shape
estimation from images. Mr. Sato and Dr. Onozawa of NTT have developed laboratory
equipment for analyzing object images under varying optical conditions, useful
for shape estimation. Prof. Freeman and Dr. Torralba of MIT have developed a
technique to improve shape estimates, using the image information and the
rendering parameters learned from the initial shape estimate. These two
research components are a great fit. We have supplied the NTT laboratory with
the techniques and code we have developed, and we look forward to applying our
shape estimation method to their image and range data, as well as helping to
tailor the algorithms to their problems.
Progress Through December 2002
Under
NTT support, we have developed shape recipes. These are functions that transform from image intensities to
shape subbands. In low-level
vision, the representation of scene properties such as shape, albedo, etc., are
very high dimensional as they have to describe complicated structures. The
approach we have developed lets the image itself bear as much of the representational
burden as possible. In many situations, scene and image are closely related and
it is possible to find a functional relationship between them. The scene
information can be represented in reference to the image where the functional
specifies how to translate the image into the associated scene.
We
have shown how to use this representation to improve initial estimates of
shape. The initial shape estimate
lets us learn the functional relationship between image and shape. Then we apply that relationship to
improve the shape estimates using image details not taken advantage of by the
initial shape estimate.
This
work was presented and published at the NIPS conference, in a ³poster
spotlight² presentation, in Vancouver, BC, Canada, in December, 2002. We
recently completed a second manuscript and submitted it for publication in the
major computer vision conference, CVPR 2003.
We
have sent the shape recipes code to Mr. Sato, and he has sent us images and
initial shape estimates from the NTT visual hull system. These data have shown us directions for
further algorithm development, and we have recently extend the algorithm to
account for occlusion and reflectance boundaries, such as are prevalent in the
data from NTT.
Regarding
example-based image synthesis, we have sent to Mr. Sato and Dr. Onozawa both
the texture synthesis code and the super-resolution code from our ongoing work
on texture and high-resolution image synthesis. Mr. Sato has applied the super-resolution code to the
view-interpolation task for the NTT image-based rendering system. The results show some improvement in
resolution without observable artifacts.
Mr.
Bryan Russell from MIT has been exploring sample-based methods for
super-resolution for video sequences.
We have found it difficult to create a manageably small database of
training data which at the same time contains enough samples for simple
translation.
Research Plan for the Next Six Months
Regarding shape recipes: We will continue our work to extend
shape recipes to images where occlusion and reflectance changes are
dominant. This improvement,
applied to the image-based rendering system of Sato-san, should allow improved
image-based rendering view interpolation, or allow data collection from fewer
views with the same image quality.
Prof. Freeman will visit NTT (and Onozawa-san and Sato-san) with the MIT
delegation in April, 2003,
Regarding example-based image synthesis: For video sequences, training-based
super-resolution methods have proved difficult. Our current approach is to exploit the Laplacian-shaped
priors for the
histograms of image subbands in both still and video image
super-resolution. Thus, we are
studying the resolution enhancement obtainable by that simple, but powerful,
prior probability on images. The
implementation issues are:
representation and inference.
We plan to use a shape-recipes representation (one of a number of linear
regression candidate formulas at each pixel), and Bayesian belief propagation
as the inference engine. We expect
this will allow inference of high resolution subbands with a much smaller
training set than is currently required.