Progress on:  Variable Viewpoint Reality      Image Database

Overview of Presentation
Variable Viewpoint Reality
Overview
Progress at MIT
Image Database Retrieval
Overview
Progress
http://www.ai.mit.edu/projects/NTTCollaboration

VVR: Motivating Scenario
Construct a system that will allow each/every user to observe any viewpoint of a sporting event.
Provide high level commentary/statistics
Analyze plays

For example …

VVR Spectator Environment
Build an exciting, fun, high-profile system
Sports: Soccer, Hockey, Tennis, Basketball
Drama, Dance, Ballet
Leverage MIT technology in:
Vision/Video Analysis
Tracking, Calibration, Action Recognition
Image/Video Databases
Graphics
Build a system that provides data available nowhere else…
Record/Study Human movements and actions
Motion Capture / Motion Generation

Window of Opportunity
20-50 cameras in a stadium
Soon there will be many more
US HDTV is digital
Flexible, very high bandwidth digital transmissions
Future Televisions will be Computers
Plenty of extra computation available
3D Graphics hardware will be integrated
Economics of sports
Dollar investments by broadcasters is huge (Billions)
Computation is getting cheaper

Progress at MIT
Simple intersection of silhouettes (Visual Hull)
Efficient but limited
Tomographic reconstruction
Based on medical reconstruction
Probabilistic Voxel Analysis (Poxels)
Handles occlusion & transparency
Parametric Human Forms

Visual Hull in 2D

Visual Hull: Segment

Visual Hull: Segment

Visual Hull: Segment

Visual Hull: Intersection

Idea in 2D: Visual Hull

Real Data:  Tweety
Data acquired on a turntable
180 views are available…  not all are used.

Intersection of Frusta
Intersection of 18 frusta
Computations are very fast
perhaps real-time

New Apparatus

Current System
Real-time image acquisition
Silhouettes computed in parallel
Silhouettes sent to a central machine
15 per second
Real-time Intersection and Visual Hull
In progress

Visual Hull is very coarse …

Tomographic Reconstruction
Motivated by medical imaging
CT - Computed Tomography
Measurements are line integrals in a volume
Reconstruction is by back-projection & deconvolution

Back-projection of image intensities

Volume Render...
Captures shape very well
Intensities are not perfect

Poxels: An improvement to tomography
Tomography confuses color with transparency
Does not model occlusion...
The Probabilistic Voxel Approach: Poxel
Estimates both color and transparency
Models occlusion
Much better results
Though slower
Work submitted to ICCV 99

Occlusion causes disagreement

Initial agreement is not enough…

Second pass uses information about occlusion

Poxels Algorithm: Definitions

Poxels: Model of Transparency

Poxels Algorithm: Agreement (Step 1)

Results…

From ICCV paper...

… additional results

Image Databases: Motivating Scenario
Image Databases are proliferating
The Web
Commercial Image Databases
Video Databases
Catalog Databases
“Find me a bag that looks like a Gucci.”
Virtual Museums
“Find me impressionist portraits.”
Travel Information
“Find me towns with Gothic architecture.”
Real-estate
“Find me a home that is sunny and open.”

But, the problem is very hard…

We have made good progress...

Search for    cars?

Complex Feature Representation
Motivated by the Human brain…
Infero-temporal cortex computes many thousand selective features
Features are selective yet insensitive to unimportant variations
Every object/image has some but not all of these features
Retrieval involves matching the most salient features

Image Database Retrieval
NTT: Visit
1/7/99

Overview of IDB Meeting
Motivation from MIT ...
Discuss current and related work
Flexible Templates
Complex Features
Demonstrations
Related NTT Efforts
Discussion of collaboration
Future work
Dinner

Motivating Scenario
Image Databases are proliferating
The Web
Commercial Image Databases
Video Databases
Catalog Databases
“Find me a bag that looks like a Gucci.”
Virtual Museums
“Find me impressionist portraits.”
Travel Information
“Find me towns with Gothic architecture.”
Real-estate
“Find me a home that is sunny and open.”

There is a very wide variety of images...

Search for images containing waterfalls?

Search for
   cars?

What makes IDB hard?
Finding the right features
Insensitive to movement of components
Sensitive to critical properties
Focussing attention
Not everything matters
Generalization based on class
Given two images
Small black dog & Large white dog
(Don’t have much in common…)
Return other dogs

Overview of IDB Meeting
Motivation from MIT ...
Discuss current and related work
Flexible Templates
Complex Features
Demonstrations
Related NTT Efforts
Discussion of collaboration
Future work
Dinner

Complex Feature Representation
Motivated by the Human brain…
Infero-temporal cortex computes many thousand selective features
Features are selective yet insensitive to unimportant variations
Every object/image has some but not all of these features
Retrieval involves matching the most salient features

Slide 46

Slide 47

Resolution is reduced at each step…

Not every feature is useful for a query

Normalization of Signature Space

Distance/Similarity Measure

Image Database Progress at MIT
Better learning algorithms to select features
Developed a very compact feature representation
Fewer features required
2-3 bits per feature
Pre-segmentation of images
Better learning
More selective queries
Construction of object models:
Faces, people, cars, etc. (ICCV 99)

Slide 53

Slide 54

Slide 55

Slide 56

Conclusions
Variable Viewpoint Reality
Prototypes constructed
New approaches
Image Database Retrieval
New more efficient representations
Improved performance