6.892 Statistical Vision and Learning


The course, because it is new, will not be following a precise syllabus. My current plan is to spend the first two days motivating the course.

We will then spend 4 or 5 classes reviewing background material. This will include the basic principles of statistical inference, an introduction to the neurophysiology of the visual system, a small bit of information theory, and a discussion of some ``neural network'' research.

The first few lectures will follow Duda and Hart quite closely. Currently it appears as though this will be the only ``required'' textbook. It is my personal belief that perhaps 90% of current research on statistical approaches to computer vision, learning, or neural networks is closely related to sections of Duda and Hart. It is well worth the purchase price.

We will briefly review some of the known physiology of the visual cortex. While much has been revealed in recent years, as contained in literally 1000's of journal articles, some very fundamental questions remain. Since our time is limited, the lectures will be aimed at getting the class to the point where we can understand some of the computational theories of perceptual processing in the visual cortex. In this section we will briefly cover:

This will bring us to the first material perhaps not covered in other classes: information transmission theories of visual processing. There were a number of pioneering theories for what visual processing might be for (for example see Barlow, Letvin and others). The visual processing areas are adaptive, and rely on visual experience to insure proper development. Kittens raised in the dark grow up to become cats who cannot see. This provides an intriguing test for theories of visual processing: can they be used to define adaptation algorithms that when exposed to various ``natural'' stimuli yield physiologically plausible receptive fields? Before proceeding we will review information theory. The best textbook in this area is Cover and Thomas (it too is well worth the money but most likely I will try to copy sections of it...).

We will then discuss a number of influential theories in the area (2 or 3 classes).

If there is time and interest we will discuss some theories for the topographic layout of the visual cortex. These are interesting because they are related to the process of mixture modelling.

Finally to round out our discussion of the role of statistics and information theory in understand biological perceptual processing we will discuss:

There have been a number of approaches that are related in mathematical form that have attempted to address ``higher-level'' processing. These theories are in an area between engineering and neuroscience.

At this point the course will segue into a discussion of engineering approaches to vision (i.e. those whose ultimate evaluation criterion is how well they work and how often). We will begin our discussion with low-level vision. Low-level visual processing is especially interesting because from first principles we can show that there are many valid interpretation of any image. These can only be differentiated with the help of a prior model of what it is that we are seeing. It is a bit surprising, but every time we look at a scene we must invoke Bayes!

Somewhere in the midst of the above discussion we will digress on the topic of intermediate representations of images. This work is based on the insight that a compressed representation of an image is often a useful for reasoning.

Bayes's law turns out to be equally useful when analyzing higher level vision like object recognition:

Paul A. Viola
Wed Sep 4 18:44:23 EDT 1996