6.892 Machine learning seminar

Prof. Tommi Jaakkola
tommi@ai.mit.edu (preferred point of contact)
NE43-735, x3-0440
Prereq.: permission of instructor
Meets: M/W 2.30-4pm, 26-322.

Description:

Statistical machine learning concerns with automated, formalized methods with the ability to adapt, infer or learn from experience for the purpose of prediction and decision making. The aim of this class is to provide the students with fundamentals of various machine learning techniques so that they can readily apply, analyze, or adjust existing methods. The emphasis will be on representational and computational issues. A wide range of topics will be covered such as representation of probabilities with graphs, inference and estimation on graphs, approximate methods, model selection, clustering, generalization.

Background and other classes:

The students are expected to know the material in 6.893 taught by Prof Viola (basics of estimation theory, linear algebra). 6.432 or a similar class would provide an excellent background for this class. Previous exposure to graph theory, information theory, or statistical physics would be helpful but not required. The class is complementary to 9.641, now taught by Prof Seung, but similar to the earlier 9.641 taught by Prof Jordan (differs, however, in the emphasis and partly in the choice of the topics). Some of the topics e.g. support vector machines are amply covered in 9.520 and such material will either be excluded or emphasized differently in this class.

Format and requirements:

The format for the class is a mixture of lectures and paper presentations by the attendees. There will be weekly assignments in the form of brief critiques, proofs, analyses, or projects. Tutorials can be arranged in several key topics, as needed.

Topics:

Elements of graphical models

Density estimation, classification, clustering

Graph representations and their properties

Exact inference algorithms

Approximate inference (sampling, graphical, variational)

Decision analysis

Gaussian process models and support vector machines

Kernel based methods and graphical models

Clustering

Model selection

Model averaging (Bayesian, bagging, boosting)

Dynamic models (temporally extended decision problems)