6.892 Machine learning seminar

Prof. Tommi Jaakkola
tommi@ai.mit.edu (preferred point of contact)
NE43-735, x3-0440

Prereq.: permission of instructor
Meets: M/W 2.30-4pm, 26-322.

Description:

Statistical machine learning concerns with automated, formalized methods with the ability to adapt, infer or learn from experience for the purpose of prediction and decision making. The aim of this class is to provide the students with fundamentals of various machine learning techniques so that they can readily apply, analyze, or adjust existing methods. The emphasis will be on representational and computational issues. A wide range of topics will be covered such as representation of probabilities with graphs, inference and estimation on graphs, approximate methods, model selection, clustering, generalization.

Background and other classes:

The students are expected to know the material in 6.893 taught by Prof Viola (basics of estimation theory, linear algebra). 6.432 or a similar class would provide an excellent background for this class. Previous exposure to graph theory, information theory, or statistical physics would be helpful but not required. The class is complementary to 9.641, now taught by Prof Seung, but similar to the earlier 9.641 taught by Prof Jordan (differs, however, in the emphasis and partly in the choice of the topics). Some of the topics e.g. support vector machines are amply covered in 9.520 and such material will either be excluded or emphasized differently in this class.

Format and requirements:

The format for the class is a mixture of lectures and paper presentations by the attendees. There will be weekly assignments in the form of brief critiques, proofs, analyses, or projects. Tutorials can be arranged in several key topics, as needed.

Topics:

  • Elements of graphical models
  • Density estimation, classification, clustering
  • Graph representations and their properties
  • Exact inference algorithms
  • Approximate inference (sampling, graphical, variational)
  • Decision analysis
  • Gaussian process models and support vector machines
  • Kernel based methods and graphical models
  • Clustering
  • Model selection
  • Model averaging (Bayesian, bagging, boosting)
  • Dynamic models (temporally extended decision problems)

    Misc topics covered include information geometry, group theory (invariances), and select applications