6.825 Project 2a: Bayes Net Learning

In this project, you will implement Bayes net learning. That is, you will write a program that will read in some data and come up with a Bayesian network that best matches the data. Reading and understanding Nilsson's Chapter 20, section 1 is necessary for doing this assignment.

This assignment is broken up into parts that will take you step-by-step through building a program for Bayes net learning.

Part 0: Designing the Data Structure for a Bayes Net

Please read through this whole assignment and make sure you understand everything you're asked to do. Then, design and implement a data structure for a Bayes net, taking into account what you'll have to do with the data structure.

Part 1: Estimating the CPTs

In this part, you are given the structure of a network and need to compute the conditional probability tables (CPTs). For this assignment, we will assume that there is no missing data, so this is just a matter of counting. For parts 1 and 2, simply use the raw counts (just as Nilsson does).

Your Tasks:

Part 2: Computing the Score of a Network

Pretend that you have the network already defined for you. That is, you already know the network structure and the conditional probability tables. How would you know whether it is a good network? This first part is about computing the score of a given network so that you can compare different networks.

We can compute the score of a network in any way we want, but a good scoring metric would take into account two things: (1) probability of the data given the network; (2) complexity of the network. For part 2, please use the scoring metric in Nilsson's chapter.

Your Tasks:

Part 3: Generating an Initial Network

Now you have the tools for searching through the space of network structures. However, like many other search problems, exhaustive search through the space of all possible network structures is not practical. In Part 4, you will need to implement some kind of greedy search, and as you should know by now, if you are not doing exhaustive search, your result will depend on how you choose the starting point. So in this part, you will think about how to choose an initial network. Possible candidates for an initial network are: Note that any network you choose (here and also throughout your search) cannot have directed cycles. Also note, from this point on, you may use bayesian correction if your network has some probability 0 events.

Your Tasks:

Part 4: Searching

Given the initial network in Part 3, implement a local greedy search to find a locally optimal network. (as described in Nilsson's Chapter).

Your Tasks:

Part 5: Experiment

In this part, you will do two experiments: one with Part 2, and another with either Part 3 or Part 4.

Your Tasks: