\title{Support Vector Machine Active Learning \\ with Applications
to Text Classification} 
\author{\name Simon Tong \email simon.tong@cs.stanford.edu \\
\name Daphne Koller \email koller@cs.stanford.edu \\
\addr Computer Science Department \\ 
Stanford University \\
Stanford CA 94305-9010, USA
}

\editor{Leslie Pack Kaelbling}

\maketitle

\begin{abstract}%
Support vector machines have met with significant success in numerous
real-world learning tasks.  However, like most machine learning
algorithms, they are generally applied using a randomly selected
training set classified in advance.  In many settings, we also have
the option of using {\em pool-based active learning}. Instead of using
a randomly selected training set, the learner has access to a pool of
unlabeled instances and can request the labels for some number of
them. We introduce a new algorithm for performing active learning
with support vector machines, i.e., an algorithm for choosing which
instances to request next. We provide a theoretical motivation for the
algorithm using the notion of a {\em version space}.  We present
experimental results showing that employing our active learning method
can significantly reduce the need for labeled training instances in
both the standard inductive and transductive settings.
