Sparse Bayesian Learning and the Relevance Vector Machine
Michael E. Tipping;
1(Jun):211-244, 2001.
Abstract
This paper introduces a general Bayesian framework for obtaining
sparse solutions to regression and classification tasks
utilising models linear in the parameters. Although this framework
is fully general, we illustrate our approach with a particular
specialisation that we denote the 'relevance vector machine' (RVM),
a model of identical functional form to the popular and
state-of-the-art 'support vector machine' (SVM). We demonstrate that
by exploiting a probabilistic Bayesian learning framework, we can
derive accurate prediction models which typically utilise
dramatically fewer basis functions than a comparable SVM while
offering a number of additional advantages. These include the
benefits of probabilistic predictions, automatic estimation of
'nuisance' parameters, and the facility to utilise arbitrary basis
functions (e.g. non-'Mercer' kernels).
We detail the Bayesian framework and associated learning algorithm
for the RVM, and give some illustrative examples of its application
along with some comparative benchmarks. We offer some explanation
for the exceptional degree of sparsity obtained, and discuss and
demonstrate some of the advantageous features, and potential
extensions, of Bayesian relevance learning.
[abs]
[pdf]
[ps.gz]
[ps]