A look at Boosting

Taken from the paper A decision-theoretic generalization of on-line learning and an application to boosting. Freund and Schapire 1995.

We are interested from section 4 on.

Language:

X is the domain.

A concept is a Boolean function tex2html_wrap_inline64 .

A concept class tex2html_wrap_inline66 is a collection of concepts.

The learner has access to an oracle providing labeled examples of the form (x, c(x)), where x is chosen randomly according to some fixed but unknown and arbitrary distribution D on X, and tex2html_wrap_inline76 is the target concept.

After some time , the learner outputs a hypothesis: tex2html_wrap_inline78 .

the error of the hypothesis is tex2html_wrap_inline80 where x follows distribution D.

A strong PAC-leraning algorithm is an algorithm, that given tex2html_wrap_inline84 , access to random examples, outputs with probability tex2html_wrap_inline86 a hypothesis with error at most tex2html_wrap_inline88 .

A weak learner, is one that has tex2html_wrap_inline90 , tex2html_wrap_inline92 .

It does a little bit better than guessing.

Boosting is a method for turning weak learners into strong learners.

The Boosting algorithm proceeds as follows:

Input: the examples, the distribution D over the examples, the weak learning algorithm and the number of iterations, T.

Initialize the weight vector: tex2html_wrap_inline98 , tex2html_wrap_inline100 .

Do for tex2html_wrap_inline102

  1. Set

    displaymath104

  2. Call the weak learner , providing it with tex2html_wrap_inline106 ; and get back the hypothesis tex2html_wrap_inline108 .
  3. Calculate the error of

    displaymath110

  4. Set tex2html_wrap_inline112 .
  5. Reweight:

    displaymath114

Output the hypothesis:

displaymath116


next up previous
Up: Class 8 Stat 540 Previous: Where to find additional

Richard Waterman
Fri Mar 5 08:09:37 EST 1999