A scalable boosting learner using adaptive sampling

Document Type

Conference Proceeding

Publication Date

1-1-2015

Abstract

Sampling is an important technique for parameter estimation and hypothesis testing widely used in statistical analysis, machine learning and knowledge discovery. Sampling is particularly useful in data mining when the training data set is huge. In this paper, we present a new sampling-based method for learning by Boosting. We show how to utilize the adaptive sampling method in [2] for estimating classifier accuracy in building an efficient ensemble learning method by Boosting. We provide a preliminary theoretical analysis of the proposed sampling-based boosting method. Empirical studies with 4 datasets from UC Irvine ML database show that our method typically uses much smaller sample size (and is thus much more efficient) while maintaining competitive prediction accuracy compared with Watanabe’s sampling-based Boosting learner Madaboost.

Publication Source (Journal or Book title)

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

First Page

101

Last Page

111

This document is currently not available here.

Share

COinS