A scalable boosting learner using adaptive sampling
Document Type
Conference Proceeding
Publication Date
1-1-2015
Abstract
Sampling is an important technique for parameter estimation and hypothesis testing widely used in statistical analysis, machine learning and knowledge discovery. Sampling is particularly useful in data mining when the training data set is huge. In this paper, we present a new sampling-based method for learning by Boosting. We show how to utilize the adaptive sampling method in [2] for estimating classifier accuracy in building an efficient ensemble learning method by Boosting. We provide a preliminary theoretical analysis of the proposed sampling-based boosting method. Empirical studies with 4 datasets from UC Irvine ML database show that our method typically uses much smaller sample size (and is thus much more efficient) while maintaining competitive prediction accuracy compared with Watanabe’s sampling-based Boosting learner Madaboost.
Publication Source (Journal or Book title)
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
First Page
101
Last Page
111
Recommended Citation
Chen, J., Burleigh, S., Chennupati, N., & Gudapati, B. (2015). A scalable boosting learner using adaptive sampling. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 9384, 101-111. https://doi.org/10.1007/978-3-319-25252-0_11