COMBOOD: A Semiparametric Approach for Detecting Out-of-distribution Data for Image Classification
Document Type
Conference Proceeding
Publication Date
1-1-2024
Abstract
Identifying out-of-distribution (OOD) data at inference time is crucial for many machine learning applications, especially for automation. We present a novel unsupervised semi-parametric framework COMBOOD for OOD detection with respect to image recognition. Our framework combines signals from two distance metrics, nearest-neighbor and Mahalanobis, to derive a confidence score for an inference point to be out-of-distribution. The former provides a non-parametric approach to OOD detection. The latter provides a parametric, simple, yet effective method for detecting OOD data points, especially, in the far OOD scenario, where the inference point is far apart from the training data set in the embedding space. However, its performance is not satisfactory in the near OOD scenarios that arise in practical situations. Our COMBOOD framework combines the two signals in a semi-parametric setting to provide a confidence score that is accurate both for the near-OOD and far-OOD scenarios. We show experimental results with the COMBOOD framework for different types of feature extraction strategies. We demonstrate experimentally that COMBOOD outperforms state-of-the-art OOD detection methods on the OpenOOD (both version 1 and most recent version 1.5) benchmark datasets (for both far-OOD and near-OOD) as well as on the documents dataset in terms of accuracy. On a majority of the benchmark datasets, the improvements in accuracy resulting from the COMBOOD framework are statistically significant. COMBOOD scales linearly with the size of the embedding space, making it ideal for many real-life applications.
Publication Source (Journal or Book title)
Proceedings of the 2024 SIAM International Conference on Data Mining Sdm 2024
First Page
643
Last Page
651
Recommended Citation
Rajasekaran, M., Sajol, M., Berglind, F., Mukhopadhyay, S., & Das, K. (2024). COMBOOD: A Semiparametric Approach for Detecting Out-of-distribution Data for Image Classification. Proceedings of the 2024 SIAM International Conference on Data Mining Sdm 2024, 643-651. Retrieved from https://repository.lsu.edu/enviro_sciences_pubs/374