Learning scale awareness in keypoint extraction and description

Document Type


Publication Date



To recover relative camera motion accurately and robustly, establishing a set of point-to-point correspondences in the pixel space is an essential yet challenging task in computer vision. Even though multi-scale design philosophy has been used with significant success in computer vision tasks, such as object detection and semantic segmentation, learning-based image matching has not been fully exploited. In this work, we explore a scale awareness learning approach in finding pixel-level correspondences based on the intuition that keypoints need to be extracted and described on an appropriate scale. With that insight, we propose a novel scale-aware network and then develop a new fusion scheme that derives high-consistency response maps and high-precision descriptions. We also revise the Second Order Similarity Regularization (SOSR) to make it more effective for the end-to-end image matching network, which leads to significant improvement in local feature descriptions. Experimental results run on multiple datasets demonstrate that our approach performs better than state-of-the-art methods under multiple criteria.

Publication Source (Journal or Book title)

Pattern Recognition

This document is currently not available here.