The core goal of feature matching is to establish correspondences between two images. Current methods without detectors achieve impressive results but often focus on global features, neglecting regions with subtle textures and resulting in fewer matches in areas with weak textures. This paper proposes a feature-matching method based on local window aggregation, which balances global features and local texture variations for more accurate matches, especially in weak-texture regions. Our method first applies a local window aggregation module to minimize irrelevant interference using window attention, followed by global attention, generating coarse and fine-grained feature maps. These maps are processed by a matching module, initially obtaining coarse matches via the nearest neighbor principle. The coarse matches are then refined on fine-grained maps through local window refinement. Experimental results show our method surpasses state-of-the-art techniques in pose estimation, homography estimation, and visual localization under the same training conditions.
Applied sciences; Computer science; Network modeling
See how this article has been cited at scite.ai
scite shows how a scientific paper has been cited by providing the context of the citation, a classification describing whether it supports, mentions, or contrasts the cited claim, and a label indicating in which section the citation was made.