A critical analysis of image-based camera pose estimation techniques

Meng Xu, Youchen Wang, Bin Xu, Jun Zhang, Jian Ren, Zhao Huang*, Stefan Poslad, Pengfei Xu

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

1 Citation (Scopus)

Abstract

Camera, and associated with its objects within the field of view, localization could benefit many computer vision fields, such as autonomous driving, robot navigation, and augmented reality (AR). After decades of progress, camera localization, also called camera pose estimation could compute the 6DoF pose of objects for a camera in a given image, with respect to different images in a sequence or formats. Structure feature-based localization methods have achieved great success when integrated with image matching or with a coordinate regression stage. Absolute and relative pose regression methods using transfer learning can support end-to-end localization to directly regress a camera pose but achieve a less accurate performance. Despite the rapid development of multiple branches in this area, a comprehensive, in-depth and comparative analysis is lacking to summarize, classify and compare, structure feature-based and regression-based camera localization methods. Existing surveys either focus on larger SLAM (Simultaneous Localization and Mapping) systems or on only part of the camera localization method, lack detailed comparisons and descriptions of the methods or datasets used, neural network designs such as loss designs, and input formats, etc. In this survey, we first introduce specific application areas and the evaluation metrics for camera localization pose according to different sub-tasks (learning-based 2D-2D task, 2D-3D task, and 3D-3D task). Then, we review common methods for structure feature-based camera pose estimation approaches, absolute pose regression and relative pose regression approaches by critically modelling the methods to inspire further improvements in their algorithms such as loss functions, and neural network structures. Furthermore, we summarize what are the popular datasets used for camera localization and compare the quantitative and qualitative results of these methods with detailed performance metrics. Finally, we discuss future research possibilities and applications.
Original languageEnglish
Article number127125
Number of pages27
JournalNeurocomputing
Volume570
Early online date13 Dec 2023
DOIs
Publication statusPublished - 14 Feb 2024
Externally publishedYes

Cite this