TY - JOUR
T1 - A critical analysis of image-based camera pose estimation techniques
AU - Xu, Meng
AU - Wang, Youchen
AU - Xu, Bin
AU - Zhang, Jun
AU - Ren, Jian
AU - Huang, Zhao
AU - Poslad, Stefan
AU - Xu, Pengfei
N1 - Funding information: This research was funded in part by a PhD scholarship funded jointly by the China Scholarship Council (CSC) and QMUL and partly funded under Didi Chuxing and the Robotics and AI for Extreme Environments program’s NCNR (National Centre for Nuclear Robotics) grant no. EP/R02572X/1.
PY - 2024/2/14
Y1 - 2024/2/14
N2 - Camera, and associated with its objects within the field of view, localization could benefit many computer vision fields, such as autonomous driving, robot navigation, and augmented reality (AR). After decades of progress, camera localization, also called camera pose estimation could compute the 6DoF pose of objects for a camera in a given image, with respect to different images in a sequence or formats. Structure feature-based localization methods have achieved great success when integrated with image matching or with a coordinate regression stage. Absolute and relative pose regression methods using transfer learning can support end-to-end localization to directly regress a camera pose but achieve a less accurate performance. Despite the rapid development of multiple branches in this area, a comprehensive, in-depth and comparative analysis is lacking to summarize, classify and compare, structure feature-based and regression-based camera localization methods. Existing surveys either focus on larger SLAM (Simultaneous Localization and Mapping) systems or on only part of the camera localization method, lack detailed comparisons and descriptions of the methods or datasets used, neural network designs such as loss designs, and input formats, etc. In this survey, we first introduce specific application areas and the evaluation metrics for camera localization pose according to different sub-tasks (learning-based 2D-2D task, 2D-3D task, and 3D-3D task). Then, we review common methods for structure feature-based camera pose estimation approaches, absolute pose regression and relative pose regression approaches by critically modelling the methods to inspire further improvements in their algorithms such as loss functions, and neural network structures. Furthermore, we summarize what are the popular datasets used for camera localization and compare the quantitative and qualitative results of these methods with detailed performance metrics. Finally, we discuss future research possibilities and applications.
AB - Camera, and associated with its objects within the field of view, localization could benefit many computer vision fields, such as autonomous driving, robot navigation, and augmented reality (AR). After decades of progress, camera localization, also called camera pose estimation could compute the 6DoF pose of objects for a camera in a given image, with respect to different images in a sequence or formats. Structure feature-based localization methods have achieved great success when integrated with image matching or with a coordinate regression stage. Absolute and relative pose regression methods using transfer learning can support end-to-end localization to directly regress a camera pose but achieve a less accurate performance. Despite the rapid development of multiple branches in this area, a comprehensive, in-depth and comparative analysis is lacking to summarize, classify and compare, structure feature-based and regression-based camera localization methods. Existing surveys either focus on larger SLAM (Simultaneous Localization and Mapping) systems or on only part of the camera localization method, lack detailed comparisons and descriptions of the methods or datasets used, neural network designs such as loss designs, and input formats, etc. In this survey, we first introduce specific application areas and the evaluation metrics for camera localization pose according to different sub-tasks (learning-based 2D-2D task, 2D-3D task, and 3D-3D task). Then, we review common methods for structure feature-based camera pose estimation approaches, absolute pose regression and relative pose regression approaches by critically modelling the methods to inspire further improvements in their algorithms such as loss functions, and neural network structures. Furthermore, we summarize what are the popular datasets used for camera localization and compare the quantitative and qualitative results of these methods with detailed performance metrics. Finally, we discuss future research possibilities and applications.
KW - Camera pose regression
KW - Structure feature-based localization
KW - Absolute pose regression
KW - Relative pose regression
U2 - 10.1016/j.neucom.2023.127125
DO - 10.1016/j.neucom.2023.127125
M3 - Article
SN - 0925-2312
VL - 570
JO - Neurocomputing
JF - Neurocomputing
M1 - 127125
ER -