Abstract
Moving object recognition (MOR) is an important but challenging problem in the field of computer vision. The aim of MOR is to recognize moving objects in a given video dataset. Convolutional neural networks (CNNs) have been extensively used for image recognition and video analysis problems. Recently, a 3D-CNN, which contains 3D convolution layers, was proposed to address MOR problems by successfully extracting spatiotemporal features. In this paper, a multi-view (MV) 3D-CNN is proposed for MOR. This model combines 3D-CNNs with a well-known MV learning technique. Because multi-view learning techniques have the ability to obtain more view-related features from videos captured by different cameras, the proposed model can extract more representative features. Moreover, the model contains a special view-pooling layer that can fuse the feature information from previous layers. The proposed MV3D-CNN is applied to both real-world moving vehicle recognition and sign language recognition tasks. The experimental results show that the proposed model possesses good performance.
Original language | English |
---|---|
Pages (from-to) | 3827-3835 |
Number of pages | 9 |
Journal | Neural Computing and Applications |
Volume | 28 |
Issue number | 12 |
Early online date | 23 Mar 2016 |
DOIs | |
Publication status | Published - Dec 2017 |
Keywords
- Moving object recognition
- Multi-view learning
- 3D convolutional neural networks
- Feature extraction
- Deep learning