TY - JOUR
T1 - A Robust and Scalable Visual Category and Action Recognition System using Kernel Discriminant Analysis with Spectral Regression
AU - Tahir, Muhammad
AU - Yan, Fei
AU - Koniusz, Peter
AU - Awais, Muhammad
AU - Barnard, Mark
AU - Mikolajczyk, Krystian
AU - Bouridane, Ahmed
AU - Kittler, Josef
PY - 2013
Y1 - 2013
N2 - Visual concept detection and action recognition are one of the most important tasks in content-based multimedia information retrieval (CBMIR) technology. It aims at annotating images using a vocabulary defined by a set of concepts of interest including scenes types (mountains, snow etc) or human actions (phoning, playing instrument). This paper describes our system in the ImageCLEF@ICPR10, Pascal VOC 08 Visual Concept Detection and Pascal VOC 10 Action Recognition Challenges. The proposed system ranked first in these large-scale tasks when evaluated independently by the organizers. The proposed system involves stateof- the-art local descriptor computation, vector quantisation via clustering, structured scene or object representation via localised histograms of vector codes, similarity measure for kernel construction and classifier learning. The main novelty is the classifierlevel and kernel-level fusion using Kernel Discriminant Analysis and Spectral Regression (SR-KDA) with RBF Chi-Squared kernels obtained from various image descriptors. The distinctiveness of the proposed method is also assessed experimentally using a video benchmark: the Mediamill Challenge along with benchmarks from ImageCLEF@ICPR10, Pascal VOC 10 and Pascal VOC 08. From the experimental results, it can be derived that the presented system consistently yields significant performance gains when compared with the state-of-the art methods. The other strong point is the introduction of SR-KDA in the classification stage where the time complexity scales linearly with respect to the number of concepts and the main computational complexity is independent of the number of categories.
AB - Visual concept detection and action recognition are one of the most important tasks in content-based multimedia information retrieval (CBMIR) technology. It aims at annotating images using a vocabulary defined by a set of concepts of interest including scenes types (mountains, snow etc) or human actions (phoning, playing instrument). This paper describes our system in the ImageCLEF@ICPR10, Pascal VOC 08 Visual Concept Detection and Pascal VOC 10 Action Recognition Challenges. The proposed system ranked first in these large-scale tasks when evaluated independently by the organizers. The proposed system involves stateof- the-art local descriptor computation, vector quantisation via clustering, structured scene or object representation via localised histograms of vector codes, similarity measure for kernel construction and classifier learning. The main novelty is the classifierlevel and kernel-level fusion using Kernel Discriminant Analysis and Spectral Regression (SR-KDA) with RBF Chi-Squared kernels obtained from various image descriptors. The distinctiveness of the proposed method is also assessed experimentally using a video benchmark: the Mediamill Challenge along with benchmarks from ImageCLEF@ICPR10, Pascal VOC 10 and Pascal VOC 08. From the experimental results, it can be derived that the presented system consistently yields significant performance gains when compared with the state-of-the art methods. The other strong point is the introduction of SR-KDA in the classification stage where the time complexity scales linearly with respect to the number of concepts and the main computational complexity is independent of the number of categories.
KW - Action recognition from still images
KW - SIFT
KW - kernel discriminant analysis
KW - visual category recognition
U2 - 10.1109/TMM.2013.2264927
DO - 10.1109/TMM.2013.2264927
M3 - Article
SN - 1520-9210
VL - 15
SP - 1653
EP - 1664
JO - IEEE Transactions on Multimedia
JF - IEEE Transactions on Multimedia
IS - 7
ER -