TY - JOUR
T1 - Learning Object-to-Class Kernels for Scene Classification
AU - Zhang, Lei
AU - Zhen, Xiantong
AU - Shao, Ling
PY - 2014/8
Y1 - 2014/8
N2 - High-level image representations have drawn increasing attention in visual recognition, e.g., scene classification, since the invention of the object bank. The object bank represents an image as a response map of a large number of pretrained object detectors and has achieved superior performance for visual recognition. In this paper, based on the object bank representation, we propose the object-to-class (O2C) distances to model scene images. In particular, four variants of O2C distances are presented, and with the O2C distances, we can represent the images using the object bank by lower-dimensional but more discriminative spaces, called distance spaces, which are spanned by the O2C distances. Due to the explicit computation of O2C distances based on the object bank, the obtained representations can possess more semantic meanings. To combine the discriminant ability of the O2C distances to all scene classes, we further propose to kernalize the distance representation for the final classification. We have conducted extensive experiments on four benchmark data sets, UIUC-Sports, Scene−15, MIT Indoor, and Caltech−101, which demonstrate that the proposed approaches can significantly improve the original object bank approach and achieve the state-of-the-art performance.
AB - High-level image representations have drawn increasing attention in visual recognition, e.g., scene classification, since the invention of the object bank. The object bank represents an image as a response map of a large number of pretrained object detectors and has achieved superior performance for visual recognition. In this paper, based on the object bank representation, we propose the object-to-class (O2C) distances to model scene images. In particular, four variants of O2C distances are presented, and with the O2C distances, we can represent the images using the object bank by lower-dimensional but more discriminative spaces, called distance spaces, which are spanned by the O2C distances. Due to the explicit computation of O2C distances based on the object bank, the obtained representations can possess more semantic meanings. To combine the discriminant ability of the O2C distances to all scene classes, we further propose to kernalize the distance representation for the final classification. We have conducted extensive experiments on four benchmark data sets, UIUC-Sports, Scene−15, MIT Indoor, and Caltech−101, which demonstrate that the proposed approaches can significantly improve the original object bank approach and achieve the state-of-the-art performance.
KW - Object bank
KW - scene classification
KW - object-to-class distances
KW - object filters
KW - kernels
UR - http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6825882&tag=1
U2 - 10.1109/TIP.2014.2328894
DO - 10.1109/TIP.2014.2328894
M3 - Article
VL - 23
SP - 3241
EP - 3253
JO - IEEE Transactions on Image Processing
JF - IEEE Transactions on Image Processing
SN - 1057-7149
IS - 8
ER -