TY - JOUR
T1 - A semi-supervised approach for dimensionality reduction with distributional similarity
AU - Zheng, Feng
AU - Song, Zhan
AU - Shao, Ling
AU - Chung, Ronald
AU - Jia, Kui
AU - Wu, Xinyu
PY - 2013/3/1
Y1 - 2013/3/1
N2 - Semi-supervised learning has recently received considerable attention in machine learning. In this paper, we propose a novel diffusion maps based semi-supervised algorithm for dimensionality reduction, visualization and data representation. Unlike previous work which uses only geometric information for similarity metric construction, a distributional similarity metric is introduced to modify the geometric relationship of samples. This metric is defined using the posterior probability over the labels of each sample, which is learned through the Expectation–Maximization (EM) algorithm. The Euclidean distance between points on the intrinsic manifold learned by our proposed method is equal to the label-dependent “diffusion distance”, which is modified by the distributional similarity related metric, in the original space. Our algorithm preserves the local manifold structure in addition to separating samples in different classes, thus facilitates the classification. Encouraging experimental results on handwritten digits, Yale faces, UCI data sets and the Weizmann data set show that the algorithm can improve the classification accuracy significantly.
AB - Semi-supervised learning has recently received considerable attention in machine learning. In this paper, we propose a novel diffusion maps based semi-supervised algorithm for dimensionality reduction, visualization and data representation. Unlike previous work which uses only geometric information for similarity metric construction, a distributional similarity metric is introduced to modify the geometric relationship of samples. This metric is defined using the posterior probability over the labels of each sample, which is learned through the Expectation–Maximization (EM) algorithm. The Euclidean distance between points on the intrinsic manifold learned by our proposed method is equal to the label-dependent “diffusion distance”, which is modified by the distributional similarity related metric, in the original space. Our algorithm preserves the local manifold structure in addition to separating samples in different classes, thus facilitates the classification. Encouraging experimental results on handwritten digits, Yale faces, UCI data sets and the Weizmann data set show that the algorithm can improve the classification accuracy significantly.
KW - Diffusion maps
KW - Manifold learning
KW - Label information
KW - Expectation–Maximization
KW - Distributional similarity
UR - http://www.sciencedirect.com/science/article/pii/S092523121200776X
U2 - 10.1016/j.neucom.2012.09.023
DO - 10.1016/j.neucom.2012.09.023
M3 - Article
SN - 0925-2312
VL - 103
SP - 210
EP - 221
JO - Neurocomputing
JF - Neurocomputing
ER -