TY - JOUR
T1 - Learning Computational Models of Video Memorability from fMRI Brain Imaging
AU - Han, Junwei
AU - Chen, Changyuan
AU - Shao, Ling
AU - Hu, Xintao
AU - Han, Jungong
AU - Liu, Tianming
N1 - Published online October 2014.
PY - 2015/8
Y1 - 2015/8
N2 - Generally, various visual media are unequally memorable by the human brain. This paper looks into a new direction of modeling the memorability of video clips and automatically predicting how memorable they are by learning from brain functional magnetic resonance imaging (fMRI). We propose a novel computational framework by integrating the power of low-level audiovisual features and brain activity decoding via fMRI. Initially, a user study experiment is performed to create a ground truth database for measuring video memorability and a set of effective low-level audiovisual features is examined in this database. Then, human subjects’ brain fMRI data are obtained when they are watching the video clips. The fMRI-derived features that convey the brain activity of memorizing videos are extracted using a universal brain reference system. Finally, due to the fact that fMRI scanning is expensive and time consuming, a computational model is learned on our benchmark dataset with the objective of maximizing the correlation between the low level audiovisual features and the fMRI-derived features using joint subspace learning. The learned model can then automatically predict the memorability of videos without fMRI scans. Evaluations on publically available image and video databases demonstrate the effectiveness of the proposed framework.
AB - Generally, various visual media are unequally memorable by the human brain. This paper looks into a new direction of modeling the memorability of video clips and automatically predicting how memorable they are by learning from brain functional magnetic resonance imaging (fMRI). We propose a novel computational framework by integrating the power of low-level audiovisual features and brain activity decoding via fMRI. Initially, a user study experiment is performed to create a ground truth database for measuring video memorability and a set of effective low-level audiovisual features is examined in this database. Then, human subjects’ brain fMRI data are obtained when they are watching the video clips. The fMRI-derived features that convey the brain activity of memorizing videos are extracted using a universal brain reference system. Finally, due to the fact that fMRI scanning is expensive and time consuming, a computational model is learned on our benchmark dataset with the objective of maximizing the correlation between the low level audiovisual features and the fMRI-derived features using joint subspace learning. The learned model can then automatically predict the memorability of videos without fMRI scans. Evaluations on publically available image and video databases demonstrate the effectiveness of the proposed framework.
KW - Audiovisual features
KW - brain imaging
KW - semantic gap
KW - video memorability (VM)
UR - http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6919270&tag=1
U2 - 10.1109/TCYB.2014.2358647
DO - 10.1109/TCYB.2014.2358647
M3 - Article
SN - 2168-2267
VL - 45
SP - 1692
EP - 1703
JO - IEEE Transactions on Cybernetics
JF - IEEE Transactions on Cybernetics
IS - 8
ER -