Abstract
Apparent first impression prediction has made great progress with deep neural networks. There is a trend for multimodal fusion where features from different sources are fused together to improve the accuracy of the prediction. However, in a real-life scenario, it is often hard to gather features from different sources such as audio and background information. It is desirable to develop a method that could improve the prediction accuracy from a single source rather than multiple sources. This study developed a method to predict personality traits from a single source of information, i.e., facial information. Specifically, a pre-trained Deep Convolutional Neural Network was employed to extract emotional expression frame by frame in the video clip, which was then fed into a Long Short Term Memory model to predict the “Big Five” personality traits score. In Parallel, the model based on the static apparent facial features was trained, and finally, the facial feature and facial expression were fused with demographic data (age and gender). The proposed system is tested on the CharLearn Dataset and achieved an accuracy score of 90.67% ranked just below the top 5 CharLearn Competition. The result also showed that the dynamic emotional pattern has a positive impact on first impression prediction, especially on extraversion.
Original language | English |
---|---|
Article number | 125114 |
Pages (from-to) | 1-11 |
Number of pages | 11 |
Journal | Expert Systems with Applications |
Volume | 259 |
Early online date | 23 Aug 2024 |
DOIs | |
Publication status | E-pub ahead of print - 23 Aug 2024 |
Keywords
- First impression prediction
- Emotion expression patterns
- Deep learning
- Long short-term memory