TY - JOUR
T1 - A Quadruple Diffusion Convolutional Recurrent Network for Human Motion Prediction
AU - Men, Qianhui
AU - Ho, Edmond S. L.
AU - Shum, Hubert
AU - Leung, Howard
N1 - Funding Information:
Manuscript received June 26, 2020; revised September 23, 2020; accepted November 1, 2020. Date of publication November 13, 2020; date of current version September 3, 2021. This work was supported in part by the grants from the City University of Hong Kong (Project No. 9220077 and 9678139) and in part by the Royal Society (Ref: IES\R2\181024 and IES\R1\191147). This article was recommended by Associate Editor Z.-J. Zha. (Corresponding author: Edmond S. L. Ho.) Qianhui Men and Howard Leung are with the Department of Computer Science, City University of Hong Kong, Hong Kong, SAR, China (e-mail: [email protected]; [email protected]).
Publisher Copyright:
© 1991-2012 IEEE.
PY - 2021/9
Y1 - 2021/9
N2 - Recurrent neural network (RNN) has become popular for human motion prediction thanks to its ability to capture temporal dependencies. However, it has limited capacity in modeling the complex spatial relationship in the human skeletal structure. In this work, we present a novel diffusion convolutional recurrent predictor for spatial and temporal movement forecasting, with multi-step random walks traversing bidirectionally along an adaptive graph to model interdependency among body joints. In the temporal domain, existing methods rely on a single forward predictor with the produced motion deflecting to the drift route, which leads to error accumulations over time. We propose to supplement the forward predictor with a forward discriminator to alleviate such motion drift in the long term under adversarial training. The solution is further enhanced by a backward predictor and a backward discriminator to effectively reduce the error, such that the system can also look into the past to improve the prediction at early frames. The two-way spatial diffusion convolutions and two-way temporal predictors together form a quadruple network. Furthermore, we train our framework by modeling the velocity from observed motion dynamics instead of static poses to predict future movements that effectively reduces the discontinuity problem at early prediction. Our method outperforms the state of the arts on both 3D and 2D datasets, including the Human3.6M, CMU Motion Capture and Penn Action datasets. The results also show that our method correctly predicts both high-dynamic and low-dynamic moving trends with less motion drift.
AB - Recurrent neural network (RNN) has become popular for human motion prediction thanks to its ability to capture temporal dependencies. However, it has limited capacity in modeling the complex spatial relationship in the human skeletal structure. In this work, we present a novel diffusion convolutional recurrent predictor for spatial and temporal movement forecasting, with multi-step random walks traversing bidirectionally along an adaptive graph to model interdependency among body joints. In the temporal domain, existing methods rely on a single forward predictor with the produced motion deflecting to the drift route, which leads to error accumulations over time. We propose to supplement the forward predictor with a forward discriminator to alleviate such motion drift in the long term under adversarial training. The solution is further enhanced by a backward predictor and a backward discriminator to effectively reduce the error, such that the system can also look into the past to improve the prediction at early frames. The two-way spatial diffusion convolutions and two-way temporal predictors together form a quadruple network. Furthermore, we train our framework by modeling the velocity from observed motion dynamics instead of static poses to predict future movements that effectively reduces the discontinuity problem at early prediction. Our method outperforms the state of the arts on both 3D and 2D datasets, including the Human3.6M, CMU Motion Capture and Penn Action datasets. The results also show that our method correctly predicts both high-dynamic and low-dynamic moving trends with less motion drift.
KW - human motion prediction
KW - body joint dynamics
KW - diffusion convolutions
KW - recurrent neural network
KW - bi-directional predictor
KW - Training
KW - Adaptation models
KW - Computational modeling
KW - Dynamics
KW - Hidden Markov models
KW - Bidirectional control
KW - Predictive models
KW - Human motion prediction
UR - http://www.scopus.com/inward/record.url?scp=85097174931&partnerID=8YFLogxK
U2 - 10.1109/tcsvt.2020.3038145
DO - 10.1109/tcsvt.2020.3038145
M3 - Article
SN - 1051-8215
VL - 31
SP - 3417
EP - 3432
JO - IEEE Transactions on Circuits and Systems for Video Technology
JF - IEEE Transactions on Circuits and Systems for Video Technology
IS - 9
M1 - 9259055
ER -