Emerging multisource data provide a promising way to make breakthroughs in remaining useful life prediction. Due to the diversity in industrial sites and the complexity of the engineering systems, a large amount of degradation information of machinery is hidden in multitype data, which poses a challenge to adequately capture the complex features that jointly affect remaining useful life. To this end, we propose an interactive attention-based deep spatio-temporal network to effectively fuse vibration waveforms and time-varying operating signals. Specifically, the spatio-temporal structure in the proposed model has the ability to mine long-term dependence and local spatial information from raw multisource data simultaneously. An interactive attention mechanism is used to weight the extracted feature contributions from different source dynamically. Furthermore, a modified mean absolute percentage error criterion is designed in the training process for the inherent properties of the remaining useful prediction. For illustration, a case study of a rotating machinery in an oil refinery and a public dataset of an aircraft engine are investigated. The extensive experiments have demonstrated that, compared to relying solely on either vibrational or operating signals and different fusion strategies, the proposed model can effectively integrate multisource data to reduce prediction loss with an acceptable performance.