TY - GEN
T1 - A Hybrid LSTM-Attention Approach for Missing Data Imputation in IoT Time Series
AU - Laeeq, Ammara
AU - Adeel, Usman
AU - Li, Jie
AU - Starkey, Eleanor
PY - 2025/11/6
Y1 - 2025/11/6
N2 - IoT systems generate large volumes of time-series data, but sensor malfunctions often lead to missing values that reduce the effectiveness of machine learning models. We propose a novel hybrid architecture that interleaves Long Short-Term Memory (LSTM) layers with a multihead attention mechanism, where the first LSTM layer captures local temporal dependencies, the attention layer highlights long-range relationships, and the second LSTM layer integrates these features into a coherent sequence. This structured design, unlike conventional orderings, enhances robustness against irregular missingness. Evaluated on six months of soil surface temperature data with simulated missing rates from 10% to 90%, in terms of mean absolute error (MAE), R2 score (R2) and root mean squared error (RMSE). Performance is also compared to a statistical technique k-Nearest Neighbour (KNN) and a deep learning technique Bidirectional Recurrent Imputation for Time Series (BRITS) baselines. Importantly, training with simulated missingness further improved generalization, underscoring the novelty and practical relevance of the proposed approach for real-world IoT scenarios.
AB - IoT systems generate large volumes of time-series data, but sensor malfunctions often lead to missing values that reduce the effectiveness of machine learning models. We propose a novel hybrid architecture that interleaves Long Short-Term Memory (LSTM) layers with a multihead attention mechanism, where the first LSTM layer captures local temporal dependencies, the attention layer highlights long-range relationships, and the second LSTM layer integrates these features into a coherent sequence. This structured design, unlike conventional orderings, enhances robustness against irregular missingness. Evaluated on six months of soil surface temperature data with simulated missing rates from 10% to 90%, in terms of mean absolute error (MAE), R2 score (R2) and root mean squared error (RMSE). Performance is also compared to a statistical technique k-Nearest Neighbour (KNN) and a deep learning technique Bidirectional Recurrent Imputation for Time Series (BRITS) baselines. Importantly, training with simulated missingness further improved generalization, underscoring the novelty and practical relevance of the proposed approach for real-world IoT scenarios.
KW - Data Imputation
KW - LSTM
KW - Missing Data
KW - Multihead Attention
KW - Neural Network
KW - Time Series
UR - https://www.scopus.com/pages/publications/105022120528
U2 - 10.1007/978-3-032-10486-1_28
DO - 10.1007/978-3-032-10486-1_28
M3 - Conference contribution
AN - SCOPUS:105022120528
SN - 9783032104854
T3 - Lecture Notes in Computer Science
SP - 301
EP - 312
BT - Intelligent Data Engineering and Automated Learning – IDEAL 2025
A2 - Martínez, Luis
A2 - Camacho, David
A2 - Yin, Hujun
A2 - Dutta, Bapi
A2 - Yera, Raciel
A2 - Rodríguez Domínguez, Rosa M.
A2 - Tallón-Ballesteros, Antonio
PB - Springer
CY - Cham, Switzerland
T2 - 26th International Conference on Intelligent Data Engineering and Automated Learning, IDEAL 2025
Y2 - 13 November 2025 through 15 November 2025
ER -