TY - JOUR
T1 - Noise tolerant drift detection method for data stream mining
AU - Wang, Pingfan
AU - Jin, Nanlin
AU - Woo, Wai Lok
AU - Woodward, John R.
AU - Davies, Duncan
N1 - Funding information: This work was supported by the European Regional Development Fund (ERDF) [25R17P01847]; Northumbria University Newcastle, Newcastle tyne upon, UK; Notify Technology Ltd, Newcastle, UK.
PY - 2022/9/1
Y1 - 2022/9/1
N2 - Drift detection methods identify changes in data streams. Such changes are called concept drifts. Existing drift detection methods often assume that the input is a noise-free data stream. However, in real world applications, for example, data streams generating from internet of things are normally contaminated with noise. (noise, i.e. class noise and/or attribute noise). In this paper, we propose a Noise Tolerant Drift Detection Method (NTDDM), which is based on two-step detection and validation function to detect drifts, and filters out the false drifts caused by the noise. The NTDDM is compared with six well-known drift detection methods and tested on four benchmarks having different levels. Three performance indicators are proposed to determine whether the drift detection is made within a reasonable time, and the length of time to the known drift starting point. The comparative studies demonstrate that NTDDM outperforms the existing methods, over these performance indicators. Our proposed method has achieved a statistically significant improvement on drift detection compared to the methods in experiment. The proposed NTDDM makes it possible to efficiently and effectively detect drift in a noisy data stream.
AB - Drift detection methods identify changes in data streams. Such changes are called concept drifts. Existing drift detection methods often assume that the input is a noise-free data stream. However, in real world applications, for example, data streams generating from internet of things are normally contaminated with noise. (noise, i.e. class noise and/or attribute noise). In this paper, we propose a Noise Tolerant Drift Detection Method (NTDDM), which is based on two-step detection and validation function to detect drifts, and filters out the false drifts caused by the noise. The NTDDM is compared with six well-known drift detection methods and tested on four benchmarks having different levels. Three performance indicators are proposed to determine whether the drift detection is made within a reasonable time, and the length of time to the known drift starting point. The comparative studies demonstrate that NTDDM outperforms the existing methods, over these performance indicators. Our proposed method has achieved a statistically significant improvement on drift detection compared to the methods in experiment. The proposed NTDDM makes it possible to efficiently and effectively detect drift in a noisy data stream.
KW - Data Stream Mining
KW - Drift detection method
KW - Noise in signal
KW - Signal Processing and Analysis
UR - http://www.scopus.com/inward/record.url?scp=85135339792&partnerID=8YFLogxK
U2 - 10.1016/j.ins.2022.07.065
DO - 10.1016/j.ins.2022.07.065
M3 - Article
AN - SCOPUS:85135339792
SN - 0020-0255
VL - 609
SP - 1318
EP - 1333
JO - Information Sciences
JF - Information Sciences
ER -