TY - JOUR
T1 - Subgroup Discovery in Smart Electricity Meter Data
AU - Jin, Nanlin
AU - Flach, Peter
AU - Wilcox, Tom
AU - Sellman, Royston
AU - Thumim, Joshua
AU - Knobbe, Arno
PY - 2014/5
Y1 - 2014/5
N2 - This work presents data mining methods for discovering unusual consumption patterns and their associated descriptive models from smart electricity meter data. At present, data mining and knowledge discovery in electricity meter data suffer from three notable weaknesses: 1) insufficient focus on intelligent data analysis of subgroups (subsets) whose patterns vary significantly from aggregate patterns embodied in an entire dataset; 2) a lack of effort towards generating intuitively understandable and practically applicable knowledge for industrial practitioners to identify such subgroups; and 3) limited knowledge regarding the link between unusual consumption patterns and household consumers' socio-demographic characteristics. This paper addresses these practically important but technically challenging issues by applying subgroup discovery algorithms to a real smart electricity meter dataset. Subgroups whose patterns are unusual and whose sizes are large enough are discovered, and their descriptive and predictive models are generated. Furthermore, to enrich subgroup discovery algorithms, three new-quality measures for real-valued targets are proposed. The comparative studies empirically evaluate the effectiveness and usefulness of subgroup discovery on classification accuracy, predictive power, and computational resources. The methodologies and algorithms presented are generic, and therefore applicable to a wider range of data mining problems.
AB - This work presents data mining methods for discovering unusual consumption patterns and their associated descriptive models from smart electricity meter data. At present, data mining and knowledge discovery in electricity meter data suffer from three notable weaknesses: 1) insufficient focus on intelligent data analysis of subgroups (subsets) whose patterns vary significantly from aggregate patterns embodied in an entire dataset; 2) a lack of effort towards generating intuitively understandable and practically applicable knowledge for industrial practitioners to identify such subgroups; and 3) limited knowledge regarding the link between unusual consumption patterns and household consumers' socio-demographic characteristics. This paper addresses these practically important but technically challenging issues by applying subgroup discovery algorithms to a real smart electricity meter dataset. Subgroups whose patterns are unusual and whose sizes are large enough are discovered, and their descriptive and predictive models are generated. Furthermore, to enrich subgroup discovery algorithms, three new-quality measures for real-valued targets are proposed. The comparative studies empirically evaluate the effectiveness and usefulness of subgroup discovery on classification accuracy, predictive power, and computational resources. The methodologies and algorithms presented are generic, and therefore applicable to a wider range of data mining problems.
KW - data mining
KW - knowledge discovery
KW - time series analysis
U2 - 10.1109/TII.2014.2311968
DO - 10.1109/TII.2014.2311968
M3 - Article
SN - 1551-3203
VL - 10
SP - 1327
EP - 1336
JO - IEEE Transactions on Industrial Informatics
JF - IEEE Transactions on Industrial Informatics
IS - 2
ER -