Addressing partial observability in reinforcement learning for energy management

Marco Biemann, Xiufeng Liu, Yifeng Zeng, Lizhen Huang

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

3 Citations (Scopus)

Abstract

Automatic control of energy systems is affected by the uncertainties of multiple factors, including weather, prices and human activities. The literature relies on Markov-based control, taking only into account the current state. This impacts control performance, as previous states give additional context for decision making. We present two ways to learn non-Markovian policies, based on recurrent neural networks and variational inference. We evaluate the methods on a simulated data centre HVAC control task. The results show that the off-policy stochastic latent actor-critic algorithm can maintain the temperature in the predefined range within three months of training without prior knowledge while reducing energy consumption compared to Markovian policies by more than 5%.

Original languageEnglish
Title of host publicationBuildSys 2021 - Proceedings of the 2021 ACM International Conference on Systems for Energy-Efficient Built Environments
PublisherACM
Pages324-328
Number of pages5
ISBN (Electronic)9781450391146
DOIs
Publication statusPublished - 17 Nov 2021

Keywords

  • HVAC control
  • POMDP
  • energy management
  • recurrent neural networks
  • reinforcement learning
  • variational inference

Cite this