Abstract
Automatic control of energy systems is affected by the uncertainties of multiple factors, including weather, prices and human activities. The literature relies on Markov-based control, taking only into account the current state. This impacts control performance, as previous states give additional context for decision making. We present two ways to learn non-Markovian policies, based on recurrent neural networks and variational inference. We evaluate the methods on a simulated data centre HVAC control task. The results show that the off-policy stochastic latent actor-critic algorithm can maintain the temperature in the predefined range within three months of training without prior knowledge while reducing energy consumption compared to Markovian policies by more than 5%.
Original language | English |
---|---|
Title of host publication | BuildSys 2021 - Proceedings of the 2021 ACM International Conference on Systems for Energy-Efficient Built Environments |
Publisher | ACM |
Pages | 324-328 |
Number of pages | 5 |
ISBN (Electronic) | 9781450391146 |
DOIs | |
Publication status | Published - 17 Nov 2021 |
Keywords
- HVAC control
- POMDP
- energy management
- recurrent neural networks
- reinforcement learning
- variational inference